Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

HTML search plugin #25

Closed
wants to merge 18 commits into
from

Conversation

Projects
None yet
2 participants
Contributor

lucassmagal commented Jun 9, 2012

Hello,

I've made a plugin that parses HTML5Doctor index.

Contributor

rpicard commented Jun 9, 2012

Thanks for the submission! I think I see where you're going with this, but the formatting of output.txt is not quite right. Each item should be on one line. You may just have some unescaped \n characters in there. For example:

area tag    http://dev.w3.org/html5/spec/Overview.html#the-area-element    

<img src="floorplan.png" usemap="#rooms" alt="1 bedroom, a kitchen, lounge and bathroom">

<map name="rooms">
    <area shape="rect" coords="20,20,140,140" href="bedroom.html" alt="Bedroom">
    <area shape="rect" coords="20,140,140,280" href="lounge.html" alt="Lounge">
    <area shape="poly" coords="140,140,140,280,280,140" href="kitchen.html" alt="Kitchen">
    <area shape="rect" coords="140,20,280,140" href="bathroom.html" alt="Bathroom">
</map>

should be something like this in output.txt (the exact number of escapes may not be correct here):

area tag    http://dev.w3.org/html5/spec/Overview.html#the-area-element    <img src="floorplan.png" usemap="#rooms" alt="1 bedroom, a kitchen, lounge and bathroom">\\n<map name="rooms">\n\t<area shape="rect" coords="20,20,140,140" href="bedroom.html" alt="Bedroom">\n\t<area shape="rect" coords="20,140,140,280" href="lounge.html" alt="Lounge">\n\t<area shape="poly" coords="140,140,140,280,280,140" href="kitchen.html" alt="Kitchen">\n\t<area shape="rect" coords="140,20,280,140" href="bathroom.html" alt="Bathroom">\n</map>

In your code all of those \ns and \ts would need to be escaped: \\n or \\t (you might need more escapes than that).

See the Hello World plugin for an example of that. Remember that the code in that one also includes \n characters that need to appear in the end abstract, so they are escaped as something like \\\\n. I don't think you'll need to worry about that though.

Also, there should be a little description of the element alongside the code. This would be the $description element where the code is in the $synopsis element.

Edit: One more thing: we don't need the word "tag" after each title, i.e. area tag should just be area.

Let me know if you have any questions!

Contributor

lucassmagal commented Jun 9, 2012

Ok, I'm working to solve these bugs.

Contributor

lucassmagal commented Jun 10, 2012

So @rpicard , is this correct now?

pypi/parse.rb: Formatting of the abstract and switch to general outpu…
…t format

This now uses the format: "Package description: the package description
goes here." The first letter of the abstract is lowercased unless the
second letter was originally capitlalized too, signaling an acronym.

It's using the general output format instead of the programming
format now too. Thanks to ezgraphs for including both formats in the
code!
Contributor

lucassmagal commented Jun 12, 2012

@rpicard , the code is ok now?

Contributor

rpicard commented Jun 12, 2012

@lucassmagal Thanks for fixing that. I'm testing it out now. I'll let you know if I have some feedback.

Contributor

rpicard commented Jun 12, 2012

@lucassmagal The escaping still isn't right. I'm going to change some of the details (title of the box, the "more at" link, etc.) but here's what the abstract looks like now: https://robert.duckduckgo.com/?q=html+a

Edit: I'm working on this so that link may not work now.

Contributor

lucassmagal commented Jun 12, 2012

Well, I think that's the code snippets are interpreted as real code. Maybe getting into a pre tag?

Or, I think there's a problem with spacing around '\t'.

Thanks for feedback, I'll investigate.

Contributor

rpicard commented Jun 12, 2012

@lucassmagal The scripts we use to process the output actually should be putting it into a <pre> tag already. I'm trying to figure out exactly where the problem is occurring now.

Contributor

lucassmagal commented Jun 12, 2012

@rpicard , I've stripped all elements before create the output, eliminating spacing around the text. Besides, I've inserted one more '\t' character between elements.

Tell me if it's still wrong.

rpicard added some commits Jun 14, 2012

iso_3166_codes: Several changes
List python package dependencies in README.md
Download the data file to download/
Update the format of the abstract
Decode unicode characters in the abstract with unidecode
iso_3166_codes/parse.py: Unidecode before assembling abstract string
The capitalization of some letters were weird since they were decoded
after going through unidecode().
.gitignore: Add .swp files for convenience
I don't like them cluttering the output of git status
Contributor

lucassmagal commented Jun 15, 2012

So @rpicard , any news?

Contributor

rpicard commented Jun 15, 2012

@lucassmagal Sorry, I'm working on figuring out why it's not processing right. I'm not sure if it's something in the script or something on our end yet.

That last commit should be reversed though. Elements should be joined by just one \t.

Contributor

lucassmagal commented Jun 15, 2012

@rpicard , ok, I'll wait your review of the problem.

Contributor

rpicard commented Jun 18, 2012

@lucassmagal I've got it working [1]. Do you have a Twitter handle that we can use for attribution when we announce it?

[1] https://robert.duckduckgo.com/?q=html+a

Contributor

lucassmagal commented Jun 19, 2012

Yes, @rpicard : http://twitter.com/lsmagalhaes

Thank you very much for your help!

Contributor

rpicard commented Jun 19, 2012

@lucassmagal Great! I'll try to get this live in the next day or two. No guarantees that I won't run into more things to be fixed though. :)

Contributor

rpicard commented Jun 19, 2012

@lucassmagal Great! I'll try to get this live in the next day or two. No guarantees that I won't run into more things to be fixed though. :)

Thanks for the submission. I'll let you know when it's live.

Contributor

rpicard commented Jun 20, 2012

@lucassmagal The plugin is live now. I've also merged it into the repo. It's a great addition. I was already using it when I just had it on robert.duckduckgo.com.

https://duckduckgo.com/?q=html+abbr

@rpicard rpicard closed this Jun 20, 2012

Contributor

lucassmagal commented Jun 20, 2012

Thank you @rpicard ! And thank you again for DuckDuckHack, it's a very opportunity for me contributing for it =D

Contributor

rpicard commented Jun 20, 2012

@lucassmagal It's great for us too! I hope I'll see more pull requests from you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment