Skip to content
This repository has been archived by the owner on Oct 15, 2022. It is now read-only.

HTML search plugin #25

Closed
wants to merge 18 commits into from
Closed

Conversation

lucassmagal
Copy link
Contributor

Hello,

I've made a plugin that parses HTML5Doctor index.

@rpicard
Copy link
Contributor

rpicard commented Jun 9, 2012

Thanks for the submission! I think I see where you're going with this, but the formatting of output.txt is not quite right. Each item should be on one line. You may just have some unescaped \n characters in there. For example:

area tag    http://dev.w3.org/html5/spec/Overview.html#the-area-element    

<img src="floorplan.png" usemap="#rooms" alt="1 bedroom, a kitchen, lounge and bathroom">

<map name="rooms">
    <area shape="rect" coords="20,20,140,140" href="bedroom.html" alt="Bedroom">
    <area shape="rect" coords="20,140,140,280" href="lounge.html" alt="Lounge">
    <area shape="poly" coords="140,140,140,280,280,140" href="kitchen.html" alt="Kitchen">
    <area shape="rect" coords="140,20,280,140" href="bathroom.html" alt="Bathroom">
</map>

should be something like this in output.txt (the exact number of escapes may not be correct here):

area tag    http://dev.w3.org/html5/spec/Overview.html#the-area-element    <img src="floorplan.png" usemap="#rooms" alt="1 bedroom, a kitchen, lounge and bathroom">\\n<map name="rooms">\n\t<area shape="rect" coords="20,20,140,140" href="bedroom.html" alt="Bedroom">\n\t<area shape="rect" coords="20,140,140,280" href="lounge.html" alt="Lounge">\n\t<area shape="poly" coords="140,140,140,280,280,140" href="kitchen.html" alt="Kitchen">\n\t<area shape="rect" coords="140,20,280,140" href="bathroom.html" alt="Bathroom">\n</map>

In your code all of those \ns and \ts would need to be escaped: \\n or \\t (you might need more escapes than that).

See the Hello World plugin for an example of that. Remember that the code in that one also includes \n characters that need to appear in the end abstract, so they are escaped as something like \\\\n. I don't think you'll need to worry about that though.

Also, there should be a little description of the element alongside the code. This would be the $description element where the code is in the $synopsis element.

Edit: One more thing: we don't need the word "tag" after each title, i.e. area tag should just be area.

Let me know if you have any questions!

@lucassmagal
Copy link
Contributor Author

Ok, I'm working to solve these bugs.

@lucassmagal
Copy link
Contributor Author

So @rpicard , is this correct now?

…t format

This now uses the format: "Package description: the package description
goes here." The first letter of the abstract is lowercased unless the
second letter was originally capitlalized too, signaling an acronym.

It's using the general output format instead of the programming
format now too. Thanks to ezgraphs for including both formats in the
code!
@lucassmagal
Copy link
Contributor Author

@rpicard , the code is ok now?

@rpicard
Copy link
Contributor

rpicard commented Jun 12, 2012

@lucassmagal Thanks for fixing that. I'm testing it out now. I'll let you know if I have some feedback.

@rpicard
Copy link
Contributor

rpicard commented Jun 12, 2012

@lucassmagal The escaping still isn't right. I'm going to change some of the details (title of the box, the "more at" link, etc.) but here's what the abstract looks like now: https://robert.duckduckgo.com/?q=html+a

Edit: I'm working on this so that link may not work now.

@lucassmagal
Copy link
Contributor Author

Well, I think that's the code snippets are interpreted as real code. Maybe getting into a pre tag?

Or, I think there's a problem with spacing around '\t'.

Thanks for feedback, I'll investigate.

@rpicard
Copy link
Contributor

rpicard commented Jun 12, 2012

@lucassmagal The scripts we use to process the output actually should be putting it into a <pre> tag already. I'm trying to figure out exactly where the problem is occurring now.

@lucassmagal
Copy link
Contributor Author

@rpicard , I've stripped all elements before create the output, eliminating spacing around the text. Besides, I've inserted one more '\t' character between elements.

Tell me if it's still wrong.

List python package dependencies in README.md
Download the data file to download/
Update the format of the abstract
Decode unicode characters in the abstract with unidecode
The capitalization of some letters were weird since they were decoded
after going through unidecode().
I don't like them cluttering the output of git status
@lucassmagal
Copy link
Contributor Author

So @rpicard , any news?

@rpicard
Copy link
Contributor

rpicard commented Jun 15, 2012

@lucassmagal Sorry, I'm working on figuring out why it's not processing right. I'm not sure if it's something in the script or something on our end yet.

That last commit should be reversed though. Elements should be joined by just one \t.

@lucassmagal
Copy link
Contributor Author

@rpicard , ok, I'll wait your review of the problem.

@rpicard
Copy link
Contributor

rpicard commented Jun 18, 2012

@lucassmagal I've got it working [1]. Do you have a Twitter handle that we can use for attribution when we announce it?

[1] https://robert.duckduckgo.com/?q=html+a

@lucassmagal
Copy link
Contributor Author

Yes, @rpicard : http://twitter.com/lsmagalhaes

Thank you very much for your help!

@rpicard
Copy link
Contributor

rpicard commented Jun 19, 2012

@lucassmagal Great! I'll try to get this live in the next day or two. No guarantees that I won't run into more things to be fixed though. :)

@rpicard
Copy link
Contributor

rpicard commented Jun 19, 2012

@lucassmagal Great! I'll try to get this live in the next day or two. No guarantees that I won't run into more things to be fixed though. :)

Thanks for the submission. I'll let you know when it's live.

@rpicard
Copy link
Contributor

rpicard commented Jun 20, 2012

@lucassmagal The plugin is live now. I've also merged it into the repo. It's a great addition. I was already using it when I just had it on robert.duckduckgo.com.

https://duckduckgo.com/?q=html+abbr

@rpicard rpicard closed this Jun 20, 2012
@lucassmagal
Copy link
Contributor Author

Thank you @rpicard ! And thank you again for DuckDuckHack, it's a very opportunity for me contributing for it =D

@rpicard
Copy link
Contributor

rpicard commented Jun 20, 2012

@lucassmagal It's great for us too! I hope I'll see more pull requests from you.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants