Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Entity decoding in Text tokens #11

merged 5 commits into from Feb 1, 2014


None yet
3 participants

chrisdone commented Jan 28, 2014

Now the decode function for each will decode HTML entities:

> Text.HTML.TagStream.ByteString.decode "foo bar "mu< zot &hello;"
Right [Text "foo bar \"mu< zot "]
> Text.HTML.TagStream.Text.decode "foo bar &quot;mu&lt; zot &hello;"
Right [Text "foo bar \"mu< zot "]

The conduit (thanks to @snoyberg) also works:

> CL.sourceList (T.chunksOf 1 "&gt;") C.$= T.tokenStream C.$$ CL.consume
[Text ">"]

In the spirit of tagstream-conduit's liberal parsing, invalid or unknown entities are skipped:

> Text.HTML.TagStream.Text.decode "&foo; &bar &quot;mu&lt; zot &hello;"
Right [Text "  \"mu< zot "]

A test case has been added.


yihuang commented Jan 31, 2014

This is cool, thanks.
I think it's better to leave invalid or unknown entities as is, which is what browser does, what do you think?

> Text.HTML.TagStream.Text.decode "&foo; &bar &quot;mu&lt; zot &hello;"
Right [Text "&foo;&bar  \"mu< zot &hello;"]

chrisdone commented Jan 31, 2014

No problem, I'll update the pull request!


chrisdone commented Jan 31, 2014

@yihuang Updated!

@yihuang yihuang added a commit that referenced this pull request Feb 1, 2014

@yihuang yihuang Merge pull request #11 from chrisdone/entity_decoding
Entity decoding in Text tokens

@yihuang yihuang merged commit d82a501 into yihuang:master Feb 1, 2014

@chrisdone chrisdone deleted the unknown repository branch Feb 1, 2014


chrisdone commented Feb 3, 2014

Can you (or I) push this to Hackage?


yihuang commented Feb 4, 2014

Done, uploaded 0.5.5.


chrisdone commented Feb 4, 2014


yihuang notifications@github.com writes:

Done, uploaded 0.5.5.

Reply to this email directly or view it on GitHub:
#11 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment