New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to ExtractorHTML.java for cond. comments #149

Merged
merged 1 commit into from Feb 19, 2016

Conversation

Projects
None yet
3 participants
@eleclerc
Contributor

eleclerc commented Feb 17, 2016

Lot's of websites are using "Downlevel Revealed" conditional comments with a twist to have a page with valid html. this is an update to the RegEx to allow such a case. ex:

<!--[if expression]><!-->
  HTML
<!--<![endif]-->
Update to ExtractorHTML.java for cond. comments
Lot's of websites are using "Downlevel Revealed" conditional comments with a twist to have a page with valid html. this is an update to the RegEx to allow such a case. ex:

    <!--[if expression]><!-->
      HTML
    <!--<![endif]-->

- reference: https://css-tricks.com/downlevel-hidden-downlevel-revealed/
- reference: http://www.sitepoint.com/web-foundations/internet-explorer-conditional-comments/
- example site where this type of tag can be found: https://www.canada.ca/index.html
@kris-sigur

This comment has been minimized.

Show comment
Hide comment
@kris-sigur

kris-sigur Feb 18, 2016

Collaborator

Thanks Eric. This resolves a comparable issue I've been dealing with in one of the more popular CMSs here in Iceland. Vote for merging.

Collaborator

kris-sigur commented Feb 18, 2016

Thanks Eric. This resolves a comparable issue I've been dealing with in one of the more popular CMSs here in Iceland. Vote for merging.

@nlevitt

This comment has been minimized.

Show comment
Hide comment
@nlevitt

nlevitt Feb 18, 2016

Member

@kris-sigur I think you have the power.

Member

nlevitt commented Feb 18, 2016

@kris-sigur I think you have the power.

kris-sigur added a commit that referenced this pull request Feb 19, 2016

Merge pull request #149 from eleclerc/patch-1
Update to ExtractorHTML.java for cond. comments

@kris-sigur kris-sigur merged commit ae413b1 into internetarchive:master Feb 19, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment