Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to ExtractorHTML.java for cond. comments #149

Merged
merged 1 commit into from Feb 19, 2016

Conversation

@eleclerc
Copy link
Contributor

@eleclerc eleclerc commented Feb 17, 2016

Lot's of websites are using "Downlevel Revealed" conditional comments with a twist to have a page with valid html. this is an update to the RegEx to allow such a case. ex:

<!--[if expression]><!-->
  HTML
<!--<![endif]-->
Lot's of websites are using "Downlevel Revealed" conditional comments with a twist to have a page with valid html. this is an update to the RegEx to allow such a case. ex:

    <!--[if expression]><!-->
      HTML
    <!--<![endif]-->

- reference: https://css-tricks.com/downlevel-hidden-downlevel-revealed/
- reference: http://www.sitepoint.com/web-foundations/internet-explorer-conditional-comments/
- example site where this type of tag can be found: https://www.canada.ca/index.html
@kris-sigur
Copy link
Collaborator

@kris-sigur kris-sigur commented Feb 18, 2016

Thanks Eric. This resolves a comparable issue I've been dealing with in one of the more popular CMSs here in Iceland. Vote for merging.

@nlevitt
Copy link
Member

@nlevitt nlevitt commented Feb 18, 2016

@kris-sigur I think you have the power.

kris-sigur added a commit that referenced this pull request Feb 19, 2016
Update to ExtractorHTML.java for cond. comments
@kris-sigur kris-sigur merged commit ae413b1 into internetarchive:master Feb 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.