Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix generated_html failure when the meta line doesn't have a "content" attribute #4628

Merged
merged 5 commits into from
Aug 29, 2019

Conversation

lildude
Copy link
Member

@lildude lildude commented Aug 29, 2019

Description

It has come to light that the changes in generated_html() added in #4574 don't correctly handle meta lines that contain name="generated" attribute, but not a content="?" attribute, for example:

  <meta name='generator' value='Ronn/v0.7.3 (http://github.com/rtomayko/ronn/tree/0.7.3)'>

... from https://github.com/tj/git-extras/blob/a56e14488e5cd713ad27370ec432fdd510353f53/man/git-alias.html#L5

This results in a 500 error (try clicking the first commit in this PR 😉) on GitHub.com with the following failure stack:

NoMethodError: undefined method `match' for nil:NilClass
    /home/runner/work/linguist/linguist/lib/linguist/generated.rb:624:in `block in generated_html?'
    /home/runner/work/linguist/linguist/lib/linguist/generated.rb:622:in `any?'
    /home/runner/work/linguist/linguist/lib/linguist/generated.rb:622:in `generated_html?'
    /home/runner/work/linguist/linguist/lib/linguist/generated.rb:89:in `generated?'
    /home/runner/work/linguist/linguist/lib/linguist/generated.rb:12:in `generated?'
    /home/runner/work/linguist/linguist/test/test_generated.rb:31:in `generated_loading_data'
    /home/runner/work/linguist/linguist/test/test_generated.rb:40:in `generated_fixture_loading_data'
    /home/runner/work/linguist/linguist/test/test_generated.rb:161:in `test_check_generated'

... as can be seen in the test failure I deliberately forced. See https://github.com/github/linguist/runs/206726836#step:5:68

This PR updates generated_html() to better handle the case where content doesn't exist, and also adds support for the value attribute and additionally marks those generated by Ronn as generated too.

Now the catch: I can't include the two sample files I want to include in this PR as doing so causes a 500 error when attempting to create this pull request.

I can add them back now the PR has been created, but then the files can't be viewed in the "Files Changed" tab as we hit the same 500 error. You can't view the individual commits containing those files either. All of these are caused by the issue I'm trying to fix here.

So for the moment, I'm going to leave out the tests and fixture files and add them once this PR has been merged and a new release made and pushed to GitHub.com.

The fix can be verified locally as follows:

  1. Create a new test/fixtures/HTML/no-content.html using another fixture as the source: sed -e 's/content/value/' test/fixtures/HTML/unknown.html > test/fixtures/HTML/no-content.html
  2. Add generated_fixture_loading_data("HTML/no-content.html", true) to the list of tests in test_generated.rb
  3. Run the generated tests: bundle exec rake test TEST=test/test_generated.rb - all test should pass.

Similar changes can be made to test the value="Ronn/.... support.

Template removed as it's not relevant.

@Alhadis
Copy link
Collaborator

Alhadis commented Aug 29, 2019

I literally just commented about this on #4620 at the very same time you submitted this. :|

You can't view the individual commits containing those files either. All of these are caused by the issue I'm trying to fix here

I checked your branch out locally:

$ cd /tmp && git clone --depth=10 --single-branch --branch lildude/fix-gen_html-error -- git@github.com:github/linguist.git

... which revealed some disconcerting metadata. 😕 Note the Author field:

469594ab9d668e0b399a32fafb677fc789549e21
commit 469594ab9d668e0b399a32fafb677fc789549e21
Author: The rugged tests are fragile <colin@github.com>
Date:   Thu Aug 29 12:21:53 2019 +0100

    Revert "Add test fixture that illustrates the failure"
    
    This reverts commit fc0da0388afc18e70baf9de328e0b6e9468276bb.

diff --git a/test/fixtures/HTML/no-content.html b/test/fixtures/HTML/no-content.html
deleted file mode 100644
index e61c29c..0000000
--- a/test/fixtures/HTML/no-content.html
+++ /dev/null
@@ -1,4 +0,0 @@
-<!DOCTYPE html>
-<html>
-<meta name="generator" value="Some sick tool nobody's using yet" />
-</html>
diff --git a/test/test_generated.rb b/test/test_generated.rb
index cd0c690..b7fa5ff 100644
--- a/test/test_generated.rb
+++ b/test/test_generated.rb
@@ -158,7 +158,6 @@ class TestGenerated < Minitest::Test
     generated_fixture_loading_data("HTML/quotes-single.html")
     generated_fixture_loading_data("HTML/uppercase.html")
     generated_fixture_loading_data("HTML/unknown.html", true)
-    generated_fixture_loading_data("HTML/no-content.html", true)
     generated_sample_loading_data("HTML/pages.html")
 
   end
d62d0ec0479f17e826fc5a60f1f9b7280f74e708
commit d62d0ec0479f17e826fc5a60f1f9b7280f74e708
Author: The rugged tests are fragile <colin@github.com>
Date:   Thu Aug 29 12:21:36 2019 +0100

    Revert "Add ronn generated fixture"
    
    This reverts commit 84428d3b9ef746ea94046ad65781ac4245d5f629.

diff --git a/test/fixtures/HTML/ronn.html b/test/fixtures/HTML/ronn.html
deleted file mode 100644
index 5374a17..0000000
--- a/test/fixtures/HTML/ronn.html
+++ /dev/null
@@ -1,4 +0,0 @@
-<!DOCTYPE html>
-<html>
-<meta name="generator" value="Ronn/v0.7.3 (http://github.com/rtomayko/ronn/tree/0.7.3)" />
-</html>
diff --git a/test/test_generated.rb b/test/test_generated.rb
index 981fff5..cd0c690 100644
--- a/test/test_generated.rb
+++ b/test/test_generated.rb
@@ -157,7 +157,6 @@ class TestGenerated < Minitest::Test
     generated_fixture_loading_data("HTML/quotes-none.html")
     generated_fixture_loading_data("HTML/quotes-single.html")
     generated_fixture_loading_data("HTML/uppercase.html")
-    generated_fixture_loading_data("HTML/ronn.html")
     generated_fixture_loading_data("HTML/unknown.html", true)
     generated_fixture_loading_data("HTML/no-content.html", true)
     generated_sample_loading_data("HTML/pages.html")
41e2e331b2da58336a13b899cf3a998c55111b06
commit 41e2e331b2da58336a13b899cf3a998c55111b06
Author: The rugged tests are fragile <colin@github.com>
Date:   Thu Aug 29 12:20:24 2019 +0100

    Support content or value and handle nil

diff --git a/lib/linguist/generated.rb b/lib/linguist/generated.rb
index a06dcc9..e6b1cea 100644
--- a/lib/linguist/generated.rb
+++ b/lib/linguist/generated.rb
@@ -621,23 +621,27 @@ module Linguist
       return false if matches.empty?
       return matches.map {|x| extract_html_meta(x) }.any? do |attr|
         attr["name"].to_s.downcase == 'generator' &&
-        attr["content"].match(/^
-          ( org \s+ mode
-          | j?latex2html
-          | groff
-          | makeinfo
-          | texi2html
-          ) \b
-        /ix)
+        [attr["content"], attr["value"]].any? do |cv|
+          !cv.nil? &&
+          cv.match(/^
+            ( org \s+ mode
+            | j?latex2html
+            | groff
+            | makeinfo
+            | texi2html
+            | ronn
+            ) \b
+          /ix)
+        end
       end
     end
 
     # Internal: Extract a Hash of name/content pairs from an HTML <meta> tag
     def extract_html_meta(match)
       (match.last.sub(/\/\Z/, "").strip.scan(/
-        (?<=^|\s)        # Check for preceding whitespace
-        (name|content)   # Attribute names we're interested in
-        \s* = \s*        # Key-value separator
+        (?<=^|\s)              # Check for preceding whitespace
+        (name|content|value)   # Attribute names we're interested in
+        \s* = \s*              # Key-value separator
 
         # Attribute value
         ( "[^"]+"        # name="value"
84428d3b9ef746ea94046ad65781ac4245d5f629
commit 84428d3b9ef746ea94046ad65781ac4245d5f629
Author: The rugged tests are fragile <colin@github.com>
Date:   Thu Aug 29 12:19:53 2019 +0100

    Add ronn generated fixture

diff --git a/test/fixtures/HTML/ronn.html b/test/fixtures/HTML/ronn.html
new file mode 100644
index 0000000..5374a17
--- /dev/null
+++ b/test/fixtures/HTML/ronn.html
@@ -0,0 +1,4 @@
+<!DOCTYPE html>
+<html>
+<meta name="generator" value="Ronn/v0.7.3 (http://github.com/rtomayko/ronn/tree/0.7.3)" />
+</html>
diff --git a/test/test_generated.rb b/test/test_generated.rb
index cd0c690..981fff5 100644
--- a/test/test_generated.rb
+++ b/test/test_generated.rb
@@ -157,6 +157,7 @@ class TestGenerated < Minitest::Test
     generated_fixture_loading_data("HTML/quotes-none.html")
     generated_fixture_loading_data("HTML/quotes-single.html")
     generated_fixture_loading_data("HTML/uppercase.html")
+    generated_fixture_loading_data("HTML/ronn.html")
     generated_fixture_loading_data("HTML/unknown.html", true)
     generated_fixture_loading_data("HTML/no-content.html", true)
     generated_sample_loading_data("HTML/pages.html")
fc0da0388afc18e70baf9de328e0b6e9468276bb
commit fc0da0388afc18e70baf9de328e0b6e9468276bb
Author: The rugged tests are fragile <colin@github.com>
Date:   Thu Aug 29 11:42:35 2019 +0100

    Add test fixture that illustrates the failure

diff --git a/test/fixtures/HTML/no-content.html b/test/fixtures/HTML/no-content.html
new file mode 100644
index 0000000..e61c29c
--- /dev/null
+++ b/test/fixtures/HTML/no-content.html
@@ -0,0 +1,4 @@
+<!DOCTYPE html>
+<html>
+<meta name="generator" value="Some sick tool nobody's using yet" />
+</html>
diff --git a/test/test_generated.rb b/test/test_generated.rb
index b7fa5ff..cd0c690 100644
--- a/test/test_generated.rb
+++ b/test/test_generated.rb
@@ -158,6 +158,7 @@ class TestGenerated < Minitest::Test
     generated_fixture_loading_data("HTML/quotes-single.html")
     generated_fixture_loading_data("HTML/uppercase.html")
     generated_fixture_loading_data("HTML/unknown.html", true)
+    generated_fixture_loading_data("HTML/no-content.html", true)
     generated_sample_loading_data("HTML/pages.html")
 
   end

I'm wondering if this has something to do with the 500 error, because there's no logical reason the changes should be screwing with anything...

@lildude
Copy link
Member Author

lildude commented Aug 29, 2019

I'm wondering if this has something to do with the 500 error, because there's no logical reason the changes should be screwing with anything...

Sorted and they have absolutely nothing to do with the 500 errors. You can see it in real life at https://github.com/tj/git-extras/pull/769/files

I was pinged on those 500 errors.

@lildude
Copy link
Member Author

lildude commented Aug 29, 2019

I've also corrected the author info and force-pushed to this branch. Problem persists.

Copy link
Collaborator

@Alhadis Alhadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, <meta /> elements don't have a value= attribute, and I assume this error was propagated due to the common pairing of name and value when naming stuff that represents a key/value pair.

Valid or not, it's still in our interests to support it. The DOM purist in me couldn't approve this PR without pointing this out, however. 😉

(Then again, I wouldn't expect accurate markup from Ronn of all things... 😜)

@lildude
Copy link
Member Author

lildude commented Aug 29, 2019

Gonna test on staging before merging this, just in case it's not a complete fix.

@lildude
Copy link
Member Author

lildude commented Aug 29, 2019

Confirmed this does the trick in staging. Merging and will now make a new release.

@lildude lildude merged commit 6027cf3 into master Aug 29, 2019
@lildude lildude deleted the lildude/fix-gen_html-error branch August 29, 2019 13:03
lildude added a commit that referenced this pull request Aug 30, 2019
* Add test fixture that illustrates the failure

* Add ronn generated fixture
@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants