Include Copyright owner/date in HTML license report; show multiple licenses #114

DonnKey · 2019-10-24T18:31:19Z

This change addresses issue #109 (nee #50) about including Copyright text in the report.
It also handles the rare (but real -- see "Checker Qual") case where there are multiple license kinds by including all the licenses. Some minor changes to improve readability of both the raw html and the final result.

The raw information in the POM is often incomplete in this regard, so it makes a "best guess" when that happens. Those are easily discovered and edited in the raw html if needed.

Incidentally, it also adds another spelling of the MIT license URL (with a .php suffix). I didn't check whether the other "standard" licenses have that form.

# Conflicts: # src/main/kotlin/com/jaredsburrows/license/internal/report/HtmlReport.kt

jaredsburrows · 2019-10-26T02:25:09Z

Can you resolve the conflicts?

DonnKey · 2019-10-26T23:07:49Z

I resolved the conflicts. The edit in GitHub dropped a }, which I've fixed locally but not pushed yet.

However... I also realized I hadn't run the regressions, and expected that it would fail because of my changes. I tried to fix that, and ran up against test harness problems. (Several.)

Details follow below.

Before getting into the test harness issues, there are a couple of cosmetic issues that I'm stuck on. I don't like to blame the tools, but in this case I'm stumped and it really does look like the tools to me. (I've been doing this stuff for a very long time, so that statement is not lightly made, but I also freely admit I could be wrong.)

The resulting HTML has some problems with whitespace/indentations around the <a> entries (and a  , but that might be a cascade); it looks to me as if the code to adjust indentation is simply losing track of where it is. I've spent several days causing the code to generate "wrong" html trying to determine if there's some alternate pattern I should be following, and the <a> simply sticks at the left margin. I'm open to suggestions, but I lean toward living with it. (The html is correct and as intended except for the whitespace.)

Here's a snippet. Note where the <a> entries align.

    <ul>
      <li><a href="#-1748871456">Android SDK Fabric (1.4.8)</a>
        <dl>
          <dt>Copyright &copy; 20xx The original author or authors</dt>
        </dl>
      </li>
<a name="-1748871456"></a>
      <pre>Fabric Software and Services Agreement
<a href="https://fabric.io/terms">https://fabric.io/terms</a></pre>
<br>
      <hr>

The problems I've run across in the test harness are not yours, I don't think, but I'm not sure how to hack around them. (Only the first of the errors is reported at any time, so you have to unmask them one at a time.) Let me know how you'd like to proceed. (My preference is to check the "right answer" in "as intended", and try to fix the test harness over time - but that's messy for you!)

Fatal error in assertHtml because the resulting HTML contains © : it's being reported as an "undefined entity" even if it is official HTML. (Yeah, using (c) would work, but the copyright symbol is available and a better choice.)
Fatal error in assertHtml because there's no matching </hr> for the <hr>. Except that <hr> doesn't require a matching </hr>! (Some HTML dialects apparently do, but at least w3schools says <hr> has no end tag for straight HTML.) The <hr> really makes the result more readable, and I haven't a workaround in mind.
Fatal error like the above for  . Here there's no ambiguity:   has never had an end tag, and it makes no sense to have one. The   is really needed, and I don't see an alternative.
I'm using Windows, and it looks as if there's a problem (another one) with respect to Windows path names. I'm seeing the following

result.output.find("Wrote HTML report to .*${reportFolder}/${taskName}.html.")
|      |      |                             |               |
|      |      |                             |               licenseDebugReport
|      |      |                             C:\Users\donnt\AppData\Local\Temp\junit2089520722103704024/build/reports/licenses
|      |      java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 26
|      |      Wrote HTML report to .*C:\Users\donnt\AppData\Local\Temp\junit2089520722103704024/build/reports/licenses/licenseDebugReport.html.
|      |                                ^
|

Note the caret under the \U - it looks as if it's trying to treat the Windows path separator as an escape. I'm inclined to fix everything else however we decide to do it, and rely on the automated testing to deal with that (give up testing on Windows). (Yeah, I know, "Windows"...).

Lastly, for the record if you haven't seen it, I'm seeing a gripe about Groovy internals. Presumably someone is working on it, but for the record:

An illegal reflective access operation has occurred
Illegal reflective access by org.codehaus.groovy.vmplugin.v7.Java7$1 (file:/C:/Users/donnt/.gradle/wrapper/dists/gradle-5.4.1-all/3221gyojl5jsh0helicew7rwx/gradle-5.4.1/lib/groovy-all-1.0-2.5.4.jar) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class,int)
Please consider reporting this to the maintainers of org.codehaus.groovy.vmplugin.v7.Java7$1
Use --illegal-access=warn to enable warnings of further illegal reflective access operations All illegal access operations will be denied in a future release

DonnKey · 2019-10-27T17:31:53Z

Over night it occurred to me to look at exactly what assertHtml did (I had assumed it was a built-in). The problem is that it's comparing XML when the input is HTML, and it matters in this case. I didn't see any HTML comparison tools similar to DiffBuilder, and I suspect that <hr> and   can't be dealt with at all because it's assuming well-formed (properly closed) XML. Either a preliminary translation phase to convert the HTML into legal XML (e.g. <hr> -> <hr/>) or a completely different comparison strategy will be needed. As final owner (and arbiter) I think at least the outline of the solution should be your call.

jaredsburrows · 2019-10-27T17:34:54Z

@DonnKey If you have a better library or way to compare the HTML, please let me know.

DonnKey · 2019-10-27T18:20:09Z

I don't know of anything that would work directly, and didn't find anything on the web that looked at all promising. Pre-transforming the strings to make them into something the XML comparison will accept seems possible (subject to actually trying it). Of course, straight string comparison would work but be less resilient to unintended changes (such as the indentation of <a> elements). If you want me to give pre-transformation a try, I can, if it would be something you'd accept if it worked.

DonnKey · 2019-10-29T17:46:25Z

The remaining failure is in comparing some Unicode characters that seem to be being changed/escaped differently on Windows and (presumably) Linux. It works fine for me locally. I'll do something to "neutralize" the difference.

src/main/kotlin/com/jaredsburrows/license/internal/report/HtmlReport.kt

src/test/groovy/test/TestUtils.groovy

jaredsburrows · 2019-10-30T16:22:42Z

I want all the imports. You can update your intellij settings to not use star imports.

…

On Wed, Oct 30, 2019, 9:19 AM DonnKey ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/main/kotlin/com/jaredsburrows/license/internal/report/HtmlReport.kt <#114 (comment)> : > import com.jaredsburrows.license.internal.pom.Project -import kotlinx.html.A -import kotlinx.html.FlowOrInteractiveOrPhrasingContent -import kotlinx.html.HtmlTagMarker -import kotlinx.html.a -import kotlinx.html.attributesMapOf -import kotlinx.html.body -import kotlinx.html.h3 -import kotlinx.html.head -import kotlinx.html.html -import kotlinx.html.li -import kotlinx.html.pre +import kotlinx.html.* I can do that, but help me understand the value of that. I didn't make the change purely on my own: IntellIJ inserted the wildcard and then reported duplicate imports. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#114?email_source=notifications&email_token=AANIYSC3R5QZGFG4BJXCGE3QRGXZPA5CNFSM4JEYC6A2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCJYCIPI#discussion_r340719708>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANIYSCDQ7BBPA2W36I7P6LQRGXZPANCNFSM4JEYC6AQ> .

jaredsburrows · 2019-11-01T04:23:24Z

If we are updating the output, should we also update the README.md?

DonnKey · 2019-11-01T17:41:17Z

README is mostly done. I left the indentation problems with the HTML as is. Let me know (either way, please) if you'd prefer that to be corrected in the README for readability reasons.

The CHANGELOG doesn't seem to need anything from me.

jaredsburrows · 2019-11-03T22:35:41Z

@DonnKey It would be nice to keep the README.md up to date with the code. This way people know what they are getting when they use the plugin.

src/main/kotlin/com/jaredsburrows/license/internal/LicenseHelper.kt

src/main/kotlin/com/jaredsburrows/license/internal/report/HtmlReport.kt

jaredsburrows · 2019-11-10T06:37:37Z

src/test/groovy/test/TestUtils.groovy

+    // this exact case, so update as needed.
+    text = text.replaceAll('<br>', '<br/>')
+    text = text.replaceAll('<hr>', '<hr/>')
+    text = text.replaceAll('&copy;', '(c)')


Sorry, one last thing before we merge, why remove copy with (c)?

The XML comparison recognizes that © is an "entity" (ampersand-semicolon), but doesn't have the copyright entity built-in, yielding a complaint about "undefined entity" and failing the test. The replacement doesn't really matter, but that seemed obvious. (The message the XML comparison generates is obvious only after you've figured out what it means!)

jaredsburrows · 2019-11-10T06:38:00Z

src/test/groovy/com/jaredsburrows/license/LicensePluginJavaSpec.groovy

          <title>Open source licenses</title>
        </head>
        <body>
          <h3>Notice for packages:</h3>
          <ul>
-            <li>
-              <a href="#314129783">appcompat-v7</a>
+            <li><a href="#1934118923">appcompat-v7 (26.1.0)</a>


Sorry, one last thing before we merge, can we remove the parentheses around the version?

I did that explicitly so that if there wasn't a version (theoretically possible, but, granted, unlikely) there'd be an obvious placeholder that could be edited. I don't mind making the change, but there was a reason for doing it this way. If you still want a change, would you prefer no placeholder or some other placeholder?

jaredsburrows · 2019-11-16T05:54:53Z

Thanks!

DonnKey added 3 commits October 16, 2019 18:59

Handle copyright, multiple licenses in html output.

08afe75

Handle copyright, multiple licenses in html output.

70e2f1b

Merge remote-tracking branch 'origin/master'

e5a09d6

# Conflicts: # src/main/kotlin/com/jaredsburrows/license/internal/report/HtmlReport.kt

Merge branch 'master' into master

8dccb24

Handle copyright, multiple licenses in html output.

cdb4f63

Handle copyright, multiple licenses in html output.

de74bde