Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implements the HTML output MVP #1127

Merged
merged 47 commits into from
May 17, 2022
Merged

Conversation

maximearmstrong
Copy link
Contributor

@maximearmstrong maximearmstrong commented Apr 28, 2022

Summary:

Fixes #1126: Implement HTML output for validation report (MVP)

This PR implements HtmlOutputUtil.java, script.js and style.css to generate a HTML version of the validation report.

Expected behavior:

The HTML report output, defaulted to report.html is located in the output folder. It is a standalone HTML file (all dependencies are included in it). It can be viewed in a local browser by opening the file.

Example:

Capture d’écran, le 2022-04-28 à 20 02 38

Please make sure these boxes are checked before submitting your pull request - thanks!

  • Run the unit tests with gradle test to make sure you didn't break anything
  • Format the title like "feat: [new feature short description]". Title must follow the Conventional Commit Specification(https://www.conventionalcommits.org/en/v1.0.0/).
  • Linked all relevant issues
  • Include screenshot(s) showing how this pull request works and fixes the issue(s)

@github-actions
Copy link
Contributor

This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: ./gradlew goJF.

@github-actions
Copy link
Contributor

Thank you for this contribution! 🍰✨🦄

Information about source corruption

0 out of 1248 sources are corrupted.

Acceptance test details

The changes in this pull request did not trigger any new errors on known GTFS datasets from the MobilityDatabase.
Download the full acceptance test report for commit 92a5887 here (report will disappear after 90 days).

@github-actions
Copy link
Contributor

Thank you for this contribution! 🍰✨🦄

Information about source corruption

0 out of 1248 sources are corrupted.

Acceptance test details

The changes in this pull request did not trigger any new errors on known GTFS datasets from the MobilityDatabase.
Download the full acceptance test report for commit 2ff6b65 here (report will disappear after 90 days).

@github-actions
Copy link
Contributor

Thank you for this contribution! 🍰✨🦄

Information about source corruption

0 out of 1248 sources are corrupted.

Acceptance test details

The changes in this pull request did not trigger any new errors on known GTFS datasets from the MobilityDatabase.
Download the full acceptance test report for commit a6da3b7 here (report will disappear after 90 days).

@github-actions
Copy link
Contributor

This contribution does not follow the conventions set by the Google Java style guide. Please run the following command line at the root of the project to fix formatting errors: ./gradlew goJF.

@maximearmstrong maximearmstrong marked this pull request as ready for review April 29, 2022 00:05
@maximearmstrong maximearmstrong added this to In Review in The Tech Dashboard (archived) via automation Apr 29, 2022
Copy link
Member

@barbeau barbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together @maximearmstrong! HTML file is definitely an improvement over just the JSON. Some thoughts below.

main/build.gradle Show resolved Hide resolved
docs/USAGE.md Outdated Show resolved Hide resolved
Scanner scanner = new Scanner(stream).useDelimiter("\\A");
content = scanner.hasNext() ? scanner.next() : "";
} catch (IOException e) {
System.out.println("The resource was not found at \"" + resourcePath + "\".");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All print statement in catch blocks that are errors should be output to System.err instead of System.out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the main result of this would be the HTML file would be malformed/broken? If so you should probably add that to the output message, otherwise the user might not tie these two things together.

}

protected static String noticeColumnsBuilder(ArrayList<String> noticeFields) {
StringBuilder columns = new StringBuilder(" ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current HTML representation is definitely an improvement over digging through raw JSON, but I think the workflow for the HTML report could be improved further. Currently it's still very difficult to track down what an error actually means.

Right now, we show the user the list of errors/warnings:

image

The natural question is "what does this error mean?"

So I would click on NOTICES.md and RULES.md and search for the code like equal_shape_distance_diff_coordinates to find more info.

However, if I search RULES.md, I don't get any matches because the rules are listed by notice class name and not the error code.

If I search NOTICES.md, I do get a hit in a list of notice codes mapped to notice class names:

image

...but that's not very helpful by itself. If I click on the notice class name I'm taken to an anchor lower on the same NOTICES.md page that describes what the fields within that notice mean:

https://github.com/MobilityData/gtfs-validator/blob/master/docs/NOTICES.md#EqualShapeDistanceDiffCoordinatesNotice

image

But that still doesn't describe what the actual error means. If I click on the notice class name linked in this anchor THEN I get taken over to RULES.md for the description of the error that I was originally looking for:

https://github.com/MobilityData/gtfs-validator/blob/master/RULES.md#EqualShapeDistanceDiffCoordinatesNotice

image

IMHO in the HTML report itself we should link directly to these error descriptions so the user has a single click to find out what the error means. We did this in the GTFS Realtime Validator HTML so the user is directly linked to, for example:
https://github.com/MobilityData/gtfs-realtime-validator/blob/master/RULES.md#e050---timestamp-is-in-the-future

...when clicking on that error in the report.

Maybe a simple approach is just to add the error code to RULES.md for now, maybe as section anchors in the markdown if that's possible so you can link directly to them?

That's what we did with RT validator - see https://raw.githubusercontent.com/MobilityData/gtfs-realtime-validator/master/RULES.md and anchors like <a name="E003"/>.

Likely beyond the scope of this issue, but I think we should also move the field descriptions in NOTICES.md into RULES.md, possibly as collapsed text that you can expand if you want the field details.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@barbeau this is a great piece of feed-back.
We were thinking of having this PR merged as a first step and then implementing (at least) the two additional parts you flagged here (mentioned in the 3.1.0 release plan issue):

  1. A description of the notice (what is this problem?)
  2. Where exactly is the issue and how to fix it (How to fix my dataset?)

Before implementing these parts, we would like to (1) gather user feedback & do some user testing on different options for the next version of the HTML output, (2) discuss the best strategy to integrate the info from RULES.md and NOTICES.md into the results (knowing that we will want to translate).

Maybe a simple approach is just to add the error code to RULES.md for now, maybe as section anchors in the markdown if that's possible so you can link directly to them?

I really like this idea and this would be actionable for the next release, and I'm willing to PR this one. What about merging RULES.md and NOTICES.md in one document, using the collapsed text to make it more digestible?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about merging RULES.md and NOTICES.md in one document, using the collapsed text to make it more digestible?

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Done in PR #1132, and the anchor links have been added in the HTML output

return columns.toString();
}

protected static String noticeRowsBuilder(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's just me, but I find it hard to read some of the data that's dumped into tables now, like this:

image

Creating human-readable sentences would obviously be an improvement - is that outside of the scope of this PR?

I was thinking similar to what we have on the GTFS Realtime Validator:

image

@bdferris-v2
Copy link
Collaborator

I wanted to link to the comment I made on the feature request issue about the technical approach here, since I accidentally posted it there instead of here on the PR. Curious if @maximearmstrong or @barbeau have opinions?

@barbeau
Copy link
Member

barbeau commented May 2, 2022

@bdferris-v2 ThymeLeaf (https://www.thymeleaf.org/) looks really nice - looks like there is an IntelliJ plugin too. I agree that having a framework that allows us to manage the HTML as HTML content (as opposed to strings within a Java class) would greatly reduce the amount of effort to maintain and enhance the HTML output. The internationalization feature also looks like it would be very useful.

@maximearmstrong
Copy link
Contributor Author

@bdferris-v2 I also agree. I will implement it.

@maximearmstrong
Copy link
Contributor Author

When I try to run this PR in IntelliJ with -i tma-tn-us.zip -o output -c us and tma-tn-us.zip (after merging master into this PR branch, to see what it will look like after merge), I'm getting this error at runtime and there is no output from the validator:

Oh, the problem is that we set the Validator-Version attribute in main/build.gradle to be in the manifest stored in the JAR file. We could use the Implementation-Version attribute instead, which solves the problem in IntelliJ, but the version displayed in the report would be v3.0-cli instead of v3.0. What do you think would be better?

After discussion with @bdferris-v2, we decided to use the Implementation-Version attribute for now, as it might be refactored and fixed along with issue #1139. Running the validator in IntelliJ now works, however the HTML output obtained this way does not display the Implementation-Version that we would get using the JAR file. Since the HTML output obtained with the JAR file is correct, I think we are ready to merge.

@github-actions
Copy link
Contributor

Thank you for this contribution! 🍰✨🦄

Information about source corruption

0 out of 1248 sources are corrupted.

Acceptance test details

The changes in this pull request did not trigger any new errors on known GTFS datasets from the MobilityDatabase.
Download the full acceptance test report for commit 2c6019d here (report will disappear after 90 days).

@maximearmstrong maximearmstrong self-assigned this May 13, 2022
Copy link
Member

@barbeau barbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @maximearmstrong, looks good! Two comments in-line. Note there are conflicts now that need to be resolved too.

README.md Outdated
@@ -33,6 +34,12 @@ You can run this validator using a GTFS dataset on your computer, or from a URL.

More detailed instructions with all the parameters that exists are available on our ["Usage"](/docs/USAGE.md) page.

### <a name="visualize-results"></a>Visualize the results
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit - You don't need the anchor here, you can just link to #visualize-the-results below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! I thought it wouldn't work because we have two "Visualize the results" titles, but the first one is properly referenced. We don't need it in this case, but do you know a way to reference the second one using the same method?

@barbeau
Copy link
Member

barbeau commented May 13, 2022

@maximearmstrong Also, do you want to open a separate issue for human-readable sentences in the HTML report? That was discussed in this PR but my understanding is it's getting tabled for future work.

@bdferris-v2
Copy link
Collaborator

(Apologies in advance for the merge conflicts, @maximearmstrong!)

@maximearmstrong
Copy link
Contributor Author

No worries @bdferris-v2! I merged the master branch. I had to change the behavior around the argument table in report.html because we no longer have access to raw arguments - a ValidatorRunnerConfig object is passed to the ValidatorRunner, not an Arguments.

I removed the exportNoticeSchema and help arguments because we don't keep them in the config object - I don't think this is a problem since they weren't really relevant. The input and URL arguments are now both represented by config.gtfsSource, so I had to check whether the URI scheme was related to a file or a URL to display the correct one. The countryCode is not displayed when it is unknown because we don't have access to the raw data.

I think our original idea of displaying the raw arguments was probably better, but it looks like we would have to break the ValidatorRunner design to make this possible. Any thoughts? @barbeau @isabelle-dr Any concerns on your end?

@bdferris-v2
Copy link
Collaborator

I understand the motivation to display the raw arguments, but I think given the ValidationRunner refactor, there is value in just displaying the contents of the config object, as you have done, as that ultimately controls the execution. As such, I wouldn't be opposed if you got rid of the idea of command-line arguments in the html report entirely and just described them as the validation configuration. That'll probably scale better to other execution environments for the validator in the future.

If you really wanted to record the raw cli arguments for debugging purposes, I would add an Optional<List> rawArguments() field to the config where you can optionally pass those in if it makes sense for the execution environment.

@maximearmstrong
Copy link
Contributor Author

maximearmstrong commented May 16, 2022

Thank you @bdferris-v2, that's helpful. I thought about changing the argument table to a configuration table and changing the "Argument Description" and "Argument Short Name" columns to "Related Argument" and "Related Argument Short Name". @isabelle-dr Do you think it would be relevant for the end user?

@isabelle-dr
Copy link
Contributor

For the arguments, my only concern is consistency between (1) the table in USAGE.md and (2) the way the params are written in the "Advanced" section in PR #1146.
If a user ticks a box that says "number of threads for execution" on the GUI and then sees "threads" and "-t" in the HTML report, this is slightly confusing (although pretty straight forward to guess).

I think from a user perspective, the best would be to have the same short descriptions than in the GUI, and remove the argument short name. There is a link to the USAGE.md file if users want to map this with the exact params used in the cli. The short descriptions could be:

  • Input directory
  • Output directory
  • Number of threads used to run the validator
  • Country code (for phone validation)
  • Validation report name
  • System errors report name
  • HTML report name

@maximearmstrong
Copy link
Contributor Author

@isabelle-dr I agree. I made the changes. I used "GTFS input (ZIP file, directory or URL)" instead of "Input directory". This would be the final version:
Capture d’écran, le 2022-05-16 à 18 26 15

@github-actions
Copy link
Contributor

Thank you for this contribution! 🍰✨🦄

Information about source corruption

0 out of 1248 sources are corrupted.

Acceptance test details

The changes in this pull request did not trigger any new errors on known GTFS datasets from the MobilityDatabase.
Download the full acceptance test report for commit 7951a8a here (report will disappear after 90 days).

@isabelle-dr
Copy link
Contributor

This looks great @maximearmstrong. Thanks for adding the sentence in the value for country code.
One last tiny change 😃: can you replace Argument description by Parameter description? We use the term Parameter above and in the documentation.
After this, ready to merge from my perspective, I'm approving now. great work!

The Tech Dashboard (archived) automation moved this from Pending approval to Approved May 17, 2022
@maximearmstrong maximearmstrong merged commit 57813dd into master May 17, 2022
@maximearmstrong maximearmstrong deleted the issue/1126/html-output-mvp branch May 17, 2022 14:28
The Tech Dashboard (archived) automation moved this from Approved to Done May 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Implement HTML output for validation report (MVP)
4 participants