Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor GraphBuilderAnnotations #2871

Merged
merged 14 commits into from
Nov 18, 2019

Conversation

gmellemstrand
Copy link
Contributor

To be completed by pull request submitter:

  • issue: #2869
  • roadmap: This is a refactor and does not add new functionality.
  • tests: Existing tests still pass
  • formatting: Have you followed the suggested code style?
  • documentation: No new documentation needed
  • changelog: No entry added to changelog

To be completed by @opentripplanner/plc:

  • reviews and approvals by 2 members, ideally from different organizations
  • after merging: update the relevant card on the roadmap

This is a OTP 2.0 port of @vpaturet 's refactoring of the GraphBuilderAnnotations that was done in Entur's OTP branch. It allows the annotations to be logged in a central place and with a common log level (currently set to info). It also converts some of the log messages during graph building to annotations.

@gmellemstrand gmellemstrand requested a review from a team as a code owner November 7, 2019 15:10
Copy link
Member

@abyrd abyrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does seem cleaner to push the logging step down into the addBuilderAnnotation method, centralizing logging and other annotation cataloging functionality there. This was how it originally worked. The main reason for the strange idiom we were using (log.warning(graph.addBuilderAnnotation(new SomeKindOfAnnotation(param1, param2)));) is to ensure the annotation messages were logged with the file names and line numbers where they were created, rather than all in one place in a logger. This is described in the Javadoc of class GraphBuilderAnnotation. This doesn't seem as important to me now as it did at the time that Javadoc was written.

The annotation's log message often alerts the user that something is wrong with the build process, and we need to jump to that line in the code to investigate or fix the problem. It also in theory allows users to fine tune which messages appear in the logs based on the characteristics of their local data. Realistically though we can easily jump to places where the annotation type constructor is called. And a good annotation summary and reporting system is a better solution than fine tuning log configurations.

The new GRAPH_BUILDER_ANNOTATION_LOG is however the only place where we use a logger name that is not a class name. We might just want to put this method on a smaller class like GraphBuilderAnnotation (or a new class like GraphBuilderAnnotationStore, see below) and use its class name. The log messages would then uniformly represent the class and line where the logging actually occurs.

Clearly graph builder annotation logging is too noisy, and when there are annotations there are typically so many of them that they need to be presented and browsed in a summary view (like HTML/JSON reports). Many times we have decided to hide certain annotations because they are too noisy. So I agree the correct approach is probably logging everything or nothing (i.e. using a single logger instance at a single log level) while ensuring we provide nice clear summaries. However, some annotations represent serious errors that make the resulting graph essentially unusable, while others represent minor annoyances that can be ignored. When a graph is built with serious errors, that should be clearly and loudly stated in the annotation summary at the end of the build.

To this end, we may want to make a set or map of annotation types that are considered severe (e.g. Set<? extends GraphBuilderAnnotation> GraphBuilderAnnotation.severeAnnotationTypes), so we can highlight them and log at error level in the post-build summary.

I also appreciate that you've made many more log messages into annotations. This should help generate comprehensive reports.

The main thing I don't understand in this PR is the AddBuilderAnnotation abstraction. If we are allowing for multiple implementations, it is not clear to me how those alternative implementations would be injected into each graph builder module. The empty lambda function is repeatedly supplied as an implementation for this interface in tests. Ideally there would be a named class implementing trivial logging to the console. Completely skipping the logging process is likely to make it difficult to debug certain future problems in tests.

The naming of this interface would benefit from more grammatical parallelism. OTP generally adopts a "world of nouns" OOP paradigm, with interfaces naming a class of things (e.g. BuilderAnnotationStore).

The interface as it currently stands seems to be serving as a way to access multiple identical implementations of GraphBuilderAnnotation storage (on Graph and OSMDatabase). It would make more sense to me to encapsulate that storage in a concrete class (e.g. BuilderAnnotationStore), state that there is a single BuilderAnnotationStore instance per graph build operation, and pass that single instance into each graph builder via a new method on GraphBuilder. Or have each builder create its own BuilderAnnotationStore, and merge them all into one in the GraphBuilder.

On further examination I see that the AddBuilderAnnotation interface already exists on dev-2.x. So this is a pre-existing situation, but if there is agreement that it should be refactored, I think this PR may be a good place to do so.

@gmellemstrand
Copy link
Contributor Author

I have now added a new BuilderAnnotationStore class and moved all annotation logic from the Graph to this class. It is created in the GraphBuilder and passed into each GraphBuilderModule. I have changed all the GraphBuilderModules and tests to account for this.

In the BuilderAnnotationStore constructor you have a parameter where you choose if annotations should be saved or just logged.

I think this is a good starting point and should be enough for this pull request. Now we have a single class which contains all the GraphBuilderAnnotations and we can start implementing different severity levels and ways to display the annotations.

@t2gran
Copy link
Member

t2gran commented Nov 14, 2019

To me the build report is to report on data issues, not errors in the code - at Entur the report is read by the the person responsible for following up on OSM and transit data quality. Based on this assumtion, I make sence to me that the annotations should contain enough information to follow up the problem, and should be easy to turn off in the log. There are of cause cases which might be errors in the code, but I think it is secondary - and as @abyrd say, I is not that difficult to find the line in the code that instantiate the annotation.

I suggested a logger name different from the class name, because this is in my eyes a x-cutting concern - the logger is not logging problems with the GraphBuilderAnnotationStore, but with the data. By using a different name and documenting why we clearly indicate that this is not the same. The GraphBuilderAnnotationStore could have 2 loggers, one for build annotations and one for local issues. If people think this is unconventional or strange I am happy to use the standard Java class name approach - I just wanted to clarify why we choose a different logger name.

If we want to support severity we can add that to the Annotation with a default method to the interface, and override it when suitable. A nice summary should be logged at the end of the build prossess. We can also add a threshold which will abort the process if the data do not pass. This should probably be configurable. At the moment this is not a high priority for Entur, so we wont be doing it for now unless the OTP community insist.

When it comes to naming AddBuilderAnnotation, I also find the name strange. The verb add is odd, and most developers I have talked to about this do not understand what a "BuildAnnotation" is - including my self. Something like DataImportIssue would be easier to understand to me. I am not a native english speaker, so I am on thin ice here.

+ visibilityPoints.size() + " > " + MAX_AREA_NODES);
annotationStore.add(
new AreaTooComplicated(
group.getSomeOSMObject().getId(), visibilityPoints.size()));
Copy link
Member

@t2gran t2gran Nov 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Info - Format like this:

xxxxxx(
    xxxxxx(
        xxx(xx)
    )
)        

The "Egyptian end parentheses" should "close" the expression on a new line. We will reformat the entire codebase later, so not fix is needed.

all doc from "graph builder annotation" to "data import issue".
- Added a OTP2-MigrationGuide.md
plural form is used for the package name, because it is the collection
of all issue types - and only issue types.
@t2gran t2gran force-pushed the logger_refactor branch 2 times, most recently from 5d11474 to 779c9ec Compare November 15, 2019 15:19
@t2gran t2gran mentioned this pull request Nov 15, 2019
@t2gran
Copy link
Member

t2gran commented Nov 15, 2019

I noticed the the log level is set to INFO for both the summary and the logging of each issue, this is not ideal since the "default" should be to not log issues, but log the summary. I will change the level for issues to DEBUG.

  - Log each issue with DEBUG level and the summary with INFO level.
  - Pretty format the summary with alphabetic order, std. numberformat,
    and fixed width columns.
t2gran
t2gran previously approved these changes Nov 15, 2019
@gmellemstrand gmellemstrand merged commit d4040ee into opentripplanner:dev-2.x Nov 18, 2019
@t2gran t2gran deleted the logger_refactor branch November 18, 2019 13:56
@abyrd abyrd added this to the 2.0 milestone Oct 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants