-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor GraphBuilderAnnotations #2871
Refactor GraphBuilderAnnotations #2871
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem cleaner to push the logging step down into the addBuilderAnnotation method, centralizing logging and other annotation cataloging functionality there. This was how it originally worked. The main reason for the strange idiom we were using (log.warning(graph.addBuilderAnnotation(new SomeKindOfAnnotation(param1, param2)));
) is to ensure the annotation messages were logged with the file names and line numbers where they were created, rather than all in one place in a logger. This is described in the Javadoc of class GraphBuilderAnnotation. This doesn't seem as important to me now as it did at the time that Javadoc was written.
The annotation's log message often alerts the user that something is wrong with the build process, and we need to jump to that line in the code to investigate or fix the problem. It also in theory allows users to fine tune which messages appear in the logs based on the characteristics of their local data. Realistically though we can easily jump to places where the annotation type constructor is called. And a good annotation summary and reporting system is a better solution than fine tuning log configurations.
The new GRAPH_BUILDER_ANNOTATION_LOG
is however the only place where we use a logger name that is not a class name. We might just want to put this method on a smaller class like GraphBuilderAnnotation (or a new class like GraphBuilderAnnotationStore, see below) and use its class name. The log messages would then uniformly represent the class and line where the logging actually occurs.
Clearly graph builder annotation logging is too noisy, and when there are annotations there are typically so many of them that they need to be presented and browsed in a summary view (like HTML/JSON reports). Many times we have decided to hide certain annotations because they are too noisy. So I agree the correct approach is probably logging everything or nothing (i.e. using a single logger instance at a single log level) while ensuring we provide nice clear summaries. However, some annotations represent serious errors that make the resulting graph essentially unusable, while others represent minor annoyances that can be ignored. When a graph is built with serious errors, that should be clearly and loudly stated in the annotation summary at the end of the build.
To this end, we may want to make a set or map of annotation types that are considered severe (e.g. Set<? extends GraphBuilderAnnotation> GraphBuilderAnnotation.severeAnnotationTypes
), so we can highlight them and log at error level in the post-build summary.
I also appreciate that you've made many more log messages into annotations. This should help generate comprehensive reports.
The main thing I don't understand in this PR is the AddBuilderAnnotation abstraction. If we are allowing for multiple implementations, it is not clear to me how those alternative implementations would be injected into each graph builder module. The empty lambda function is repeatedly supplied as an implementation for this interface in tests. Ideally there would be a named class implementing trivial logging to the console. Completely skipping the logging process is likely to make it difficult to debug certain future problems in tests.
The naming of this interface would benefit from more grammatical parallelism. OTP generally adopts a "world of nouns" OOP paradigm, with interfaces naming a class of things (e.g. BuilderAnnotationStore).
The interface as it currently stands seems to be serving as a way to access multiple identical implementations of GraphBuilderAnnotation storage (on Graph and OSMDatabase). It would make more sense to me to encapsulate that storage in a concrete class (e.g. BuilderAnnotationStore), state that there is a single BuilderAnnotationStore instance per graph build operation, and pass that single instance into each graph builder via a new method on GraphBuilder. Or have each builder create its own BuilderAnnotationStore, and merge them all into one in the GraphBuilder.
On further examination I see that the AddBuilderAnnotation
interface already exists on dev-2.x
. So this is a pre-existing situation, but if there is agreement that it should be refactored, I think this PR may be a good place to do so.
I have now added a new BuilderAnnotationStore class and moved all annotation logic from the Graph to this class. It is created in the GraphBuilder and passed into each GraphBuilderModule. I have changed all the GraphBuilderModules and tests to account for this. In the BuilderAnnotationStore constructor you have a parameter where you choose if annotations should be saved or just logged. I think this is a good starting point and should be enough for this pull request. Now we have a single class which contains all the GraphBuilderAnnotations and we can start implementing different severity levels and ways to display the annotations. |
To me the build report is to report on data issues, not errors in the code - at Entur the report is read by the the person responsible for following up on OSM and transit data quality. Based on this assumtion, I make sence to me that the annotations should contain enough information to follow up the problem, and should be easy to turn off in the log. There are of cause cases which might be errors in the code, but I think it is secondary - and as @abyrd say, I is not that difficult to find the line in the code that instantiate the annotation. I suggested a logger name different from the class name, because this is in my eyes a x-cutting concern - the logger is not logging problems with the GraphBuilderAnnotationStore, but with the data. By using a different name and documenting why we clearly indicate that this is not the same. The GraphBuilderAnnotationStore could have 2 loggers, one for build annotations and one for local issues. If people think this is unconventional or strange I am happy to use the standard Java class name approach - I just wanted to clarify why we choose a different logger name. If we want to support severity we can add that to the Annotation with a default method to the interface, and override it when suitable. A nice summary should be logged at the end of the build prossess. We can also add a threshold which will abort the process if the data do not pass. This should probably be configurable. At the moment this is not a high priority for Entur, so we wont be doing it for now unless the OTP community insist. When it comes to naming |
src/main/java/org/opentripplanner/netex/loader/mapping/NetexMapper.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opentripplanner/graph_builder/BuilderAnnotationStore.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opentripplanner/graph_builder/annotation/GraphBuilderAnnotation.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opentripplanner/graph_builder/annotation/ElevationPropagationLimit.java
Outdated
Show resolved
Hide resolved
+ visibilityPoints.size() + " > " + MAX_AREA_NODES); | ||
annotationStore.add( | ||
new AreaTooComplicated( | ||
group.getSomeOSMObject().getId(), visibilityPoints.size())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Info - Format like this:
xxxxxx(
xxxxxx(
xxx(xx)
)
)
The "Egyptian end parentheses" should "close" the expression on a new line. We will reformat the entire codebase later, so not fix is needed.
src/main/java/org/opentripplanner/routing/impl/StreetVertexIndexServiceImpl.java
Outdated
Show resolved
Hide resolved
all doc from "graph builder annotation" to "data import issue". - Added a OTP2-MigrationGuide.md
plural form is used for the package name, because it is the collection of all issue types - and only issue types.
5d11474
to
779c9ec
Compare
I noticed the the log level is set to INFO for both the summary and the logging of each issue, this is not ideal since the "default" should be to not log issues, but log the summary. I will change the level for issues to DEBUG. |
…name in previous commits.
- Log each issue with DEBUG level and the summary with INFO level. - Pretty format the summary with alphabetic order, std. numberformat, and fixed width columns.
779c9ec
to
0030aee
Compare
To be completed by pull request submitter:
To be completed by @opentripplanner/plc:
This is a OTP 2.0 port of @vpaturet 's refactoring of the GraphBuilderAnnotations that was done in Entur's OTP branch. It allows the annotations to be logged in a central place and with a common log level (currently set to info). It also converts some of the log messages during graph building to annotations.