New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Importer APIs to support filename and archive name cleanly #2963
Comments
OMG, this is such a mess! Not only did the |
I guess we need a miracle...I think he goes by the name of Tom? :-) |
Yes the architecture of our importers is not ideal at the moment. Also we rely on the CSV importer in many tests to create a test project, which we should not do (that is my fault). I am working on this architecture in the Spark migration and I have removed the dependency to the CSV importer in tests, so perhaps it is not worth duplicating this effort in 3.x? |
These issues are orthogonal to use of the CSV importer to create test cases. After ripping out all this code, all the tests pass again, so I guess the metadata functionality has zero test coverage. :( As part of the investigation, I also realized that the recent addition of optionally saving archive filename broke the public importer API with an incompatible change which is uncool, so that will need to be redone. |
It looks like this mess started with #1055, which was merged without review, and has snowballed from there. It's too late to restore compatibility for pre-2015 era importers, but we can at least a) not break current importers for 3.5 and b) try to create something which can be stabilized going forward. |
Fixes OpenRefine#2963 - restore binary compatibility to the API - hoist the handling of both fileSource and archiveFileName from TabularImportingParserBase and TreeImportingParserBase to ImportingParserBase so that there's only one copy. These 3 classes are all part of the internal implementation, so there should be no compatibility issue.
Fixes OpenRefine#2963 - restore binary compatibility to the API - hoist the handling of both fileSource and archiveFileName from TabularImportingParserBase and TreeImportingParserBase to ImportingParserBase so that there's only one copy. These 3 classes are all part of the internal implementation, so there should be no compatibility issue.
Fixes OpenRefine#2963 - restore binary compatibility to the API - hoist the handling of both fileSource and archiveFileName from TabularImportingParserBase and TreeImportingParserBase to ImportingParserBase so that there's only one copy. These 3 classes are all part of the internal implementation, so there should be no compatibility issue.
* Make sure data directory is directory, not a file * Add a test for zip archive import Also tests the saving of the archive file name and source filename * Add TODOs - no functional changes * Cosmetic cleanups * Revert importer API changes for archive file name parameter Fixes #2963 - restore binary compatibility to the API - hoist the handling of both fileSource and archiveFileName from TabularImportingParserBase and TreeImportingParserBase to ImportingParserBase so that there's only one copy. These 3 classes are all part of the internal implementation, so there should be no compatibility issue. * Revert weird flow of control for import options metadata This reverts the very convoluted control flow that was introduced when adding the input options to the project metadata. Instead the metadata is all handled in the importer framework rather than having to change APIs are have individual importers worry about it. The feature never had test coverage, so that is still to be added. * Add test for import options in project metadata & fix bug Fixes bug where same options object was being reused and overwritten, so all copies in the list ended up the same.
Fixes OpenRefine#2963 - restore binary compatibility to the API - hoist the handling of both fileSource and archiveFileName from TabularImportingParserBase and TreeImportingParserBase to ImportingParserBase so that there's only one copy. These 3 classes are all part of the internal implementation, so there should be no compatibility issue.
In 818e139 the
NotImplementedException
throwing behavior of the methods in the base class got removed so that they're now silent no-ops if someone calls them for a subclass where they are not implemented. This is unsafe and misleading behavior that just cost me a good chunk of time to track down.It looks like TreeImportingParserBase suffers from the same issue: 818e139#diff-fc818390bcd00535a45c7ef1f5dfc8d2L177-R178
The text was updated successfully, but these errors were encountered: