-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move from com.google.refine to org.openrefine, for #226 #335
base: 4.0-point-1
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a cursory review and transferred over some of the comments from my PR. My biggest questions/concerns:
- mapping of legacy package/class names (it seemed broken to me)
- loading of legacy projects which have hardcoded classnames serialized
- being careful not to change legacy test data with over aggressive global replace
import org.openrefine.extension.database.model.DatabaseColumn; | ||
import org.openrefine.extension.database.model.DatabaseQueryInfo; | ||
import org.openrefine.extension.database.model.DatabaseRow; | ||
import org.slf4j.Logger; | ||
import org.slf4j.LoggerFactory; | ||
|
||
import com.google.refine.extension.database.model.DatabaseColumn; | ||
import com.google.refine.extension.database.model.DatabaseQueryInfo; | ||
import com.google.refine.extension.database.model.DatabaseRow; | ||
import com.google.refine.importers.TabularImportingParserBase.TableDataReader; | ||
import com.google.refine.importing.ImportingJob; | ||
import org.openrefine.importers.TabularImportingParserBase.TableDataReader; | ||
import org.openrefine.importing.ImportingJob; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import ordering seems rather ad hoc. I'd suggest, as I have in the past, agreeing on a target import ordering which can then be used as part of this refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be happy enforcing a certain import order, as I mentioned here: https://forum.openrefine.org/t/java-import-ordering/846
Do you know if our current linter can be configured for that, or if we should introduce a new one?
That being said, it's not clear to me at which point in the git history this linting should be done.
@@ -5,7 +5,7 @@ importPackage(org.openrefine.wikidata.commands); | |||
* Function invoked to initialize the extension. | |||
*/ | |||
function init() { | |||
var RefineServlet = Packages.com.google.refine.RefineServlet; | |||
var RefineServlet = Packages.org.openrefine.RefineServlet; | |||
RefineServlet.registerClassMapping( | |||
"org.openrefine.wikidata.operations.SaveWikibaseSchemaOperation$WikibaseSchemaChange", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this identity mapping? If it actually does something, an explanatory comment would be helpful, otherwise it should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly it's a workaround for a rather obscure class loading problem. I'd be happy to add a comment at this stage already but given that Change classes are dropped later on, it's a problem that gets solved at this point.
@@ -48,14 +48,16 @@ SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT | |||
import javax.servlet.http.HttpServletRequest; | |||
import javax.servlet.http.HttpServletResponse; | |||
|
|||
import org.openrefine.ProjectManager; | |||
import org.openrefine.RefineModel; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RefineModel seems to be from a separate refactoring. I can't find where the package namespace class remapping is done any more, but I'm guessing perhaps it moved there?
It didn't appear to work in my experimentation and I also discovered that there are hardcoded classnames (!!) embedded in the serialized workspace data which caused workspace corruption for me, so this is an area that I'd want to inspect very carefully to make sure we have a robust solution in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, RefineModel is introduced in the previous step (#334) to separate out some functionality (class loading, version information…) from the servlet. The idea being that OpenRefine's operations (and other core logic, such as facets, clustering, importers, exporters…) are exposed in an independent Maven module (or-workflow
) which can be used outside of the context of a Butterfly module (which does the work of defining the web API to interact with this functionality).
registerClassMapping("com.metaweb.refine.*", "org.openrefine.*"); | ||
registerClassMapping("com.google.gridworks.*", "org.openrefine.*"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can probably be removed.
registerClassMapping("com.google.gridworks.*", "com.google.refine.*"); | ||
registerClassMapping("com.metaweb.refine.*", "org.openrefine.*"); | ||
registerClassMapping("com.google.gridworks.*", "org.openrefine.*"); | ||
registerClassMapping("com.google.refine.*", "org.openrefine.*"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work? In my version it couldn't read serialized JSON data with embedded com.google.refine
classnames embedded in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into the issue closely given that I gave up on backwards compatibility with previous workspaces (see #335 (comment)). It's very well possible that somehow this class mapping mechanism has been broken by something else in the past and we just did not notice because it's not actually relied on unless someone is upgrading from a 10-year old version.
I could look into fixing it but I am not sure it's worth the effort, I'd rather make sure we don't use class names in workspace serialization in the new architecture (which should not be hard to reach).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a commit which fixes this class name mapping functionality: ef5f47b
|
||
public class TopListTests { | ||
@Test | ||
public void serializeTopList() throws JsonParseException, JsonMappingException, IOException { | ||
String json = "{" | ||
+ "\"class\":\"com.google.refine.preference.TopList\"," | ||
+ "\"class\":\"org.openrefine.preference.TopList\"," |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be left with the original name to test loading legacy data. For newly written projects we need to make sure they don't include hardcoded classnames.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated tests so that they point to an ostensibly old class name (com.google.gridworks
) to make it clear that we're testing for backwards compatibility in ef5f47b
Thanks for the review! Indeed, I did not attempt to maintain compatibility with previous workspaces at all. Although that could indeed be done at this stage, I did not judge it being worth the effort given that the 4.0 version is meant to use a different workspace directory than the previous versions. That is because project serialization and architecture changes are too big to bridge in a reliable way. I did implement later on an importer to read OpenRefine 3.x projects in 4.0, but it does so by discarding the history of the project (only importing its current state). By using different workspaces we are making sure that old projects don't get corrupted by the new version and can still be opened with the old one. I agree that it's a bit unfortunate that class names are stored in the project serialization. The new architecture addresses this problem later on, when I removed the notion of Change entirely (whose functionality is served by Operation classes). But class names are still used in the preference store, that's something that could be worth cleaning up too. And one could want to import the preference store from the old workspace into the new one. |
This is the second PR in an attempt to break my work on 4.0 into reviewable chunks.
This PR moves from the
com.google.refine
to theorg.openrefine
namespace, for OpenRefine#2269.