Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VIVO 3838: Upgrading to 4 #379

Draft
wants to merge 8 commits into
base: jena4-upgrade
Choose a base branch
from

Conversation

kaladay
Copy link
Contributor

@kaladay kaladay commented Mar 9, 2023

This is a draft of incomplete fixes that are provided as a PR for easier consumption of another developer to pick up where I am leaving off.

This is not intended to be merged as-is and will eventually be closed rather than merged into the target branch. Consider using the branch represented by this Draft PR as a starting point.

The state of these changes:

  • This is based off of VIVO 3828: Remove SDB in preparation to moving to new Jena. #374 in expectation of that code being merged into the jena4-upgrade branch (This will be easier to review once that PR is merged).
  • Jena 4 is upgrade to 4.7.0.
  • This uses DatasetFactory.createGeneral() as a replacement to the once deprecated and now removed DatasetFactory.createMem().
  • The DatasetFactory.createTxnMem() cannot be used without further design changes as several tests fail and there appears to be a performance loss.
  • Several methods were removed and a few were added as necessary (some being empty stubs).
  • Some lost functionality had to be re-created in a small number of cases.
  • There are incidental code refactor changes.
  • Currently only a handful of tests fail with all of the remaining passing.
  • This has been confirmed to run, populate, and work (without extensive testing).
  • Given the test failures, there is an expectation of problems that are not observed with minimal startup and click testing.

Some of the incidental refactor code changes happened due to trying to cross reference the current implementation and the upstream class being overriden. I found it easier to just make the files identical and scroll up and down each until everything being overwritten matches. I added comments breaking up such files for easier comparison.

The following are some observations regarding on of the failing tests.

The recomputeABox1 test from SimpleReasonerSameAsTest is failing.
The problem appears to be associated with ABoxRecomputer.

I observed a problem where after the recompute is performed the data in the inference becomes incorrect.

The data is looking like:

http://test.vivo/b @owl:sameAs http://test.vivo/b;

When instead the data should look like:

http://test.vivo/b @owl:sameAs http://test.vivo/a;

The entire inference before recomputing looks like:

{
  http://test.vivo/a @rdf:type owl:Thing;
  http://test.vivo/a @http://vitro.mannlib.cornell.edu/ns/vitro/0.7#mostSpecificType owl:Thing;
  http://test.vivo/d @http://vitro.mannlib.cornell.edu/ns/vitro/0.7#mostSpecificType owl:Thing;
  http://test.vivo/c @http://vitro.mannlib.cornell.edu/ns/vitro/0.7#mostSpecificType owl:Thing;
  http://test.vivo/b @owl:sameAs http://test.vivo/b;
  http://test.vivo/b @rdf:type owl:Thing;
  http://test.vivo/b @http://vitro.mannlib.cornell.edu/ns/vitro/0.7#mostSpecificType owl:Thing
}

After recomputing, the data looks like:

{
  http://test.vivo/a @rdf:type owl:Thing;
  http://test.vivo/a @http://vitro.mannlib.cornell.edu/ns/vitro/0.7#mostSpecificType owl:Thing;
  http://test.vivo/d @http://vitro.mannlib.cornell.edu/ns/vitro/0.7#mostSpecificType owl:Thing;
  http://test.vivo/c @http://vitro.mannlib.cornell.edu/ns/vitro/0.7#mostSpecificType owl:Thing;
  http://test.vivo/b @owl:sameAs http://test.vivo/b;
  http://test.vivo/b @rdf:type owl:Thing;
  http://test.vivo/b @http://vitro.mannlib.cornell.edu/ns/vitro/0.7#mostSpecificType owl:Thing
}

Consider reviewing the individual commits as they may have further details regarding that given change set.

kaladay and others added 8 commits March 6, 2023 14:37
The Jena version being switched to (Jena 4) has removed support for SDB.

The current forms of `DatasetWrapperFactory` and `StaticDatasetFactory` should no longer be needed.

The SDB related code has been stripped out.
Many of the classes with "SDB" in the name that actually provide more than just SDB have been re-created.
These recreations are generally child classes of a similarly named Jena class.
These recreates have "DB" in their name rather than "SDB".

The `DatasetFactory.createMem()` is deprecated and may not be available in Jena 4.
Attempts to replace this with `createTxnMem()` have revealed problems that are potentially transaction related.
Replacing with `createGeneral()` might be possible but this is not done to avoid introducing more potential problems.

Notable points in regards to replacing `DatasetFactory.createMem()`:
1) The method is deprecated.
2) The `DatasetFactory.createGeneral()` is the compatible equivalent.
3) The `DatasetFactory.createGeneral()` better supports TDB (which is the main reason for including in this commit set).
4) The `DatasetFactory.createTxnMem()` is the more recommended alternative and is transactional.
5) This better prepares the code for the upcoming Jena upgrade changes.

There are some serious existing problems with closing dataset connections (and related).
The (now removed) `DatasetWrapperFactory` probably should be rewritten to provide a new dataset each time rather than copying an existing one.
The `close()` functionality appears to not be being called for TDB due to the SDB conditions.
With this SDB condition the connections are now being closed.
This has exposed several problems that both tests and runtime expose.
This problem is considered out of scope for the changes and the close code is commented out with a FIXME statement.

The documentation for `example.runtime.properties` refers to `VitroConnection.DataSource.*` only being used by SDB but this is incorrect.
The OpenSocial code appears to talk directly to an SQL database using these properties.
Update the documentation in this regard and replace the references to SDB with OpenSocial.

Remove no longer necessary SDB helper code that is added to TDB as well because it "shouldn't hurt".

Remove much of the documentation and installation functionality using or referencing SDB.

The `IndividualDaoSDB.java` file is implemting `getAllIndividualUris()` and `getUpdatedSinceIterator()` via the now removed `IndivdiualSDB` class.
This functionality is used by `RebuildIndexTask` and `UpdateUrisTask` classes.

The new *DB classes now have java docs.
These java docs have a bare-bones level of detail.
…ropertyStatementDao.java

Co-authored-by: Dragan Ivanovic <chenejac@uns.ac.rs>
…dividualDaoJena.java

Co-authored-by: Dragan Ivanovic <chenejac@uns.ac.rs>
…y.createGeneral().

The current design of Vitro does not seem to properly handle transactional memory.
Switching from `DatasetFactory.createTxnMem()` to `DatasetFactory.createGeneral()` fixes several of the problems (namely failed tests).
The `@Override` annotation is not needed anymore by Java.
I historically want to remove it for this reason as it saves space and makes the code a little easier to maintain.

The migration between Jena 3 to Jena 4 presented me with a strong reason to not only keep these annotations but to also promote their continued use.

When upgrading and the dependent model completely removes the overridden method, the `@Override` results in an error.
This makes it easy to identify which methods are removed during such a migration.
If the removed method is to be still maintained, then the `@Override` can simply be removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant