Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replaced TDB with HDT #64

Merged
merged 30 commits into from
Jan 22, 2021
Merged

Replaced TDB with HDT #64

merged 30 commits into from
Jan 22, 2021

Conversation

svandenhoek
Copy link
Collaborator

@svandenhoek svandenhoek commented Jan 18, 2021

Changes

  • Vibe now uses a HDT database instead of a TDB.
  • TDB code has been removed, but can be reverted through commit fa97f91.
  • Minor overhaul of shared test resources to reduce chance of overlapping test resources (now individual modules should only not have a shared directory themselves).
  • Creation of HDT was done using the vibe-3.1.0 TTL files (so not from the start), so output should be identical (except perhaps the order of diseases with the same score within the results of a single gene).
  • Readme is updated.

Testing notes

  • New HDT still gives same results as original TDB (script was adjusted to use the optimized HDT instead of optimized TDB):
$ sh TestOptimizedQueries.sh -or ~/Programming/data/vibe/database_creation/vibe-5.0.0-sources-tdb/ -op ~/Programming/data/vibe/database_creation/vibe-5.0.0-hdt/vibe-5.0.0.hdt
### Running original TDB/query.
Time: 74,405 sec
### Running optimized HDT/query.
### Validating if optimized TDB/query output files are equal to their original counterparts.
test/genes_for_hpo-optimized.tsv: OK
  • Some new tests are not supported on Jenkins (due to changing file/dir permissions within a unit test), so a tag skipOnJenkins has been created and -DexcludedGroups='skipOnJenkins' was added to the Jenkinsfile on several places. These tests are still run locally by default and mvn clean install did not cause any issues.
  • md5 of the final .tsv files seem to differ between the TDB & HDT files, though multiple runs for a single format do not differ. It seems however that the cause of this might be that certain genes (or diseases for a single gene) with the same score are outputted in a different order.

@sonarcloud
Copy link

sonarcloud bot commented Jan 21, 2021

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 3 Code Smells

81.4% 81.4% Coverage
0.0% 0.0% Duplication

@bartcharbon bartcharbon self-assigned this Jan 22, 2021
* Download the [HPO.owl][hpo_owl]
* Make sure you have [Java 8 or higher][java_download]
* Open a terminal and run VIBE. `java -jar vibe-with-dependencies.jar -d -t TDB/ -o results.tsv -p HP:0002996 -p HP:0001377`
* Open a terminal and run VIBE. `java -jar vibe-with-dependencies-<version>.jar -d -t vibe-<db-version>.hdt -o results.tsv -p HP:0002996 -p HP:0001377`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when I try to run with these arguments, I get an error: Missing arguments: -w

when adding the missing argument it works fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like a documentation error that already existed before this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants