Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snovault conversion #1287

Closed
mrmin123 opened this issue Feb 6, 2017 · 3 comments
Closed

Snovault conversion #1287

mrmin123 opened this issue Feb 6, 2017 · 3 comments
Assignees

Comments

@mrmin123
Copy link
Contributor

mrmin123 commented Feb 6, 2017

No description provided.

@mrmin123 mrmin123 self-assigned this Feb 6, 2017
@kilodalton kilodalton added R11 and removed R10 labels Feb 21, 2017
@mrmin123
Copy link
Contributor Author

mrmin123 commented Mar 6, 2017

Sanity checklist:

  • Get clincoded to build w/ imported snovault package ✅
  • Get server up and running locally ✅
    • Changes to python backend (imports, new functions, implementations, etc) ✅
  • Get frontend to parity ✅
    • Get logins working ✅
    • Get standard pages working ✅
    • Get db/form interactions in curation pages working ✅
    • Get collections working ✅
    • Get search working ✅
    • Get view pages working ✅
  • Get tests to work ✅
    • Pyramid tests ✅
    • Browser tests ✅
    • Javascript test ✅
  • Get server up and running on instances ✅
    • Get server up and running on instances using production data ❓

@mrmin123
Copy link
Contributor Author

What's gone?

The entire contentbase folder is gone, so its functionalities (tests, elasticsearch settings, podtgres db, etc) have all been offloaded to snovault.

snovault dependency

The branch is currently pointing to ClinGen/snovault (see buildout.cfg, but should point to the official ENCODE-DCC/snovault repo when PR#26 gets merged in. The PR notes why this modified version of snovault is necessary.

Many things have changed as part of the transition over to snovault, which will be covered in this (and possibly subsequent) comment(s).

Updated commands

There's a new build command:

python3 bootstrap.py -v 2.4.1 --setuptools-version 18.5

There's a new browser test command:

bin/test -m bdd -v -v --wsgi-arg port_range.min 65525 --wsgi-arg port_range.max 65535

(Note the new wsgi-arg arguments, which is why the modifications to snovault was necessary, and is how we get our authentication tests to work)

There's a new database dump command:

pg_dump --no-owner postgresql://postgres@:5432/postgres?host=/tmp/snovault/pgdata > DEV_TEST_DB_DUMP_9.4

(README and docs have been updated to reflect these new commands)

How far did I get?

I can't say for certain that everything is at parity, but from my testing, it seems to be very, very close. This branch should be thoroughly tested before merging, however, as much of our frontend does not have test coverage, and I certainly may have missed something.

@mrmin123
Copy link
Contributor Author

Changes to data objects

snovault introduced some changes to the data schema and how it interacts. Primarily, how objects are named and referenced. An object's name is specified and referred to in the following ways, using the article object as an example:

  • Data schema json (article.json)
  • Test data json (article.json)
  • Object loaded into Pyramid via loadxl.py (defined in ORDER variable as article + defined in PHASE1_PIPELINES and PHASE2_PIPELINES as necessary; I am unsure what this second step does)
  • Object and embedded objects/reverse links specified in types/__init__.py
    • Collection URL defined (articles)
    • Collection unique key defined (article:pmid)
    • Class name defined (Article)
    • Item type defined (article)
    • Schema defined (clincoded:schemas/article.json)
  • In the javascript, when defining views and checking against item @types: Article

Basically, how an object's class is defined in the types/__init__.py file is reflective of how it should be referenced within the javascript. I don't know if this is the underlying reason, but because of Python classes typically starting with upper case letters, the object types in the javascript now also have upper case letters. If you define the class with lower case letters, it is possible to refer to them within the javascript with lower case casing.

That said, the way snovault defaults to reading types has changed somewhat to where it only accepts all-lowercase type names (this is independent of the class name/how types are references in the javascript). Objects that originally had camel cased names (caseControl, curatorHistory, evidenceScore, orphaPhenotype, and provisionalClassificaton) caused some issues due to how snovault automatically converts/expects fully-lowercase names. Non-python Class (Article) references need to be fully lowercase, otherwise snovault's resources.py's __getitem__() method will error out at viewing the Collection views for the objects, even though it seems like accessing the objects work fine otherwise.

While it is possible to name the schema and test data JSON files with camel case and get around the collections view issue by referencing the files in the loadxl.py file by the lowercase names (orphaPhenotype.json, but referred to as orphaphenotype in loadxl.py), this will cause issues in tests and instances because there the system looks for orphaphenotype.json.

These camel-cased objects were renamed to fully-lowercase counterparts in the related PR, but they this does mean that live/production data must be somehow converted to these new object types on the postgres database/elasticsearch backend, otherwise the data will not be loaded properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants