-
Notifications
You must be signed in to change notification settings - Fork 0
Relocate IngestPipeline method about metadata convention to CellMetadata Class (SCP-3631) #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
eweitz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
setup.py
Outdated
| setup( | ||
| name="scp-ingest-pipeline", | ||
| version="1.11.0", | ||
| version="1.11.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current version is 1.12.0, so this should be higher than that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch. I had incremented to test using TestPyPI. I forgot that I needed to update it accurately (PyPI won't let you use a version tag twice and I was afraid I'd accidentally mess up 1.12.1 in my testing!)
|
|
||
| def conforms_to_metadata_convention(self): | ||
| """ Determines if cell metadata file follows metadata convention""" | ||
| convention_file_object = IngestFiles(self.JSON_CONVENTION, ["application/json"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope, but it'd be nice if we define default MIME types for given extensions (e.g. ("json": ["application/json"]) in only one place, so we could typically omit second parameters like this ["application/json"].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, it would be nice to do.
Codecov Report
@@ Coverage Diff @@
## development #216 +/- ##
===============================================
- Coverage 71.44% 71.29% -0.15%
===============================================
Files 26 26
Lines 3089 3101 +12
===============================================
+ Hits 2207 2211 +4
- Misses 882 890 +8
Continue to review full report at Codecov.
|
bistline
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for figuring this out finally!
Co-authored-by: bistline <bistline@broadinstitute.org>
Tests in single_cell_portal repo fail due to Classes in the repo that were refactored earlier this year. Methods used for tests in single_cell_portal are now obsolete. To use scp-ingest-pipeline as-is in the tests, an IngestPipeline object would need to be instantiated to use the conforms_to_metadata_convention method, which is more complex than instantiating a CellMetadata object which is what the current, broken tests do. Logically, conforms_to_metadata_convention should be a CellMetadata method because none of the other ingest file types require the metadata convention. To ease the repair of tests in the single_cell_portal repo, this PR moves the conforms_to_metadata_convention method to the Cell Metadata Class.
To test (is kinda finicky right now, sorry!)
pre-test: find a study in your dev instance and get a valid study_id and study_file_id
ensure your python environment is up to date with requirements.txt in scp-ingest-pipeline repo
(not sure if this works differently if your gcloud config is set to your service account or your Broad identity, I used my Broad identity)
Have your mondoDB credentials set up as environment variables (Let me know if this isn't documented already, I think Eno made a doc somewhere. I have a script that set's it up for me using vault commands)
Run the following
python ingest_pipeline.py --study-id --study-file-id ingest_cell_metadata --cell-metadata-file <path/to/repo>/scp-ingest-pipeline/tests/data/annotation/metadata/convention/valid_array_v2.1.2.txt --study-accession --ingest-cell-metadata --validate-convention
Expect many warning messages but no errors.
This work supports SCP-3631 and corrects the import errors that blocked SCP-3414.