Skip to content

Conversation

@jlchang
Copy link
Contributor

@jlchang jlchang commented Nov 5, 2021

Generating dense expression matrix files incorrectly can incorporate dataframe row indices in the matrix file. When this user error occurs, the row indices become the first column of the expression matrix and gene names end up in the second column, where gene expression values are expected, causing ingest to fail.

This PR improves upon the existing error message by passing along the offending value for easier troubleshooting.

To test, set your local instance "Ingest Pipeline Docker Image" configuration to use
gcr.io/broad-singlecellportal-staging/scp-ingest-jlc_improve_error_msg:6b54eca

and upload processed matrix, using the dense matrix the file found at:
gs://fc-2f8ef4c0-b7eb-44b1-96fe-a07f0ea9a982/test_Data/ingest_manual_test/non-numeric_dense_value.csv

The errror in the email notification should state

Expected numeric expression score - could not convert string to float: 'Sox17'

OR

run ingest_pipeline locally - you'll need both a valid study-id and a valid study-file-id
(recommendation: create a designated local study for such tests, upload a small text file as a file of type "other" and use its study-file-id)

python ingest_pipeline.py --study-id <your local test study-id> --study-file-id <your local test study-file-id> ingest_expression --matrix-file gs://fc-2f8ef4c0-b7eb-44b1-96fe-a07f0ea9a982/test_Data/ingest_manual_test/non-numeric_dense_value.csv --matrix-file-type dense --taxon-name Homo sapiens --taxon-common-name human --ncbi-taxid 9606

The resulting user_log.txt file should contain:

Expected numeric expression score - could not convert string to float: 'Sox17'

This addresses SCP-3832

@jlchang jlchang merged commit daa9f22 into development Nov 10, 2021
@jlchang jlchang deleted the jlc_improve_error_msg branch November 10, 2021 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants