Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/hcatoscea_test_data_ami_update #59

Conversation

ami-day
Copy link
Contributor

@ami-day ami-day commented Jan 25, 2022

Fixed input test files (TEST01.xlsx, TEST02.xlsx) which were resulting in incorrect output files. Updated the arguments.csv file to reflect this.

@amnonkhen amnonkhen changed the base branch from master to feature/hca-to-scea-test-data January 26, 2022 13:06
@@ -0,0 +1,30 @@

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to suggest a differnt naming convention for the test data.
Please name the spreadsheet and the directory containing the idf and sdrf using the accession number.
In this case:

  • spreadsheet: SRP274475.xlsx
  • idf & sdrf should be in test/golden/SRP274475

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok will do.

Copy link
Contributor Author

@ami-day ami-day left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes are correct and intended.

@amnonkhen
Copy link
Collaborator

These changes are correct and intended.

@ami-day I do not understand your comment.

@ami-day
Copy link
Contributor Author

ami-day commented Feb 15, 2022

These changes are correct and intended.

@ami-day I do not understand your comment.

Hi @amnonkhen I'm not sure what I meant by this. I made the changes you suggested. Can this request be approved now?

Copy link
Collaborator

@amnonkhen amnonkhen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the hca to scea test input spreadsheet. I do not, however, see the expected results files, namely the idf/sdrf files for each of the test cases?
Before merging your changes, the tests should run successfully.
At least for your test cases.
You can run the tests on your environment by running python -munittest.
Also, once you push them to the remote branch github runs them for you. Please examine the output, and fix the errors. If you don't undersatnd the error messages, talk to me.

…ich are not supported by the hca2scea tool

as it is currently. I will create tickets for relevant enhancements to the code.
Removed a .DS_Store file that was causing issues with test validation.
Replaced None values with empty string when a protocol is applied to some samples and not others.
Also changed how the protocol columns are ordered on a protocol type basis.
These changes preserve the correct protocol id order for each protocol type.
…-to-scea-test-data-ami-update

The hca2scea.py script in the feature/hca-to-scea-test-data-ami-update branch incorrectly enters duplicated SRA file names in the sdrf file. The test branch needs to be updated with the corrected code, and corresonding corrected expected test output files.
@ami-day
Copy link
Contributor Author

ami-day commented Feb 23, 2022

Thanks for updating the hca to scea test input spreadsheet. I do not, however, see the expected results files, namely the idf/sdrf files for each of the test cases? Before merging your changes, the tests should run successfully. At least for your test cases. You can run the tests on your environment by running python -munittest. Also, once you push them to the remote branch github runs them for you. Please examine the output, and fix the errors. If you don't undersatnd the error messages, talk to me.

Thanks @amnonkhen I have done this, the test is returning "OK" now.

enrichment_protocol_cols = []
library_preparation_protocol_cols = []
for col in list(protocols_sdrf_before_sequencing.columns):
if col.split(".protocol_core.protocol_id")[0] == "collection_protocol":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to repeat col.split(".protocol_core.protocol_id")[0] 4 times.
Define a variable and use it in the 4 locations.

@@ -11,10 +11,10 @@ Person Email alison.simmons@imm.ox.ac.uk
Person Affiliation MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) Target Discovery Institute MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) John Radcliffe Hospital, University of Oxford MRC Weatherall Institute of Molecular Medicine Sir William Dunn School of Pathology, University of Oxford MRC Weatherall Institute of Molecular Medicine MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) MRC Weatherall Institute of Molecular Medicine (WIMM) Target Discovery Institute Target Discovery Institute St Mary's Hospital, Imperial College MRC Weatherall Institute of Molecular Medicine MRC Weatherall Institute of Molecular Medicine (WIMM) EMBL-EBI
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we have Test01 output file but now Test01 spreadsheet in the arguments?
This file will not be used anywhere.

@ami-day
Copy link
Contributor Author

ami-day commented Oct 19, 2022

Closing this branch, it is an older version that is now out of date.

@ami-day ami-day closed this Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants