Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev #160

Merged
merged 28 commits into from
Dec 19, 2023
Merged

Dev #160

merged 28 commits into from
Dec 19, 2023

Conversation

charles-cowart
Copy link
Collaborator

Review merge of dev into master branch.

dhakim87 and others added 24 commits March 24, 2023 10:49
* spike in analysis ported from R in two parts.  Can calculate coefficients for linear models to correct from relative to absolute abundance.  Can rescale relative abundance read counts to absolute abundance read counts, relative abundance cell counts, or absolute abundance cell counts.  TODO: absolute abundance cell counts per gram

* Updated file paths to be based from root of the checkout

* Lint nonsense for ancient 80 character terminals

* More linter nonsense

* PR review comments
Removed function to generate read count output; updated function for generating cell counts to generate cell counts per gram of input sample material
Added back function needed for testing to work
Added column for sample input weights and updated pool variable name
* add final tests for pooling nb

* move compress_plates_exp file

* move plate map files

* fix filepath of platemaps in compression test

* move the files back

* more path errors

* more path errors 2

* fix tests monday oct 23

* flake8 changes

* revert to original version of 2 functions; comment out 2 test cases

* try 2: remove changes to functions; remove test cases

* working notebook

* flake8 fixes

* flake8 fix
config file for abs quant sample information calculations
…and unit-test (#155)

* port to modules of Mackenzie's NPH_calc_df_SLURRY.ipynb functionality

* port to modules of Mackenzie's NPH_calc_df_SLURRY.ipynb functionality

* Notebook debugging

* Updates to readme

* Remove .DS_Store files

* Code review fixes

* Code review fix from Mackenzie

* Corrected MacKenzie's name--apologies!

* Removed early implementation of SynDNA which now is in a separate repo.
* port to modules of Mackenzie's NPH_calc_df_SLURRY.ipynb functionality

* port to modules of Mackenzie's NPH_calc_df_SLURRY.ipynb functionality

* Notebook debugging

* Updates to readme

* Remove .DS_Store files

* Code review fixes

* Code review fix from Mackenzie

* Corrected MacKenzie's name--apologies!

* Removed early implementation of SynDNA which now is in a separate repo.

* Lint fixes

* Fix weird formatting

* Ugh still more lint fixes
…. Added unit test for matrix_tube_pipeline (#153)

* sync absolute_quant, matrix_tube, plate_replication notebooks.

Rewrote the iSeqnorm notebook to include absolute_quant and plate_replication features. Also added the same features to new matrix_tube notebook.

* added validate_plate_dataframe() and moved functions to metapool.py

refactored the sample validation modules into a function. Moved new functions into metapool.py

* tested notebooks and cleared output for commit. Small change to validate function arguments

* added unit tests for compress_plates(), add_controls(), and validate_plate_df()

maybe compress_plates() could use better error handling. No warnings or error handling in compress_plate() at the moment, relies on pandas error handling.

* cleaned up notebook. renamed it.

* plate replication feature added to notebooks

deleted plate_replication notebook since I added the feature modules to metagenomics_pipeline_iSeqnorm.ipynb and matrix_tube_pipeline_seqcount_norm.ipynb

* Removed DS_Files. Adjusted gitignore.

* Update metapool.py

Incorporated feedback from @charles-cowart

* Update metapool.py

fixed read_file() helper function

* addressed issues #148 and #82

* flake8

* Added helper function, read_viosionmate_file() and associated tests

Homogenized the VisionMate expected files. Added more tests and warnings

* flake8

---------

Co-authored-by: Charles Cowart <ccowart@ucsd.edu>
updating DNA/RNA shield lot number information for second and third lots for NPH
* WIP refactoring code into class objects

Refactoring code into class objects so that each version of a
sample-sheet can have its own list of requirements.

* Fixed existing tests

* Added create_sample_sheet()

Added create_sample_sheet() creates and returns the proper SampleSheet
object based on input parameters. Replaces object creation using =
KLSampleSheet() expression.

* WIP: Profiles outlined

A pair of (SheetType, SheetVersion) is implemented as a subclass of
KLSampleSheet e.g.: MetagenomicSampleSheetv100, AbsQuantSampleSheetv10.
Each implementation is able to override the values of column lists and
key/value pairs as needed to realize a specific version of a
sample-sheet. Entire sections can be added or removed from a subclass if
needed, but this hasn't been needed yet.
All legacy tests were working but I broke them again once I removed
synda_pool_number from classes other than AbsQuantSampleSheetv10 since
they don't need it. I also reverted contains_replicates column and
well_id_384 became Sample_Well again for the base KLSampleSheet class.

* WIP: Cleaning up existing tests

* Fixed existing tests

* Flake8

* KLSampleSheet() no longer instantiable.

Subclasses of KLSampleSheet() are used instead.

* updated make_sample_sheet()

updated client-facing interface for creating sample-sheets using
metadata dictionary and data from a DF to reflect latest discussions
w/Rodolfo.

* cleanup

* Add instantiation test

KLSampleSheet() can no longer be called. One of the child classes of
KLSampleSheet() must be instantiated instead.

* Update metapool/sample_sheet.py

Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu>

* Update metapool/sample_sheet.py

Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu>

* Update metapool/sample_sheet.py

Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu>

* Update metapool/sample_sheet.py

Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu>

* Update metapool/sample_sheet.py

Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu>

---------

Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu>
* port to modules of Mackenzie's NPH_calc_df_SLURRY.ipynb functionality

* port to modules of Mackenzie's NPH_calc_df_SLURRY.ipynb functionality

* Notebook debugging

* Updates to readme

* Remove .DS_Store files

* Code review fixes

* Code review fix from Mackenzie

* Corrected MacKenzie's name--apologies!

* Removed early implementation of SynDNA which now is in a separate repo.

* Lint fixes

* Fix weird formatting

* Ugh still more lint fixes

* Fix incorrect comment id'd by MacKenzie
#158)

* Added absolute_quant specific column tracking and SheetType and SheetVersion user interface

Added tracking for columns needed for absolute quant. Added front facing fields to specify SheetType and SheetVersion when generating samplesheets. Updated unit tests accordingly. Incorporated Daniel's feedback on some fields in the plate_compression form (Project_Plate is now a concatenation of Project Name and the plate number)

* Added new samplesheet generation scheme to notebooks. Plus other minor fixes.

Corrected some column header discrepancies. Added a global samplesheet requirement for plate_replication flag (boolean). Succesfully tested abs_quant_metag samplesheet generation. Changed notebook evp total_volume parameter value (200->190).

* Fixes issues w/tests

* flake8

* Edited notebooks to default to "standard_metag" samplesheet. Cleaned up test_output folder.

I verified the make_sample_sheet() function worked for both standard and absolute quant metag. Cleaned up the test_output folder to delete files that had dates instead of YYYY_MM_DD.

---------

Co-authored-by: Charles Cowart <ccowart@ucsd.edu>
@charles-cowart
Copy link
Collaborator Author

@RodolfoSalido can you confirm that notebooks/plate_replication.ipynb was removed or renamed from the repository? I can see it exists in master, but not in the current dev branch. Thanks!

@RodolfoSalido
Copy link
Collaborator

it was removed, but the functionality is now added to both the metagenomic_pooling_notebook_iSeqnorm, and the matrix_tube_pipeline_seqnorm notebooks.

@charles-cowart
Copy link
Collaborator Author

it was removed, but the functionality is now added to both the metagenomic_pooling_notebook_iSeqnorm, and the matrix_tube_pipeline_seqnorm notebooks.

Thanks Rodolfo!

metapool/metapool.py Outdated Show resolved Hide resolved
metapool/metapool.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@RodolfoSalido RodolfoSalido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed most of the changes that I am familiarized with. Added some comments for future work and some small corrections that shouldn't have big consequences.

charles-cowart and others added 4 commits December 14, 2023 17:08
* Updated README.md

* Reapply Metagenomics -> Metagenomic

* Updates to notebooks

Updates to notebooks that modify/use YYYY_MM_DD_FinRisk_33-36_samplesheet.csv.
Confirmed current csv is correct in form.
Notebooks updated to use validate_sample_sheet() correctly.
validate_sample_sheet() and its helpers updated to no longer return a
sheet object.

* tests updated

* Removing obsolete notebook + files.

Removing obsolete notebook on Rodolfo's recommendation.
Removing files that are no longer mentioned in any file in this repo.

* Updating files

* Removed obsolete test
* Added support for abs-quant in seqpro

* Fixed? test

* cleanup
Give AmpiconSampleSheet a SheetType and SheetVersion, since it's
expected that children of KLSampleSheet have them defined. Dummy
amplicon sample-sheets are created by SPP to pass to bcl-convert, based
on the lab-supplied pre-prep file. Giving it the type 'dummy_amp' avoids
having it erroneously say 'standard_metag'.
@charles-cowart
Copy link
Collaborator Author

@RodolfoSalido Thanks for the review! I corrected the documentation. I agree it would be nice to remove some of the exp output files but that can wait for another day.

@charles-cowart charles-cowart merged commit 12560b9 into master Dec 19, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants