-
Notifications
You must be signed in to change notification settings - Fork 3
Pride submission prep #43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- save some intermediate files - visualize some stats
- prepare upload to pride (which unified and annonymized identifiers)
- data curation - allow selection of raw data
- only load intensities - transpose and create mask view in separate document - dump counts for samples and features
- create machine specific subfolders for pride - instrument_name added for subfolders
- don't support long data for now - skip categorical data checking keep old code as comments for now (as a reminder)
- was at some point used to investigate which data to use
- notebook is for exploration of single MaxQuant folder
- erda notebooks create dumps which are then processed in "hela" notebooks - rename and describe
- create folder and put commands for raw files - use -f for using commands read from a file with lftp - start uploading
- sanity checks and upload missing or incomplete files - rename sample names in MQ output
technology type -> indicates that it is not RNA (MAGE-TAB format)
- all 7444 selected files for upload are used to create unified dums - small plotting improvements and minor other changes
- check all files are in list of files (queried from server) - create some dummy files (placeholders) locally for pride submission tool - manuelly annotate the submission.px text file from the submission tool (basically: add files) - 🐛 SDRF file had ontology issues (and cellline template was not enough)
- plots based on metadata
- metadata is provided on pride ("pride_metadata.csv")
Splitted metadata creation from analysis
- relevant information of mq_summaries.csv also provided in metadata_pride.csv
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pride data release preparation.
pride submission scripts ->
00_0_*.ipynbupdate erda (FTP server) notebooks, build dumps for pride of 7,444 selected files ->
erda_*.ipynbadd python version of notebooks for better diffs in the future