Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment and directory structure updates #8

Merged
merged 29 commits into from
Mar 5, 2020

Conversation

evanroyrees
Copy link
Collaborator

This is partially to display the git workflow via forks, pull requests, reviews and merges. You do not need to do a comprehensive review here. These edits are minor and should work out some of the kinks I have encountered while testing on the Boreal datasets as well as on my Mac

Updates

  • Removed unused scripts
  • Changed shebang where env now specifies python instead of python3 (This otherwise results in a cryptic error. Thank you, @cfrancoeur for bringing this to my attention.
  • Added ncbi directory under databases using .gitignore. (This will ensure the user has the required database directories when downloading NCBI or markers files, respectively)
  • Post-processing scripts are in a validation directory
  • Updated parameters in tests/metagenome.config
  • Minor tqdm edit in lca.py to display thousand delimiter for ORFs search description.
  • Added bhsne.py under binning directory to provide user capability to run python2.7 BH-tSNE (Note: conda env setup is within bhsne.py script.
  • Bug fix to hmmer.py main script where input and output args were the same. Updated argparse parameters accordingly.
  • Suppressed numba logger at DEBUG level in kmers.py
  • Updated kmers.py args with default --n-components=2 and --do-pca as well as --pca-dimensions are now available in the main call to kmers.
  • Minor edits to README.md with information on the first version of Autometa and how to construct a coverage table with the current iteration (in dev)
  • Created a datasets directory to which will be populated with functions to easily retrieve data specified in upcoming Autometa releases. I.e.

Example usage of scripts populating datasets directory

from autometa.datasets import simulated_community
from autometa.datasets import synthetic_community
# Retrieve specified simulated community
fasta = simulated_community.get('78.125')
# Retrieve all simulated communities mentioned.
fastas = simulated_community.get('all')
# Same for synthetic, etc...
fasta = synthetic_community.get('MIX51')

chanana and others added 27 commits February 2, 2020 10:17
This is a temporary setup and could change later. For now, each user
(class User read here: https://docs.djangoproject.com/en/3.0/ref/contrib/auth/)  has projects (class Project) and each project has jobs
(class Job). Each class creates a data table and is mapped to the
related table using foreign keys. Please read more here:
https://docs.djangoproject.com/en/3.0/ref/models/fields/#foreignkey
This reverts commit aa1dbc1.
…ctory with 78Mbp simulated community. updated config files. py2.7 bhsne for kmers in its own script to run py2.7 version. removed shebang specifically specifying python3 to avoid cryptic errors where user defined python env is not selected when run. Added .gitignore to ncbi dir under databases to keep empty directory. Post-processing are in their own directory under validation.
@evanroyrees evanroyrees added bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request labels Feb 11, 2020
@jason-c-kwan
Copy link
Collaborator

I'm going to have to finish my review of the refactored code first before I can assess this. Is that OK? I just would feel better understanding the code first.

@evanroyrees
Copy link
Collaborator Author

Yes, no problem. Since I have a fork of the repository, I can operate on my own repo until the changes are merged.

…reflect checkpointing. Renamed MAG class to Mag to follow python conventions. Added prodigal parsing to lca.py. Reflected in majority_vote.py. removed superfluous attributes for DiamondResult object. Updated metagenome.config with checkpoints.tsv file.
…and hmmer.py). By default will pass in ORFs retrieved from Mag object.
@jason-c-kwan jason-c-kwan merged commit 83109f3 into KwanLab:dev Mar 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants