-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding shogun #37
adding shogun #37
Conversation
.travis.yml
Outdated
- pip install sphinx sphinx-bootstrap-theme coveralls biopython | ||
- pip install https://github.com/qiita-spots/qiita_client/archive/master.zip | ||
- pip install atropos | ||
- pip install git+https://github.com/knights-lab/SHOGUN.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this line really necessary? I thought the conda install conda install --yes -c knights-lab shogun
will take care of this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's shouldn't have been there. It was form an old draft.
qp_shotgun/shogun/shogun.py
Outdated
# Returns filepaths of new combined files | ||
both_fp = [] | ||
samples = make_read_pairs_per_sample(fwd_seqs, rev_seqs, map_file) | ||
for run_prefix, sample, f_fp, r_fp in samples: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could replace for this and will generate a single FNA
output_fp = join(temp_path, 'merged.fna')
output = open(output_fp, "w")
for run_prefix, sample, f_fp, r_fp in samples:
counter = 0
# Loop through forward file
for seq_header, seq, qual_header, qual in _read_fastq_seqs(f_fp):
output.write("%s_%d\n" % (sample, counter))
output.write("%s\n" % seq)
count+=1
# Loop through reverse file
for seq_header, seq, qual_header, qual in _read_fastq_seqs(r_fp):
output.write("%s_%d\n" % (sample, counter))
output.write("%s\n" % seq)
count+=1
output.close()
return output_fp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A multiple comment question.
qp_shotgun/shogun/shogun.py
Outdated
# Step 2 converting to fna | ||
qclient.update_job_step( | ||
job_id, "Step 2 of 6: Converting to FNA for Shogun") | ||
temp_path = os.environ['QC_SHOGUN_TEMP_DP'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this new folder? Or can we simply use do something like:
temp_path = join(out_dir, 'tmp')
Note that the difference is that if we use QC_SHOGUN_TEMP_DP the folder is gonna be always created in that path, which is in home/where-the-software-was-installed but if we do it like in my suggestion the path is gonna be created in the mount/job-folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was that we can change the base temp path by changing the environment variable without having to reload the whole plugin. However, it makes sense to simply make the temp directory in the output
self.assertEqual(obs, exp) | ||
|
||
def test_generate_fna_file(self): | ||
temp_path = os.environ['QC_SHOGUN_TEMP_DP'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For tests you could use: from tempfile import mkdtemp
to avoid having to set this variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good, a few questions and suggestions.
I think in the current iteration (prior to separating out the taxonomy per-level biom generation as a separate step) you'll want to incorporate the 'regroup' command for the taxonomy tables as with the functional tables.
qp_shotgun/shogun/shogun.py
Outdated
# Step 3 align | ||
qclient.update_job_step( | ||
job_id, "Step 3 of 6: Aligning FNA with Shogun") | ||
generate_shogun_align_commands( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return a cmd
, right?
qp_shotgun/shogun/shogun.py
Outdated
qclient.update_job_step( | ||
job_id, "Step 5 of 6: Functional profile with Shogun") | ||
levels = ['genus', 'species', 'level'] | ||
for level in levels: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll also want to create redistributed Biom files for the taxonomy tables at these levels, a la:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@semarpetrus The linked lines are the command you'll want to add to get the redistributed taxonomy tables. These are then what you'll convert to BIOM representation for the taxonomy portion of the command.
|
||
exp_cmd = [ | ||
('shogun assign_taxonomy --aligner bowtie2 ' | ||
'--database %sshogun --input %s/alignment.bowtie2.sam ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why %sshogun
?
Codecov Report
@@ Coverage Diff @@
## master #37 +/- ##
=========================================
Coverage ? 86.84%
=========================================
Files ? 5
Lines ? 403
Branches ? 78
=========================================
Hits ? 350
Misses ? 24
Partials ? 29
Continue to review full report at Codecov.
|
The error is a timeout from the humman2 tests that we will move to a new repo. Thus, @semarpetrus, @tanaes and I decided to be ok to merge. |
No description provided.