Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding shogun #37

Merged
merged 36 commits into from
Apr 20, 2018
Merged

adding shogun #37

merged 36 commits into from
Apr 20, 2018

Conversation

antgonza
Copy link
Member

No description provided.

@coveralls
Copy link

coveralls commented Feb 21, 2018

Coverage Status

Coverage decreased (-5.3%) to 89.154% when pulling 40f78ae on antgonza:shogun into 77e5cd3 on qiita-spots:master.

.travis.yml Outdated
- pip install sphinx sphinx-bootstrap-theme coveralls biopython
- pip install https://github.com/qiita-spots/qiita_client/archive/master.zip
- pip install atropos
- pip install git+https://github.com/knights-lab/SHOGUN.git
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line really necessary? I thought the conda install conda install --yes -c knights-lab shogun will take care of this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's shouldn't have been there. It was form an old draft.

# Returns filepaths of new combined files
both_fp = []
samples = make_read_pairs_per_sample(fwd_seqs, rev_seqs, map_file)
for run_prefix, sample, f_fp, r_fp in samples:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could replace for this and will generate a single FNA

    output_fp = join(temp_path, 'merged.fna')
    output = open(output_fp, "w")
    for run_prefix, sample, f_fp, r_fp in samples:
        counter = 0
        # Loop through forward file
        for seq_header, seq, qual_header, qual in _read_fastq_seqs(f_fp):
            output.write("%s_%d\n" % (sample, counter))
            output.write("%s\n" % seq)
            count+=1
        # Loop through reverse file
        for seq_header, seq, qual_header, qual in _read_fastq_seqs(r_fp):
            output.write("%s_%d\n" % (sample, counter))
            output.write("%s\n" % seq)
            count+=1
   output.close()
   return output_fp

Copy link
Member Author

@antgonza antgonza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A multiple comment question.

# Step 2 converting to fna
qclient.update_job_step(
job_id, "Step 2 of 6: Converting to FNA for Shogun")
temp_path = os.environ['QC_SHOGUN_TEMP_DP']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this new folder? Or can we simply use do something like:
temp_path = join(out_dir, 'tmp')
Note that the difference is that if we use QC_SHOGUN_TEMP_DP the folder is gonna be always created in that path, which is in home/where-the-software-was-installed but if we do it like in my suggestion the path is gonna be created in the mount/job-folder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that we can change the base temp path by changing the environment variable without having to reload the whole plugin. However, it makes sense to simply make the temp directory in the output

self.assertEqual(obs, exp)

def test_generate_fna_file(self):
temp_path = os.environ['QC_SHOGUN_TEMP_DP']
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tests you could use: from tempfile import mkdtemp to avoid having to set this variable.

Copy link
Contributor

@tanaes tanaes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, a few questions and suggestions.

I think in the current iteration (prior to separating out the taxonomy per-level biom generation as a separate step) you'll want to incorporate the 'regroup' command for the taxonomy tables as with the functional tables.

# Step 3 align
qclient.update_job_step(
job_id, "Step 3 of 6: Aligning FNA with Shogun")
generate_shogun_align_commands(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a cmd, right?

qclient.update_job_step(
job_id, "Step 5 of 6: Functional profile with Shogun")
levels = ['genus', 'species', 'level']
for level in levels:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll also want to create redistributed Biom files for the taxonomy tables at these levels, a la:

https://github.com/biocore/oecophylla/blob/251fa6c5a34014b919736acf3e003da13f09053b/oecophylla/taxonomy/taxonomy.rule#L378

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@semarpetrus The linked lines are the command you'll want to add to get the redistributed taxonomy tables. These are then what you'll convert to BIOM representation for the taxonomy portion of the command.


exp_cmd = [
('shogun assign_taxonomy --aligner bowtie2 '
'--database %sshogun --input %s/alignment.bowtie2.sam '
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why %sshogun?

@codecov-io
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (master@77e5cd3). Click here to learn what that means.
The diff coverage is 86.66%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master      #37   +/-   ##
=========================================
  Coverage          ?   86.84%           
=========================================
  Files             ?        5           
  Lines             ?      403           
  Branches          ?       78           
=========================================
  Hits              ?      350           
  Misses            ?       24           
  Partials          ?       29
Impacted Files Coverage Δ
qp_shotgun/qc_filter/qc_filter.py 94.59% <100%> (ø)
qp_shotgun/shogun/utils.py 78.94% <78.94%> (ø)
qp_shotgun/shogun/shogun.py 90.16% <90.16%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 77e5cd3...05170e0. Read the comment docs.

@antgonza antgonza changed the title WIP: adding shogun adding shogun Apr 20, 2018
@antgonza
Copy link
Member Author

The error is a timeout from the humman2 tests that we will move to a new repo. Thus, @semarpetrus, @tanaes and I decided to be ok to merge.

@antgonza antgonza merged commit b3e76ec into qiita-spots:master Apr 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants