Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to join paired reads #12

Closed
johnchase opened this issue Oct 3, 2016 · 6 comments
Closed

Add ability to join paired reads #12

johnchase opened this issue Oct 3, 2016 · 6 comments

Comments

@johnchase
Copy link

Similar to running dada2 in R it would be nice to have the ability to join paired reads with the q2 plugin

@benjjneb
Copy link
Collaborator

benjjneb commented Oct 5, 2016

Is there currently a QIIME object for paired fastq files (or lists of paired fastq files)?

@johnchase
Copy link
Author

I don't believe that there is. @ebolyen can you confirm?

@ebolyen
Copy link
Member

ebolyen commented Oct 6, 2016

There is if I understand the question correctly: SampleData[PairedEndSequencesWithQuality] which uses SingleLanePerSamplePairedEndFastqDirFmt as its backing directory format.

@benjjneb
Copy link
Collaborator

benjjneb commented Oct 6, 2016

If that exists already, then adding a paired end workflow should be straightforward.

Is there documentation of SampleData[PairedEndSequencesWithQuality] objects? For example, can it be assumed there are always forward and reverse files for each sample?

@ebolyen
Copy link
Member

ebolyen commented Oct 6, 2016

No real documentation, and we can't yet enforce that there exists a forward and reverse for each sample, but I believe it is our goal to be able to validate that property. That particular directory format does contain a manifest file which maps the sample id to the filepath and direction.

relevant lines for format
manifest definition
q2-demux which manipulates these

These parts of the API (multi-file directory formats) are still very rough around the edges, but we're thinking about ways to make it easier.

benjjneb added a commit that referenced this issue Jan 16, 2017
The script run_dada_faster_paired.R was added, and implements paired
end functionality. It inputs directories of demultiplexed forward and
reverse reads as they come off the Illumina sequencer (i.e. F/R are in
matched order for each sample).

Output is the sequence table. Multithreading and error estimation from
a subet of the data is a part of this workflow.

This solves #12 from the R side.
@benjjneb
Copy link
Collaborator

benjjneb commented Feb 7, 2017

Added in a1857e5 and b66432d

Assumes dada2 1.2+

@benjjneb benjjneb closed this as completed Feb 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants