-
Notifications
You must be signed in to change notification settings - Fork 79
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
desc for deprecated software (#3339)
* adding the software desction to the GUI for deprecated software * clean docs * add workflow not active message * Update qiita_pet/support_files/doc/source/processingdata/qp-fastp-minimap2.rst Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu> --------- Co-authored-by: Charles Cowart <42684307+charles-cowart@users.noreply.github.com> Co-authored-by: Daniel McDonald <d3mcdonald@eng.ucsd.edu>
- Loading branch information
1 parent
35d9335
commit 30eb772
Showing
8 changed files
with
60 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
37 changes: 37 additions & 0 deletions
37
qiita_pet/support_files/doc/source/processingdata/qp-fastp-minimap2.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
Adapter and host filtering | ||
========================== | ||
|
||
At the end of August 2023, we discovered that the parameters used by | ||
qp-fastp-minimap2 did not trigger application of adapter filtering. By default, | ||
fastp performs autodetection of adapters and filtering for single-end data. By | ||
default, fastp does not perform these operations on paired-end data. This behavior | ||
was not expected by us. It was discovered when manually assessing replicated | ||
sequences, which on examination by BLAST against NT reported to be adapters. | ||
|
||
Adapter filtering for paired-end data with fastp requires specifying either the | ||
exact adapters to remove (i.e., no autodetection), or to explicitly specify “--detect_adapter_for_pe”. Qiita previously indicated to users that the | ||
qp-fastp-minimap2 plugin was performing adapter autodetection and filtering. | ||
However, because this flag was not specified, that behavior did not occur. | ||
|
||
In the metagenomic dataset the adapters were discovered in, we observed a few | ||
sequences with high replication, with assignments to a few genomes in RS210. | ||
The coverage of those genomes, using all metagenomic short reads, was constrained | ||
to very specific regions. The replicated sequences exhibited high identity to | ||
known adapters. As such, we suspect the replicated sequences we observed were | ||
adapters. We suspect the observed genomes either suffer from adapter contamination | ||
themselves, or the constructs used in the samples we examined were derived from | ||
real organisms. Although we cannot differentiate this definitively in the data | ||
we examined, in either case these short reads are likely artifactual. | ||
|
||
For the dataset we examined, removal of these false positives was important | ||
for the biological interpretation of the results. However, whether the removal | ||
is important likely depends on the dataset and question. | ||
|
||
qp-fastp-minimap2 has been updated to perform adapter filtering on paired-end data. | ||
The fastp autodetection is compile-time limited to `the first 256k sequences <https://github.com/OpenGene/fastp/blob/7784d047fdf0a8df4211967156f5c97920c6d2e8/src/evaluator.cpp#L410-L417>`_. | ||
Because of this, we opted for a more conservative approach of not relying on | ||
autodetection and instead we now test all adapters that fastp is aware of. Specifically, | ||
we now provide fastp a known adapters FASTA which is a serialized representation | ||
of their `known adapter list <https://github.com/OpenGene/fastp/blob/7784d047fdf0a8df4211967156f5c97920c6d2e8/src/knownadapters.h#L11>`_. | ||
|
||
The new command is named: `Adapter and host filtering v2023.12`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters