Skip to content

Merging of paired reads when no assembly is performed

Latest
Compare
Choose a tag to compare
@iquasere iquasere released this 30 Jan 09:36

MOSCA was calling genes directly from the preprocessed reads.
Now, it merges paired-end reads first, and then calls the genes on those reads.
When gene calling, MOSCA still considers the data as reads (-complete=0), not complete genomes (-complete=1).

Update on sortmerna functions

SortMeRNA databases have been updated, and are now provided as a tar file multiple database files. Each of these databases can be used separately for a specific type of search. MOSCA now provides the sortmerna_database parameter, which sets which database will be used:

  • if fast, MOSCA will use the smr_v4.3_fast_db.fasta database.
  • if default, MOSCA will use the smr_v4.3_ default_db.fasta database.
  • if sensitive, MOSCA will use the smr_v4.3_sensitive_db.fasta database.
  • if sensitive_with_rfam, MOSCA will use the smr_v4.3_sensitive_db_rfam_seeds.fasta database.

Only one database file can be used at a time.

minimum_read_length parameter split for MG and MT

Now, minimum length of reads for further analysis can be set with the minimum_mg_read_length and minimum_mt_read_length parameters.

Added minimum_envs folder and contents

For commands and resources to update envs when needed

Also, some fixes

  • Converting readcounts (for MG and MT) to int was turning them all to zeros (because they are normalized). MOSCA now keeps them as float.
  • Blocked the print of MOSCA's TXT logo. Don't know why it doesn't work on the tests.
  • Fix on Summary Report, now rows have information for both "Name" and "Sample" levels (before, there were rows for "Name" and rows for "Sample").
  • Another fix on Summary Report, counting annotated genes was not done properly.
  • When not performing assembly, General Report was not importing correctly the readcounts. Now, it does.