Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: update transcript list #168

Merged
merged 8 commits into from
Oct 4, 2024
Merged

refactor: update transcript list #168

merged 8 commits into from
Oct 4, 2024

Conversation

balajtimate
Copy link
Collaborator

@balajtimate balajtimate commented Oct 2, 2024

Description

  • Update transcripts.fasta.gz to include transcript sequences from an updated set of small subunit RP genes (RPSA, RPS2, RPS3, RPS9, RPS11, RPS13, RPS14, RPS15, RPS15A, RPS16, RPS18, RPS20, RPS23, RPS29) that are consistent across the domains of Bacteria, Archaea and Eukaryotes (so in the future Bacteria can also be included). The RP genes were selected based on https://bangroup.ethz.ch/research/nomenclature-of-ribosomal-proteins.html. Human genes and their orthologs were downloaded from Ensembl main (Vertebrates), Ensembl Plants, Ensembl Fungi and Ensembl Metazoa. The set now consists of 14410 transcripts from 385 organisms.
  • Update to correct current version v0.11.0
  • Add disable statement to pylint too-many-positional-arguments in get_read_layout.py (related to changes in pylint v3.3)

Closes #108
Closes #56

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Checklist

Please carefully read these items and tick them off if the statements are true
or do not apply.

  • I have performed a self-review of my own code
  • My code follows the existing coding style, lints and generates no new
    warnings
  • I have added type annotations to all function/method signatures, and I
    have added type annotations for any local variables that are non-trivial,
    potentially ambiguous or might otherwise benefit from explicit typing.
  • I have commented my code in hard-to-understand areas
  • I have added ["Google-style docstrings"] to all new modules, classes,
    methods/functions or updated previously existing ones
  • I have added tests that prove my fix is effective or that my feature
    works
  • New and existing unit tests pass locally with my changes and I have not
    reduced the code coverage relative to the previous state
  • I have updated any sections of the app's documentation that are affected
    by the proposed changes

If for some reason you are unable to tick off all boxes, please leave a
comment explaining the issue you are facing so that we can work on it
together.

@balajtimate
Copy link
Collaborator Author

balajtimate commented Oct 2, 2024

With the updated transcript list and the new set of SRA samples (which does contain 720 samples), we have the best result for library source inference yet:

I think with this we can close #108 and #56 (improving lib source inference with a consistent set of RP genes)

@balajtimate
Copy link
Collaborator Author

Also, the checks are failing because there's an ongoing issue with setup-miniconda as a result of mamba v2 release. There seems to be a fix on the way, should we wait for it or pin mamba to 1.5.10 @uniqueg? That worked in the previous commit.

@uniqueg
Copy link
Member

uniqueg commented Oct 3, 2024

If you think a fix will come within the next few days, then wait - otherwise pin.

Copy link

codecov bot commented Oct 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (51957ac) to head (a75cfd7).
Report is 1 commits behind head on dev.

Additional details and impacted files
@@            Coverage Diff            @@
##               dev      #168   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           13        13           
  Lines         1143      1142    -1     
=========================================
- Hits          1143      1142    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@balajtimate balajtimate merged commit d3cbe0e into dev Oct 4, 2024
19 checks passed
@balajtimate balajtimate deleted the update_transcripts branch October 4, 2024 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Analyze organism/source Improve library source inference
2 participants