Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging changes on develop #1874

Merged
merged 36 commits into from
May 23, 2024
Merged

Merging changes on develop #1874

merged 36 commits into from
May 23, 2024

Conversation

GavinHuttley
Copy link
Collaborator

No description provided.

GavinHuttley and others added 30 commits May 15, 2024 17:43
[NEW] implemented using slots, so it takes less memory
    than the dict it replaces. It supports dictionary
    style indexing, so old code will work. It also
    aliases certain old keys.

[CHANGED] updated return type hints to reflect the new class

[CHANGED] updated tests to reflect these changes
[CHANGED] this frees parser from having to always start at
    the top of a file to figure out the format version.
[CHANGED] previously, this was reversed if a feature was on the minus
    strand.
[CHANGED] this was a private method on GffAnnotationDb but has been
    made a function to facilitate chunked reading of Gff files.
[CHANGED] iter_line_blocks() now supports num_lines=None, which results
    in all lines being returned.
[CHANGED] just calls bound sqlitedb's close method
[CHANGED] incomplete records in a GFF database can be updated
…tations

[CHANGED] we achieve a ~75% reduction in RAM for creating a GffAnnotationDb
    for the human genome by combining iter_line_blocks(), which uses
    iter_splitlines(), merged_gff_records() and
    GffAnnotationDb.update_record_spans(). The
    load_annotations(lines_per_block=500_000) argument controls how many lines
    are read before the insert is done. We track all record name's that have
    been inserted and update their existing spans.
[NEW] builds indexes for standard columns, biotype, seqid, start, etc..
[NEW] thanks to comment in code review by khiron, added
    # codacy:ignore[sql-injection] - limited SQL injection exposure
    to silence this codacy warning. As this is purely in a test,
    it doesn't seem to have much risk.
[CHANGED] seems comment ws incorrect
[CHANGED] this is from the bandit tool, which indicates B608
    as the error for hardcoded_sql_expressions
Improve performance of annotation db creation, querying
NEW: abstract base class for views, fixes #1865
fredjaya and others added 2 commits May 22, 2024 16:22
[NEW] MolType.is_compatible_alphabet() checks that the characters
     in an alphabet match those in one of the members of the
     MolType.alphabets. The argument strict=False (the default)
     means the exact ordering of elements must match.

[NEW] AlphabetGroup.iter_alphabets() yields individual alphabets
     from the group.
@coveralls
Copy link
Collaborator

coveralls commented May 22, 2024

Pull Request Test Coverage Report for Build 9200124195

Details

  • 248 of 265 (93.58%) changed or added relevant lines in 5 files are covered.
  • 10 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.04%) to 91.91%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/cogent3/core/sequence.py 102 103 99.03%
src/cogent3/core/moltype.py 6 8 75.0%
src/cogent3/parse/gff.py 96 110 87.27%
Files with Coverage Reduction New Missed Lines %
src/cogent3/parse/gff.py 10 84.81%
Totals Coverage Status
Change from base Build 9090395347: -0.04%
Covered Lines: 30278
Relevant Lines: 32943

💛 - Coveralls

DOC: Add IndelMap param docstring
@GavinHuttley GavinHuttley merged commit b832dd4 into seq-collections-refactor May 23, 2024
31 of 33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants