Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change add_dna_sequences to use get_seq #1867

Merged
merged 4 commits into from Jan 26, 2023
Merged

Change add_dna_sequences to use get_seq #1867

merged 4 commits into from Jan 26, 2023

Conversation

martinkim0
Copy link
Contributor

@martinkim0 martinkim0 commented Jan 25, 2023

add_dna_sequence fails in some datasets such as the Buenrostro 2018 used in scBasset batch correction. See notebook: https://colab.research.google.com/drive/1cTkQqQGmbTAV7jtFiV5bK7sr7nD0g3rJ?usp=sharing. New code outputs the same dna sequence for the initial tutorial.

@codecov
Copy link

codecov bot commented Jan 25, 2023

Codecov Report

Base: 90.42% // Head: 90.42% // No change to project coverage 👍

Coverage data is based on head (1cd2dc1) compared to base (508ed0b).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1867   +/-   ##
=======================================
  Coverage   90.42%   90.42%           
=======================================
  Files         141      141           
  Lines       11058    11058           
=======================================
  Hits         9999     9999           
  Misses       1059     1059           
Impacted Files Coverage Δ
scvi/data/_preprocessing.py 76.50% <100.00%> (ø)
scvi/external/scbasset/_module.py 98.64% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@@ -357,7 +357,6 @@ def _concat_anndata(multi_anndata, other):


def _dna_to_code(nt: str) -> int:
nt = nt.upper()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not necessary as sequences are uppercased in add_dna_sequences

Comment on lines +446 to 448
block_mid = (chrom_df[start_var_key] + chrom_df[end_var_key]) // 2
block_starts = block_mid - (seq_len // 2)
block_ends = block_starts + seq_len
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinkim0 martinkim0 merged commit 928ce6c into main Jan 26, 2023
@adamgayoso adamgayoso deleted the scbasset-setup branch January 26, 2023 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants