Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excluding chromosomes from analysis #5

Open
hwm08 opened this issue Jun 14, 2023 · 1 comment
Open

Excluding chromosomes from analysis #5

hwm08 opened this issue Jun 14, 2023 · 1 comment

Comments

@hwm08
Copy link

hwm08 commented Jun 14, 2023

I am trying to run cfNDAPro on data aligned to a "synthetic" genome containing puc and lambda sequences to check for methylation conversion.
I am trying to only look at reads aligned to 1:22 and X and Y however I struggle to read the bam file into cfDNAPro:

read_bam_insert_metrics(bamfile = file.bam, genome_label="hg38-NCBI",chromosome_to_keep =append(1:22,c("X","Y")))

bamfile was supplied.
Reading bam into galp...
Curating seqnames and strand information...
Removing outward facing fragments ...
Correcting start and end coordinates of fragments ...
Error in .normarg_seqlengths(value, seqnames(x)) :
the length of the supplied 'seqlengths' vector must be equal to the
number of sequences
Calls: read_bam_insert_metrics ... seqlengths<- -> seqlengths<- -> .normarg_seqlengths
In addition: Warning message:
In .merge_two_Seqinfo_objects(x, y) :
Each of the 2 combined objects has sequence levels not in the other:

  • in 'x': KI270728.1, KI270727.1, KI270442.1, KI270729.1, GL000225.1, KI270743.1, GL000008.2, GL000009.2, KI270747.1, KI270722.1, GL000194.1, KI270742.1, GL000205.2, GL000195.1, KI270736.1, KI270733.1, GL000224.1, GL000219.1, KI270719.1, GL000216.2, KI270712.1, KI270706.1, KI270725.1, KI270744.1, KI270734.1, GL000213.1, GL000220.1, KI270715.1, GL000218.1, KI270749.1, KI270741.1, GL000221.1, KI270716.1, KI270731.1, KI270751.1, KI270750.1, KI270519.1, GL000214.1, KI270708.1, KI270730.1, KI270438.1, KI270737.1, KI270721.1, KI270738.1, KI270748.1, KI270435.1, GL000208.1, KI270538.1, KI270756.1, KI270739.1, KI270757.1, KI270709.1, KI270746.1, KI270753.1, KI270589.1, KI270726.1, KI270735.1, KI270711.1, KI270745.1, KI270714.1, KI270732.1, KI270713.1, KI270754.1, KI270710.1, KI270717.1, KI270724.1, KI270720.1, KI270723.1, KI270718.1, KI270317.1, KI270740.1, KI270755.1, KI270707.1, KI270579.1, KI270752.1, KI270512.1, KI27032 [... truncated]
    Execution halted

I get the same error when I subset the bam file to only the relevant chromosomes using
samtools view
then re-index. I believe this is because the header retains the old chromosome names:


9 138394717 1374698 0
MT 16569 0 0
X 156040895 1739164 0
Y 57227415 7892 0
KI270728.1 1872759 0 0
KI270727.1 448248 0 0
KI270442.1 392061 0 0

Is there a way around this?
many thanks!

@hwm08
Copy link
Author

hwm08 commented Jun 14, 2023

Sorry, I stand corrected, this boils down to:
chromosome_to_keep =append(1:22,c("X","Y")) not working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant