Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using demuxalot for ATAC data #17

Open
himanshiarora7 opened this issue Nov 1, 2021 · 3 comments
Open

Using demuxalot for ATAC data #17

himanshiarora7 opened this issue Nov 1, 2021 · 3 comments
Labels
question Further information is requested

Comments

@himanshiarora7
Copy link

I was recently working on ATAC data. Therefore, i wanted to ask if demuxalot can be used for ATAC data?
Because each time i try to use it for my data i get the following error.

KeyError: “tag ‘NH’ not present

Incase it can be used for ATAC data as well, if yes, there is a way to solve this issue.

@arogozhnikov
Copy link
Owner

Hi @himanshiarora7

this should be completely possible to use demuxalot for ATAC, though I never tested that.

NH is a tag that is used to detect number of genomic alignments in STAR aligner.

Please sample ~10 random alignments from your bamfile and post them here so I could look at available alignment tags.

@himanshiarora7
Copy link
Author

Thank you for your response.
I am attaching the sample alignment and the header for my ATAC bam file.

check_atac_bam.txt
header_bam.txt

@arogozhnikov
Copy link
Owner

Hi, I think after recent update I think you can actually use demuxalot for scATAC demuxing, though with some trickery.

You'll need to provide custom callback for parsing reads instead, I think this one should be optimal:

def parse_read(read: AlignedRead) -> Optional[Tuple[float, int]]:
    """
    returns None if read should be ignored.
    Read still can be ignored if it is not in the barcode list
    """
    if read.get_tag("AS") <= len(read.seq) - 6:
        # more than 2 edits
        return None
    if not read.has_tag("UB"):
        # does not have molecule barcode
        return None

    if read.mapq < 20:
        # this one should not be triggered because of NH, but just in case
        return None

    p_misaligned = 0.01  # default value
    fake_ub = random.randint(0, 2 ** 31)
    return p_misaligned, fake_ub

A word of warning: 1) I don't have any scATAC data to test, 2) most important changes in demuxalot are scrnaseq-specific, so you're unlikely to benefit from them

@arogozhnikov arogozhnikov added the question Further information is requested label Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants