Support for ASAP-seq #5

caleblareau · 2020-09-09T06:34:32Z

Hi kite team,

We recently put out a pre-print on quantifying protein counts using 10x scATAC-seq. We used an intermediate python script to convert the output of cellranger-atac mkfastq into something that we could fit into kite (specifically, we made the ATAC R1/R2/R3 look like an R1/R2 from a 10x v2 feature barcoding experiment). Funnily enough, my simple python script to cut and paste parts of fastqs was markedly slower than performing the tag abundances using kite, so any support for this assay would probably dramatically reduce the workflow time.

This python script has the key function to do the conversion of the reads before I feed them through kite. I don't have a good sense of how hard it would be to modify this tool to facilitate the R1/R2/R3 format of the scATAC data as that seems to be the barrier for kite to support this type of data directly.

Any input of the feasibility of supporting this would be great! Thanks!
-Caleb

The text was updated successfully, but these errors were encountered:

jasegehring · 2020-09-10T19:20:11Z

Hi Caleb, Congrats on the recent manuscript! I was checking it out myself just the other day - very cool. OK I'm not super familiar with the 10X ATAC-seq workflow, but here are my thoughts. It looks like the ATAC-seq protocol generates a cell barcode (Read3) and two genomic DNA reads (R1/R2). If you know where your antibody barcodes are going to be (either in R1 or R2), then you might be able to just drop one of them and run kite only on the reads that include the cell barcode and the antibody oligo. If you don't know where the antibody barcodes are going to show up in R1 or R2, then my suggestion would be to simply run kite on R1/R3 and then again on R2/R3 and deal with any necessary merging at the BUS file stage where things are small and easy to quickly parse. I hope this is helpful and makes sense. If not, or if you just find it sort of gross and sketchy, I'm happy to do a quick call and see if we can sort this out and come up with a better solution. The kallisto | bustools team may be able to incorporate a third read into the workflow, but my guess is this would take quite a bit of engineering with relatively limited applications since kallisto is a poor choice for genome alignments. I could be wrong, though. Looking forward to your thoughts and congrats again on the cool workflow. Jase

…

On Tue, Sep 8, 2020 at 11:34 PM Caleb Lareau ***@***.***> wrote: Hi kite team, We recently put out a pre-print on quantifying protein counts using 10x scATAC-seq <https://www.biorxiv.org/content/10.1101/2020.09.08.286914v1>. We used an intermediate python script to convert the output of cellranger-atac mkfastq into something that we could fit into kite (specifically, we made the ATAC R1/R2/R3 look like an R1/R2 from a 10x v2 feature barcoding experiment). Funnily enough, my simple python script to cut and paste parts of fastqs was markedly slower than performing the tag abundances using kite, so any support for this assay would probably dramatically reduce the workflow time. This python script has the key function <https://github.com/caleblareau/asap_to_kite/blob/master/asap_to_kite_v2.py#L104> to do the conversion of the reads before I feed them through kite. I don't have a good sense of how hard it would be to modify this tool to facilitate the R1/R2/R3 format of the scATAC data as that seems to be the barrier for kite to support this type of data directly. Any input of the feasibility of supporting this would be great! Thanks! -Caleb — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#5>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB4UGX425OQVM46WFYIT3HLSE4OYPANCNFSM4RBHMUVQ> .

caleblareau · 2020-09-15T05:01:39Z

Hey Jase, Thanks for the quick and detailed response and the kind words!

Unfortunately, the UMI (or in our case UBI), cell barcode, and antibody barcode are all encoded into 3 different files-- so there's not a way that I can only supply 2 of the 3 files. It seems like will will necessarily have to wrap the R1/R2/R3 into another format. We have an existing solution linked that I can keep using and will suggest to others to do the same.

"The kallisto | bustools team may be able to incorporate a third read into
the workflow, but my guess is this would take quite a bit of engineering
with relatively limited applications since kallisto is a poor choice for
genome alignments."

^^ this was my sense too but wanted to raise it just in case.

Thanks again for the response and maintaining kite... I use it almost everyday it seems!
-Caleb

caleblareau closed this as completed Sep 15, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for ASAP-seq #5

Support for ASAP-seq #5

caleblareau commented Sep 9, 2020

jasegehring commented Sep 10, 2020 via email

caleblareau commented Sep 15, 2020

Support for ASAP-seq #5

Support for ASAP-seq #5

Comments

caleblareau commented Sep 9, 2020

jasegehring commented Sep 10, 2020 via email

caleblareau commented Sep 15, 2020