New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot convert output from Scirpy to dandelion #343
Comments
Hi @sbenjamaporn, thanks for reporting this issue.
Thanks, |
Thanks for your prompt response! This is my results attached as a pdf file. |
@sbenjamaporn, thanks for the stacktrace! Regarding my second request, you checked @zktuong, according to the stacktrace, the error occurs within Dandelion. It could theoretically be that there's a problem with scirpy's output, but I have absolutely no idea where the |
I think it's got to do with a malformed entries in So @sbenjamaporn, can you check what's the unique values for: I think the following are dandelion's columns: I don't think I wonder if the round trip of |
What's also weird is that in addition to the "productive" should be a chain-level attribute (i.e only available as IR_V(D)J_1/2_productive). Could you please describe
|
Hmm it looks like those additional productive columns are from dandelion. Can you try and remove every column from “clone_id_by_size” onwards and see if there’s still the issue of conversion? |
@grst, |
Dear @grst, Is there any ways that Scirpy could give full information as an AIRR standard ? ( I try ir.io.write_airr, but it did not create full information) |
@zktuong Thanks for suggestion, I will try! |
It seems the actual problem is in the conversion from dandelion to scirpy. @zktuong
Happy to discuss this. Could you please open a separate issue and describe what's missing? |
That's right. it's just a wrapper to call A small update on this - the issue seems to lie in: # works ok
irdata = ddl.to_scirpy(vdj) # or ir.io.from_dandelion(vdj)
vdj2 = ir.io.to_dandelion(irdata)
# same issue with ValidationError: field productive has invalid bool T + T appears
irdata = ddl.to_scirpy(vdj, transfer = True) # or ir.io.from_dandelion(vdj, transfer = True)
vdj2 = ir.io.to_dandelion(irdata) |
ok. the 'issue' is with line 273: scirpy/scirpy/io/_datastructures.py Lines 260 to 273 in 2c5b99e
because dandelion's Lines 726 to 739 in 2c5b99e
Can confirm that if just change the name away from @sbenjamaporn if you just rename the current
you should be able to do the transfer. I will action this on dandelion's side to rename |
@zktuong Thanks, It works now!. I have a more question during update germline sequence by update_germline. I have many samples to update. Should the fasta file be "tigger_heavy_igblast_db-pass_genotype.fasta" ? ( I also got the error in this case) or manually specify in each sample ? OSError: Environmental variable GERMLINE must be set. Otherwise, please provide path to folder containing germline IGHV, IGHD, and IGHJ fasta files. |
@grst |
Thanks a lot @zktuong! LMK once you have a release including the fix, then I'll pin the latest version of dandelion. |
Let's follow up over on dandelion's side: |
Hi @grst, i've just release sc-dandelion==0.2.3 (it's actually 0.2.2 but i thought my upload went wrong) |
Description of the bug
I try to convert the AnnData after clonal assignment by Scirpy into dandelion format. The result showed that "field productive has invalid bool T + T". I would like to convert it for updating germline sequence of each BCR sequences using dandelion because I did not find this function in Scirpy. However, if you could suggest the other ways. Feel free to let me know.
Minimal reproducible example
The error message produced by the code above
Version information
versions
The text was updated successfully, but these errors were encountered: