-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERROR: No enum auxiliary type exists. At src/slow5.c:1458 #9
Comments
Oh yea that's my bad. Let me fix that and get back to you. |
Hey, Any chance you could show me the header columns of your data? The first 2 lines above the actual reads (and below the header values) It should be something like
What I'm looking for here is this part of it
This is the list that blue-crab tried to get from your slow5 file. If it's not present or fails, it tries to make it a list of just ["unknown"]. It looks like it's trying to use a value that is outside the length of the list. So having a look at the list is a good start to see if there is anything weird going on there. An easy way to get that value from a blow5 file is to run this command
and just scroll down to that header line and copy paste it here. Thanks |
I have also just pushed a change to the dev branch that has a check on this line of code that will spit out what So another way is to switch to the dev branch, run Thanks |
Hi @Psy-Fer, I am also getting the same error: blue-crab s2p minion_sim_1000_itrs.blow5 -o minion_test.pod5 The last two lines before the actual reads are: #char* uint32_t double double double double uint64_t int16_tchar double int32_t uint8_t uint64_t There is no enum{} in my files. The slow5 files were generated (using the subprocess.run function of python) with the dna-r10-min model and full-contigs: "squigulator " + fasta_derep + " -x dna-r10-min -o ./tmp/tmp_" + str(i) + ".slow5 --full-contig --seed " + str(random_numbers[i]) and then merged: slow5tools merge tmp -o minion_sim_1000_itrs.slow5 The individual tmp files as well as the merged files have the same structure and no enum{} on line 9. I tried the dna-r9-min model, and there also is no enum{} on line 9. #char* uint32_t double double double double uint64_t int16_tchar double int32_t uint8_t uint64_t Thanks, Tomas |
Ahh so these reads were built with squigulator? I'll need to tell @hasindu2008 to put a dummy end_reason in the blow5 output. In the meantime, I'll modify blue-crab to insert a dummy enum via an argument, making all reads end in the signal_positive state. I'll get back to you in a sec James |
Hi Tomas, Could you please try using the dev branch and showing me the error it gives you? You can do this by activating your environment
Then re-install this dev version into your env pip install . Now re-run your bluecrab command. something fishy is going on, but this should figure it out. Cheers, |
Hi James, Thanks for looking into this. Here is the output of the dev branch: blue-crab s2p minion_sim_1000_itrs.blow5 -o minion_test.pod5 |
Ahh progress! Okay so now the issue is the readID isn't a valid uuid. Again I think that's a squigulator issue. @hasindu2008 what are the readIDs you make? The issue here is that pod5 requires the readID to be a uuid. So I can't just use any old string. Ideally squigulator would create these and then blue-crab just reads the string and converts it. Another option is in the absence of valid uuids I add an option to create one. But then you can't link the old reads to the new reads (unless I make a tsv file that provides the mapping). What do you think? |
James, I agree, the solution is to have valid UUID and a dummy end_reason in the slow5/blow5 output generated by squigulator @hasindu2008. Not having this also most likely breaks the butterfly-eel wrapper. Ultimately, I need to be able to basecall the simulated slow5/blow5 files generated by squigulator so I can use the called fastq files for downstream analyses. Thanks, Tomas |
Buttery-eel I can unbreak by using dummy uuids when i basecall and then replace the original readID when the read comes back. The issue is going over to pod5 you can't do this because of their strict typing. So yea, either squigulator produces uuids or I create them in blue-crab and give a file that maps squigulator readIDs with uuids. Let's see what @hasindu2008 thinks and then we will implement it asap James |
Hey all, The reason I adhere to the current readID format in squigulator is so that it is compatible with the "mapeval" utility in Minimap2's Paftools companion script. This is quite useful for assessing the mapping accuracy once the reads are basecalled. Also, I like deterministic read IDs compared to random ones. It is very strange that POD5 needs the readid to be a UUID. Perhaps in their implementation, they simply store the UUID as a 128-bit integer instead of storing it as a variant-length string. This is not great, as this means POD5 is stuck with UUID forever as their read IDs, well, might change later and break backward compatibility. ReadID in many bioinformatics formats including BAM format has been a variable string. Perhaps, I can implement Squigulator an option called By the way, @Psy-Fer, is this UUID thing applicable to buttery-eel too? It wasn't a problem when using ont-guppy-server with the eel. Perhaps they enforced this UUID in ont-dorado-server? If they have enforced it (which is of limited sense to me), I would be very glad if you could do some internal mapping with a fake uuid when sending to the ont-basecall-server, but write the original readID to the FASTQ/SAM. |
Also cross-referencing to the issue in squigulator that raises the same issue: hasindu2008/squigulator#13 |
Hey, Okay I'll just make absolutely sure what pod5 is doing so we are 100% correct when we do this. James |
@hasindu2008 and @Psy-Fer Perhaps, I (@hasindu2008) can implement Squigulator an option called --ont-friendly that produces some fake UUIDs for the read IDs, as well as a fake end_reason with the value "unknown". Let me know your thoughts on this. This way, there is no need for the blue crab to do any "UUIdification" of the readIDs. If you all are happy, I can implement this to squigulator ASAP. I think this is a great solution that will maintain maximum compatibility for downstream use. Thanks, Tomas |
Okay I have confirmed that pod5 requires a uuid type for the readID, even though it shouldn't have to be.
This is what happens if we just parse a str it's trying to access the bytes method on the uuid type specifically, as that is what they expect. So yea, I think we need to go with dummy uuids, and just make a tsv file that maps the uuid with the more verbose read information you want to store. James |
@Psy-Fer I am implementing an option in squigulator to generate uuids for readids, so blue-crab does not need to do anything. Please check if the buttery-eel is also broken due to this uuid thing? |
Buttery-eel should be fine, unless they change something in the dorado server code |
I should probably merge the buttery-eel/skipped branch into main and do a release to handle this. |
If you compile squigulator from the dev branch, and specify the option When you specify If you encounter issues let me know, thanks. Seems like buttery-eel works even without things being uuid as James mentioned above. |
James, I agree, the solution is to have valid UUID and a dummy end_reason in the slow5/blow5 output generated by squigulator @hasindu2008. Not having this also most likely breaks the butterfly-eel wrapper. Ultimately, I need to be able to basecall the simulated slow5/blow5 files generated by squigulator so I can use the called fastq files for downstream analyses. Thanks, Tomas |
Thanks for implementing this option. I can now convert the squigulator generated files to pod5. Thanks for your help, Tomas |
Hi @Psy-Fer ,
I suspect this issue might be related to a recent pull request based on the new pod5 spec from about a month ago. Is there a way to avoid this error? |
Hmm..make sure you have the latest pod5 version? Which version do you have? Please do a pip list for me? |
Thank you for the fast answer, upgrading pod5 fixed the problem! |
Hi @Psy-Fer,
I am trying to convert some blow5 files to pod5 and get this error:
Any ideas of what might cause this and how I might fix it?
Thanks!
Rich
The text was updated successfully, but these errors were encountered: