-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to acquire Remora models in the toml format that Dorado expects as input? #38
Comments
Well, listed under "Features":
|
Hmm -- yep it does look like the -h suggests this might be possible, although it's not very informative as to what format it wants the remora models in...
When I try to pass it an .onnx from the Remora repository, it treats it like a directory that should contain a .toml file.
But there are no toml files in the Remora repository, or in rerio, and none for the basecall models distributed with Guppy. There's also no clear documentation on this format, although it seems to be alluded to in nanoporetech/bonito#278 which talks about "a bonito basecalling model [tar+toml]". So the question is, how does one acquire (or create?) Remora models in the tar+toml format that Dorado accepts? |
I will get |
Thanks! Looking forward to it! |
@oneillkza v0.0.2 has a matching 5mC model for each simplex model. $ dorado download --list
[2022-11-10 16:25:06.843] [info] > simplex models
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_260bps_fast@v3.5.2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_260bps_hac@v3.5.2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_260bps_sup@v3.5.2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_400bps_fast@v3.5.2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_400bps_hac@v3.5.2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_400bps_sup@v3.5.2
[2022-11-10 16:25:06.846] [info] - dna_r9.4.1_e8_fast@v3.4
[2022-11-10 16:25:06.846] [info] - dna_r9.4.1_e8_hac@v3.3
[2022-11-10 16:25:06.846] [info] - dna_r9.4.1_e8_sup@v3.3
[2022-11-10 16:25:06.846] [info] > modification models
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_260bps_fast@v3.5.2_5mCG@v2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_260bps_hac@v3.5.2_5mCG@v2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_260bps_sup@v3.5.2_5mCG@v2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_400bps_fast@v3.5.2_5mCG@v2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_400bps_hac@v3.5.2_5mCG@v2
[2022-11-10 16:25:06.846] [info] - dna_r10.4.1_e8.2_400bps_sup@v3.5.2_5mCG@v2
[2022-11-10 16:25:06.846] [info] - dna_r9.4.1_e8_fast@v3.4_5mCG@v0
[2022-11-10 16:25:06.846] [info] - dna_r9.4.1_e8_hac@v3.4_5mCG@v0
[2022-11-10 16:25:06.846] [info] - dna_r9.4.1_e8_sup@v3.4_5mCG@v0 In this release you have to specify the model manually like so: $ dorado basecaller ${models}/dna_r10.4.1_e8.2_400bps_hac@v3.5.2 ${data} \
--remora-models ${models}/dna_r10.4.1_e8.2_400bps_hac@v3.5.2_5mCG@v2 > mods.sam But I intend to simplify this with automatic model matching and a simpler cli i.e.
|
Thanks @iiSeymour ! |
Does this also work using the rerio remora all cytosine context model? |
@jcolicchio-soundag not yet. |
What's the timeline for getting support for modified basecalling models in Dorado?
(Or is this possible already?)
The text was updated successfully, but these errors were encountered: