Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data preparation scripts for Remora models with random bases #38

Closed
AnWiercze opened this issue Oct 14, 2022 · 3 comments
Closed

Data preparation scripts for Remora models with random bases #38

AnWiercze opened this issue Oct 14, 2022 · 3 comments

Comments

@AnWiercze
Copy link

Hello Remora Team,

In this year's ONT update, Clive mentioned that the newer models that perform better than BS-seq are trained with sequences that contain a modified position with +-30 random bases around that position, if I understand it correctly. Are the scripts to prepare the training data for this kind of input data publicly available? Right now only fully modified and unmodified reads are applicable with the data preparation scripts uploaded here, correct?

Thanks for your help!

Cheers,
Anna

@marcus1487
Copy link
Collaborator

These scripts are not currently publicly available. We are working to improve the robustness of this workflow and release this code at some point in the future.

We will be updating the data preparation scripts very soon to take pod5 and bam input to directly create a Remora dataset. This will add a lot more flexibility to dataset generation outside of the "fully modified at a motif" type datasets.

@AnWiercze
Copy link
Author

Thanks a lot for sharing these information! I am looking forward to the next release. :)

@marcus1487
Copy link
Collaborator

Betta is now available via a developer release. Please see instructions for accessing the repository in the community note here (login required): https://community.nanoporetech.com/posts/betta-tool-release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants