Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

structure #103

Open
3 of 6 tasks
brentp opened this issue Jun 10, 2022 · 0 comments
Open
3 of 6 tasks

structure #103

brentp opened this issue Jun 10, 2022 · 0 comments

Comments

@brentp
Copy link
Contributor

brentp commented Jun 10, 2022

This is a suggestion for structuring the code. Currently, it's very focused on the evaluations. Let's make the user-facing code and build the evaluations and training around that.

extract-signals

  • This takes the BAM/CRAM file and extracts all relevant signals. This is working and simple to use (if a bit slow with thousands of contigs).

generate channels

This takes the output from extract signals and a set of SVs and generates the arrays (channels) to be used by the NN.

  • this should be updated to accept VCF (currently requires bedpe)

score (predict)

This should take a trained model along with a VCF or bedpe and output a score for each variant in the sample field. With an option for QUAL.

  • this should be updated to accept VCF (currently requires bedpe)
  • this should NOT accept labels, that is part of train/evaluate.

train/evaluate

This will be handled by Luca and includes the optimization and LOCSO. We will keep this more isolated since it is harder to run.
Simplify to only and always use LOCSO.

  • Find models that tend to work well to reduce search space of optimizer and reduce variability among runs. Currently, when running LOCSO for different chromosomes we can get dramatically different results because of the network architecture or hyperparameters.
  • Use more true negative variants in training. This can help prevent over-fitting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant