Skip to content

Predicting with DeepNovo

Samuel E. Miller edited this page Aug 16, 2019 · 6 revisions

DeepNovo de novo sequence prediction can be performed via the Postnovo predict_deepnovo subcommand. DeepNovo is automatically run multiple times with different fragment mass tolerance parameterizations. Postnovo can distribute these processes via Slurm on a compute cluster.

It can take a few minutes for the DeepNovo processes to start at each fragment mass tolerance parameterization. The Postnovo command can be run in the background (with " &" at the end of the command), and upon logging out of the server, this will not cause the spawned DeepNovo processes to exit.

  1. Here is an example of prediction on a single machine at each fragment mass tolerance used for high-resolution data by Postnovo (0.01, 0.03, 0.05, 0.1, 0.5 Da).

    python main.py predict_deepnovo --container /path/to/tensorflow.simg --mgf /path/to/sample.mgf --frag_resolution high &

  2. Here is an example of prediction at two of the five high-resolution parameterizations. Predicting at specific parameterizations can be useful if some of the jobs for the whole set of parameterizations failed, say, by running out of memory.

    python main.py predict_deepnovo --container /path/to/tensorflow.simg --mgf /path/to/sample.mgf --frag_resolution high --frag_mass_tols 0.1 0.5

  3. Here is an example of prediction on a compute cluster via Slurm. The jobs run on nodes of a partition called "smallmem", each job using 16 CPUs and a maximum of 20 GB of memory for up to 36 hours. A maximum of ~16 GB is required, with the least memory needed for the highest fragment mass tolerance.

    python main.py predict_deepnovo --container /path/to/tensorflow.simg --mgf /path/to/sample.mgf --frag_resolution high --slurm --partition smallmem --cpu 16 --mem 20 --time_limit 36