Testing Postnovo

The Postnovo test subcommand tests the algorithm's predictive models. Testing requires that de novo sequence predictions are cross-validated against FDR-controlled PSMs. Postnovo recognizes FDR-controlled results in the TSV format produced by MSGF+ database search. Postnovo test can run MSGF+, which should first be downloaded by Postnovo setup. MSGF+ should be run with the MGF file produced by Postnovo format_mgf.

The --test_plots option produces precision-recall and precision-yield plots that evaluate model accuracy. The definitions of precision and recall are not entirely straightforward, so three methods are used to calculate classification statistics, as described in the diagrams at the bottom of the page.

The following command tests the Postnovo low-resolution with a properly formatted MGF file), running MSGF+ in the process (see Training Postnovo for more examples), and writing plots to the directory of the MGF input.

python main.py test --mgf /path/to/dataset1.mgf --clusters /path/to/MaRaCluster.clusters_p2.tsv --precursor_mass_tol 20 --frag_method CID --frag_resolution low --denovogui --deepnovo --ref_fasta reference_proteins.fasta --msgf --test_plots --cpus 32
The leave_one_out option of the Postnovo train subcommand does thorough model testing, but produces different plots showing true vs. predicted accuracy of predictions.

Classification Statistics Diagram 1

Classification Statistics Diagram 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Postnovo

Clone this wiki locally