Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to access CANOPUS confidence scores? #32

Closed
wkumler opened this issue Apr 7, 2021 · 4 comments
Closed

How to access CANOPUS confidence scores? #32

wkumler opened this issue Apr 7, 2021 · 4 comments

Comments

@wkumler
Copy link

wkumler commented Apr 7, 2021

Hi again,

I've been using SIRIUS + CANOPUS from the command line to get potential compound classes for untargeted metabolomic data, but I can't figure out how to access the data that's available from the GUI. This kind of output is incredible and super informative, but I'd like to be able to access it programmatically:

image

Specifically, I'm interested in the posterior probability for each compound class. The classes themselves are available in the canopus_summary.tsv file that's written out for the project as a whole, but I'd like to filter out the low-confidence class estimations. I can't seem to find those values in the individual compound files either; the "canopus" folder contains only a .fpt file apparently containing raw floating-point values from an unknown process.

image

Any advice would be great!

@kaibioinfo
Copy link
Contributor

In your project space directory there should be a canopus.tsv file. This file lists all compound classes with meta information and their relative index. The relative index (starting with 0) tells you which line in the canopus .fpt files belongs to which compound class.

Alternatively, you can use the canopus_treemap python library which contains code for parsing the compound classes from the project space.

@wkumler
Copy link
Author

wkumler commented Apr 8, 2021

Ah, I think I understand! CANOPUS evaluates each compound's suitability for every compound class in ClassyFire, and the .fpt file gives the confidence associated with each class. So a 0.9999 in the very first line of my .fpt file corresponds to a 0.9999 match to "Organic compounds", which is the very first line of the canopus.tsv file? And similarly, a 0.0001 in the second line of my .fpt file corresponds to a 0.0001 match to "Inorganic compounds", which is the second line of the canopus.tsv?

@kaibioinfo
Copy link
Contributor

Correct.

@wkumler
Copy link
Author

wkumler commented Apr 8, 2021

Fantastic, thanks!

@wkumler wkumler closed this as completed Apr 8, 2021
mfleisch pushed a commit that referenced this issue Apr 15, 2024
Resolve "New project space for REST API"

Closes #161, #145, #149, #162, #160, #151, #34, #33, #32, #165, and #150

See merge request bright-giant/sirius/sirius-frontend!41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants