Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about implementation #39

Closed
hanulpark98 opened this issue Jul 21, 2023 · 4 comments
Closed

Question about implementation #39

hanulpark98 opened this issue Jul 21, 2023 · 4 comments

Comments

@hanulpark98
Copy link

hi i have attempted to try and study the code and thanks for your effort
this may be a silly question but
in the file "run_experiments.py" i found that in line 535,536 there weren't any parameters for get_cindex and build_combined_categorical, but in the file "emetrics.py" the function get_cindex(Y, P) requires Y and P and build_combined_categorical requires 4.
do i have to fill it up myself? in case can you explain Y and P for me?

@hkmztrk
Copy link
Owner

hkmztrk commented Jul 21, 2023

Hi @hanulpark98, no worries :)

Those lines you refer to are, you can think of as placeholder variables for selected function, meaning

perfmeasure = get_cindex

perfmeasure variable can take any function as long as these functions have the same inputs which is Y and P. If you recursively follow the functions where perfmeasure is sent as argument you will reach to below line.

rperf = prfmeasure(val_Y, predicted_labels)

The code is a little bit old now but idea is to give it a little more flexibility whether you choose get_cindex or get_aupr functions from emetrics.py. Hope this makes sense.

You can refer to here for pytorch implementation.

@hanulpark98
Copy link
Author

oh that makes sense
big thanks for the comment

@hanulpark98
Copy link
Author

Hi @hanulpark98, no worries :)

Those lines you refer to are, you can think of as placeholder variables for selected function, meaning

perfmeasure = get_cindex

perfmeasure variable can take any function as long as these functions have the same inputs which is Y and P. If you recursively follow the functions where perfmeasure is sent as argument you will reach to below line.

rperf = prfmeasure(val_Y, predicted_labels)

The code is a little bit old now but idea is to give it a little more flexibility whether you choose get_cindex or get_aupr functions from emetrics.py. Hope this makes sense.

You can refer to here for pytorch implementation.

hi i've been studying your code since and found out that 11902 proteins * 1353 ligands binding affinity data of your dataset Y, there were only 50181 (which is 0.3 percent of the data) that were not NAN. can you explain how it is possible to train the data with so much missing data?

@hkmztrk
Copy link
Owner

hkmztrk commented Aug 16, 2023

Hi @hanulpark98, it is true that actual experimental data is quite limited and this eventually is an imbalanced dataset. Therefore, one can pay attention to metrics such as AUPR to interpret the results more carefully. Hope this answers your question. I'm closing the issue but feel free to comment/reopen.

@hkmztrk hkmztrk closed this as completed Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants