Question about implementation #39

hanulpark98 · 2023-07-21T08:25:26Z

hi i have attempted to try and study the code and thanks for your effort
this may be a silly question but
in the file "run_experiments.py" i found that in line 535,536 there weren't any parameters for get_cindex and build_combined_categorical, but in the file "emetrics.py" the function get_cindex(Y, P) requires Y and P and build_combined_categorical requires 4.
do i have to fill it up myself? in case can you explain Y and P for me?

hkmztrk · 2023-07-21T09:36:17Z

Hi @hanulpark98, no worries :)

Those lines you refer to are, you can think of as placeholder variables for selected function, meaning

DeepDTA/source/run_experiments.py

Line 530 in 2c9cbaf

perfmeasure = get_cindex

perfmeasure variable can take any function as long as these functions have the same inputs which is Y and P. If you recursively follow the functions where perfmeasure is sent as argument you will reach to below line.

DeepDTA/source/run_experiments.py

Line 369 in 2c9cbaf

rperf = prfmeasure(val_Y, predicted_labels)

The code is a little bit old now but idea is to give it a little more flexibility whether you choose get_cindex or get_aupr functions from emetrics.py. Hope this makes sense.

You can refer to here for pytorch implementation.

hanulpark98 · 2023-07-24T04:57:44Z

oh that makes sense
big thanks for the comment

hanulpark98 · 2023-08-04T07:12:48Z

Hi @hanulpark98, no worries :)

Those lines you refer to are, you can think of as placeholder variables for selected function, meaning

DeepDTA/source/run_experiments.py

Line 530 in 2c9cbaf

perfmeasure = get_cindex

perfmeasure variable can take any function as long as these functions have the same inputs which is Y and P. If you recursively follow the functions where perfmeasure is sent as argument you will reach to below line.

DeepDTA/source/run_experiments.py

Line 369 in 2c9cbaf

rperf = prfmeasure(val_Y, predicted_labels)

The code is a little bit old now but idea is to give it a little more flexibility whether you choose get_cindex or get_aupr functions from emetrics.py. Hope this makes sense.

You can refer to here for pytorch implementation.

hi i've been studying your code since and found out that 11902 proteins * 1353 ligands binding affinity data of your dataset Y, there were only 50181 (which is 0.3 percent of the data) that were not NAN. can you explain how it is possible to train the data with so much missing data?

hkmztrk · 2023-08-16T12:26:45Z

Hi @hanulpark98, it is true that actual experimental data is quite limited and this eventually is an imbalanced dataset. Therefore, one can pay attention to metrics such as AUPR to interpret the results more carefully. Hope this answers your question. I'm closing the issue but feel free to comment/reopen.

hkmztrk closed this as completed Aug 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about implementation #39

Question about implementation #39

hanulpark98 commented Jul 21, 2023

hkmztrk commented Jul 21, 2023

hanulpark98 commented Jul 24, 2023

hanulpark98 commented Aug 4, 2023

hkmztrk commented Aug 16, 2023

Question about implementation #39

Question about implementation #39

Comments

hanulpark98 commented Jul 21, 2023

hkmztrk commented Jul 21, 2023

hanulpark98 commented Jul 24, 2023

hanulpark98 commented Aug 4, 2023

hkmztrk commented Aug 16, 2023