Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RMSE #22

Closed
ErickQuintanar opened this issue Jul 11, 2023 · 2 comments
Closed

RMSE #22

ErickQuintanar opened this issue Jul 11, 2023 · 2 comments

Comments

@ErickQuintanar
Copy link

Hi,

I encountered some trouble while trying to use DLKcat. I am doing some research on E. coli and using DLKcat to predict the kcats would be very helpful. In the paper, you achieved a 1.06 RMSE with respect to the test dataset. Nevertheless, when I do my own calculations with the data from your database I get 13742.58 for the whole set, 14360.11 for only the E. coli substrates and 29778.05 for only the E. coli wildtype substrates. I already double checked my implementation of the RMSE calculation (I do it twice with numpy and sklearn).

If you apply this patch 0001-E.-coli-workflow.patch with the instructions in this stackoverflow thread you can get all the python and shell scripts I use to extract the data, make the predictions and calculate the RMSE. After applying the patch, you should change into the DeepLearningApproach directory and then unzip the Data/input.zip file (as per your instructions). In the DeepLearningApproach directory you can find the ecoli.sh script, to run the script executesh ecoli.sh. The ecoli.sh script extracts the data from your dabase and separates it according to our needs (using DeepLearningApproach/Data/database/extract_subtrates.py). Then, it uses DLKcat to make the predictions (just as indicated in the repository python3 prediction_for_input.py input_file.tsv). Later on, it merges the predictions with the measured values (using DeepLearningApproach/Data/database/merge_dbs.py) and finally it calculates the RMSE (using DeepLearningApproach/Data/database/rmse.py).

I would be very helpful if you could tell us how did you calculate the RMSE and whether the predictions made by the model are correct.

Thanks in advance.

Erick Quintanar

@jiejiangtao
Copy link

Kcat = data['Value']
regression.append(np.array([math.log2(float(Kcat))]))
他对kcat值都取了对数
但是我觉得rmse还是有些问题,我采取他数据同样的方式,我测试集得出的RMSE为 3左右,但是我测试集的R2值 可以达到0.5

@jiejiangtao
Copy link

我见作者文章里面的图kcat都是取得以10为底的对数, 但是作者提供的代码里取的是以2为底的对数

@feiranl feiranl closed this as completed Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants