You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I encountered some trouble while trying to use DLKcat. I am doing some research on E. coli and using DLKcat to predict the kcats would be very helpful. In the paper, you achieved a 1.06 RMSE with respect to the test dataset. Nevertheless, when I do my own calculations with the data from your database I get 13742.58 for the whole set, 14360.11 for only the E. coli substrates and 29778.05 for only the E. coli wildtype substrates. I already double checked my implementation of the RMSE calculation (I do it twice with numpy and sklearn).
If you apply this patch 0001-E.-coli-workflow.patch with the instructions in this stackoverflow thread you can get all the python and shell scripts I use to extract the data, make the predictions and calculate the RMSE. After applying the patch, you should change into the DeepLearningApproach directory and then unzip the Data/input.zip file (as per your instructions). In the DeepLearningApproach directory you can find the ecoli.sh script, to run the script executesh ecoli.sh. The ecoli.sh script extracts the data from your dabase and separates it according to our needs (using DeepLearningApproach/Data/database/extract_subtrates.py). Then, it uses DLKcat to make the predictions (just as indicated in the repository python3 prediction_for_input.py input_file.tsv). Later on, it merges the predictions with the measured values (using DeepLearningApproach/Data/database/merge_dbs.py) and finally it calculates the RMSE (using DeepLearningApproach/Data/database/rmse.py).
I would be very helpful if you could tell us how did you calculate the RMSE and whether the predictions made by the model are correct.
Thanks in advance.
Erick Quintanar
The text was updated successfully, but these errors were encountered:
Hi,
I encountered some trouble while trying to use DLKcat. I am doing some research on E. coli and using DLKcat to predict the kcats would be very helpful. In the paper, you achieved a 1.06 RMSE with respect to the test dataset. Nevertheless, when I do my own calculations with the data from your database I get 13742.58 for the whole set, 14360.11 for only the E. coli substrates and 29778.05 for only the E. coli wildtype substrates. I already double checked my implementation of the RMSE calculation (I do it twice with numpy and sklearn).
If you apply this patch 0001-E.-coli-workflow.patch with the instructions in this stackoverflow thread you can get all the python and shell scripts I use to extract the data, make the predictions and calculate the RMSE. After applying the patch, you should change into the
DeepLearningApproach
directory and then unzip theData/input.zip
file (as per your instructions). In theDeepLearningApproach
directory you can find theecoli.sh
script, to run the script executesh ecoli.sh
. Theecoli.sh
script extracts the data from your dabase and separates it according to our needs (usingDeepLearningApproach/Data/database/extract_subtrates.py
). Then, it uses DLKcat to make the predictions (just as indicated in the repositorypython3 prediction_for_input.py input_file.tsv
). Later on, it merges the predictions with the measured values (usingDeepLearningApproach/Data/database/merge_dbs.py
) and finally it calculates the RMSE (usingDeepLearningApproach/Data/database/rmse.py
).I would be very helpful if you could tell us how did you calculate the RMSE and whether the predictions made by the model are correct.
Thanks in advance.
Erick Quintanar
The text was updated successfully, but these errors were encountered: