# Testing your own Algorithm

This notebook provides a step-by-step guide for testing your datafile generated by your own algorithm. To ensure a fair comparison, utilize the [USPTO-50k dataset curated by Dai et. al.](https://www.dropbox.com/sh/6ideflxcakrak10/AAAESdZq7Y0aNGWQmqCEMlcza/typed_schneider50k?dl=0&subfolder_nav_tracking=1) when training and testing your algorithm. 
<br> <br>
Before running any cells below, please install the correct environment (evalretro) as outlined in the README.md file. Make sure to add the ipykernel package to the environment for running this notebook:
```
pip install ipykernel
```

In [1]:
import os
script_path = os.path.dirname(os.getcwd())

## Running the preprocess script
The data_import.py script ensures that the datafile contains the correct headers and entries before running main.py.
<br> <br>
For this example, the .csv file containing retrosynthesis predictions was placed in the 'examples/data' directory. The configuration file for this prediction was placed in the 'examples/config' directory with name 'example_config.json'. <br><br>
The configuration for the file is as follows: <br>
```
{"my_algorithm":{
    "file":"my_predictions.csv",    -> name of the csv file within data directory
    "preprocess":false,             -> false as .csv already follows structure of [idx, target, reactant] 
    "skip":false,                   -> class is LineSeperated, hence false
    "class":"LineSeparated",        -> the predictions for each target are separated through an empty line
    "delimiter":"comma",            -> .csv, hence comma
    "colnames": null,               -> null as colnames are provided within .csv
    "type": "",                     -> Not Needed              
    "name": ""                      -> Not Needed
}
}
```


In [2]:
# Running pre-process script on example .csv file with correct paths
!python $script_path/data_import.py --data_path $script_path/'examples/data' --config_path $script_path/'examples/config' --config_name 'example_config.json'

my_algorithm data saved to /home/friedrich/phd/evalretro/examples/data/my_algorithm directory.


The script above could be run from the command line (instead of jupyter notebook) as: 
```
python data_import.py --data_path 'examples/data' --config_path 'examples/config' --config_name 'example_config.json'
``` 

## Running the main script
Simply call the main script as follows, providing the desired args: <br> <br>
<font color='red'>Note:</font> When using --quick_eval True, the Top-k Invalidity is **NOT** provided as a percentage as done in the paper. Please multiply this result by 100 to arrive at Invalidity %.

In [3]:
!python $script_path/main.py --data_path $script_path/'examples/data' --config_path $script_path/'examples/config' --config_name 'example_config.json' --quick_eval True

Instructions for updating:
Use `tf.cast` instead.
-------------------------------------------------------------------------------
2023-12-12 19:00:03,306 INFO Evaluating my_algorithm:
 Results will be saved in /home/friedrich/phd/evalretro/examples/results/my_algorithm
-------------------------------------------------------------------------------
2023-12-12 19:00:03,306 INFO Canonicalizing smile strings from my_algorithm dataset.
 Results are written to /home/friedrich/phd/evalretro/examples/data/my_algorithm/my_algorithm_processed.csv
100%|███████████████████████████████████████| 519/519 [00:00<00:00, 1939.28it/s]
2023-12-12 19:00:03,581 INFO Evaluating diversity
100%|█████████████████████████████████████████████| 9/9 [00:00<00:00, 11.85it/s]
2023-12-12 19:00:04,744 INFO Evaluating duplicates
100%|███████████████████████████████████████████| 9/9 [00:00<00:00, 7743.33it/s]
2023-12-12 19:00:04,749 INFO Evaluating invsmiles
100%|████████████████████████████████████████████| 9/9 [00:00<0

The script above could be run from the command line (instead of jupyter notebook) as: 
```
python main.py --data_path 'examples/data' --config_path 'examples/config' --config_name 'example_config.json' --quick_eval True
``` 