# Tutorial A - basic usage

In this tutorial you will learning the basics of running retrosynthesis experiments with AiZynthFinder.

After the completion of this tutorial, you will know:
* How to download public models and data files
* How to write a simple configuration file
* How to select models to be used in search
* How to select stock to be used in search
* How to perform a retrosynthesis search
* How to perform basic analysis of the outut


We will start with installing the package from pypi

In [None]:
!pip install --quiet aizynthfinder
!pip install --ignore-installed Pillow==9.0.0

### Download public data files

Throughout this tutorial we will use publicly available models and data files.
These can be downloaded to our local folder using a convienient tool.

We will download
- Expansion models trained on the USPTO data
- Filter model trained on UPSTO data
- ZINC stock file

In [None]:
!mkdir --parents data && download_public_data data

### The aizynthfinder configuration file

The main python interface to AiZynthFinder is a class called `AiZynthFinder`. This interface is instantiated with a configuration, either from disc in the form of a yaml-file or from a dictionary.

The configuration is central to the execution and holds information about:
- What models to use
- What stock to use
- How to configure the search algorithm
- What score to compute for the routes

In this tutorial, we will only look at the two first and the other will be covered in upcoming tutorials.

The script that we used above to download the public models, also provided us with a config file that looks like this

```
expansion:
  uspto:
    - uspto_model.onnx
    - uspto_templates.csv.gz
  ringbreaker:
    - uspto_ringbreaker_model.onnx
    - uspto_ringbreaker_templates.csv.gz
filter:
  uspto: uspto_filter_model.onnx
stock:
  zinc: zinc_stock.hdf5
```

The `expansion`-section specify the expansion model to load into memory. This does however not mean that they will be used in the search.

Here we load two models, one general and one specific for breaking rings. The `uspto` and `ringbreaker` are labels for the models that we can use to reference the models in the setup of the search.

The two files specified for each model is 1) a ONNX model file containing the weights of the neural network, and 2) a CSV file with metadata on the templates.

The `filter`-section specifies similarly the filter model. Here, only one file needs to be specified - the ONNX model weights.

Finally, the `stock`-section specifies the stock to load. Here we load one that we will refer to as `zinc` and the compounds in this stock will be loaded from `zinc_stock.hdf5`.


### Initializing AiZynthFinder interface

Now we can start to setup the retrosynthesis search using the `AiZynthFinder` interface. We will also initialize the logging level so that we get some useful information printed to the screen.

In [None]:
import logging
from aizynthfinder.utils.logging import setup_logger
setup_logger(logging.INFO)

from aizynthfinder.aizynthfinder import AiZynthFinder

In [None]:
finder = AiZynthFinder("data/config.yml")

When instantiating the `AiZynthFinder` class with our config-file, we see that the two template-based models, the filter model, and stock file are loaded.

Even though they are loaded into memory, they are not automatically used in the search. For this we need to select what stock and models we want to use.

We will start with selection all stock (although we only loaded one) and the expansion policy with the tag `uspto`, i.e., the general expansion model.

In [None]:
finder.stock.select_all()
finder.expansion_policy.select("uspto")

### Starting a search

There are two steps to a retrosynthesis search once you have setup the interface
- Set the target SMILES
- Initiate the search

In [None]:
finder.target_smiles = "Cc1cccc(C)c1N(CC(=O)Nc1ccc(-c2ncon2)cc1)C(=O)C1CCS(=O)(=O)CC1"
display(finder.target_mol.rd_mol)

In [None]:
finder.tree_search(show_progress=True)

That was quick, right?


### Analysis of the output

Now we need to extract routes from the retrosynthesis search tree

In [None]:
finder.build_routes()
finder.analysis.tree_statistics()

The `tree_statistics` method return som general information about the search tree, and the top-ranked routes.

We can for instance read that:
- There are 618 nodes in the search tree
- The depth of the search tree is 6
- There are 174 routes in the search tree, whereof 37 are solved (starting material is in stock)
- The top-ranked route is a 2-step route with 3 starting material

We only extract the top-ranked routes by default.

In [None]:
len(finder.routes)

We can visualize the top-ranked route using this

In [None]:
finder.routes.reaction_trees[0].to_image()

We can iterate over all the starting material and display them together with their SMILES string

In [None]:
for mol in finder.routes.reaction_trees[0].leafs():
  print(mol.smiles)
  display(mol.rd_mol)

We can compute some scores of the extract routes. You will learn more about how this is done in a forthcoming tutorial.

We will import `pandas` so that we can get a nice-looking table

In [None]:
import pandas as pd
finder.routes.compute_scores(*finder.scorers.objects())
pd.DataFrame(
    finder.routes.all_scores
)

That is all for now!

Let's continue with the next tutorial where you will learn how to do more advance route analysis.