# Correct Ph005

Ph005 was incorrectly specified as 4-trifluoromethylphenyl-KAT `O=C(c1ccc(C(F)(F)F)cc1)[B-](F)(F)F.[K+]` on ChemInventory. New characterization data showed that it is in fact 3,5-bis(trifluoromethyl)phenyl-KAT `O=C(c1cc(C(F)(F)F)cc(C(F)(F)F)c1)[B-](F)(F)F.[K+]`.

We need to:
- update `building_blocks` table (SMILES, mass, molecular formula)
- enumerate reactions and update `virtuallibrary` table (products A-D, F, G)
- update all existing entries in the `experiments` table with new product SMILES (products A-D, F, G)
- from corrected DB, prepare MoBiAS submission files for reprocessing the data (affected data: row J of exps 4, 7, 9, 19, 23)
- after reprocessing, repeat evaluation for affected plates

In [None]:
import sys
import pathlib
sys.path.append(str(pathlib.Path().absolute().parents[1]))

from rdkit import Chem
from rdkit.Chem.MolStandardize import rdMolStandardize
from rdkit.Chem.rdMolDescriptors import CalcExactMolWt, CalcMolFormula

from src.util.db_utils import SynFermDatabaseConnection
from src.definitions import DATA_DIR

In [None]:
# Connect to database
con = SynFermDatabaseConnection()

In [None]:
# correct structure for Ph005
mol = Chem.MolFromSmiles("O=C(c1cc(C(F)(F)F)cc(C(F)(F)F)c1)[B-](F)(F)F.[K+]")
mol

## Correct `building_blocks` table

In [None]:
# old data
con.con.execute("SELECT * FROM building_blocks WHERE long = 'Ph005';").fetchall()

In [None]:
# update image
Chem.Draw.MolToFile(mol, DATA_DIR / "db" / "static" / "image" / "Ph005.png")

In [None]:
# calc new data
smiles = Chem.MolToSmiles(mol)

fragment_chooser = rdMolStandardize.LargestFragmentChooser()
mol_anion = fragment_chooser.choose(mol)

lcms_mass = CalcExactMolWt(mol_anion)
lcms_formula = CalcMolFormula(mol_anion)

print(smiles, lcms_mass, lcms_formula)

In [None]:
# update building_blocks table
with con.con:
    con.con.execute(
        "UPDATE building_blocks SET SMILES = ?, lcms_mass_1 = ?, lcms_formula_1 = ? WHERE long = 'Ph005';",
        ((smiles, lcms_mass, lcms_formula))
    )

## Correct `virtuallibrary` table

We simply correct this externally by dropping all rows with initiator_long = 'Ph005', then running the `add_new_products_to_vl.iypnb` notebook to fill in the missing rows.

## Update `experiments` table

We did this externally with these SQLite queries:

Mark test experiments / legacy experiments that do not need immediate reprocessing as invalid:
```sqlite
UPDATE experiments SET comment = 'Ph005 needs reprocessing' WHERE initiator_long = 'Ph005' AND exp_nr NOT BETWEEN 4 AND 29;
```

Delete current data:
```sqlite
UPDATE experiments SET product_A_lcms_ratio = NULL,
                       product_B_lcms_ratio = NULL,
                       product_C_lcms_ratio = NULL,
                       product_D_lcms_ratio = NULL,
                       product_E_lcms_ratio = NULL,
                       product_F_lcms_ratio = NULL,
                       product_G_lcms_ratio = NULL,
                       product_H_lcms_ratio = NULL,
                       valid = NULL
                   WHERE initiator_long = 'Ph005' AND exp_nr BETWEEN 4 AND 29;
```

Replace product SMILES with corrected values from virtual library table (run for each product A-D, F, G, substituting the right letter):
```sqlite
UPDATE experiments
SET product_G_smiles = helper.smi
FROM (SELECT experiments.id AS eid, v.SMILES AS smi
      FROM experiments JOIN virtuallibrary v on experiments.long_name = v.long_name
      WHERE v.type = 'G') AS helper
WHERE experiments.id = helper.eid AND experiments.initiator_long = 'Ph005' AND experiments.exp_nr BETWEEN 4 AND 29;
```


## Update submission files for MoBiAS

Rerun 
```bash
python -m src.experiment_planning.generatelcmssubmission JG246 JG247 JG248 JG249 JG250 JG251 JG264 JG265 JG266 JG267 JG268 JG269 JG277 JG278 JG279 JG280 JG281 JG282 JG339 JG340 JG341 JG342 JG343 JG344 JG363 JG364 JG365 JG366 JG367 JG368
```
Then prune using `head` / `tail` in bash to leave only row J, copy to MoBiAS submission files and submit for reprocessing.