Investigation of the structure-odor relationship using a Transformer model

Visualization results of attention matrices

Only a part of visualization results are shown in the paper, more results can be found at 'attention_result' file.

More visualization results corresponding to the figures in paper can be found according to the following table.

Figure in the paper	file path for more results	memo
Figure 6	attention_result/substructure prediction/sub-12_30_61-eachdc	TP samples
Figure 7 (2 decoder layers)	attention_result/substructure prediction/sub-12_30_62-eachdc	TP samples
Figure 7 (3 decoder layers)	attention_result/substructures/sub-12_30_63-eachdc	TP samples
Figure 7 (4 decoder layers)	attention_result/substructures/sub-12_30_64-eachdc	TP samples
Figure 8 (1 decoder layers)	attention_result/substructures/sub-12_30_61-eachdc	TP samples
Figure 8 (2 decoder layers)	attention_result/substructures/sub-12_30_62	TP samples
Figure 8 (3 decoder layers)	attention_result/substructures/sub-12_30_63	TP samples
Figure 8 (4 decoder layers)	attention_result/substructures/sub-12_30_64	TP samples
Figure 9	attention_result/OD prediction/od-dc2	TP samples
Figure 10	attention_result/OD prediction/attn_repeat100-tp	TP samples
Figure 11	attention_result/OD prediction/attn_repeat100-tn	TN samples screened by substructures constraints

Experimental code

'od_data.ipynb' is used to collect SMILES and corresponding odor descriptors from The Good Scents Company (http://www.thegoodscentscompany.com/). You can collect data with this code directly after changing the save path at first line.

'experiment.ipynb' is an example to run the program that used to predict substructures and odor descriptors. Remember to change all the file path appear in this file to your local file path if you want to use this code. (In the experiment of substructre prediction, we creat target by: (1). mol = rdkit.Chem.MolFromSmiles('SMILES string') (2). target = rdkit.Chem.MolFromSmarts('SMARTS string') (3). mol.GetSubstructMatches(target))

'mainCode.py' is the code that imported by 'experiment.ipynb'. Remember to change all the file path appear in this file to your local file path if you want to use this code.

'Transformer_mol3.py' is the code of proposed model mentioned in paper, and it is called by Transformer2OD_tada.modelBuild2 in 'mainCode.py'

The version of Tensorflow used in this work is 2.8. The version of RDKit used in this work is 2021.09.4

Dataset

We collected SMILES and corresponding odor descriptors (ODs) from The Good Scents Company (TGSC) website http://www.thegoodscentscompany.com/ .

‘smiles-website_odorant.txt’ records 4240 canonical SMILES (converted by RDKit) of odorant and their corresponding webpages on TGSC. The ODs for each SMILES can be obtained by accessing the webpage corresponding to the SMILES. Similarly, 'smiles-website_odorless.txt' contains 222 canonical SMILES (converted by RDKit) of molecules labeled as odorless and their corresponding web pages on TGSC.

For copyright reasons, we cannot directly make the dataset that records the SMILES used in the experiment and their corresponding odor descriptors available in this GitHub repository. If you need the dataset anyway, please contact us by email (zheng.xiaofan.413@s.kyushu-u.ac.jp) . We would like to send you the dataset used in the experiments, however, please note that the copyright belongs to TGSC.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Investigation of the structure-odor relationship using a Transformer model

Visualization results of attention matrices

Experimental code

Dataset

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 58 Commits
attention_result		attention_result
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Transformer_mol3.py		Transformer_mol3.py
experiment.ipynb		experiment.ipynb
mainCode.py		mainCode.py
od_data.ipynb		od_data.ipynb
smiles-website_odorant.txt		smiles-website_odorant.txt
smiles-website_odorless.txt		smiles-website_odorless.txt

License

zhenghah/0607

Folders and files

Latest commit

History

Repository files navigation

Investigation of the structure-odor relationship using a Transformer model

Visualization results of attention matrices

Experimental code

Dataset

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages