# Fragnet search example

Source notebook: https://github.com/OpenRiskNet/notebooks/blob/master/DataCure/Fragnet/fragnet-search.ipynb

Example of using fragnet search to fetch related molecule using the Frgament Network on the OpenRiskNet infrastructure.
**NOTE:** Details are subject to change.

Fragnet Search is an API around data from the Fragment Network. The Fragment Network was conceived by Astex and is used in their
fragment based drug design processes. It is described in this publication: https://pubs.acs.org/doi/10.1021/acs.jmedchem.7b00809

Astex did not make the code available, and Anthony Bradley, while he was working at Diamond Light Source, reimplemented this in 
Python (see https://github.com/xchem/fragalysis). Informatics Matters have been working with Diamond to further develop this code
for use in Diamond's fragment screening program ([XChem](https://www.diamond.ac.uk/Instruments/Mx/Fragment-Screening.html) and 
are working on finding ohter uses for the powerful technology. One of these is the [Fragnet Search](https://fragnet.informaticsmatters.com/)
web application. More details can be found at: https://www.informaticsmatters.com/pages/fragment_network.html

The REST search API of Fragnet Search has been deployed to the OpenRiskNet reference site, allowing applications on the site to use this
API. The typical use is to expand a molecule using the fragment network and fetch a set of related molecules.

The key benefit of the fragment network over traditional chemical fingerprint based similarity searches are that the results are more 
'chemically intuitive', and this is especially true for small molecules such as fragments and building blocks.

In [7]:
import requests
import json
import urllib.parse

# requests_toolbelt module is used to handle the multipart responses.
# Need to `pip install requests-toolbelt` from a terminal to install. This might need doing each time the Notebook pod starts
try:
    from requests_toolbelt.multipart import decoder
except:
    %pip install requests_toolbelt
    from requests_toolbelt.multipart import decoder

In [8]:
# Define some URLs and params
base_url = 'http://fragnet-search.fragnet-search.svc:8080/fragnet-search/rest'
expansion_url = base_url + '/v2/search/expand/'
keycloak_url = 'https://sso.prod.openrisknet.org/auth/realms/openrisknet/protocol/openid-connect/token'

# set to False if self signed certificates are being used
tls_verify=True

In [9]:
# Test the PING service. Should give a 200 response and return 'OK'.
# If not then nothing else is going to work.
# This endpoint is not authenticated

url = base_url + '/ping'

print("Requesting GET " + url)
resp = requests.get(url, verify=tls_verify)
print('Response Code: ' + str(resp.status_code))
print(resp.text)

Requesting GET http://fragnet-search.fragnet-search.svc:8080/fragnet-search/rest/ping
Response Code: 200
OK


## Authentication

You need to fetch an access token from the OpenRiskNet SSO environment (Keycloak) and pass that token in with your requests to the fragnet search API.
Details of this may well change soon.
Contact tdudgeon@informaticsmatters.com for details about how to log in.

In [10]:
# Need to specify your Keycloak SSO username and password so that we can get a token

import getpass
username = input('Username')
password = getpass.getpass('Password')

Username user1
Password ···········


In [11]:
# Get token from Keycloak. This will have a finite lifetime.
# If your requests are getting a 401 error your token has probably expired.

data = {'grant_type': 'password', 'client_id': 'fragnet-search', 'username': username, 'password': password}
kresp = requests.post(keycloak_url, data = data)
print('Response code: ' + str(kresp.status_code))

j = kresp.json()
token = j['access_token']
print("Token length: " + str(len(token)))
#token

Response code: 200
Token length: 1334


## Run the expansion search
The parameters are:
* query_smiles - the molecule to search for as SMILES
* hops - the number of edges to traverse in the fragment network. Must be 1 or 2
* hac - the change in heavy atom count between the query and the result molecules
* rac - the change in ring atom count between the query and the result molecules

POST operations using Molfile format are also supported.

The result is JSON.

In [12]:
query_smiles = 'NC1CCCNC1'
hops = 1
hac = 3
rac = 1
url =  expansion_url  + urllib.parse.quote(query_smiles) + '?hac=' + str(hac) + '&rac=' + str(rac) + '&hops=' + str(hops)
print("Requesting GET " + url)
jobs_resp = requests.get(url, headers={'Authorization':  'bearer ' + token}, verify=tls_verify)
print('Response Code: ' + str(jobs_resp.status_code))
json = jobs_resp.json()
json

Requesting GET http://fragnet-search.fragnet-search.svc:8080/fragnet-search/rest/v2/search/expand/NC1CCCNC1?hac=3&rac=1&hops=1
Response Code: 200


{'query': 'MATCH p=(m:F2)-[:FRAG]-(e:Mol)<-[:NonIso*0..1]-(c:Mol)\nWHERE m.smiles=$smiles AND e.smiles <> $smiles AND abs(m.hac - e.hac) <= $hac AND abs(m.chac - e.chac) <= $rac\nRETURN p LIMIT $limit',
 'parameters': {'limit': 5000, 'hac': 3, 'smiles': 'NC1CCCNC1', 'rac': 1},
 'refmol': 'NC1CCCNC1',
 'resultAvailableAfter': 1,
 'processingTime': 192,
 'pathCount': 73,
 'size': 66,
 'members': [{'smiles': 'C1CCNCC1',
   'props': {'chac': 6, 'neighbours': 42434, 'hac': 6},
   'cmpd_ids': ['MOLPORT:003-791-712',
    'MOLPORT:000-871-527',
    'MOLPORT:023-329-895',
    'MOLPORT:006-116-298']},
  {'smiles': 'NCCN1CCCC(N)C1',
   'props': {'chac': 6, 'neighbours': 3, 'hac': 10},
   'cmpd_ids': ['CHEMSPACE-BB:CSC015917530']},
  {'smiles': 'NCCN1CCC[C@@H](N)C1',
   'props': {'neighbours': 0},
   'cmpd_ids': ['CHEMSPACE-BB:CSC015917530']},
  {'smiles': 'NCC1(N)CCCNC1',
   'props': {'chac': 6, 'neighbours': 17, 'hac': 9},
   'cmpd_ids': ['CHEMSPACE-BB:CSC015489974']},
  {'smiles': 'NC1CNCCC1O',

Extract the SMILES strings

In [13]:
str(len(json['members']))
mols = []
for member in json['members']:
    #print(member['smiles'])
    mols.append(member['smiles'])
mols

['C1CCNCC1',
 'NCCN1CCCC(N)C1',
 'NCCN1CCC[C@@H](N)C1',
 'NCC1(N)CCCNC1',
 'NC1CNCCC1O',
 'N[C@@H]1CNCC[C@H]1O',
 'NC1CNCCC1F',
 'NC1CNCCC1C(=O)O',
 'NC1CNCC(O)C1',
 'NC1CNCC(N)C1',
 'N[C@@H]1CNC[C@H](N)C1',
 'NC1CNCC(C(=O)O)C1',
 'NC1CCCNC1C(=O)O',
 'NC1CCCN(CCO)C1',
 'N[C@@H]1CCCN(CCO)C1',
 'N[C@H]1CCCN(CCO)C1',
 'NC1CCCN(CCF)C1',
 'N[C@H]1CCCN(CCF)C1',
 'N[C@@H]1CCCN(CCF)C1',
 'NC1CCCN(CCCl)C1',
 'NC1CCCN(CCBr)C1',
 'NC1CCCN(CC=O)C1',
 'NC1CCCN(C=O)C1',
 'N[C@@H]1CCCN(C=O)C1',
 'NC1(C(F)F)CCCNC1',
 'NC1(C(=O)O)CCCNC1',
 'NC(=O)N1CCCC(N)C1',
 'NC(=O)N1CCC[C@@H](N)C1',
 'NC(=O)C1(N)CCCNC1',
 'N#CN1CCCC(N)C1',
 'N#CN1CCC[C@@H](N)C1',
 'N#CCN1CCCC(N)C1',
 'N#CCN1CCC[C@@H](N)C1',
 'N#CC1(N)CCCNC1',
 'C[S+]([O-])N1CCCC(N)C1',
 'C[S+]([O-])N1CCC[C@@H](N)C1',
 'COC1CCNCC1N',
 'CN1CCCC(N)C1',
 'CN1CCC[C@@H](N)C1',
 'CN1CCC[C@H](N)C1',
 'CN(C)N1CCCC(N)C1',
 'CCN1CCCC(N)C1',
 'CCN1CCC[C@H](N)C1',
 'CCN1CCC[C@@H](N)C1',
 'CCCN1CCCC(N)C1',
 'CCCN1CCC[C@@H](N)C1',
 'CCCC1CCNCC1N',
 'CCC1CCNCC1N',

Now use those SMILES however you want!

## OpenAPI

Intial OpenAPI (Swagger) docs are available, but this is work in progress.
We hope to add OpenRiskNet semantic annotations to this as well.

In [None]:
# NOTE - THIS VERSION HAS NOT BEEN DEPLOYED TO ORN YET

url = base_url + '/api-doc'

print("Requesting GET " + url)
resp = requests.get(url, verify=tls_verify)
print('Response Code: ' + str(resp.status_code))
print(resp.text)