# Interactive Example: Accessing the IUPAC Gold Book API in Python

Author: [Stuart Chalk](https://orcid.org/0000-0002-0703-7776)
Topic: How to use the IUPAC Gold Book API to retrieve the definition of a chemistry concept
Format: Interactive Jupyter Notebook (Python)
Skills: You should be familiar with
    - Application Programming Interfaces (APIs)
    - The JavaScript Object Notation (JSON) file format
    - Introductory Python
Learning outcomes: After completing this example you should understand:
    - what a Python function (def) is
    - how to write Python code to request data from a URL (typically an API)
    - how to use a Python variable to call an API and download data for the content of the variable
Reuse: This notebook is made available under the IUPAC FAIR Chemistry Cookbook MIT license.

## Step 1: Import the Python packages needed to run this code

In [3]:
import requests                             # package to get data from a URL
import json                                 # package to read/write/display JSON formatted data
import re                                   # package to use regular expression (regex) searching

## Step 2: Add a function to remove HTML tags from textual data

In [4]:
# Source: https://medium.com/@jorlugaqui/how-to-strip-html-tags-from-a-string-in-python-7cb81a2bbf44
def remove_html_tags(text):                 # a 'def' is a (defined) function that can be called later
    clean = re.compile('<.*?>')             # sets up a regular expression to search with
    return re.sub(clean, '', text)          # removes the matches to the regular expression

## Step 3: Download a JSON file with data about all the IUPAC Recommended Terms currently available

In [8]:
allpath = "https://goldbook.iupac.org/terms/index/all/json"  # URL to the IUPAC Gold Book API down
reqdata = requests.get(allpath)                              # download file in JSON
terms = json.loads(reqdata.content)                          # convert JSON to a Python dictionary
print(str(len(terms['terms']['list'])) + ' terms')           # print the number of terms in the list

7052 terms


## Step 4: Search for a term in the recommended term list and if present get its code

In [9]:
searchterm = "cis-trans isomers"                            # the term to be found
searchcode = None                                           # empty variable to contain the searchcode
for code, term in terms['terms']['list'].items():           # iterate over each term in the list (code (str), term (obj))
    cleaned = remove_html_tags(term['title'])               # remove any HTML formatting in the title
    if cleaned == searchterm:                               # check if the term matches the one we want
        searchcode = code                                   # if it does, get the code for the term
        break                                               # we have found the term so we can get out of the for loop
print(searchcode)                                           # IUPAC Gold Book term code (if found)

C01093


## Step 5: Use the term code to create URL to get data about the term, and print out the term, its code and the definition

In [10]:
path = "https://goldbook.iupac.org/terms/view/**/json"      # URL path to the IUPAC Gold Book API for a term
reqdata = requests.get(path.replace("**", searchcode))      # request data from the Gold Book server
jsondata = json.loads(reqdata.content)                      # get the downloaded JSON
print(searchterm + " (" + searchcode + ")")                 # print the title and Gold Book term code
print(jsondata['term']['definitions'][0]['text'])           # print the definition of the term

cis-trans isomers (C01093)
Stereoisomeric olefins or cycloalkanes (or hetero-analogues) which differ in the positions of atoms (or groups) relative to a reference plane: in the cis-isomer the atoms are on the same side, in the trans-isomer they are on opposite sides. [image: molecular structures showing cis/trans isomerism]


Step 6: Try other terms by changing the value of the 'searchterm' variable above and rerunning steps 4 and 5