# Getting to Know JSON - A Quick Python Exercise
Up until now we've mostly saved our data in .csv files. These are great for handling data with a simple tabular structure (such as your Arduino data), but not so great for storing data with a more complicated structure. 

JSON is extremely good for building data files with complicated, structure, such as information about chemicals or, say, the experimental conditions that you used to gather your Arduino data.

In this workbook, we'll look through how to produce, save, and load JSON files to build up some basic information about the compound betamethasone. The details are taken from [its ChEMBL entry](https://www.ebi.ac.uk/chembl/web_components/explore/compound/CHEMBL632).

### 1.1 Creating a JSON file.
JSON stores data in collections very similar to Python - lists, dictionaries, and tuples, etc.

Just like dictionaries in Python, JSON files have a key-value format.

First, we'll import the library and create dictionary that stores the name of our compound.

In [1]:
import json

data = {'CompoundName' : 'Betamethasone'}

### 1.2 Writing to a JSON File Using Python
You can read or write JSON files easily in Python. The example below writes a file called 'chemdata.json' into the directory that your code is stored in. Run the code below and check that its worked.

In [2]:
output_file = './chemdata.json'
# Write the combined data to a single output file
with open(output_file, 'w') as outfile:
    json.dump(data, outfile, indent=4)

data

{'CompoundName': 'Betamethasone'}

### 1.3 Reading from a JSON File into Python
Now let's read in the JSON file we've just saved, again to check things are behaving.

In [4]:
input_file = './chemdata.json'
# Write the combined data to a single output file
with open(input_file, 'r') as file:
    data = json.load(file)
data

{'CompoundName': 'Betamethasone'}

### 1.4 Building a More Complicated JSON File
JSONs can store dictionaries within dictionaries. Let's use this property to give our molecule an identifier, in this case 'Molecule1'

Write the code below to your JSON file:

In [5]:
data = {'Molecule1' : 
        {'CompoundName' : 'Betamethasone'}
}

Note that the identifier isn't a very good one. Think about why that might be (how does this identifier distinguish it from other molecules...?). We'll come back to this later.

### 1.5 Adding Multiple Pieces of Data

You can now add as many pieces of information as you like to your dictionary. <b>Write the following information to your JSON file:</b>

In [None]:
data = {'Molecule1' : 
        {'CompoundName' : 'Betamethasone',
         'MolecularFormula' : 'C22H29FO5',
         'inchikey' : 'UREBDLICKHMUKA-DVTGEIKXSA-N'
         },
}

### 1.6 Dictionaries within Dictionaries within Dictionaries
Sometimes, you might want to relate more than one value to a single trait. For instance, if your drug had multiple brand names, you could place multiple dictionaries inside a list:

In [None]:
data = {'Molecule1' : 
            {'CompoundName' : 'Betamethasone',
             'MolecularFormula' : 'C22H29FO5',
             'inchikey' : 'UREBDLICKHMUKA-DVTGEIKXSA-N',

            'synonyms': [
                {
                'synonym': 'Betatrex',
                'source': 'ChEMBL',
                },
                {
                'synonym': 'Celestone',
                'source': 'ChEMBL',
                }
            ]  
            }
        }

output_file = './chemdata.json'


### JSON Exercise

Assign each group member a molecule from your list. Search the internet for the information to fill out the above JSON template as a far as possible, and save each file.

<b>Now go back to the Workbook and start Exercise 3.3</b>

# JSON Activity 2
The code below can be used to join multiple JSON files together - it will be useful in Exercise 3.5.

You'll need to run this code in the same directory that the Workshop 6 repo is stored in.


In [9]:
import collate_json as cj
input_directory = './saved_jsons' 
output_file_path = './json_combined.json'
cj.combine_json_files(input_directory, output_file_path)

Take a look in the folder ./saved_jsons to see the output of the code.

### Thinking About Uniqueness
In the above example we've given our molecule an identifier - 'Molecule1'. However, it's not a very good identifier, because it's not unique, which will make Gemma sad. What might be a suitable unique identifier that you've encountered in this workshop?

### Further Reading

You can find full details about the module we used and some examples at the [JSON Python module documentation page](https://docs.python.org/3/library/json.html).

There are also a huge number of useful tutorials that you find online if you search "JSON and Python", [such as this one](https://realpython.com/python-json/).