# Working with JSON

[JSON](http://www.json.org/) (JavaScript Object Notation) is a popular data-interchange format. It is well-suited for structured data that is a mix between numbers and strings. The JSON format is supported by the `json` standard library in Python (that `json` is a _standard_ library means that it is always available).

To read a JSON-file with Python, we
+ import the `json`-library,
+ open the JSON-file with the built-in `open`-function,
+ read the file(handle) with the `json.load`-function.

In [1]:
import json

with open('data/glossary.json', mode='r') as fid:
    glossary = json.load(fid)

glossary

{'glossary': {'GlossDiv': {'GlossList': {'GlossEntry': {'Abbrev': 'ISO 8879:1986',
     'Acronym': 'SGML',
     'GlossDef': {'GlossSeeAlso': ['GML', 'XML'],
      'para': 'A meta-markup language, used to create markup languages such as DocBook.'},
     'GlossSee': 'markup',
     'GlossTerm': 'Standard Generalized Markup Language',
     'ID': 'SGML',
     'SortAs': 'SGML'}},
   'title': 'S'},
  'title': 'example glossary'}}

## Lists and dictionaries

Two of the basic, but very powerful, data structures in Python are _lists_ (`list`) and _dictionaries_ (`dict`). These are also the main building blocks of the JSON format (although in JSON they are called _arrays_ and _objects_).

A `list` is an ordered sequence of elements. These elements can be of different types, and can even be new lists. A `dict` is an unordered collection of key-value-pairs, a value can be anything (even a new `dict`) while there are some restrictions on what can be a key (most often strings are used).

In [2]:
# list of cities
cities = ['Wien', 'Oslo', 'London', 'Barcelona']
for city in cities:
    print(city)

Wien
Oslo
London
Barcelona


In [3]:
# dict of countries with their capitals
capitals = {'Austria': 'Wien', 'Norway': 'Oslo', 'Portugal': 'Lisboa', 'Finland': 'Helsinki'}
for country, capital in capitals.items():
    print(country, capital)

Austria Wien
Norway Oslo
Portugal Lisboa
Finland Helsinki


## Indexing

To pick out one element from a list or dictionary we use _indexing_. This is denoted by square brackets. For lists we need to use a numerical index, the element number counting from 0:

In [4]:
cities[0]

'Wien'

In [5]:
cities[2]

'London'

For dictionaries the indexing is done by the keys. In our example above we only used string-keys, namely the names of countries.

In [6]:
capitals['Norway']

'Oslo'

For numerical indices we can also use _slicing_ to pick out several elements at once (getting a sub-list from a list). In slicing we give both a start- and an end-index separated by colon. The start-index is inclusive, while the end-index is non-inclusive.

In [7]:
cities[0:2]      # Includes the elements 0 and 1, but not 2

['Wien', 'Oslo']

Any of these numbers can be omitted. If the start-index is omitted it defaults to 0, while if the end-index is omitted all elements at the end of the sequence are included.

In [8]:
cities[:2]

['Wien', 'Oslo']

In [9]:
cities[1:]

['Oslo', 'London', 'Barcelona']

It is also possible to specify a third number, which will be the stride. For instance `[::2]` will pick out every second element of a list.

In [10]:
cities[::2]

['Wien', 'London']