[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/scott2b/PythonReview/blob/main/notebooks/Python.08.Strings.ipynb)

## Python string handling, including parsing JSON

## parsing json data

In [1]:
import json

# This is a dictionary
brand_hq = {
    'Nike': 'Beaverton',
    'Adidas': 'Herzogenaurach',
    'Reebok': 'Boston'
}

# this creates a json string
brand_hq_json = json.dumps(brand_hq)
brand_hq_json

'{"Nike": "Beaverton", "Adidas": "Herzogenaurach", "Reebok": "Boston"}'

Note the quotes (') around the data. It's a string, not a dictionary now:

In [2]:
brand_hq['Nike'] # this works

'Beaverton'

In [3]:
brand_hq_json['Nike'] # but this doesn't make sense

TypeError: string indices must be integers

This string then can be written out to a file, which would be a file in the json standard data format:

```
with open('brandhq.json', 'w') as outfile:
    outfile.write(brand_hq_json) # just like writing any other string to a file
```

It is not necessary to encode the data to a json string before writing it out. The json module provides tools for direct encoding/parsing to and from a file. Starting again with our data dictionary instead of the json string:

```
with open('brandhq.json', 'w') as outfile:
    json.dump(outfile, brand_hq) # note the method is dump, not dumps (which stands for dump-string)
```

We can also go the other way. Given a json string, we can parse it into a dictionary:

In [4]:
data = json.loads(brand_hq_json)
data

{'Nike': 'Beaverton', 'Adidas': 'Herzogenaurach', 'Reebok': 'Boston'}

Note the absence of quotes around this data. This is a dictionary, not a string:

In [5]:
data['Nike']

'Beaverton'

There is also a `load` method for working directly with a file:

```
with open('brandhq.json') as infile:
    data = json.load(infile)
```

Let's parse some Twitter data:

In [6]:
import json
from pathlib import Path

try:
    from google.colab import drive
    drive.mount('/content/drive')
    root = Path('drive/My Drive/')
except ModuleNotFoundError:
    root = Path('../..')

with open(root / 'MyProject/twitter_apiresponse_example.json') as infile:
    data = json.load(infile)[0] # we just want the first tweet

Mounted at /content/drive


In [7]:
data

{'created_at': 'Thu Apr 06 15:28:43 +0000 2017',
 'id': 850007368138018817,
 'id_str': '850007368138018817',
 'text': 'RT @TwitterDev: 1/ Today we’re sharing our vision for the future of the Twitter API platform!nhttps://t.co/XweGngmxlP',
 'truncated': False,
 'entities': {'hashtags': [],
  'symbols': [],
  'user_mentions': [{'screen_name': 'TwitterDev',
    'name': 'TwitterDev',
    'id': 2244994945,
    'id_str': '2244994945',
    'indices': [3, 14]}],
  'urls': [{'url': 'https://t.co/XweGngmxlP',
    'expanded_url': 'https://cards.twitter.com/cards/18ce53wgo4h/3xo1c',
    'display_url': 'cards.twitter.com/cards/18ce53wg…',
    'indices': [94, 117]}]},
 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>',
 'in_reply_to_status_id': None,
 'in_reply_to_status_id_str': None,
 'in_reply_to_user_id': None,
 'in_reply_to_user_id_str': None,
 'in_reply_to_screen_name': None,
 'user': {'id': 6253282,
  'id_str': '6253282',
  'name': 'Twitter API',
  'screen_name': 

## Advanced string usage

Strings are iterables and can be treated as lists:

In [8]:
for letter in 'abcd':
    print(letter)

a
b
c
d


In [9]:
'abcd'[:2]

'ab'

In [10]:
'abcd'[2:]

'cd'

In [11]:
'abcd'[0]

'a'

In [12]:
'abcd'[-1]

'd'

In [13]:
sorted('cbda')

['a', 'b', 'c', 'd']

How would you put this back together as a string?

There are also several methods specific to string handling:

In [14]:
'aBcD'.lower()

'abcd'

In [15]:
'aBcD'.upper()

'ABCD'

In [16]:
'abcd'.startswith('a')

True

In [17]:
'abcd'.endswith('b')

False