# RWET - Feb 23

## Outline
- homework presentations
- dictionaries
- JSON

In [12]:
import random as rng

## `for` loops! \*yay\*

In Python, `for` loops go over `lists` instead of creating some conditions

```python
for item in a_list:
    thing()
    thing()
    more_things()
```

What we place in `a_list` has to be something that **evaluates** to a list

In [2]:
cheesus = ['cheddar', 'camembert', 'brie', 'parmesan', 'gorgonzola', 'goat']

In [5]:
# if we want to print it, we can do...
print('\n'.join(cheesus))


print('==')
# or, using a for loop
for item in cheesus:
    yell_cheese = item.upper()
    print(yell_cheese)

cheddar
camembert
brie
parmesan
gorgonzola
goat
==
CHEDDAR
CAMEMBERT
BRIE
PARMESAN
GORGONZOLA
GOAT


The function `range(X)` is extremelly useful for `for` loops, as it returns a list of X elements from `0` to `X-1`

In [6]:
for i in range(11):
    print('👾')

👾
👾
👾
👾
👾
👾
👾
👾
👾
👾
👾


`range` can have more parameters:
- `range(a, b)` goes from `a` to `b` 
- `range(a, b, c)` goes from `a` to `b` with a step of `c`

In [10]:
for i in range(5, 20):
    print(i)
    
print('\n==\n')
for i in range(5, 20, 2):
    print(i)


5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

==

5
7
9
11
13
15
17
19


##  write a computer generated **[acrostic](https://en.wikipedia.org/wiki/Acrostic)** 

        

In [14]:
words = open('../rwet/frost.txt').read().lower().split()

In [21]:
# `findStart(words, letter)` looks for a word starting with a specific `letter`

def findStart(words, letter):
    i = 0
    # first we separate all the words that start with the desired letter
    possibles = [ w for w in words if w[0].lower()==letter.lower() ]
    # if there's no such letter, we return an empty string
    if len(possibles) == 0:
        return ""
    else: # else, we return a random one
        return rng.choice(possibles)

In [23]:
for c in "ROBERT FROST":
    print( findStart(words, c).capitalize() )

Roads
One
Both
Equally
Really
Trodden

Far
Roads
Other,
Step
Took


This is a good solution, but is not very efficient if we're dealing with a **big** corpus and/or too many calls to this function...

We need a data structure that lets us have:
'r' -> ["robert", "rotten", ...]
'o' -> ["other", "of", ...]

So... 🌈🌈**DICTIONARIES** 🌈🌈!!!

### Dictionaries

A dictionary is a data structure that maps a `key` to a `value`

We define a dictionary with curly brackets `{}` and map the keys to values as `key:value`

In [28]:
dist_AU = {"mercury":0.387, "venus":0.723, "earth":1.000, "mars":1.523 }
print(type(dist_AU))
dist_AU

<class 'dict'>


{'earth': 1.0, 'mars': 1.523, 'mercury': 0.387, 'venus': 0.723}

Dictionaries in pyhton **are not ordered**

To access a `value` from a `key`, you call it with square brackets (`[key]`)

In [31]:
dist_AU['venus']

0.723

You can search if a key exists with the `in` operator

In [33]:
print( 'mars' in dist_AU )
print( 'nibiru' in dist_AU )
print( 'nibiru' not in dist_AU )

True
False
True


To add a new key-value pair, you declare it with an assignment operation (or you can change it the same way)

In [35]:
dist_AU['jupiter'] = 5.2
dist_AU

{'earth': 1.0, 'jupiter': 5.2, 'mars': 1.523, 'mercury': 0.387, 'venus': 0.723}

Keys can also be numbers

In [39]:
number_words = {0:'zero', 1:'one', 2:'two', 3:'three', 4:'four', 5:'five'}
number_words[2]

'two'

And values can also be lists!

In [41]:
planet_moons = {
    "Mercury": [],
    "Venus": [],
    "Earth": ["Moon"],
    "Mars": ["Phobos", "Deimos"],
    "Jupiter": ["Io", "Europa", "Ganymede", "Callisto"]
}

planet_moons["Mars"]

['Phobos', 'Deimos']

... going back to the **acrostic**

In [71]:
# first, initialize the dictionary
start_letter = {}

# second, generate the database dictionary, for every letter in the word list
for w in words:
    initial = w[0].lower()
    if initial not in start_letter:
        start_letter[initial] = []
    start_letter[initial].append(w)

for c in [' ', '.', ',', '-', '?']:
    start_letter[c] = [c]

In [74]:
for c in "ROBERT FROST.":
    print( rng.choice( start_letter[c.lower()] ).capitalize() )

Really
One
Both
Ever
Really
Them
 
First
Roads
On
Somewhere
The
.


In [75]:
# another way...
seed_phrase = "academics"

acrostic = [rng.choice( start_letter[c] ).capitalize() for c in seed_phrase]
print('\n'.join(acrostic))

Ages
Claim,
As
Diverged
Equally
Morning
I
Could
Should


## Guest speaker: [Darius Kazemi](https://github.com/dariusk) - [Corpora Project](https://github.com/dariusk/corpora)

So cool! but... how to use it in our scipts?!

Well... JSON files are typed as dictionaties in python, but is still a string!! **not** python syntax!!

- Go to the file > RAW > Save as... > .json file
- import the `json` package
- and `json.loads()` the data!

In [76]:
import json

In [83]:
animals_data = json.loads( open("./sources/common.json", 'r').read() )

In [85]:
print(animals_data)

{'animals': ['aardvark', 'alligator', 'alpaca', 'antelope', 'ape', 'armadillo', 'baboon', 'badger', 'bat', 'bear', 'beaver', 'bison', 'boar', 'buffalo', 'bull', 'camel', 'canary', 'capybara', 'cat', 'chameleon', 'cheetah', 'chimpanzee', 'chinchilla', 'chipmunk', 'cougar', 'cow', 'coyote', 'crocodile', 'crow', 'deer', 'dingo', 'dog', 'donkey', 'dromedary', 'elephant', 'elk', 'ewe', 'ferret', 'finch', 'fish', 'fox', 'frog', 'gazelle', 'gila monster', 'giraffe', 'gnu', 'goat', 'gopher', 'gorilla', 'grizzly bear', 'ground hog', 'guinea pig', 'hamster', 'hedgehog', 'hippopotamus', 'hog', 'horse', 'hyena', 'ibex', 'iguana', 'impala', 'jackal', 'jaguar', 'kangaroo', 'koala', 'lamb', 'lemur', 'leopard', 'lion', 'lizard', 'llama', 'lynx', 'mandrill', 'marmoset', 'mink', 'mole', 'mongoose', 'monkey', 'moose', 'mountain goat', 'mouse', 'mule', 'muskrat', 'mustang', 'mynah bird', 'newt', 'ocelot', 'opossum', 'orangutan', 'oryx', 'otter', 'ox', 'panda', 'panther', 'parakeet', 'parrot', 'pig', 'plat

But, this is a dictionary! If we want to access the `list`, we need to access the value from the `"animals"` key

In [86]:
animals_list = animals_data["animals"]

In [89]:
print(rng.sample(animals_list, 10))

['bear', 'parrot', 'chipmunk', 'mole', 'gopher', 'marmoset', 'gorilla', 'rhinoceros', 'fox', 'dromedary']
