# Dictionaries in Python
The basic data container in Python is a list. A `list` can hold a variety of data types. Once a `list` exists, it can be appended, have elements deleted, and be transformed into `arrays` or `tuples`. Of course, there are more sophisicated ways of storing/accessing/processing data in Python. Let's look at `dictionaries`. Content below is generously sampled from http://openbookproject.net/thinkcs/python/english3e/dictionaries.html.

Dictionaries are lists of values that are indexed by `keys`. In other languages, these dictionaries are often referred to `associative arrays` because it is a way to associate `keys` with values.

Creating dictionaries allows you to form a one-to-one association between 2 pieces of data, one referred to as a `value` the other as a `key`. Values in dictionaries can be any datatype and can be duplicated. Keys, on the other hand, must be unique and cannot be repeated within the same dictionary. 
Examples of using dictionaries:
* English to Spainish dictionary

In [23]:
entosp = {"one":"uno","two":"dos","three":"tres"} # `key:value` elements
entosp["two"]

'dos'

* Inventory of supplies

In [24]:
office = {"highlighters":200,"pencils":300,"pens":500}
office['pens']

500

* Dictionary for a encryption code:

In [25]:
codex = { "a": 1, "b": 2, "c": 3, "d": 4, "e": 5, "f": 6, "g": 7, "h": 8, "i": 9, "j": 10,"k": 11,"l": 12,"m": 13,"n": 14,
         "o": 15,"p": 16,"q": 17,"r": 18,"s": 19,"t": 20,"u": 21,"v": 22,"w": 23,"x": 24,"y": 25}
print(codex)
print(codex["g"])

{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12, 'm': 13, 'n': 14, 'o': 15, 'p': 16, 'q': 17, 'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24, 'y': 25}
7


You can add new keys after the initial definiton:

In [26]:
codex["z"]=26
print(codex)

{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5, 'f': 6, 'g': 7, 'h': 8, 'i': 9, 'j': 10, 'k': 11, 'l': 12, 'm': 13, 'n': 14, 'o': 15, 'p': 16, 'q': 17, 'r': 18, 's': 19, 't': 20, 'u': 21, 'v': 22, 'w': 23, 'x': 24, 'y': 25, 'z': 26}


You can update the values in the dictionary

In [28]:
office["pens"]+=200
office['pens'] = office['pens']+200
print(office)

{'highlighters': 200, 'pencils': 300, 'pens': 1100}


You can find out how many `key:values` pairs a dictionary has

In [29]:
len(codex)

26

## Dictionary Methods
Dictionaries are a type of class in Python, and classes have functions associated with them called `methods`. Methods allow you to perform operations on and with the dictionary. 
You can always get the keys and the values from any dictionary, using the methods `keys` and `values`

In [30]:
type(office)

dict

In [37]:
print(type(office.values()))
print(office.keys())
# Don't forget about typecasting
print(list(office.keys())) 

<class 'dict_values'>


TypeError: 'dict_keys' object is not subscriptable

You can run over all the keys to process the values or run through the values explicitly.

In [33]:
print("Let's count in Spanish")
for k in entosp.keys():  
    print(k)
    print(entosp[k])


Let's count in Spanish
one
uno
two
dos
three
tres


In [34]:
print("Let's look at the code")
for k in codex:   # The order of the k's is defined
   print(k,"oops") # 

Let's look at the code
a oops
b oops
c oops
d oops
e oops
f oops
g oops
h oops
i oops
j oops
k oops
l oops
m oops
n oops
o oops
p oops
q oops
r oops
s oops
t oops
u oops
v oops
w oops
x oops
y oops
z oops


The method `items` can be used to iterate over the keys and values or to grab both the keys and the values.

In [35]:
x=list(office.items())
print(type(x[0]))
print(x)
x.append(('staplers',900))
print(x)

<class 'tuple'>
[('highlighters', 200), ('pencils', 300), ('pens', 1100)]
[('highlighters', 200), ('pencils', 300), ('pens', 1100), ('staplers', 900)]


In [36]:
for (k,v) in codex.items():
    print(k, "=", v)

a = 1
b = 2
c = 3
d = 4
e = 5
f = 6
g = 7
h = 8
i = 9
j = 10
k = 11
l = 12
m = 13
n = 14
o = 15
p = 16
q = 17
r = 18
s = 19
t = 20
u = 21
v = 22
w = 23
x = 24
y = 25
z = 26


# Exercise: Simple Harmonic Motion Lab Data
Assume you've just completed a simple harmonic motion lab measuring both pendulums and spring systems. The data has been organized for analysis using dictionaries.

In [39]:
import numpy as np

# Simple Harmonic Motion Lab - organized experimental data
lab_data = {
    "pendulum": {
        "lengths": [0.25, 0.50, 0.75, 1.00, 1.25],  # meters
        "periods": [1.0, 1.42, 1.74, 2.01, 2.24],   # seconds
        "setup_status": "complete"
    },
    "spring_system": {
        "masses": np.array([0.1, 0.2, 0.3, 0.4, 0.5]),      # kg
        "periods": np.array([0.63, 0.89, 1.09, 1.26, 1.41]), # seconds  
        "spring_constant": 25.0,  # N/m (theoretical)
        "setup_status": "complete"
    },
    "lab_info": {
        "date": "2025-09-15",
        "temperature": 22.5,  # Celsius
        "group_members": 4,
        "lab_complete": True
    }
}

g_assumed = 9.81  # m/s^2

**Questions**
1. Extract the values of the masses used in the spring experiment. Extract the periods measured in the spring experiment. Calculate the spring constants from the data. Create a new field `exp_spring_constant` in the dictionary. What is the percent error of the average calculated spring constant given the theoretical spring constant?
2. How many pendulum lengths were used in the pendulum experiment? Find the longest pendulum length used. What was the period of the longest pendulum?
3. Write a summary statement including the completion date, temperature, and number of group members using an f string.

In [45]:
m = lab_data['spring_system']['masses']
T = lab_data['spring_system']['periods']
print(T)
print(m)

[0.63 0.89 1.09 1.26 1.41]
[0.1 0.2 0.3 0.4 0.5]


In [76]:
le=lab_data['pendulum']['lengths']
np.max(le)

1.25

In [56]:
f'Experiment data of pendulum lengths is {lab_data['pendulum']['lengths']} meters'

'Experiment data of pendulum lengths is [0.25, 0.5, 0.75, 1.0, 1.25] meters'

## Dictionaries and json files
One of the more useful applications of dictionaries in python are processing [json files](https://stackoverflow.blog/2022/06/02/a-beginners-guide-to-json-the-data-format-for-the-internet/). JSON files have a format that looks familiar:
``` python 
{
  "key": "String",
  "Number": 1,
  "array": [1,2,3],	
  "nested": {
	"literals": true
  }	
}
```
Yes, this is the exact syntax of a dictionary in Python (this is actually a nested dictionary). Many organizations and cites format their data using `json` formats. 

Python can pull and process this information directly. People use many packages to pull data from the internet; one of the common packages is `requests`. Using this package's functions you can pull the information contained in any website. The following examples are pulling directly from `json` files.

### Particle Data

In [57]:
import requests as rq
particle  = rq.get("https://pdg.lbl.gov/2022/pdgid/PDGIdentifiers-2022v0.json")
print(particle)
particle_data = particle.json()
type(particle_data)

<Response [200]>


list

In [58]:
print(type(particle_data[0]))
particle_data[0]

<class 'dict'>


{'description': 'gamma (photon)', 'pdgId': 'S000'}

In [59]:
print(particle_data[0]['description'])
len(particle_data)

11374

In [60]:
particle_data[114].values()

dict_values(['<N(rho+-)>', 'S044RHC'])

### Time-Date
You can pull the current GMT time and Unix epoch time (the number of milliseconds elapsed since January 1, 1970).

In [61]:
import requests as rq
date = rq.get("http://date.jsontest.com")
datetime = date.json()

ConnectionError: HTTPConnectionPool(host='date.jsontest.com', port=80): Max retries exceeded with url: / (Caused by NameResolutionError("<urllib3.connection.HTTPConnection object at 0x000002504DAA68D0>: Failed to resolve 'date.jsontest.com' ([Errno 11001] getaddrinfo failed)"))

In [None]:
datetime['time']

### New York City Bikes
Organizations and governments can post loads of information for the public.

In [None]:
import requests as rq
bikes = rq.get("https://gbfs.citibikenyc.com/gbfs/en/station_information.json")
print(bikes)
bikedata = bikes.json()

In [62]:
type(bikedata)

dict

In [63]:
bikedata.keys()

dict_keys(['data', 'last_updated', 'ttl', 'version'])

In [65]:
bikedata['data'].keys()

dict_keys(['stations'])

In [73]:
bikedata['data']['stations'][0]

{'lat': 40.62994,
 'electric_bike_surcharge_waiver': False,
 'eightd_has_key_dispenser': False,
 'eightd_station_services': [],
 'short_name': '2625.01',
 'has_kiosk': False,
 'lon': -74.03121,
 'capacity': 0,
 'name': '78 St & Ridge Blvd',
 'region_id': '71',
 'rental_uris': {'ios': 'https://bkn.lft.to/lastmile_qr_scan',
  'android': 'https://bkn.lft.to/lastmile_qr_scan'},
 'external_id': '2124036103510616644',
 'station_id': '2124036103510616644',
 'station_type': 'classic',
 'rental_methods': ['KEY', 'CREDITCARD']}

In [71]:
bikedata['data']['stations'][67]['lon']
bikedata['data']['stations'][67]['lat']

40.6686627

# Exercise: NYC Bikes Exploration
Use the bike data at https://gbfs.citibikenyc.com/gbfs/en/station_information.json to answer the following questions about CitiBike. 
1. How many stations are there in the NYC database?
2. What information is available for each station?
3. Print the name and capacity of the first 3 stations.
4. How many stations have a capacity greater than 50?
5. Find and print all the station names that contain "Brooklyn" in their name. *(Hint: there is a `in` operator that could help here)*