## Practical 3: Foundations (Part 2)

Getting to grips with Dictionaries, LOLs and DOLs

In this notebook we are exploring basic (in the sense of fundamental)
data structures so that you understand both how to manage more complex
types of data and are prepared for what we will encounter when we start
using `pandas` to perform data analysis. To achieve that, you will need
to be ‘fluent’ in nested lists and dictionaries; we will focus primarily
on lists-of-lists and dictionaries-of-lists, but note that file formats
like JSON can be understood as
dictionaries-of-dictionaries-of-lists-of-… so this is just a *taster* of
real-world data structures.

> **Tip**
>
> You should download this notebook and then save it to your own copy of
> the repository. Follow the process used *last* week
> (i.e. `git add ...`, `git commit -m "..."`, `git push`) right away and
> then do this again at the end of the class and you’ll have a record of
> everything you did.

## 1. From Lists to Data (Little Steps)

We’re going to start off using lists and dictionaries that *we* define
right at the start of the ‘program’, but the *real* value of these data
structures comes when we build a list or dictionary *from* data such as
a file or a web page… and that’s what we’re going to do below!

First, here’s a reminder of some useful methods (*i.e.* functions) that
apply to lists which we covered in the
[lecture](https://jreades.github.io/fsds/sessions/week2.html#pre-recorded-lectures)
and
[practical](https://jreades.github.io/fsds/sessions/week2.html#practical)
in Week 2:

| Method | Action |
|--------------------|----------------------------------------------------|
| `list.count(x)` | Return the number of times x appears in the list |
| `list.insert(i, x)` | Insert value `x` at a given position `i` |
| `list.pop([i])` | Remove and return the value at position `i` (`i` is optional) |
| `list.remove(x)` | Remove the first element from the list whose value is `x` |
| `list.reverse()` | Reverse the elements of the list in place |
| `list.sort()` | Sort the items of the list in place |
| `list.index(x)` | Find the first occurence of `x` in the list |
| `list[x:y]` | Slice the list from index `x` to `y-1` |

This should all be revision… because it’s how we finished things up
*last week*. But I want to go over it *briefly* again because we’re
going to build on it this week.

> **Hint**
>
> As before, `??` will highlight where one or more bit of code are
> missing and need to be filled in…

### 1.1 List Refresher

> **Difficulty: Low.**

To complete these tasks, all of the methods that you need are listed
above, so this is about testing yourself on your understanding *both* of
how to read the help *and* how to index elements in a list.

The next line creates a list of (made up) Airbnb property names where
each element is a string:

In [1]:
listings = ["Sunny 1-Bed", "Fantastic Dbl",
    "Home-Away-From-Home", "Sunny Single", 
    "Whole House", "Trendy Terrace"]

#### 1.1.1 List Arithmetic

Replace the `??` so that it prints <code>Sunny Single</code>.

##### 1.1.1.1 Question

In [2]:
print(listings[2 + 1]) #第一个list即"Sunny 1-Bed"的index是"0"

Sunny Single


#### 1.1.2 Negative List Arithmetic

Now use a **negative** index to print <code>Whole House</code>:

##### 1.1.2.1 Question

In [3]:
print(listings[-2]) #从后往前的index以"1"开始//"-1"表示最后一个元素

Whole House


#### 1.1.3 Finding a Position in a List

Replace the `??` so that it prints the *index* for <code>Fantastic
Dbl</code> in the list.

##### 1.1.3.1 Question

In [4]:
print("The position of 'Fantastic Dbl' in the list is: " + str(listings.index("Fantastic Dbl") )) #str指字符

The position of 'Fantastic Dbl' in the list is: 1


### 1.2 Looking Across Lists

> **Connections**
>
> This section draws on the
> [LOLs](https://jreades.github.io/fsds/sessions/week3.html#pre-recorded-lectures)
> lecture and you will also find Code Camp’s
> [Loops](https://jreades.github.io/code-camp/lessons/Loops.html)
> session useful here.

> **Difficulty: Medium.**

Notice that the list of `prices` is the same length as the list of
`listings`, that’s because these are (made-up) prices for each listing.

In [5]:
listings = ["Sunny 1-Bed", "Fantastic Dbl", "Home-Away-From-Home", "Sunny Single", "Whole House", "Trendy Terrace"]
prices = [37.50, 46.00, 125.00, 45.00, 299.99, 175.00]

#### 1.2.1 Lateral Thinking

Given what you know about `listings` and `prices`, how do you print:

> `"The nightly price for Home-Away-From-Home is £125.0."`

But you have to do this *without* doing any of the following:

1.  Using a list index directly (*i.e.* `listings[2]` and `prices[2]`)
    or
2.  Hard-coding the name of the listing?

To put it another way, **neither** of these solutions is the answer:

In [6]:
print("The nightly price for `Home-Away-From-Home` is £" + str(prices[2]) + ".")
# ...OR...
listing=2
print("The nightly price for `Home-Away-From-Home` is £" + str(prices[listing]) + ".")

The nightly price for `Home-Away-From-Home` is £125.0.
The nightly price for `Home-Away-From-Home` is £125.0.


> **Tip**
>
> You will need to combine some of the ideas above and also think about
> the fact that the list index is that we need is the same in both
> lists… Also, remember that you’ll need to wrap a `str(...)` around
> your temperature to make it into a string.

##### 1.2.1.1 Question

In [7]:
listing="Home-Away-From-Home" # Use this to get the solution...
position=listings.index(listing)
price=prices[position]

# This way is perfectly fine
print("The nightly price of " + listing + " is £" + str(price) + '.')
# This way is more Python 3 and a bit easier to read
print(f"The nightly price of {listing} is £{price}.")

The nightly price of Home-Away-From-Home is £125.0.
The nightly price of Home-Away-From-Home is £125.0.


#### 1.2.2 Double-Checking Your Solution

In [8]:
listing = 'Sunny Single'

You’ll know that you got the ‘right’ answer to the question above if you
can copy+paste your code and change only **one** thing in order to print
out: “The nightly price of Sunny Single is £45.0”

##### 1.2.2.1 Question

In [9]:
listing = 'Sunny Single' #引用''或者""都可以
position=listings.index(listing)
price=prices[position] #取列表的元素要用方括号 []

print("The nightly price of "+listing+ " is £" +str(price) + ".")
#或者更方便的表达如下
print(f"The nightly price of {listing} is £{price}.")

The nightly price of Sunny Single is £45.0.
The nightly price of Sunny Single is £45.0.


#### 1.2.3 Loops

Now use a `for` loop over the listings to print out the price of each.
But first, some information about formatting number in Python…

> **Formatting Numbers**
>
> We often want to format numbers in a particular way to make the more
> readable. Commonly, in English we use commas for thousands separators
> and a full-stop for the decimal. Other countries follow other
> standards, but by default Python goes the English way. So:
>
> ``` python
> print(f"{1234567.25:.0f}")
> print(f"{1234567.25:.1f}")
> print(f"{1234567.25:.2f}")
> print(f"{1234567:.2f}")
> print(f"{1234567:,.2f}")
> ```
>     1234567
>     1234567.2
>     1234567.25
>     1234567.00
>     1,234,567.00
>
> You might like to read this [useful
> article](https://blog.teclado.com/python-formatting-numbers-for-printing/)
> on formatting numbers in Python.

That should then help you with the output of the following block of
code!

##### 1.2.3.1 Question

In [25]:
print(f"{1234567:,.2f}")   # 用逗号分隔千位
print(f"{1234567:_.2f}")   # 用下划线分隔千位

1,234,567.00
1_234_567.00


In [24]:
for l in listings: #记得按一次Tabs缩进，以表示for的循环
    position=listings.index(l)
    price=prices[position]
    print(f"The nightly price of {l} is £{price:.2f}.")

The nightly price of Sunny 1-Bed is £37.50.
The nightly price of Fantastic Dbl is £46.00.
The nightly price of Home-Away-From-Home is £125.00.
The nightly price of Sunny Single is £45.00.
The nightly price of Whole House is £299.99.
The nightly price of Trendy Terrace is £175.00.


The output should be:

In [None]:
The nightly price of Sunny 1-Bed is £37.50
The nightly price of Fantastic Dbl is £46.00
The nightly price of Home-Away-From-Home is £125.00
The nightly price of Sunny Single is £45.00
The nightly price of Whole House is £299.99
The nightly price of Trendy Terrace is £175.00

## 2. Dictionaries

> **Connections**
>
> This section draws on the
> [Dictionaries](https://jreades.github.io/fsds/sessions/week3.html#pre-recorded-lectures)
> lecture and Code Camp
> [Dictionaries](https://jreades.github.io/code-camp/lessons/Dicts.html)
> session.

> **Difficulty: Low.**

Remember that dictionaries (a.k.a. dicts) are like lists in that they
are [data
structures](https://docs.python.org/2/tutorial/datastructures.html)
containing multiple elements. A key difference between
[dictionaries](https://docs.python.org/2/tutorial/datastructures.html#dictionaries)
and [lists](https://docs.python.org/2/tutorial/introduction.html#lists)
is that while elements in lists are ordered, dicts (in most programming
languages, though not Python) are unordered. This means that whereas for
lists we use integers as indexes to access elements, in dictonaries we
use ‘keys’ (which can multiple different types; strings, integers,
etc.). Consequently, the important term here is key-value pairs.

### 2.1 Creating an Atlas

The code below creates an atlas using a dictionary. The dictionary `key`
is a listing, and the `value` is the latitude, longitude, and price.

In [26]:
listings = {
    'Sunny 1-Bed': [37.77, -122.43, '£37.50'],
    'Fantastic Dbl': [51.51, -0.08, '£46.00'],
    'Home-Away-From-Home': [48.86, 2.29, '£125.00'],
    'Sunny Single': [39.92, 116.40 ,'£45.00'],
}

#### 2.1.1 Adding to a Dict

Add a record to the dictionary for “Whole House” following the same
format.

##### 2.1.1.1 Question

In [36]:
listings['Whole House'] = [13.08, 80.28, '£299.99']

#### 2.1.2 Accessing a Dict

In *one* line of code, print out the price for ‘Whole House’:

##### 2.1.2.1 Question

In [38]:
print(listings['Whole House'][2])

£299.99


### 2.2 Dealing With Errors

Check you understand the difference between the following two blocks of
code by running them.

In [39]:
try:
    print(listings['Trendy Terrace'])
except KeyError as e:
    print("Error found")
    print(e)

Error found
'Trendy Terrace'


In [40]:
try:
    print(listings.get('Trendy Terrace','Not Found'))
except KeyError as e:
    print("Error found")
    print(e)

Not Found


Notice that trying to access a non-existent element of a dict triggers a
`KeyError`, while asking the dict to `get` the *same element* does not,
it simply returns `None`. Can you think why, depending on the
situtation, *either* of these might be the ‘correct’ answer?

### 2.3 Thinking Data

This section makes use of both the
[Dictionaries](https://jreades.github.io/fsds//sessions/week3.html#pre-recorded-lectures)
lecture and the [DOLs to
Data](https://jreades.github.io/fsds/sessions/week3.html#pre-recorded-lectures)
lecture.

> **Tip**
>
> In this section you’ll need to look up (i.e. Google) and make use of a
> few new functions that apply to dictionaries: `<dictionary>.items()`,
> `<dictionary>.keys()`. *Remember*: if in doubt, add `print(...)`
> statements to see what is going on!

#### 2.3.1 Iterating over a Dict

Adapting the code below, print out the listing name and price.

##### 2.3.1.1 Question

In [62]:
for l in listings.keys():
    print(l, '->', listings[l][2])
    #或者等同于
    #print(f'{l} -> £{listings[l][2]}')

Sunny 1-Bed -> £37.50
Fantastic Dbl -> £46.00
Home-Away-From-Home -> £125.00
Sunny Single -> £45.00
Whole House -> £299.99


The output should look something like this:

In [None]:
Sunny 1-Bed -> £37.50</code>
Fantastic Dbl -> £46.00</code>
Home-Away-From-Home -> £125.00</code>
Sunny Single -> £45.00</code>
Whole House -> £299.99</code>

#### 2.3.2 More Complex Dicts

How would your code need to change to produce the *same output* from
this data structure:

In [75]:
listings = {
    'Sunny 1-Bed': {
        'lat': 37.77, 
        'lon': -122.43, 
        'price': '£37.50'},
    'Fantastic Dbl': {
        'lat': 51.51, 
        'lon': -0.08, 
        'price': '£46.00'},
    'Home-Away-From-Home': {
        'lat': 48.86, 
        'lon': 2.29, 
        'price': '£125.00'},
    'Sunny Single': {
        'lat': 39.92, 
        'lon': 116.40, 
        'price': '£45.00'},
}

##### 2.3.2.1 Question

So to print out the below for each listing it’s…

In [84]:
for l in listings.keys():
    print(l, '->', listings[l]['price'])
    #或者等同于
    #print(f'{l} -> {listings[l]['price']}')
    

Sunny 1-Bed -> £37.50
Fantastic Dbl -> £46.00
Home-Away-From-Home -> £125.00
Sunny Single -> £45.00


Your output should be:

In [None]:
Sunny 1-Bed -> £37.50
Fantastic Dbl -> £46.00
Home-Away-From-Home -> £125.00
Sunny Single -> £45.00

#### 2.3.3 More Dictionary Action!

And how would it need to change to print out the name and latitude of
every listing?

##### 2.3.3.1 Question

In [87]:
for l in listings.keys():
    print(l, 'is at latitude', listings[l]['lat'])
    #或者等同于
    #print(f'{l} is at latitude {listings[l]['lat']}')

Sunny 1-Bed is at latitude 37.77
Fantastic Dbl is at latitude 51.51
Home-Away-From-Home is at latitude 48.86
Sunny Single is at latitude 39.92


The output should be something like this:

In [None]:
Sunny 1-Bed is at latitude 37.77
Fantastic Dbl is at latitude 51.51
Home-Away-From-Home is at latitude 48.86
Sunny Single is at latitude 39.92

#### 2.3.4 And Another Way to Use a Dict

Now produce the *same output* using this new data structure:

In [124]:
listings_alt = [
    {'name':     'Sunny 1-Bed',
     'position': [37.77, -122.43],
     'price':    '£37.50'},
    {'name':     'Fantastic Dbl',
     'position': [51.51, -0.08],
     'price':    '£46.00'},
    {'name':     'Home-Away-From-Home',
     'position': [48.86, 2.29],
     'price':    '£125.00'},
    {'name':     'Sunny Single',
     'position': [39.92, 116.40],
     'price':    '£45.00'},
    {'name':     'Whole House', 
     'position': [13.08, 80.28],
     'price':    '£299.99'}
]

In [131]:
#如果只想要第一条数据
first =  listings_alt[0]
print(first)

{'name': 'Sunny 1-Bed', 'position': [37.77, -122.43], 'price': '£37.50'}


##### 2.3.4.1 Question

In [110]:
for l in listings_alt:
    print(l['name'], 'is at latitude', l['position'][0])
    #或者等同于
    #print(f'{l['name']} is at altitude {l['position'][0]}')

Sunny 1-Bed is at latitude 37.77
Fantastic Dbl is at latitude 51.51
Home-Away-From-Home is at latitude 48.86
Sunny Single is at latitude 39.92
Whole House is at latitude 13.08


The output should be something like this:

In [None]:
Sunny 1-Bed is at latitude 37.77
Fantastic Dbl is at latitude 51.51
Home-Away-From-Home is at latitude 48.86
Sunny Single is at latitude 39.92
Whole House is at latitude 13.08

#### 2.3.5 Think Data!

What are some of the main differences that you can think of between
these data structures? There is no right answer.

-   Point 1 here.
-   Point 2 here.
-   Point 3 here.

### 2.4 Add to Git/GitHub

Now follow the same process that you used last week to ensure that your
edited notebook is updated in Git and then synchronised with GitHub.

# Updated Review (practice with video)

In [9]:
# List 用[]包住 元素用，分隔
cities = [
    'San Francisco',
    'London',
    'Paris',
    'Beijing'
]

# Prints London
print(cities[1]) #取位置

London


In [6]:
# Dict 用{}包住 元素用，分隔
cities = {
    'San Francisco': 837442,
    'London': 8673713,
    'Paris': 837442,
    'Beijing': 17000000
}

# Prints population of London
print(cities['London']) #取元素

8673713


In [20]:
# Dict 用[] 直接取值——报错
cities = {
    'San Francisco': 837442,
    'London': 8673713,
    'Paris': 837442,
    'Beijing': 0.1700000
}

print(cities['Sao Paulo'])          # Throws KeyError
# key不存在，直接报错KeyError

KeyError: 'Sao Paulo'

In [21]:
# 用 .get()取值——不报错
print(cities.get('Sao Paulo'))      # Returns None
# key不存在，返回 None （不报错）

None


In [22]:
print(cities.get('Sao Paulo', 'No Data'))   # Returns 'No Data'
# key不存在，返回 默认值 （不报错）

No Data


In [27]:
# in
cities = {
    'San Francisco': 837442,
    'London': 8673713,
    'Paris': 837442,
    'Beijing': 0.1700000
}

c = cities.get('Sao Paulo')
if not c:
    print('Sorry,no city by that name.')

if 'Beijing' in cities:
    print('Found Beijing!')


Sorry,no city by that name.
Found Beijing!


In [57]:
# 给 Dict 加入添加、更改、删除  key
cities = {}                 # empty dictionary
cities['Beijing'] = 21710662     # set key -> value
cities['Toronto'] = 2930000      #set key -> value


print(cities['Toronto'])    # print key
#del cities['Toronto']   # del(delete) Toronto key and value
cities.pop('Toronto', 'Default')  # 因为 Toronto 不存在，返回 'Default'


#cities['Beijing'] = 16620   # 更新旧值，覆盖
print(cities)  #output final list

2930000
{'Beijing': 21710662}


In [58]:
# 验证
cities = {}                 # empty dictionary
cities['Beijing'] = 21710662     # set key -> value
cities['Toronto'] = 2930000      # set key -> value

del cities['Toronto']   # del(delete) Toronto key and value
#cities.pop('Toronto', 'Default')  # 因为 Toronto 不存在，返回 'Default'

print(cities) 

{'Beijing': 21710662}


In [60]:
# 验证
cities = {}                 # empty dictionary
cities['Beijing'] = 21710662     # set key -> value
cities['Toronto'] = 2930000      # set key -> value

del cities['Toronto']   # del(delete) Toronto key and value
cities.pop('Toronto')  # 因为 Toronto 不存在，且没有给予默认值，报错

print(cities) 

KeyError: 'Toronto'

In [71]:
#  在 Dict 做 for 循环时，默认遍历的是 key，而不是 value！
cities = {
    'San Francisco': 837442,
    'London': 8673713,
    'Paris': 837442,
    'Beijing': 0.1700000
}

for keys in cities:    #或者 for key in cities.keys():
    print(keys)

San Francisco
London
Paris
Beijing


In [68]:
# 接上，如果想遍历 value
for values in cities.values():
    print(values)

837442
8673713
837442
0.17


In [75]:
# 接上，如果想遍历 keys 和 values
for keys,values in cities.items():
    print(keys,values)

San Francisco 837442
London 8673713
Paris 837442
Beijing 0.17


In [78]:
    #或者
for keys,values in cities.items():
    print(f'{keys} -> {values}')

San Francisco -> 837442
London -> 8673713
Paris -> 837442
Beijing -> 0.17


In [86]:
cities = {
    'San Francisco': {
        'lat': 37.77,
        'lon': -122.43,
        'airport': 'SFO'}
    }
print(cities['San Francisco']['lat'])  #字典中的字典

37.77


In [92]:
# 输出矩阵
my_list = [
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
]

print(my_list)
print(my_list[2])

for i in my_list:    # 遍历my_list中的每一个元素
    print(i)
    
#观察两种输出方法

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
[7, 8, 9]
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]


In [101]:
a = [1, 2, 3]
b = [4, 5, 6]
c = [7, 8, 9]

for i in my_list:
    print(i)

[1, 2, 3]
[4, 5, 6]
[7, 8, 9]


In [100]:
# 输出矩阵中的每个元素
a = [1, 2, 3]
b = [4, 5, 6]
c = [7, 8, 9]

print(a[0])

my_list = [a, b, c]
for i in my_list:
    print(f'{i}')
    for j in i:          # i 中嵌套的for循环 j   // remember that i is a list
        print(j)

1
[1, 2, 3]
1
2
3
[4, 5, 6]
4
5
6
[7, 8, 9]
7
8
9


In [103]:
# 访问 “嵌套列表” （nested list）中的元素
a = [1, 2, 3]   # 第 0 行
b = [4, 5, 6]   # 第 1 行
c = [7, 8, 9]   # 第 2 行


# 3×3 Matrix

i, j = 0, 1
print(my_list[i][j])     # 双索引 “第 i 行row、第 j 列column” 的值

2


In [105]:
# 接上
print(my_list[i])
print(my_list[j])

[1, 2, 3]
[4, 5, 6]


In [112]:
# Lists of lists (LOLs)
my_cities = [
    ['London', [51.5072, 0.1275], +0],
    ['New York', [40.7127, 74.0059], -5],
    ['Tokyo', [35.6833, 139.6833], +8]
]

print(my_cities[0][0])    # london
print(my_cities[0][1][0]) # lat
print(my_cities[0][2])  # 第0行，第2列

London
51.5072
0


In [118]:
# LOLs of Data
# 如果要输出所有的lat
ds1 = [
    ['lat', 'lon', 'name', 'tz'],
    [51.51, 0.13, 'London', +0],
    [40.71, 74.01, 'New York', -5],
    [35.69,139.68,'Tokyo', +8]
]

lats = []
for row in ds1[1:]:
    lats.append(row[0])

print(lats)

[51.51, 40.71, 35.69]


In [125]:
# DOLs(dict of lists) of Data  (更适合提取数据，因为字典中的各list属性都一致：float\lin\string),因此运算更快
# 如果要输出所有的 lat
ds2 = {
    'lat': [51.51,40.71,35.69],
    'lon': [0.13,74.01,139.68],
    'tz' : [+0,-5,+8],
    'name':['London','New York','Tokyo']
}
lat = ds2['lat']
print(lat)   #或者直接print(ds2['lat'])

[51.51, 40.71, 35.69]


In [128]:
# 列表+索引 找元素
ds2 = {
    'lat': [51.51,40.71,35.69],
    'lon': [0.13,74.01,139.68],
    'tz' : [+0,-5,+8],
    'name':['London','New York','Tokyo']
}

print(ds2['name'][0])
print(ds2['lat'][0])
print(ds2['tz'][0])

London
51.51
0


In [156]:
ds2 = {
    'lat': [51.51,40.71,35.69],
    'lon': [0.13,74.01,139.68],
    'tz' : [+0,-5,+8],
    'name':['London','New York','Tokyo']
}

city_name = 'Tokyo'

city_index = ds2['name'].index(city_name)
print(city_index)     # 使得 输出city的index

print(f'The time zone of {city_name} is {ds2['tz'][city_index]}.')

2
The time zone of Tokyo is 8.
