# 3. Python Built-in Data Structure

## List

Python lists are very versatile containers (boxes) that are extremely userful in many circumstances. A list in Python, defined using brackets `[]`, is an ordered collection of objects that can contain arbitrary numbers of Python objects, such as strings, integers, floats, lists or others. All these data points can be accessed using numerical index.

1. Create Python list
2. Access, modify, and delete list elements
3. Merge lists
4. Use the slicing synatx to operate on sublists
5. Loop over lists

### Create a list

In [1]:
city = ["Lancaster", "Warwick", "Lancaster", "Bath"]

### Access list elements

Python follows C's convention of starting from zero as numerical index. This means, if we have a list of length $n$, the first element of the list is at index $0$ (think of location), the second is at index $1$, and so on and so forth. So be careful that, the last element of the list will be at index $n-1$.

To get the **first** element in the above list, we type:

In [2]:
# Python is inspired by C
# follows C's convention that index starts with 0
city[0]

'Lancaster'

How about the **last** element?

In [3]:
city[-1]

'Bath'

#### A simple loop

#### mix of strings and numbers

In [4]:
mix = [1, [2, 3], 'Python rocks!', True]

#### manipulate lists

In [9]:
# add a single element to the end of the list
city.append('Manchester')
print(city)

['Oxford', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'York', 'Edinburgh', 'Manchester']


In [10]:
# insert an element into a specific location
city.insert(0, "Cambridge")
print(city)

['Cambridge', 'Oxford', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'York', 'Edinburgh', 'Manchester']


In [11]:
# update an element
city[0] = "Oxford"

In [12]:
# add another list
city.extend(['York', 'Edinburgh'])
print(city)

['Oxford', 'Oxford', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'York', 'Edinburgh', 'Manchester', 'York', 'Edinburgh']
['Oxford', 'Oxford', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'York', 'Edinburgh', 'Manchester', 'York', 'Edinburgh', 'Liverpool', 'Glasgow']


In [None]:
# or just + sign
city = city + ['Liverpool', 'Glasgow']
print(city)

In [13]:
# delete element by index
del city[0]
print(city)

['Oxford', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'York', 'Edinburgh', 'Manchester', 'York', 'Edinburgh', 'Liverpool', 'Glasgow']


In [14]:
# or by value
city.remove("Glasgow")
print(city)

['Oxford', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'York', 'Edinburgh', 'Manchester', 'York', 'Edinburgh', 'Liverpool']


In [15]:
# sort list
city.sort()
print(city)

['Bath', 'Edinburgh', 'Edinburgh', 'Lancaster', 'Lancaster', 'Liverpool', 'Manchester', 'Manchester', 'Oxford', 'Warwick', 'York', 'York']


#### Another list of numbers

In [None]:
# slice list by using (:)
sq = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

#### another counterintuitive convention

We can select a group of elements in a list starting from the first element indicated and going up to (**but not including**) the last element indicated. For example:

In [None]:
# to get the first three element
sq[0:3] # Remember index starts with 0

We can select everything in a list

In [None]:
sq[:]

We can select everything before or after a certain point

In [None]:
sq[8:]

In [None]:
sq[:3]

We can also add a third component to slice a list, which is the step size we want to take. So instead of taking every single element, we can take every other element. For instance:

In [None]:
# the max index for the list is 10
# if using 10 as the second component, the last element will be excluded
sq[0:11:2]

### loop through lists

We can create an empty list for university names and loop through all the cities in the list `city` and add "University of" to city names:

In [19]:
uni = []
for i in range(len(city)): # len() returns numbers of elements
    uni.append("University of {}".format(city[i]))

print(uni)

['University of Bath', 'University of Edinburgh', 'University of Edinburgh', 'University of Lancaster', 'University of Lancaster', 'University of Liverpool', 'University of Manchester', 'University of Manchester', 'University of Oxford', 'University of Warwick', 'University of York', 'University of York']


We can also do mathematical operations on each element in a list

In [None]:
import math

for value in sq:
    print("square root for {} is {}".format(value, int(math.sqrt(value))))

In [None]:
# even better
for index, value in enumerate(sq):
    print("# {} element ==> {}".format(index+1, value))

#### Useful built-in function `range()` to generate lists

In [None]:
my_range = range(10)
print(my_range)

In [None]:
my_range = range(1, 11)
print(my_range)

In [None]:
my_range = range(1, 11, 2)
print(my_range)

#### Strings are lists of characters!

We already know that strings are basically text. In Python, strings are also indexed much in the same way that lists are, which means that we can do use built-in operations to access, slice, and update strings easily depending on what we need.

In [1]:
my_string = "I love Python!"
print(my_string[0]) # "I"
print(my_string[2:6]) # "Love"
print(my_string[::-1]) # reverse it!

I
love
!nohtyP evol I


Regular Expression is a powerful engine to deal with text and is widely used in programming and data analysis. We will see an example at the end of today's session.

### Tuple and Set

While we can do everything we want in a list, a special form of data container, very similar to list, is `tuple`. The only difference is that we cannot modify a tuple once defined as we do to a list. Tuple is defined using `( )`.

In [None]:
my_tuple = ("I", "want", "my", "lunch")
print(my_tuple)

In [None]:
my_tuple[3] = "dinner" # error if we attempt to update tuple

A `set` is a collection of unordered but unique elements, defined using curl brackets `{ }`. Note: indexing is not working for `set` but instead we can check if an element is in a set or not.

In [None]:
uni_set = set(uni)

## Dictionary

Another essential data structure in Python is the dictionary. Dictionaries are defined with a combination of curly braces (`{ }`) and colons (`:`). The braces define the beginning and end of a dictionary and the colons indicate key-value pairs. A dictionary is essentially a set of `key-value pairs`. 

Dictionary is very close to an Excel spreadsheet or table, with `key` as column name and `value` as row value

1. Create Python dictionaries
2. Access, modify, and delete dictionary items
3. Merge dictionaries
4. Loop over dictionary keys, values, and items

#### Create a dictionary

In [8]:
capitals = {'China': 'Beijing',
            'United Kingdom': 'London',
            'France': 'Paris',
            'Italy': 'Rome'}

#### Accessing dictionaries items

Instead of using numerical index as we did in a list, we can use a more intuitive approach in a dictionary.

In [22]:
capitals['China']

'Beijing'

#### Add items to dictionary

In [23]:
capitals['United States'] = 'Washington, DC'

In [24]:
capitals

{'China': 'Beijing',
 'France': 'Paris',
 'Italy': 'Rome',
 'United Kingdom': 'London',
 'United States': 'Washington, DC'}

if we call a key from the dictionary, but there is no such pair in our container/box/data, then we get an error

In [26]:
capitals['Germany']

KeyError: 'Germany'

#### Check if an item belongs to a dictionary

In [1]:
'Germany' in capitals # Python knows that you are talking about keys

NameError: name 'capitals' is not defined

In [28]:
'United Kingdom' in capitals

True

#### Combine dictionaries

In [29]:
morecapitals = {'Germany': 'Berlin',
                'Japan': 'Tokyo'}

In [31]:
capitals.update(morecapitals)
capitals

{'China': 'Beijing',
 'France': 'Paris',
 'Germany': 'Berlin',
 'Italy': 'Rome',
 'Japan': 'Tokyo',
 'United Kingdom': 'London',
 'United States': 'Washington, DC'}

#### Delete Key-Value pair

In [32]:
del capitals['United States']
capitals

{'China': 'Beijing',
 'France': 'Paris',
 'Germany': 'Berlin',
 'Italy': 'Rome',
 'Japan': 'Tokyo',
 'United Kingdom': 'London'}

#### Dictionaries are useful with `loop`

In [33]:
# if we want to know what are the keys we have in the dictionary
# loop through keys
for key in capitals.keys():
    print(key)

United Kingdom
Italy
Japan
China
Germany
France


In [37]:
# if we want to know what values we stored
# loop through values
for value in capitals.values():
    print(value)

London
Rome
Tokyo
Beijing
Berlin
Paris


In [38]:
# or loop over pairs
for key, value in capitals.items():
    print(key, value)

('United Kingdom', 'London')
('Italy', 'Rome')
('Japan', 'Tokyo')
('China', 'Beijing')
('Germany', 'Berlin')
('France', 'Paris')


### Comprehensions
In python programming, especially when dealing with data, there are many cases where we want to iterate over or loop through list or dictionary, perform an operation on every element, and then collect the results in a new list or dictionary for future analysis.

We can certainly perform this with a traditional `for` loop but Python offers a better approach, comprehensions, which let us write shorter and cleaner codes that acchieves the same effect.

1. rewrite a loop as a comprehensions
2. filter items
3. create a list or a dictionary from a comprehensions

#### Create a list using `for` loop

In [41]:
sq = []
for i in range(10):
    sq.append(i**2)
sq

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

#### Create a list using `comprehension`

In [43]:
# inline for loop
sq = [i**2 for i in range(10)]
sq

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

#### Filter elements in a list using `for` loop

In [46]:
sq = []
for i in range(10):
    if i < 5:
        sq.append(i**2)
sq

[0, 1, 4, 9, 16]

#### Filter elements in a list using `comprehension`

In [47]:
# if condition
sq = [i**2 for i in range(10) if i < 5]
sq

[0, 1, 4, 9, 16]

#### Create a dictionary using `comprehension`

In [51]:
i_sq = {i: i**2 for i in range(10) if i < 5}

In [52]:
i_sq

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

### TASK: Can you use a comprehension to transpose key and value of the dictionary `capitals`?

# 4. Flow Control/Logical Statements

### Basic Logic

If you recall, a boolean takes one of the two values, `True` or `False` (sometimes 1 or 0). The basic logical statements that we can make are defined using Python built-in **comparison operators**:

- `==`: equal to
- `!=`: NOT equal to
- `<`: less than
- `<=`: less than or equal to
- `>`: greater than
- `>=`: greater than or equal to

In [None]:
1 == 1

In [None]:
2.0 != 2

In [None]:
3 > 3

In [None]:
3 >= 3

#### Boolean Operators

We can also put various comparisons together using `Boolean Operators`:
- `and`: and
- `or`: or
- `not`: not

In [None]:
True and True

In [None]:
True and False

In [None]:
(1 == 0) and (1 < 2)

In [None]:
(1 == 0) or (1 < 2)

In [None]:
not (1 == 0) and (1 < 2)

#### Truthiness
As a general rule, containers like strings, tuples, dictionaries, lists, and sets, will return `True` if they contain anything at all and `False` if they contain nothing.

In [3]:
bool('')

False

In [4]:
bool('I am not empty')

True

In [5]:
bool([])

False

This feature becomes very handy if we need to test if a variable or any object exists in Python's memory so that we can avoid errors.

In [9]:
bool(capitals['China'])

True

### If-statements
We can create segments of code that only execute if a set of `conditions` is met. 

We use if-statements in conjunction with logical statements in order to create `branches` in our code.

An `if branch` gets executed when the `condition` is considered to be `True`. If condition is evaluated as `False`, the if block will simply be skipped unless there is an `else branch`. 

Conditions are made using either logical operators or by using the truthiness of values in Python. An if-statement is defined with a colon and a block of indented text:

In [11]:
# Multiple-branch logical statements

if "Condition_1":         # First check Condition_1
    print("1")            # If met, take action and finish
elif not "Condition_2":   # If not met, check Condition_2
    print("not 2")        # If met, take action and finish
else:                     # If not met, check other Conditions
    print("never see me") # take action and finish the operation

1


In [12]:
capitals

{'China': 'Beijing',
 'France': 'Paris',
 'Italy': 'Rome',
 'United Kingdom': 'London'}

In [None]:
# Test if data exists
if capitals['United States']: # truthiness
    print('Yes!')
# if not, we can then add US
else:
    capitals['United States'] = 'Washington, DC'

Multiple Conditions:

In [14]:
# name = "Liang"
# age = 5

if name == "Alice":
    print("How are you today, Alice.")
else:
    if age < 12:
        print("You are not Alice, kiddo!")
    elif (age >= 12) and (age <=100):
        print("You are not Alice, stranger!")
    else:
        print("You are not Alice, vampire!")

You are not Alice, kiddo!


### `for` loop

As we have seen many times, `for` loop is extremely useful if we need to operate each data point in a container such as lists and dictionaries.

Or we only want to do something interesting for a certain number of times such a arcade game!

### `while` loop

We can make a block of code execute over and over again with a `while` statement. The body block (actions) will be executed as long as `while` statement's condition is True.

In [19]:
spam = 0

while spam < 5:
    print("Hello World")
    spam = spam + 1

Hello World
Hello World
Hello World
Hello World
Hello World


### break/continue statements

Be ware of `infinite loop` since `while` loop can go on and on...
We need statements to jump out loops such as break and continue statements

In [27]:
while True:
    print("Please type your name.")
    name = raw_input()
    
    if name == 'Alice':
        break
print("Nice to meet you, {}".format(name))

Please type your name.
'Alice'
Nice to meet you, Alice


In [31]:
while True:
    print("Please input your username")
    name = raw_input()
    
    if name != 'Huawei':
        continue
    else:
        print("Hello, master, please input your password (It is a phone)")
        password = raw_input()
        
        if password == 'P20':
            break
print("Access granted")

Please input your username
Huawei
Hello, master, please input your password (It is a phone)
P20
Access granted


### ===== Your First Python Toy Game =====

In [None]:
import random

secretNumber = random.randint(1, 20)
print("I am thinking of a number between 1 and 20")

print("You have 6 changes")
for guessesTaken in range(1,7):
    print("Take a guess")
    guess = int(input())
    
    # check with stored secretNumber
    if guess < secretNumber:
        print("too low, try again!")
    elif guess > secretNumber:
        print("too high, try again!")
    else:
        break

if guess == secretNumber:
    print("Good job! You guessed my number {} in {} guesses".format(secretNumber, guessesTaken))
else:
    print("Sorry, I was thinking of {}".format(secretNumber))

### Functions

An enhanced version of the above game is putting the above block codes in a function with arguements we can put in:

In [24]:
def myFirstGame(x, y, z):
    secretNumber = random.randint(x, y)
    print("I am thinking of a number between {} and {}".format(x, y))

    print("You have {} changes".format(z))
    
    for guessesTaken in range(1,z+1):
        print("Take a guess")
        guess = int(input())
    
    # check with stored secretNumber
        if guess < secretNumber:
            print("too low, try again!")
        elif guess > secretNumber:
            print("too high, try again!")
        else:
            break

    if guess == secretNumber:
        print("Good job! You guessed my number {} in {} guesses".format(secretNumber, guessesTaken))
    else:
        print("Sorry, I was thinking of {}".format(secretNumber))

In [25]:
myFirstGame(1, 100, 2)

I am thinking of a number between 1 and 100
You have 2 changes
Take a guess
1
too low, try again!
Take a guess
50
too low, try again!
Sorry, I was thinking of 52
