# 3. Python Built-in Data Structure

## List

Python lists are very versatile containers (boxes) that are extremely userful in many circumstances. A list in Python, defined using brackets `[]`, is an ordered collection of objects that can contain arbitrary numbers of Python objects, such as strings, integers, floats, lists or others. All these data points can be accessed using numerical index.

#### Todos
1. Create Python list
2. Access, modify, and delete list elements
3. Merge lists
4. Use the slicing synatx to operate on sublists
5. Loop over lists

### Create a list

In [3]:
city = ["Lancaster", "Warwick", "Lancaster", "Bath"]

### Access list elements

Python follows C's convention of starting from zero as numerical index. This means, if we have a list of length $n$, the first element of the list is at index $0$ (think of location), the second is at index $1$, and so on and so forth. So be careful that, the last element of the list will be at index $n-1$.

To get the **first** element in the above list, we type:

In [4]:
# Python is inspired by C
# follows C's convention that index starts with 0
city[0]

'Lancaster'

How about the **last** element?

In [5]:
city[-1]

'Bath'

#### mix of strings and numbers

In [7]:
mix = [1, [2, 3], 'Python rocks!', True]

#### manipulate lists

In [8]:
# add a single element to the end of the list
city.append('Manchester')
print(city)

['Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'Manchester']


In [9]:
# insert an element into a specific location
city.insert(0, "Cambridge")
print(city)

['Cambridge', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'Manchester']


In [10]:
# update an element
city[0] = "Oxford"

In [11]:
# add another list
city.extend(['York', 'Edinburgh'])
print(city)

['Oxford', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'Manchester', 'York', 'Edinburgh']


In [12]:
# or just + sign
city = city + ['Liverpool', 'Glasgow']
print(city)

['Oxford', 'Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'Manchester', 'York', 'Edinburgh', 'Liverpool', 'Glasgow']


In [13]:
# delete element by index
del city[0]
print(city)

['Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'Manchester', 'York', 'Edinburgh', 'Liverpool', 'Glasgow']


In [14]:
# or by value
city.remove("Glasgow")
print(city)

['Lancaster', 'Warwick', 'Lancaster', 'Bath', 'Manchester', 'Manchester', 'York', 'Edinburgh', 'Liverpool']


In [15]:
# sort list
city.sort()
print(city)

['Bath', 'Edinburgh', 'Lancaster', 'Lancaster', 'Liverpool', 'Manchester', 'Manchester', 'Warwick', 'York']


#### Slice a list

In [16]:
# slice list by using (:)
sq = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

#### another counterintuitive convention

We can select a group of elements in a list starting from the first element indicated and going up to (**but not including**) the last element indicated. For example:

In [17]:
# to get the first three element
sq[0:3] # Remember index starts with 0

[0, 1, 4]

We can select everything in a list

In [18]:
sq[:]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

We can select everything before or after a certain point

In [19]:
# starting from the eighth element
sq[8:]

[64, 81, 100]

In [20]:
# another way to get the first three elements
sq[:3]

[0, 1, 4]

We can also add a third component to slice a list, which is the step size we want to take. So instead of taking every single element, we can take every other element. For instance:

In [21]:
# the max index for the list is 10
# if using 10 as the second component, the last element will be excluded
sq[::2]

[0, 4, 16, 36, 64, 100]

### loop through lists

We can create an empty list for university names and loop through all the cities in the list `city` and add "University of" to city names:

In [22]:
uni = []
for i in range(len(city)): # len() returns numbers of elements
    uni.append("University of {}".format(city[i]))

print(uni)

['University of Bath', 'University of Edinburgh', 'University of Lancaster', 'University of Lancaster', 'University of Liverpool', 'University of Manchester', 'University of Manchester', 'University of Warwick', 'University of York']


We can also do mathematical operations on each element in a list

In [23]:
import math

for value in sq:
    print("square root for {} is {}".format(value, int(math.sqrt(value))))

square root for 0 is 0
square root for 1 is 1
square root for 4 is 2
square root for 9 is 3
square root for 16 is 4
square root for 25 is 5
square root for 36 is 6
square root for 49 is 7
square root for 64 is 8
square root for 81 is 9
square root for 100 is 10


In [24]:
# even better
for index, value in enumerate(sq):
    print("# {} element ==> {}".format(index+1, value))

# 1 element ==> 0
# 2 element ==> 1
# 3 element ==> 4
# 4 element ==> 9
# 5 element ==> 16
# 6 element ==> 25
# 7 element ==> 36
# 8 element ==> 49
# 9 element ==> 64
# 10 element ==> 81
# 11 element ==> 100


#### Useful built-in function `range()` to generate lists

In [None]:
my_range = range(10) # <= a total of 10 numbers

for element in my_range:
    print(element)

#### Strings are lists of characters!

We already know that strings are basically text. In Python, strings are also indexed much in the same way that lists are, which means that we can do use built-in operations to access, slice, and update strings easily depending on what we need.

In [None]:
my_string = "I love Python!"
print(my_string[0]) # "I"
print(my_string[2:6]) # "Love"
print(my_string[::-1]) # reverse it!

Regular Expression is a powerful engine to deal with text and is widely used in programming and data analysis. We will see an example at the end of today's session.

### Tuple and Set

While we can do everything we want in a list, a special form of data container, very similar to list, is `tuple`. The only difference is that we cannot modify a tuple once defined as we do to a list. Tuple is defined using `( )`.

In [None]:
my_tuple = ("I", "want", "my", "lunch")
print(my_tuple)

In [None]:
my_tuple[3] = "dinner" # error if we attempt to update tuple

A `set` is a collection of unordered but unique elements, defined using curl brackets `{ }`. Note: indexing is not working for `set` but instead we can check if an element is in a set or not.

In [None]:
uni_set = set(uni)

## Dictionary

Another essential data structure in Python is the dictionary. Dictionaries are defined with a combination of curly braces (`{ }`) and colons (`:`). The braces define the beginning and end of a dictionary and the colons indicate key-value pairs. A dictionary is essentially a set of `key-value pairs`. 

Dictionary is very close to an Excel spreadsheet or table, with `key` as column name and `value` as row value

### Todos
1. Create Python dictionaries
2. Access, modify, and delete dictionary items
3. Merge dictionaries
4. Loop over dictionary keys, values, and items

#### Create a dictionary

In [None]:
capitals = {'China': 'Beijing',
            'United Kingdom': 'London',
            'France': 'Paris',
            'Italy': 'Rome'}

#### Accessing dictionaries items

Instead of using numerical index as we did in a list, we can use a more intuitive approach in a dictionary.

In [None]:
capitals['China']

#### Add items to dictionary

In [None]:
capitals['United States'] = 'Washington, DC'

In [None]:
capitals

if we call a key from the dictionary, but there is no such pair in our container/box/data, then we get an error

In [None]:
capitals['Germany']

#### Check if an item belongs to a dictionary

In [None]:
'Germany' in capitals # Python knows that you are talking about keys

In [None]:
'United Kingdom' in capitals

#### Combine dictionaries

In [None]:
morecapitals = {'Germany': 'Berlin',
                'Japan': 'Tokyo'}

In [None]:
capitals.update(morecapitals)
capitals

#### Delete Key-Value pair

In [None]:
del capitals['United States']
capitals

#### Dictionaries are useful with `loop`

In [None]:
# if we want to know what are the keys we have in the dictionary
# loop through keys
for key in capitals.keys():
    print(key)

In [None]:
# if we want to know what values we stored
# loop through values
for value in capitals.values():
    print(value)

In [None]:
# or loop over pairs
for key, value in capitals.items():
    print(key, value)

### Comprehensions
In python programming, especially when dealing with data, there are many cases where we want to iterate over or loop through list or dictionary, perform an operation on every element, and then collect the results in a new list or dictionary for future analysis.

We can certainly perform this with a traditional `for` loop but Python offers a better approach, comprehensions, which let us write shorter and cleaner codes that acchieves the same effect.

#### Todos
1. rewrite a loop as a comprehensions
2. filter items
3. create a list or a dictionary from a comprehensions

#### Create a list using `for` loop

In [None]:
sq = []
for i in range(10):
    sq.append(i**2)
sq

#### Create a list using `comprehension`

In [None]:
# inline for loop
sq = [i**2 for i in range(10)]
sq

#### Filter elements in a list using `for` loop

In [None]:
sq = []
for i in range(10):
    if i < 5:
        sq.append(i**2)
sq

#### Filter elements in a list using `comprehension`

In [None]:
# if condition
sq = [i**2 for i in range(10) if i < 5]
sq

#### Create a dictionary using `comprehension`

In [None]:
i_sq = {i: i**2 for i in range(10) if i < 5}

In [None]:
i_sq

### TASK: Can you use a comprehension to transpose key and value of the dictionary `capitals`?

# 4. Flow Control/Logical Statements

### Basic Logic

If you recall, a boolean takes one of the two values, `True` or `False` (sometimes 1 or 0). The basic logical statements that we can make are defined using Python built-in **comparison operators**:

- `==`: equal to
- `!=`: NOT equal to
- `<`: less than
- `<=`: less than or equal to
- `>`: greater than
- `>=`: greater than or equal to

In [None]:
1 == 1

In [None]:
2.0 != 2

In [None]:
3 > 3

In [None]:
3 >= 3

#### Boolean Operators

We can also put various comparisons together using `Boolean Operators`:
- `and`: and
- `or`: or
- `not`: not

In [None]:
True and True

In [None]:
True and False

In [None]:
(1 == 0) and (1 < 2)

In [None]:
(1 == 0) or (1 < 2)

In [None]:
not (1 == 0) and (1 < 2)

#### Truthiness
As a general rule, containers like strings, tuples, dictionaries, lists, and sets, will return `True` if they contain anything at all and `False` if they contain nothing.

In [None]:
bool('')

In [None]:
bool('I am not empty')

In [None]:
bool([])

This feature becomes very handy if we need to test if a variable or any object exists in Python's memory so that we can avoid errors.

In [None]:
bool(capitals['China'])

### If-statements
We can create segments of code that only execute if a set of `conditions` is met. 

We use if-statements in conjunction with logical statements in order to create `branches` in our code.

An `if branch` gets executed when the `condition` is considered to be `True`. If condition is evaluated as `False`, the if block will simply be skipped unless there is an `else branch`. 

Conditions are made using either logical operators or by using the truthiness of values in Python. An if-statement is defined with a colon and a block of indented text:

In [None]:
# Multiple-branch logical statements

if "Condition_1":         # First check Condition_1
    print("1")            # If met, take action and finish
elif not "Condition_2":   # If not met, check Condition_2
    print("not 2")        # If met, take action and finish
else:                     # If not met, check other Conditions
    print("never see me") # take action and finish the operation

In [None]:
capitals

In [None]:
# Test if data exists
if capitals['United States']: # truthiness
    print('Yes!')
# if not, we can then add US
else:
    capitals['United States'] = 'Washington, DC'

Multiple Conditions:

In [None]:
# name = "Liang"
# age = 5

if name == "Alice":
    print("How are you today, Alice.")
else:
    if age < 12:
        print("You are not Alice, kiddo!")
    elif (age >= 12) and (age <=100):
        print("You are not Alice, stranger!")
    else:
        print("You are not Alice, vampire!")

### `for` loop

As we have seen many times, `for` loop is extremely useful if we need to operate each data point in a container such as lists and dictionaries.

Or we only want to do something interesting for a certain number of times such a arcade game!

### `while` loop

We can make a block of code execute over and over again with a `while` statement. The body block (actions) will be executed as long as `while` statement's condition is True.

In [None]:
spam = 0

while spam < 5:
    print("Hello World")
    spam = spam + 1

### break/continue statements

Be ware of `infinite loop` since `while` loop can go on and on...
We need statements to jump out loops such as break and continue statements

In [None]:
while True:
    print("Please type your name.")
    name = raw_input()
    
    if name == 'Alice':
        break
print("Nice to meet you, {}".format(name))

In [None]:
while True:
    print("Please input your username")
    name = raw_input()
    
    if name != 'Huawei':
        continue
    else:
        print("Hello, master, please input your password (It is a phone)")
        password = raw_input()
        
        if password == 'P20':
            break
print("Access granted")

### ===== Your First Python Toy Game =====

In [25]:
# Guess a number
import random

secretNumber = random.randint(1, 20)
print("I am thinking of a number between 1 and 20")

print("You have 6 changes")
for guessesTaken in range(1,7):
    print("Take a guess")
    guess = int(input())
    
    # check with stored secretNumber
    if guess < secretNumber:
        print("too low, try again!")
    elif guess > secretNumber:
        print("too high, try again!")
    else:
        break

if guess == secretNumber:
    print("Good job! You guessed my number {} in {} guesses".format(secretNumber, guessesTaken))
else:
    print("Sorry, I was thinking of {}".format(secretNumber))

I am thinking of a number between 1 and 20
You have 6 changes
Take a guess
5
too low, try again!
Take a guess
15
too high, try again!
Take a guess
10
too low, try again!
Take a guess
12
too low, try again!
Take a guess
13
too low, try again!
Take a guess
14
Good job! You guessed my number 14 in 6 guesses


### Functions

An enhanced version of the above game is putting the above block codes in a function with arguements we can put in:

In [None]:
import random

def myFirstGame(x, y, z):
    secretNumber = random.randint(x, y)
    print("I am thinking of a number between {} and {}".format(x, y))

    print("You have {} changes".format(z))
    
    for guessesTaken in range(1,z+1):
        print("Take a guess")
        guess = int(input())
    
    # check with stored secretNumber
        if guess < secretNumber:
            print("too low, try again!")
        elif guess > secretNumber:
            print("too high, try again!")
        else:
            break

    if guess == secretNumber:
        print("Good job! You guessed my number {} in {} guesses".format(secretNumber, guessesTaken))
    else:
        print("Sorry, I was thinking of {}".format(secretNumber))

In [None]:
myFirstGame(1, 100, 2)