# Introduction to Python - Lecture 04 - 18 July 2022
## Story so far...
+ primitive python objects / data-types and operations (numbers, strings)
+ logical / boolean operators
+ variables and variable naming conventions (pep8, reserved keywords)
+ expressions and simple statements (assignment)
+ misc (user input, comments, mutability, terminology)
+ Compound statements: if/else conditionals
<br />


# Today's topics

## Sequential data types
+ Introduce composite objects types (data structures) -> way to organize data for processing
    + lists  
    + dictionaries
    + sets
    + tuples
    
## Assert statements 
+ These are at the end of each code block in your assignments 


<br />


# Sequential Data Types

It is often necessary to group information together. This is the function of sequential data types.
The primary difference between different sequential types is how data is accessed and stored.

## Strings

+ Strings are sequences of characters.
+ Each character in a string is assigned a specific index.
  + The first index is always **0**
  + This always corresponds to the leftmost character
  + Each subsequent character will have an index one greater than the previous index.
 
Eg:


String: 'ABCDEF'

|String|A|B|C|D|E|F|
|------|-|-|-|-|-|-|
|Index |0|1|2|3|4|5|

Each character can be accessed using the relavant index placed in square brackets [] after the string.

```python
'ABCDEF'[index]
```

In [None]:
# variable with index 

my_name = "Genevieve"
my_name[0]

my_name[0:8]

print(len(my_name))


Don't get caught up on the difference between **length** and **index**

Eg:

'ABCDEF' **has indices from 0 to 5**, but calling `len('ABCDEF')` will returns ***6***


|String|A|B|C|D|E|F|
|------|-|-|-|-|-|-|
|Index |0|1|2|3|4|5|



Leaving either the start or end blank will result in the first or last index being used repectively.

```python
'ABCDEF'[start:]
'ABCDEF'[:end]
```

In [None]:
# len vs. index 
name = "Genevieve"
print(len(name)) 
name[0:9] # remember, "up to, but not including"


In [None]:
name[0:]

In [None]:
name[:9]

#### Practice

In [None]:
# Print B
sequence = 'ABCDEF'

print(sequence[1])


In [None]:
# Print E

sequence = 'ABCDEF'
print(sequence[4])


In [None]:
# Print CD

sequence = 'ABCDEF'
print(sequence[2:4])


In [None]:
# Print ABC

sequence = 'ABCDEF'
print(sequence[0:3])




In [None]:
# Print DEF

sequence = 'ABCDEF'
print(sequence[3:])



In [None]:
# Print the first three letters of your name from the ABC string
sequence = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

subseq = [sequence[6], sequence[4], sequence[13], sequence[4], sequence[21], sequence[8], sequence[4], sequence[21],sequence[4]]
print(subseq)




# Lists

+ Another sequence data type (like strings), that stores sequence of objects. For ex.
    ```python
    [1, 2, 3, 4, 5]
    ```
+ Elements in a list have position and order, like what we saw with strings

|Index |0|1|2|3|4|
|------|-|-|-|-|-|
|List Values |1|2|3|4|5|

+ Elements / items / components of a list can be of **any type**, including **mixed**.  
    ```python
    [1, 2, 'a', [3, 4]]
    ```
   
   
|Index |0|1|2|3|
|------|-|-|-|-|
|List Values |1|2|'a'|[3, 4]|
|Data Type |Int|Int|Str|List|


+ Some examples from real world:
    - List of employees in a company
    - List of genes associated with a disease
    - List of book recommendations for a user
    - List of items in an order basket
    - List of all citi-bike stations in the city [link](https://gbfs.citibikenyc.com/gbfs/en/station_information.json)
<br />
+ **Key characteristics**
    - Elements have position and order (**ordered collection**)
    - Elements can be heterogeneous (**arbitrarily typed**)  
    - Lists can expand or contract dynamically
    - can be single- or multi-dimensional
<br />

In [None]:
# Example of a list 

diabetes_tests = ["Fasting plasma glucose (FPG) test", "A1C test", "Random plasma glucose (RPG) test"]
print(len(diabetes_tests))

print(diabetes_tests[0])
print(diabetes_tests[1].title())

print(diabetes_tests[0:2])

print(diabetes_tests[-1])

# Common List operations:
    - Create
    - Access elements or chunks
    - Modify elements or chunks
    - Check membership of an element
    - Find position / index of a specific element
    - Traverse through the list and do something
    - Make it bigger / smaller (add and remove elements)
    - Sort / reverse
    - ...  

```python
help(list)
help(list.index)
```

+ Some generic operations
    - len(x), sum(x), max(x), etc.  
<br />
+ User-defined operations (will be covered later - e.g., list comprehension, traversing with a for loop)

In [None]:
# help(list)

help(list.index) # help on a particular method 


# Lists: Create


```python
x = [1,2,3,4,5]		    # direct assignment
y = [1, 'a', [1,2,3]]

z = []                     # creates an empty list

print(type(x), x)
print(type(y), y)

```

In [None]:
# direct assignment 

students = ['Brendan','May','Wally','Steven','Max']
print(students)

#  assignment by method
x = list(("apple", "banana", "orange"))
print(x)


# Lists: Access and Modify
+ All sequences (lists, strings, ...) support two basic access operations:
    - Indexing
    - Slicing
    

Let's take this list of students for example: 

In [None]:
students = ['Brendan','May','Wally','Steven','Max']

In [None]:
# indexing starts with '0'

print(students[0])


In [None]:
# index 1

print(students[1])

In [None]:
# negative indices go backwards

print(students[-1])



In [None]:
# out of range raises IndexError: You're responsible to respect list length
print(students[10])


In [None]:
# slicing syntax: [start_idx : stop_idx]; excludes stop_idx; 

print(students[1:3])


In [None]:
# just like we saw in strings

print(students[3:])  # just like we saw in strings
print(students[:4])



In [None]:
print(students[1::2])  # every-other-element



In [None]:
print(students[::2])   # step_size is optional; every other one starting with the first

In [None]:
print(students[::-2])  # every other one starting with the last

In [None]:
# my_list[start_idx : stop_idx]
# fourth from the back to the end
# '5' is both index 4 and -1 

students = ['1','2','3','4','5']

sub_list = students[-4::]

print(sub_list)

# Lists are mutable (unlike strings)

```python
students = ['Brendan','May','Wally','Steven']
print(id(students), students)

students[0] = 'New Name'
students[1:3] = ['Name1', 'Name2']    # slice reassignment
print(id(students), students)
```

In [None]:
# mutablity with ID
students = ['Brendan','May','Wally','Steven']
print(id(students), students)

students[0] = "New Name"
students[1:3] = ["Name1", "Name2"]
print(id(students), students)

# Lists: Membership
+ <font color='blue'>**in**</font> operator, similar to string type
+ Let's us do some "logical checks"

```python
students = ['Brendan','May','Wally','Steven']
print('Brendan' in students)      # boolean expressions
print('Roxanne' in students)      # evaluate to True/False
print('Wally' not in students)    # evaluates to False
```

In [None]:
# Example

students = ['Brendan','May','Wally','Steven']

print('Brendan' in students)
print('Roxanne' in students)
print('Wally' not in students) 



# Lists: index of a specific element

+ List.index() let's you find *where* in a string an element is

```python
students = ['Brendan','May','Wally','Steven']
print(students.index('Brendan'))
print(students.index('Roxanne'))          # ValueError exception
```

In [None]:
# index
students = ['Brendan','May','Wally','Steven']
print(students.index("Brendan"))
print(students.index('Roxanne')) 


In [None]:
# if statement 

students = ['Brendan','May','Wally','Steven']
name_to_find = "Roxanne"

if name_to_find in students:
  results = students.index(name_to_find)
  print(results)
else: 
  print(name_to_find + " not in list")


# Lists: Add / remove elements

```python
students = ['Brendan','May','Wally','Steven']
students.append('Norman')
print(students)
```

```python
students.extend(['Tate','Liza'])     # or x + [11,12,13,14,15]
print(students)
```

```python
students.insert(0, "Katie")     # or x + [11,12,13,14,15]
print(students)
```

```python
students.pop()     # Will remove the last element 
print(students)
```

```python
students.pop(2)    # Will remove element of the desired index
print(students)
```

```python
students.pop(students.index('May'))    # What will this do?
print(students)
```




In [None]:
students = ['Brendan','May','Wally','Steven']

students.append("Norman")
print(students)

In [None]:
students.extend(["Tate", "Liza"])
print(students)

In [None]:
students.insert(0, "Katie")
print(students)

In [None]:
students.pop()
print(students)

In [None]:
students.pop(2)

In [None]:
#students.pop(students.index('Steven'))

print(students)



In [None]:
list1 = ['Brendan','May','Wally','Steven']
list2 = ['Brendan','May','Wally','Steven']

In [None]:
print(list1 + list2)

In [None]:

list1 += list2 # concatenate and update list1 
print(list1)

# List Exercise - Breakout room

In [5]:
# Create a list of objects (e.g., names, values, equipment, etc)
my_objects = ['roller skates', 'kayak', 'tennis racket', 'soldering iron']

# Use insert()to add somthing to the beginning of your list
my_objects.insert(0,'paint brushes')

# Use insert()to add somthing to the middle of your list
my_objects.insert(3, 'picture frames')

# Use append to add something to the end of your list 
my_objects.append('car washing sponge')

# use pop() to remove an item 
my_objects.pop(0)

print(my_objects)



['roller skates', 'kayak', 'picture frames', 'tennis racket', 'soldering iron', 'car washing sponge']


## In-place operations

As lists are mutable, some functions will change the original list and others will return a new list.


In [None]:

x = [5, 5, 2, 6, 1, 9, 8, 3]
print('---ORIGINAL DATA---')
print(id(x), x)

# Sort two ways
x.sort()      
y = sorted(x) 

print()
print('---SORTED BY x.sort()---')
print(id(x), x)
print()
print('---SORTED BY sorted()---')
print(id(y), y)


# Try help commands like this

# help(x.sort)
#help(sorted) 


# Tuples

Tuples and lists are very similar:
  + Both store values
  + Both can hold different data types
  + Both are accessed in the same way
  
The main difference between them is that tuples are immutable.

This means that once a tuple is created it cannot have any additional values added to it.

There are performance benefits to this:
  + Lists require more memory than they use
    + This allows for adding new elements without rebuilding the list (when enough elements are added the list does need to be rebuilt)
  + Tuples are a fixed size, so the memory requirement is known in advance
  
Another use for tuples is as keys for dictionaries which we will discuss next.
  + This is due to tuples being immutable
  
### Defining a tuple
Where lists are defined using \[ \] tuples are defined using ()
```python
t = ('John', 'Doe', 78)
```
Alternatively than can be defined using the tuple construct
```python
t = tuple('John', 'Doe', 78)
```


In [None]:
t = ('John', 'Doe', 78)
print(t)

## Tuples are **immutable** 
### Attempting to assign a value to a tuple will result in a TypeError

```python
t = ('John', 'Doe', 78)
print(t[0])
t[0] = 'Jane'
```


In [None]:
t = ('John', 'Doe', 78)
print(t[0])
t[0] = 'Jane'

### Tuples allow for mixed types
#### Even other tuples ...

```python
t = (['John', 'Doe'], 72, ('john@doe.com', 'retired'))
```

But, there may be an easier way to store this ...


---
# Dictionaries

+ Consist of a **set of mappings** between _**<font color='blue'>unique</font>**_ **keys** and their **values**.

#### Basic syntax:
<font color='magenta'>**\{**</font> **key1**: value1, **key2**: value2, ...  <font color='magenta'>**}**</font>
   
```python
# Example:
genetic_code = {'uuu': 'phe', 'uua': 'leu', 'aug': 'met', 'uaa': 'stop'}
```

**Comparison with Lists**
+ Lists are ordered: the order in which elements are added is the order in which they are stored
    + Access by position/index
        + Ex. letters = ['a', 'b', 'c', 'd', 'e', 'f']
        + letters[0] is 'a' etc.
+ Dictionaries are unordered
    + Access by key
        + Ex. dict_ = {'key1': 'a', 'key2': 'b', 'key3': 'c'}
        + dict_['key1'] is 'a'

The association between a **key** and a **value** is often refered to as a **key**-**value** pair or an **item**.


#### Keys

+ must be immutable (string, integer, float, tuple)
+ must be unique


#### Values
+ Can be of any type, mutable or immutable, simple or composite 
    + primitives (character: 'a', integer: 0, float: 3.4)
    + sequential Types (string: 'asd', list: [0, 1, 2], another dictionary: {'key':'value'}, tuple: (0, 1, 2)
    + user Defined Types (discussed later) (functions, classes, objects etc.)


### Some Real World Examples

+ {**&lt;gene_id&gt;**: **&lt;**gene sequence**&gt;**, ...}
+ {**&lt;email&gt;**: **&lt;**user data**&gt;**, ...}
+ {**&lt;soc security&gt;**: **&lt;**individual**&gt;**, ...}
+ {**&lt;emp id&gt;**: **&lt;**emp data**&gt;**, ...}

#### Lookup Table

```python
elements = {'H': 'hydrogen',   'He': 'helium', 
            'Li': 'lithium',  'C': 'carbon', 
            'O': 'oxygen',  'N': 'nitrogen'}
complement = {'A': 'T', 'T': 'A', 'C': 'G', 'G': 'C'}
print('H', '->', elements['H'])
print('A', '->', complement['A'])
```

#### Database Records
```python
person = {'name': 'Becky', 
          'surname': 'Chambers', 
          'contact': 
              {
              'phone': {'office': '415-456-7890',
                        'cell': '628-789-0123'
                       },
              'email': ['becky@gmail.com', 'becky.chambers@writers.com']
              }
          }
print(person['name'])
print(person['contact'])
print(person['contact']['phone'])
print(person['contact']['email'])
```


In [None]:
elements = {'H': 'hydrogen',   'He': 'helium', 
            'Li': 'lithium',  'C': 'carbon', 
            'O': 'oxygen',  'N': 'nitrogen', 'Na':'sodium'}

print(elements['Na'])

In [None]:
complement = {'A': 'T', 'T': 'A', 'C': 'G', 'G': 'C'}

print(complement["A"])

In [None]:
person = {'name': 'Becky', 
          'surname': 'Chambers', 
          'contact': 
              {
              'phone': {'office': '415-456-7890',
                        'cell': '628-789-0123'
                       },
              'email': ['becky@gmail.com', 'becky.chambers@writers.com']
              }
          }


print(person.keys()) 
print(person['contact'].keys())
print(person['contact'])
print(person['contact']['phone']['cell'])
print(person['contact']['email'])
print(person.get("name")) #use get to get value

## Operations

```python
help(dict)
```

+ Create
+ Access keys, values or (key, value) pairs / items
+ Modify items
+ Check membership of a key
+ Traverse through the dictionary and do something
+ Make it bigger / smaller (add and remove items)
+ â€¦


In [None]:
help(dict)

## Creating a dictionary
Several ways to create a dictionary

```python
dict_x = {'a': 1, 'b': 2}      # initialize by assignment
dict_y = dict(a=1, b=2)        # use dict built-in function
print(dict_x, dict_y)
print("The value for key '{}' is {}".format('a', dict_x['a']))
```
+ **keys** = 'a', 'b'
+ **values** = 1, 2
+ **items** = ('a', 1), ('b', 2)

+ Access by key:
```python
print("The value for key '{}' is {}".format('a', dict_x['a']))
```


In [None]:
dict_x = {'a': 1, 'b': 2}      # initialize by assignment
dict_y = dict(a="turtle", b="elephant")        # use dict built-in function

# print(dict_x, dict_y)
print("The value for key '{}' is {}".format('a', dict_y['a']))

### Checking to see if key/value is in dictionary 

In [None]:
contact = {'first_name':'Steven' , 'last_name':'Stone', 'occupation': 'office worker'}


print('first_name' in contact) # see if a key is in the dictionary 
print('Steven' in contact.values()) # see if a value is in the dictionary
print(('first_name', 'Steven') in contact.items()) # see if a key, value pair exists



### Modifications

```python
contact = {'name': 'John'}
```

+ Changing the value for a key

```python
contact['name'] = 'Joe'
```

+ Adding individual key-value pair to a dictionary

```python
contact['salary'] = 100000
```

+ Updating a dictionary with another dictionary (updates existing values; adds new key-value pairs)
<br>

```python
some_other_dict = {'age': 67, 'gender': 'M'}
contact.update(some_other_dict)         
```

    This is an ***in-place*** operation!



In [None]:
some_other_dict = {'age': 67, 'gender': 'M'}

In [None]:

del some_other_dict['age']

print(some_other_dict)

### Extra - pretty Printing

+ Complicated dictionaries do not print nicely.
+ pprint is a library that prints dictionaries in a more structured manner
    + external library that needs to be imported
    + it comes standard with python installation
+ If you want to configure the output, create a pretty printer object first before using it (ow default config is used)


In [None]:
import pprint
dict_ = {'name': 'Joe', 'Surname': 'van Niekerk', 'email': 'jvn@c.m', 
        'friends': [{'name': 'Sally'}, {'name': 'Dave'}, {'name': 'Rick'}, {'name': 'James'}]}
print('\n' +'-'*50)


print("No pretty printing")
print('-'*50)
print(dict_)
print('\n' + '-'*50)

print("Default pretty printing")
print('-'*50)
pprint.pprint(dict_)
print('\n' + '-'*50)


## Assert Statements

These help you make sure your code is working the way it should.

```python
score = 20
assert score > 0
```

Will continue without a problem. But if it's not met...


```python 
score = -10
assert score > 0
```

Will throw an error. Personalize your error by passing in a string after your assert statement.


```python
score = 0
assert score > 1, "Please use only positive scores."
```

But you'll see this in your homework. 
```python

# Replace '0' with the expression for each operation
# ADDITION
number1 = #ADD CODE TO GET FIRST INT
number2 = #ADD CODE TO GET SECOND INT

result_addition = 0

assert result_addition == number1 + number2, "{} != {} + {}".format(result_addition, number1, number2)
```