# 2: Data Types and Data Structures

## Learning objectives
- Understand the nature of a list.
- Know how to index and slice a list.
- Know some ways to add items to a list.
- Know some methods associated with a list.
<br> <br>
- Understand the nature of a dictionary.
- Know how to index a dictionary.
- Know some functions and methods associated with a dictionary.
<br><br>
- Understand the nature of a tuple and the basic idea of mutability.
- Understand tuple assignment and tuple unpacking.
- Know how perform tuple indexing/slicing.
- Know some tuple methods.
<br> <br>
- Understand the basics of the Boolean data type (bool).
- Know how to use comparison operators.
- Know how to use boolean operators.

## Lists

- Powerful data type in Python.
- Denoted by square brackets [].
- Store items as a mutable ordered sequence of elements.
- Each element in a list is an item.
- Support indexing and slicing.
- Can nest lists within each other.

![Lists](images/lists.png)

### Some definitions
- Mutable: can be changed after creation (supports addition/removal/reassignment of items).
- Ordered: as it sounds, has a fixed order (the order of elements provided at the time of assignment) and so can be indexed using numbers.
- Hence a list \[2,1,3,4\] will not be rearranged, 2 will be the first element, 1 will be the second... etc.
- Sequence of elements: fairly self explanatory.

In [None]:
#week 3 lesson 1 recap

tuple1 = (1, 3, 5)
list(tuple1)

In [24]:
list1 = [3, 5, 8, 9, 10,]

#list1.reverse()

list2 = list1[:]

list2

[3, 5, 8, 9, 10]

In [19]:
list1

[3, 5, 8, 9, 10]

In [None]:
# can handle multiple object types
my_list = [3, "three", 3.0, True]

In [None]:
my_list[0]

In [None]:
sentence = 'This is just a sentence'
sentence.split('s')

In [None]:
# can use the type() function to check the data type of an object
type(3)

In [None]:
type("three")

In [None]:
type(3.0)

In [None]:
type(True)

In [None]:
type(my_list)

### Indexing and Slicing

In Python, slicing:

- Starts at 0 (zero).
- Is inclusive at the lower bound (including).
- Is exclusive at the upper bound (up to but not including).

In [None]:
my_list = ['John', 'Paul', 'George', 'Ringo']

In [None]:
# Index 0 gives first element
my_list[1]

In [None]:
# Use colon to indicate slice, 1:3 returns 2nd and 3rd items but not 4th
my_list[0:3]

In [None]:
# No upper bound starts with first index indicated and gives everything beyond
my_list[1:]

In [None]:
# No lower bound starts from index 0, up to but not including upper bound
my_list[:3]

In [None]:
my_list[::2]

In [None]:
# You can also use negative index to start from the end:
print(my_list[-1])
print(my_list[-2])
print(my_list[::-1])

In [None]:
# can reassign elements in lists to new values
my_list[1] = my_list[1].upper()
my_list

In [None]:
# Can add lists, does not change original list
my_list + ['Yoko']
print(my_list)

In [None]:
# Must reassign list to change original
my_list = my_list + ['Yoko']

my_list

In [None]:
my_list = my_list[:4]

In [None]:
my_list

In [None]:
# Can multiply, same principle applies
my_list * 2

In [None]:
my_list

In [None]:
# Lists can be an item within a list
# Lists within lists are called nested
lst_1=[1,2,3]
lst_2=[4,5,6]
lst_3=[7,8,9]

# Make a list of lists to form a nested list
nest_list = [lst_1,lst_2,lst_3]
nest_list

![Lists](images/deeper.jpg)

In [None]:
print('Accessing second item', nest_list[1])
print('Accessing second item of the second item', nest_list[1][1])

### List Functions and Methods

In [None]:
# use len() function to check length
len(my_list)

In [None]:
len(3)

In [None]:
# use min() and max() to find highest and lowest item in lists
# works in alphabetical order with strings

new_list = ['AApple', 'ABanana', 'cranberry']

print(min(new_list))
print(max(new_list))

#### .append() vs .extend()
- .append() adds items to the end of a list.
- .extend() adds items in a list (or other iterable) itemwise to the end of a list.
- We can see the difference in addition below.

In [None]:
# add items by .append() method

my_list = my_list.append(['Lennon','McCartney','Harrison', 'Starr'])

print(my_list)

In [None]:
# add items in iterable itemwise by .extend() method

my_list.extend(['Lennon','McCartney','Harrison', 'Starr'])

my_list

In [None]:
my_list.insert(1, 'Lennon')

In [None]:
my_list

In [None]:
# use .pop method to remove and return last item
last_item = my_list.pop()

In [None]:
last_item

In [None]:
print(my_list)

In [None]:
# can index, default index -1
my_list.pop(0)

In [None]:
my_list

In [None]:
# returned item can be assigned to variable

popped_item = my_list.pop(0)

popped_item

In [None]:
my_list

In [None]:
# use .sort() method to sort list, changes original list, no returned value

let_list = ["a", "d", "v", "x", "g"]

num_list = [13,42,4,24,2,46,3,7]

In [None]:
let_list.sort()
num_list.sort()

In [None]:
print(let_list)
print(num_list)

In [None]:
# use .reverse() method to reverse list
num_list.reverse()

print(num_list)

In [None]:
# use "sep".join(list) to join a list of strings using a separator

list_of_strings = ["This", "is", "a", "sentence."]

print("\t".join(list_of_strings))

In [None]:
my_list

In [None]:
# my_list.remove('PAUL')
idx = my_list.index(['Lennon', 'McCartney', 'Harrison', 'Starr'])
print(idx)

In [None]:
my_list

## Lists Exercises

### Question 1:
Write a program that checks if the two words in a two-word string start with the same letter. <br>
Copy-paste (and slightly modify) your code to try it for both cases.

In [None]:
phrase1 = 'Clean Couch'

split_string = phrase1.split()

split_string[0][0] == split_string[1][0]
# CODE HERE

In [None]:
phrase2 = 'Giant Table'

split_string = phrase1.split()

split_string[0][0] == split_string[1][0]
# CODE HERE

### Question 2:
Write a program that returns a string with the __words__ reversed.
Once again, try the same operation with both test cases.

In [None]:
my_string1 = 'This is a short phrase'

# CODE HERE

In [None]:
my_string2 = 'This is actually a significantly longer phrase than the previous one'

# CODE HERE

## A Brief Introduction to Sets

- Sets are a data type in Python.
- They follow the rules of mathematical sets that you should already be familiar with.
- They are mutable and unordered, and they do not contain repeated items (items are unique).
- This means one useful usage of a set is to find all unique items in a list, as we can see below.
- Sets also have their own methods, with operations familiar from mathematical sets.

In [None]:
my_set = set([1, 2, 3, 4, 4, 4, 6])
print(my_set[1])

![Venn Diagram](images/venn.jpg)

In [None]:
# can return unique items in list by casting list to set using set() function
long_list = [1,1,1,1,2,3,3,4,5,5,4,4,5,5,6,6,5,55,5,5,5,5]

set_of_list = set(long_list)

print(set_of_list)

In [None]:
# we develop sets by first creating an empty set and then using .add() to add to it
set_x = set()

print(set_x)

set_x.add(1)

print(set_x)

set_x.add(2)

print(set_x)

set_x.add(2)

print(set_x)

In [None]:
# if we add 1 again, we see the set does not change, as items in a set are unique
print(set_x)

set_x.add(1)

print(set_x)

In [None]:
# .union() finds the union (mathematical union) of one set and another
set_x.add(10)
print(set_x)
print(set_of_list)
set_x.union(set_of_list)

In [None]:
set_x = set_x.union(set_of_list)

In [None]:
# .update() updates a set with the union of it and another set
print(set_x)

print(set_of_list)

dummy = set_x.update(set_of_list)

print(dummy)

In [None]:
# a.difference(b) returns the items in a that are NOT in b
set_x = set()
set_x.add(1)
set_x.add(2)
print(set_x)
print(set_of_list)
set_of_list.difference(set_x)

In [None]:
type(list(set([1, 2, 3, 4, 4, 4, 5, 5, 5, 5, 5])))

- We will not cover them further set methods here, but more information is available at:
https://docs.python.org/3/library/stdtypes.html#set

## Dictionaries

- Dictionaries are __unordered__ collections of key:value pairs.
- Keys must be strings (can be any immutable type but best practice to use strings).
- Values can be any data type, including dictionaries themselves (nesting).
- Indexed using keys.

In [None]:
# flexibility of data assignment inc. lists and sub-dictionaries
d = { 'list': 123, 'list':[0,1,2], 'dictionary':{'insidekey':[100,200]}}

print(d)

In [None]:
# indexing happens sequentially
d['k4']['insidekey'][1]

In [None]:
d = {[1, 2, 3]: 123, 'k1':[0,1,2], 'k4':{'insidekey':[100,200]}}

In [None]:
# can stack calls
d1 = {'k1':["a", "b", "c"]}

print(d1['k1'][2].upper())

In [None]:
# both of these methods give the same result
print(d1['k1'][2].upper())

x = d1['k1'][2]
print(x.upper())

In [None]:
print(d)
d["k7"] = "NEW"
print(d)

In [None]:
# add by assigning new pair, reassign
d["k7"] = "VALUE"

print(d)

In [None]:
# call all keys/values/pairs by .keys / .values / .items methods, .items returns tuples
print(d.keys())
print()
print(d.values())
print()
print(d.items())

In [None]:
d['k7']

In [None]:
# use in to check if item in iterable
d1 = {"k1": 10, "k2":[1,2,3], "k3":345}

In [None]:
"k2" in d1

In [None]:
345 in d1

In [None]:
d1.values()

In [None]:
345 in d1.values()

In [None]:
345 in d1.keys()

In [None]:
d1.items()

In [None]:
('k3', 345) in d1.items()

In [None]:
d1 = {"k1": 10, "k2":[1,2,3], "k3":345}
d2 = {'k1': 50, 'k5': 42}
d1.update(d2)
print(d1)

In [None]:
d1.pop('k1')
print(d1)

## Tuples

- Tuples are like lists: flexible data input.
- But they are immutable: cannot be changed once created.
- Therefore no append/extend/remove/pop methods and no item reassignment for tuples.
- Useful for holding values in data that you do not want to be reassigned by accident.

In [None]:
# immutable but flexible data input
t = (1,2,3)
t1 = (1, "two", 3)
t2 = ("a", "a", "b")

In [None]:
t

In [None]:
# can use in operator to check if item is in tuple
1 in t1

In [None]:
t1.pop()

In [None]:
t1[1] = "ten"

In [None]:
# check length with len() function
len(t)
print("The length of the tuple is {x}".format(x=len(t)))

In [None]:
# count instances using .count() method
t2.count("a")
print("a occurs {x} times in the tuple".format(x=t2.count("a")))

In [None]:
t2

In [None]:
# find first index using .index() method
t2.index("b")
print("b occurs first at index {x} in the tuple".format(x=t2.index("b")))

### Tuple Packing and Unpacking

- One of the most powerful aspects of tuples is a technique called tuple unpacking.
- This allows us to assign variables using commas from a single tuple in order.
- The syntax works as below, although the brackets can be omitted, unless required to be clear.

Python here 'unpacks' the tuple automatically and picks out the values and assigns them to the comma-separated variables:

In [None]:
a, b = (1, 2, 3)

print(a)
print(b)
print(c)

We can also assign a tuple to a variable and perform tuple unpacking on the variable:

In [None]:
t1 = (1,2,3)

a, b, c = t1

print(a)
print(b)
print(c)

Here, brackets are implied, Python performs tuple unpacking operation in same way 'under the hood'. <br>
This comma notation is useful shorthand for assigning multiple variables:

In [None]:
t1 = (1, 3, 5)

In [None]:
t1 = (1)

In [None]:
type(t1)

In [None]:
a, b, c = 1, 2, 3

print(a)
print(b)
print(c)

Tuples can also be packed in the same way. Again, useful for multiple variable assignment:

In [None]:
print(a)
print(b)
print(c)

t1 = a, b, c

print(list(t1))
print(type(list(t1)))

## Booleans and Comparison Operators

Booleans in Python function as they do in maths. <br>
Can use: 
- (<) less than
- (>) more than
- (<=) less than or equal to
- (>=) more than or equal to
- (==) equal to (single = is assignment)
- (!=) not equal

to evaluate equality/inequality.

Use keywords to chain/modify boolean operators:
- and
- or
- not

In [None]:
1 < 2

In [None]:
1 >= 2

In [None]:
# and/or for boolean comparison chaining
1 < 2 and 10 < 20

In [None]:
True and False

In [None]:
True or False

In [None]:
# and/or for boolean comparison chaining
25 > 37 or 55 < 100

In [None]:
# use not to return opposite boolean
not 100 > 1

In [None]:
# use in keyword to check if item is in iterable (also works for strings)
print("x" in [1,2,3])

print("x" in ['x','y','z'])

print("a" in "a world")

## Data Types and Structures Exercises

### Question 1

Describe the key differences between a list, a dictionary and a tuple.

Answer here

### Question 2

What is mutability?

Answer here

### Question 3

Print the string python from this dictionary:

In [None]:
d = {'start here':1,'k1':[1,2,3,{'k2':[1,2,{'k3':['keep going',{'further':[1,2,3,4,[{'k4':'python'}]]}]}]}]}

In [None]:
# CODE HERE

### Question 4

Create a nested dictionary called shop with sub-dictionaries called 'prices' and 'pack_sizes'. <br>
'prices' should contain items as keys and prices as values. <br>
'pack_sizes' should contain items as keys and pack sizes as values. <br>

- Tomatoes cost 87p for a pack of 6
- 500g sugar costs £1.09
- Washing sponges cost 29p for a pack of 10
- Juice is £1.89 per 1.5l bottle
- Foil is £1.29 per 30m roll

Use the same keys for both sub-dictionaries, e.g. 'tomato':0.87 and 'tomato':'Pack of 6'. <br>
Use the provided list as the values for pack_sizes (copy-paste the strings):

In [None]:
["Pack of 6", "500g", "Pack of 10", "1.5l bottle", "30m roll"]

shop = {}

In [None]:
shop

### Question 5

Using the nested dictionary above, find the price of the following shopping list:

- 18 tomatoes
- 2 packs of washing sponges
- 4.5 litres of juice
- 4 rolls of foil
- 2kg sugar

Do it in this (rather tedious) order:

Create 4 lists using indexing:
- The first containing the cost per pack of each item called 'pack_cost', indexing the 'prices' dictionary.
- The second containing the pack size of each item called 'pack', indexing the 'pack_sizes' dictionary.
- The third containing the quantities of each item called 'quant' (created for you).
- The fourth containing cost per pack multiplied by quantity for each item called 'tsp', indexing 'unit_cost' and 'pack'.

Print your answer as a nested list of 4 lists called order.

Using sum, find the subtotal (exc. VAT) as a variable called order_subtotal.

Find the total (inc. VAT) as a variable called order_total (VAT is 20%).

(This is a very tedious way of doing this, but it will show you why for loops are super useful later on)

In [None]:
unit_cost = []

pack = []

quant = [3,4,2,3,4]

tsp = []

order = []

order_subtotal

order_total

In [None]:
print(order)

In [None]:
print("The subtotal is £{:4.2f}.".format(order_subtotal))

In [None]:
print("The total is £{:4.2f}.".format(order_total))

## Summary
We now understand:
- The nature of lists, dictionaries and tuples.
- The basic concept of mutability.
- The basic concept of tuple unpacking.
- The nature of booleans and how to use them.
<br><br>

We now know:
- How to index and slice lists.
- List functions and methods including len(), .append(), .extend() etc.
- How to use a set to find the unique values in a list.
- How to index a dictionary.
- Dictionary methods including .keys(), .values(), .items().
- Tuple methods including .count() and .index().
- How to use booleans and logical operators.
<br><br>

Please use this notebook as a reference, and refer to the links below for more information.

## Further reading
- List methods: https://docs.python.org/3/tutorial/datastructures.html
- Dictionary methods: https://docs.python.org/3/library/stdtypes.html#typesmapping
- Built-in types: https://docs.python.org/3/library/stdtypes.html
- Sets: https://docs.python.org/3/library/stdtypes.html#set

## Szymons extra exercise 1

In [8]:
def fizz_buzz_exercise1(values=range(100), m=3, n=5):
    fizz, buzz, fizzbuzz = [],[],[]

    for value in values:
        if value % (m*n) == 0:
           fizzbuzz.append(value)
        elif value % m == 0:
           buzz.append(value)
        elif value % n == 0:
           fizz.append(value)    
            
    return fizz, buzz, fizzbuzz


In [9]:
fizz_buzz_exercise1([15,7,8,10,9])

([10], [9], [15])

## Szymons extra exercise 2

In [50]:
from math import sqrt
def fizz_buzz_exercise2(values, value1, value2=None):
    
    fizz, buzz, fizzbuzz = [],[],[]
    
    if value2 == None:
        value2 = round(sqrt(value1*7))
    for value in values:
        if value % (value1*value2) == 0:
           fizzbuzz.append(value)
        elif value % value2 == 0:
           buzz.append(value)
        elif value % value1 == 0:
           fizz.append(value)    
            
    return fizz, buzz, fizzbuzz

In [51]:
fizz_buzz_exercise2(range(100),9)

([9, 18, 27, 36, 45, 54, 63, 81, 90, 99],
 [8, 16, 24, 32, 40, 48, 56, 64, 80, 88, 96],
 [0, 72])

## Szymons etxra exercise 3 (yield)

In [75]:
fizz, buzz, fizzbuzz = [],[],[]

def fizz_buzz_exercise2(values, value1, value2=None):

    if value2 == None:
        value2 = round(sqrt(value1*7))
    for value in values:
           yield value


In [76]:
for num in fizz_buzz_exercise2([15,7,8,10,9],9):
    if num % (value1*value2) == 0:
        fizzbuzz.append(num)
    elif num % value2 == 0:
        buzz.append(num)
    elif num % value1 == 0:
        fizz.append(num)    

NameError: name 'value1' is not defined

## Szymons etxra exercise 4 (bubble sort)

In [44]:
def bubblesort(unsorted_list):

    for j in range (1,len(unsorted_list)):
        for i in range(len(unsorted_list)-j):
            if unsorted_list[i] > unsorted_list[i+1]:
                unsorted_list[i+1], unsorted_list[i] = unsorted_list[i], unsorted_list[i+1]
       
    return unsorted_list


    

        
        
    

In [45]:
bubblesort([3, 2,1,4,5,7,3,2143,7465,9000])

[1, 2, 3, 3, 4, 5, 7, 2143, 7465, 9000]

[0, 1]

2