# FIT9136 Algorithms and programming foundations in Python

# Week 3 Lab Activities: Introduction to Data Structures, Collective Data Types and Control Structures I

\#string \#list \#set \#tuple \#dictionary \#collectiveDataTypes

Simply speaking, collective data types are different kinds of lockers, which can store 0 or more *objects*.

### 1. String
\#ordered \#immutable

```
Index:    0  1  2  3  4  5   ...
          ↓  ↓  ↓  ↓  ↓  ↓
         'p  y  t  h  o  n'  ...
```

#### 1.1 Creating a string

In [None]:
'' # empty string

In [None]:
"" # also an empty string 

In [None]:
'python' # 'python' as a string

#### 1.2 String Formatting
String Formatting combines the (other)variables and string to provide meaningful output (using placeholders).

Mainly there are three different ways to do string formatting:
  1. "%" modulo operator
  2. "str.format()"
  3. f-string

Want to learn more? Check [this](https://realpython.com/python-string-formatting/) out!

**str.format()**

```python
# values of str(var1) and str(var2) will be put into the first and second brackets respectively.
'some string with {} placeholders {}.'.format(var1, var2)
```

In [None]:
# Example
the_name = 'Sherlock Holmes'
interest = 'solving crimes'
the_age = 79.9999999 # Never going to 80
# using positional arguments
'My name is {}, {:.0f} years old. I love {}.'.format(the_name, the_age, interest) # What dose :.0f do?

In [None]:
# using keyword arguments
'My name is {str_name}, {str_age:.0f} years old. I love {str_interest}.'.format(str_name = the_name, str_age = the_age, str_interest = interest)

**f-string**
```python
f'Some string with {variables inside curly brackets}.'
```

In [None]:
# Example
the_name = 'Sherlock Holmes'
interest = 'solving crimes'
the_age = 79.9999999
f'My name is {the_name}, {the_age:.0f} years old. I love {interest}.'

**%**: more than just modulo
```python
'Some string with (%variable placeholder formats starting with %)' % (variables)
```

In [None]:
# Example
the_name = 'Sherlock Holmes'
interest = 'solving crimes'
the_age = 79.9999999
# substitute the placeholders in positional order using tuple
'My name is %s, %.0f years old. I love %s.' % (the_name, the_age, interest) # what are %s and %.0f?

In [None]:
# substitute the placeholders according to keys of dictionary
'My name is %(str_name)s, %(str_age).0f years old. I love %(str_interest)s.' % {'str_name': the_name, 'str_age':the_age, 'str_interest':interest}

#### 1.3 Retrieving / Modifying items in string

Every item of a list is labeled by an index, and the index starts with 0.

We can retrieve an item of a list using its index.

In [None]:
a_string = 'some thing'

In [None]:
print(a_string[0]) # getting the 0th character from a_string

In [None]:
print(a_string[-1]) # getting the last character from a_string

In [None]:
print(a_string[5:]) # getting the sub-string starting from index 5 to the end, this technique is called slicing

In [None]:
print(a_string[:4]) # getting the sub-string from the start up to index 3 ← BE CAREFUL! It is 3 not 4!!

In [None]:
a_string[4] = 'a' # We want to change the space into 'a'. What will happen?

Remember, string is *immutable*. That means we could not modify the items in a string. 

<b><font color='red'> Question:</font></b> However, how can we change a_string from 'some thing' to 'something'?

<b><font color='red'>Answer</font></b>

In [None]:
# Retrieve the substrings from indices between 0 to 3 and 5 to the end, CONCATENATE(+) them, and assign the value back to a_string
a_string = 'some thing'

new_string = ?

#### 1.4 Built-in methods of strings

There are many useful built-in methods of the string objects. <br>Lets have a look at few of them:

**a. upper/lower/capitalise**: transform the string to uppercase/lowercase or capitalise the string.

In [None]:
'lower-case'.upper() # transform the string to upper case

In [None]:
'UPPER-CASE'.lower() # transform the string to lower case

In [None]:
'lower case. the end.'.capitalize() # transform the first character of the string to upper case

**b. isdigit/isalpha/isalnum**: check if the string only contains digits, characters or digits and characters respectively.

In [None]:
# check if the following strings contain digits only
'123'.isdigit(), '123.45'.isdigit(), 'abc'.isdigit()

In [None]:
# check if the following strings contain characters only
'123'.isalpha(), '123.45'.isalpha(), 'abc'.isalpha()

In [None]:
# check if the following strings contain digits and characters only
'123'.isalnum(), '123.45'.isalnum(), 'abc'.isalnum()

**c. startswith, endswith**: check if the string starts or ends with a pattern.

In [None]:
'FIT9136 Python'.startswith('FI'), 'FIT9136 Python'.endswith(6) # What will be returned? Why is that?

In [None]:
# remember, you can always seek help
?str.endswith

**d. find**: returns the index of first occurence of the pattern we want to search.

In [None]:
sentence = 'FIT9136 is so interesting.'
sentence.find(i), sentence.find('E') # What will be returned?

**e. split**: split the string based on a pattern and returns a list of tokens.

In [None]:
sentence = 'FIT9136 is so interesting.'
sentence.split()

In [None]:
# Now we want to split by pattern ' is so '
sentence.split(' is so ')

For the details, please refer to the [documentation](https://docs.python.org/3/library/stdtypes.html#string-methods). 

#### 1.5 String comparison

'A' is greater or 'a' is greater?

In [None]:
'A' > 'a' # True or False

In Python, strings are compared **character by character** using their [ascii codes](https://www.asciitable.com/). Now, let's try to use `ord()` function to get the integer represention of 'A' and 'a' based on the ascii table.

In [None]:
ord('A'), ord('a')

#### 1.6 String length

We can obtain the **number of items** in a collective data type object using `len()` function.

In [None]:
len('abcd')

### 2. List

\#ordered \#mutable

```
Index:    0   1   2   ...
          ↓   ↓   ↓
        ┌───┬───┬───┬
Value:  │'a'│ 5 │'z'│ ...
        └───┴───┴───┴
```

#### 2.1 List creation

In [None]:
list() # An empty list

In [None]:
[] # Another empty list

In [None]:
['a', 5, 'z'] # A list with items

#### 2.2 Retriving / Modifying item of list



In [None]:
a_list = ['a', 5, 'z'] # remember a_list, the variable, is just a pointer to the list object created

Every item of a list is labeled by an **index**, and the index starts with 0.

We can retrieve or modify an item of a list using its index.

In [None]:
a_list[0] # getting the 0th item from a_list → ?

In [None]:
a_list[2] = 'abc' # set the item at index 2 of a_list to 'abc'
a_list # What will a_list look like?

**We can also use the keyword `in` to check whether an item is in a collective data type object.**

In [None]:
'a' in a_list # check if 'a' is in a_list

#### 2.3 List comparison

What? Lists can be compared?!!!!

In [None]:
a = [1, 2, 3]
b = [1, 2, 3]
c = [3, 2, 1]

In [None]:
a == b

In [None]:
a == c

In [None]:
a is b

When we compare two lists using `==` operator, we are comparing the **values** of two lists **item-wise**. 

Since lists are **ordered**, so the order of items matters! That's why `a == b` while `a != c`.

On the other hand, when we compare using `is`, we are comparing the ids of two lists. Since a and b are **different lists with same items**, so **a is not b**.

How about the "greated than" or "smaller than" comparisons?

In [None]:
[1, 2, 3] < [2, 1, 3]

In [None]:
[1, 2, 3] < [1, 3]

**Yes**, your thinking is right. It checks based on the element's values in order.

### 3. Tuple

\#ordered #immutable

Tuple is exactly like list, but:
<blockquote>
IT IS IMMUTABLE, which means you cannot modify the items of it.
</blockquote>

#### 3.1 Tuple creation

In [None]:
tuple()

In [None]:
()

In [None]:
(1) # Is this a tuple? How can we create a tuple with only one element?

<b><font color='red'> Question:</font></b> Can we change a tuple from `(1,[1,2,3],2)` to `(1,[1,2],2)`?

The answer is **YES**, because we are modifying the item of a list instead of the item of a tuple.

In [None]:
a_tuple = (1,[1,2,3],2)
a_tuple[1].pop(-1) # remove the last element from the item in the 1st index of the tuple
a_tuple

Actually, the item at index 1 of the tuple is only a pointer to the list. The pointer remains unchanged here!

### 4. Set

\#unordered #mutable #uniqueItems
```
╭──────────────╮
│ 'a'    3     │
│   '5'   123.4│
│ 17.5  'str'  │
│    79.6 72.1 │
╰──────────────╯
```

#### 4.1 Set creation

In [None]:
set()

In [None]:
{1, 3, 5}

In [None]:
set([1,3,5]) # convert a list to a set

In [None]:
{} # how about this?

In [None]:
type({}) # let's find out the data type of {}

However, please note that **all items in set should be immutable**.

#### 4.2 Retrieving/Modifying item of set

In [None]:
a_set = {1, 3, 5}

What do the following code blocks do? 

In [None]:
a_set[0]

In [None]:
a_set.pop() # What is left in a_set?

In [None]:
a_set.add(5) # Add 5 to a_set. What is left in a_set?

#### 4.3 Set operations

There are 4 handy operations exclusive to set data type:
1. Union
2. Difference
3. Intersection
4. Symmetric Difference

In [None]:
from IPython.display import Image
print('Image source: https://www.learnbyexample.org/')
Image(url='https://www.learnbyexample.org/wp-content/uploads/python/Python-Set-Operatioons.png')

**1) Union**

When we want to get all the elements in set A and set B, then we can use `union()` or `|` operator.

In [None]:
set_a = {1,3,5,2,4,6}
set_b = {2,4,6,7,9,11}

# set_a.union(set_b)
set_a | set_b

**2) Difference**

When we want to get all the elements in set A that do not appear in set B, then we can use `difference()` or `-` operator.

In [None]:
set_a = {1,3,5,2,4,6}
set_b = {2,4,6,7,9,11}

# set_a.difference(set_b)
set_a - set_b

**3) Intersection**

When we want to get all the elements that appears in both set A and set B, then we can use `intersection()` or `&` operator.

In [None]:
set_a = {1,3,5,2,4,6}
set_b = {2,4,6,7,9,11}

# set_a.intersect(set_b)
set_a & set_b

**4) Symmetric Difference**

When we want to get all the elements that appears either in set A or set B (exclusively), then we can use `symmetric_difference()` or `^` operator.

In [None]:
set_a = {1,3,5,2,4,6}
set_b = {2,4,6,7,9,11}

# set_a.symmetric_difference(set_b)
set_a ^ set_b

<b><font color='red'> Task:</font></b> How can we check if a super long string contains all 26 alphabets or not? If not, which alphabet(s) is/are missing?

In [None]:
import string
the_string = 'Supercalifragilisticexpialidocious$&#@*() Pneumonoultramicroscopicsilicovolcanoconiosis +-*/&#Q$*^&*@#$*&^'

unique_char = set(the_string.lower()) & set(string.ascii_lowercase) # get all unique characters from the_string(in lower-case) and filter out all the symbols
set(string.ascii_lowercase) - unique_char # getting the missing alphabets by subtracting the unique characters in the string from the full set of lower-case alphoabets

### 5. Dictionary

\#unordered #mutable

Remember every item in a list is labeled by the index? In dictionary, every item('value') is labeled by a 'key'.

```
key  value
┌───┬───┐
│ 1 │'a'│
├───┼───┤
│'a'│ 1 │
├───┼───┤
│'t'│ 9 │
├───┼───┤
    ⋮
```

#### 5.1 Dictionary creation

In [None]:
dict()

In [None]:
{}

In [None]:
# every item is a key-value pair
{1: 'a', 
 'a': 1, 
 't': 9} 

In [None]:
# it can also be created by converting a list of tuples or list of lists
# however, each inner tuple/list should only contain 2 items
# the first item will be the 'key' and the second item will be the 'value'
dict([(1,'a'),('a',1),('t',9)]) 

In [None]:
dict([(1,'a'),(1,'b')]) # what will be the resulting dictionary? 

#### 5.2 Retrieving/Modifying item of dictionary

In [None]:
a_dict = {'a':1,'b':2,'c':3}

In [None]:
a_dict['a'] # 'a' is the key of value 1, this will return 1

In [None]:
a_dict['a'] = 3 # modifying the value with key 'a' to 3

In [None]:
a_dict['d'] # what will be returned? 

In [None]:
'a' in a_dict # check whether 'a' is one of the **keys** of a_dict

In [None]:
a_dict.items() # get all key-value pairs as a list of tuple

In [None]:
a_dict.keys() # get a list of keys of a_dict

In [None]:
# how about getting all values from a_dict?
a_dict.values()

### 6. Discussion

Below is a small summary of properties of aforementioned collective data types:

Properties | List | Tuple | String | Set | Dictionary 
--- | --- | --- | --- | --- | ---
Mutable (Able to modify item after creation) | True | False | False | True | True
Ordered (Values associated with indices) | True | True | True | False | False
Allow Duplicate Values | True | True | True | False | True
Key-value pair | False | False | False | False | True

**[OPTIONAL]**:

It is interesting to see the effect of searching an item from these collective data types based on the time/speed of access. 

Let us try to run the below code and see the effect.

#### 6.1 Searching an item from a set

In [None]:
a_set = set([1,3,5])
%timeit 6 in a_set

#### 6.2 Searching an item from a list

In [None]:
a_list = [1,3,5]
%timeit 6 in a_list

#### 6.3 Searching a key from a dictionary

In [None]:
a_dict = {1:'a',3:'b',5:'c'}
%timeit 6 in a_dict

#### 6.4 Searching a pattern from a string

In [None]:
a_string = '135'
%timeit '6' in a_string

Based on the above outputs, we can say that searching an item in a **list** is much longer than other collective data types. It is because of the difference in the implementation between **list** and other data types. 

For more information, you can refer to the time complexities of the algorithms used in search operations:
1. [linear search](https://en.wikipedia.org/wiki/Linear_search) for list
2. searching in [hash table](https://en.wikipedia.org/wiki/Hash_table) for set and dictionary
3. [Two-way string-matching algorithm](https://en.wikipedia.org/wiki/Two-way_string-matching_algorithm) for string

### 7. In-class Practices

#### 7.1 Combine Dictionaries

Write a Python script to concatenate following dictionaries to create a new one.

<br>

    Sample Dictionary:
    dic1={'one':10, 'two':20}
    dic2={'three':30, 'four':40}
    dic3={'five':50,'six':60}

    Expected Result : {'one': 10, 'two': 20, 'three': 30, 'four': 40, 'five': 50, 'six': 60}

Hint: try `?dict.update`

<b><font color='red'>Solution</font></b>

In [None]:
# define the dictionaries
dic1={'one':10, 'two':20}
dic2={'three':30, 'four':40}
dic3={'five':50,'six':60}

In [None]:
# solution 1 - the straight-forward way
# define a new empty dictionary
dic4 = {}

dic4.update(dic1) # update the key-value pairs in dic4 with the items of dic1
dic4.update(dic2)
dic4.update(dic3)

print("dict 4: ",dic4)

In [None]:
# solution 2 - using loop
# define a new empty dictionary
dic4 = {}

# iterate through a tuple of dictionaries
for d in (dic1, dic2, dic3):
    # update the key-value pairs in dic4 with dic1, dic2 and dic3 in 1st, 2nd and 3rd iteration respectively
    dic4.update(d)

print("dict 4: ",dic4)

In [None]:
# solution 3 - using kwargs(keyword arguments)
dic4 = dict(**dic1, **dic2, **dic3)

"""
   dict(**dic1,         **dic2,            **dic3         )
=  dict(one=10, two=20, three=30, four=40, five=50, six=60)
"""

print("dict 4: ",dic4)

In [None]:
# dict(one=10, two=20, three=30, four=40, five=50, six=60)

#### 7.2 convert list into dictionary

Write a python code to combine the following two list into a dictionary. 


    Sample List:
    keys = ['zero', 'one', 'two']
    values = [0, 1, 2]
    Expected Result :{'zero': 0, 'one': 1, 'two': 2}

Hint: try `?zip`

<b><font color='red'>Solution</font></b>

In [None]:
# define 2 lists
keys = ['zero', 'one', 'two']
values = [0, 1, 2]
# using zip function to combine two lists itemwise and convert to dictionary
sampleDict = dict(zip(keys, values))
print(sampleDict)

#### 7.3 List Merging

There are two lists, and your final goal is to merge these two lists. Before joining, your manager asks you to save the **odd** indices from the first elements, and **even** indices elements into two lists and then combine them. 
<br> 

    list1 = [2,6,4,8,9,10,15,17,20,22]
    list2 = [5,6,3,27,16,19,10,26,33,80]
<br>

    Expected output:
    Element at odd-index positions from list1
    [6, 8, 10, 17, 22]
    Element at even-index positions from list2
    [5, 3, 16, 10, 33]
    Printing Final third list
    [6, 8, 10, 17, 22, 5, 3, 16, 10, 33]

Hint: try `?list.extend` and list slicing

<b><font color='red'>Solution</font></b>

In [None]:
# define 2 lists
list1 = [2,6,4,8,9,10,15,17,20, 22]
list2 = [5,6,3,27,16,19,10,26,33,80]
#create a empty list to store the merged list
list3 = list()

# geting odd-indexed item using list slicing
oddElements = list1[1::2]
print("Element at odd-index positions from list1")
print(oddElements)

# geting even-indexed item using list slicing
evenElement = list2[0::2]
print("Element at even-index positions from list2")
print(evenElement)

# display the final odd,even index list and merged list.
print("Printing Final third list")
list3.extend(oddElements)
list3.extend(evenElement)
print(list3)

## Exercise

Now, try to write the code and add comments for the following questions.

*You only need to use the techniques learnt in this notebook to complete the tasks*

### Task A: list certain value in a list

There is a nested list. Your task is to print the z in the string of ‘baz’. Please write the code to list the ‘z’. 
<br>

    x = [10, [3.141, 20, [30, 'baz', 2.718]], 'foo'] 


<b><font color='red'>Solution</font></b>

In [None]:
# define the nested list
x = [10, [3.141, 20, [30, 'baz', 2.718]], 'foo']

"""
All we need is to peel off the layers and get to the core
x[1] → [3.141, 20, [30, 'baz', 2.718]]
x[1][2] → [30, 'baz', 2.718]
x[1][2][1] → 'baz'
x[1][2][1][2] → 'z'
"""
x[1][2][1][2]

### Task B: Access to the dictionary value

There is one interview test ask you to write the code to access the value of marks of chemistry subject
<br>

    SampDict={'Class': {'Student' : {'name': 'Mike', 'Marks': {'physics' : 80, 'chemistry':70}}}}


In [None]:
#define dictionary
SampDict={'Class': {'Student' : {'name': 'Mike', 'Marks': {'physics' : 80, 'chemistry':70}}}}
#display chemistry subject marks
print(SampDict['Class']['Student']['Marks']['chemistry'])


### Task C: Calculator

a) Write a python code that stores two integer numbers as ‘x’ and ‘y’, then performs the following operations: +,- ,÷,×,mod,x sin(y). 
- Ensure that your statements print in a way that isn’t ambiguous. 
- At the end print the types of both x and y.

In [None]:
# import sin() function from math library
from math import sin

# assign value to variable x

# assign value to variable y

# print out the results from the operations between x and y

b) Now, rather than hard coding ‘x’ and ‘y’, try taking input from the user. Did you have any problems just using ‘input()’? 

In [None]:
# write the answer here
