# Week 1: Introduction to Data Types

# Intermediate: Collections

Welcome to the *Week 1 Intermediate Python Notebook*. This notebook is designed for students who already have some experience with Python and are ready to build on the basics.  

Your task today is to read through the material carefully and complete the exercises provided at the end. These exercises are designed to deepen your understanding and give you practical experience with new concepts.  

> **Important:** Before starting this notebook, make sure you are confident with everything in the [`Beginner`](./week_01_intro_to_data_types_beginner.ipynb) notebook. You should be comfortable working with the basic data types: numeric, boolean, and string, and attempt at least $4$ of the [`Beginner exercises`](./week_01_intro_to_data_types_beginner.ipynb#Exercises). The material in this notebook builds directly on those foundations.  

In this notebook, you will explore Python’s *collection data types*: *tuples*, *lists*, *dictionaries*, and *sets*. These are powerful tools for grouping, organizing, and working with data in more complex ways.  

Be sure to work through the examples and attempt all the exercises. They are designed to help you practice, reinforce your learning, and prepare you for the more advanced topics ahead.  

### Table of Contents

 - [Welcome Page](./week_01_home.ipynb)

 - [Beginner: Basic Data Types](./week_01_intro_to_data_types_beginner.ipynb)

 - [**Intermediate: Collections**](./week_01_intro_to_data_types_intermediate.ipynb)
   - [Tuples and lists](#Tuples-and-lists)
   - [Sets and Dictionarys](#Sets-and-Dictionarys)
   - [Exercises](#Exercises)
 - [Advanced: Copying and References](./week_01_intro_to_data_types_advanced.ipynb)
 - [Slides](./week_01_slides.ipynb) ([Powerpoint](./Lecture1_Introduction_And_Data_Types.pptx))

## Tuples and lists



In this section, we are going to look at Tuples and Lists. 

 - A tuple is a collection which is ordered and ***unchangeable***. In Python tuples are written with ***round brackets***.

 - A list is a collection which is ordered and ***changeable***. In Python lists are written with ***square brackets***. 

*where **changeable** means that once created, the elements of the list can be modified, added, or removed, while the elements of a tuple cannot be altered after creation.*

Both tuples and lists are built-in data types, can hold elements of arbitrary datatypes and behave, in many respects, like mathematical vectors. However, for numerical vectors and arrays it is much better to use numpy arrays, which are covered later in the course.

Below are some examples of lists and tuples. Note again that lists and tuples can contain data of any type.

In [None]:
# Example tuple
example_tuple = ('random string', 8, True)
print(example_tuple)

# Example list
example_list = [10, 'str', False]
print(example_list)

This means they can also be nested. I.e. we can put a list in a list, a tuple in a tuple, a list in a tuple and a tuple in a list!

In [None]:
TupleAndListInTuple = (example_tuple, example_list)
TupleAndListInList = [example_tuple, example_list]
print(TupleAndListInTuple)
print(TupleAndListInList)

### Adding to Lists and Tuples

Lists and tuples can be added to using the concatenation operator, `+`, like so:

In [None]:
example_list = [1, 2, 3]
example_list = example_list + [4]
example_list +=  [5]
print(example_list)

example_tuple = (1, 2, 3)
example_tuple = example_tuple + (4,)
example_tuple +=  (5,)
print(example_tuple)

### Indexing Lists and Tuples

To index in python, square brackets are used. This is a common convention which applies to almost all vector-like datatypes in Python, e.g. tuples, lists, strings, numpy arrays, etc.

Python uses zero indexing, which means that the first element in a list of length `n` is indexed as `0` and the last element is indexed as `n-1`. For example see the below (note that the same can also be done for tuples):

In [None]:
example_list = [2,'str',4,9,0]

# Work out the length of the list
n = len(example_list)

print(example_list[0])
print(example_list[n-1])
print(example_list[3])
#print(example_list[n]) # This line will fail as the highest index for a python list is n-1

Negative numbers can also be used to index lists (and other similar datatypes). For example, the index `-1` will give the last element of the list, the index `-2` will give the second to last element of the list and so on. This is known as circular wrap-around.

For a list of length `n` any index can be used between `-n` and `(n-1)` (inclusive) to access elements in the list. Indices outside of this range, however, will give an "`index out of range`" error.

In [None]:
# Examples of circular wrap-around
print(example_list)
print(example_list[-2])
print(example_list[-4])
print(example_list[-n])

# Examples of indices which will fail/give errors
#print(example_list[n])
#print(example_list[-n-1])
#print(example_list[20*n + 2]) 
#print(example_list[-20*n - 2])

Note that nested lists and tuples support nested indexing but this involve multiple sets of square brackets (for example see below).

You may be more familiar with indexing of the form `example_list[a,b]` (i.e. only one set of square brackets); this syntax does exist in Python but unfortunately not for lists. In fact, this syntax is used heavily by numpy arrays (which we will cover later in the course).

In [None]:
exampleNested = [('str', 20, 'str2'), [False, 8, 9.5]]
print(exampleNested[0][1])
print(exampleNested[1][0])
print(exampleNested[1][-1])

### Slicing Tuples and Lists

A range of index values can be specified to extract values from a list using a colon, `:`. This way of accessing data is known as *slicing* and can be performed as follows:

In [None]:
example_list = [1,2,3,4,5]
print(example_list[0:3])

> **Warning:** Often, in other languages such as `Matlab` the syntax `0:3` represents the range `[0,1,2,3]`; however, in python `0:3` represents only `[0,1,2]`. In other words the syntax `k:n` includes `[k,k+1,...n-1]` but does not include `n` itself! This is common to all data types in python and should always be remembered when indexing anything in Python!

When indexing in python you can leave the start and end values implicit. This will give the same effect as starting from the beginning of the list and ending at the end of the list. For example:


In [None]:
example_list = [1,2,3,4,5]

# The below two lines will give the same
print(example_list[0:3])
print(example_list[:3])

# The below two lines will also give the same
print(example_list[2:5])
print(example_list[2:])

# This will print the whole list
print(example_list[:])

You can also indicate a step size when indexing by using the following syntax:

In [None]:
# Print every second element between the 0th and 5th elements
print(example_list[0:5:2])
print(example_list[::2])

And you can run backwards through the list by using negative integers as the step size. For example:

In [None]:
# Print the list backwards
print(example_list[::-1])

### Operations on Lists and Tuples

Other operations are also available for lists and tuples. Most notably, the `*` symbol can be used to replicate a list or tuple like so:

In [None]:
example_list = [1,2,3,4]
print(example_list*2)

example_tuple = (1,2,3,4)
print(example_tuple*2)

Other notable operations for lists include:

 - `insert`: This function adds an element to the list at a specified position.
 - `pop`: This function removes the element in the list at the specified position.
 - `remove`: This function removes the item with the specified value from the list.
 - `reverse`: This function reverses the order of the list.
 - `sort`: This function sorts the list.
 
A few examples of these functions are given below. Try changing and editing the below code to check your understanding! Again, this is by no means a comprehensive list; more information can be found in the [Python API](https://docs.python.org/3/tutorial/datastructures.html) or in the [W3 schools documentation](https://www.w3schools.com/python/python_lists.asp).

In [None]:
# Here is an example list
example_list = [1,2,3,4,5]
print(example_list)

# We will now insert the value 3 at index 2
example_list.insert(2,3)
print(example_list)

# We will now remove the item in the 4th position of the list
example_list.pop(4)
print(example_list)

# We will now remove an element with the value 3 from the list
example_list.remove(3)
print(example_list)

# We will now reverse the list
example_list.reverse()
print(example_list)

# We will now sort the list
example_list.sort()
print(example_list)

Not as many operations are available for tuples as tuples are immutable (not meant to change in value). Two available operations are:

 - `count`: This function returns the number of times a specified value occurs in the tuple.
 - `index`: This function searches the tuple for a specified value and returns the position of where it was found.
 
Examples of these are given below. For more information on tuples please visit the [Python API](https://docs.python.org/3/tutorial/datastructures.html) or the [W3 schools documentation](https://www.w3schools.com/python/python_tuples.asp).

In [None]:
example_tuple = (1,2,3,3,4,5,6)

# Count how many times the value 3 occurs
print(example_tuple.count(3))

# Find the number 2 in the tuple and return it's index
# (Remember indexing starts at 0 in Python!)
print(example_tuple.index(2))

## Sets and Dictionarys

In this section, we are going to look in more detail at Sets and Dictionarys.

 - A set is a collection which is **unordered**, **changeable** and **unindexed**. In Python sets are written with **curly** brackets.
 - A dictionary is a collection which is **unordered**, **changeable** and **indexed**. In Python dictionarys are also written with **curly** brackets but also with keys and values.

Like tuples and lists, both sets and dictionarys are built-in python types and can hold elements of arbitrary data types. However, unlike tuples and lists they do not behave like vectors as they are unordered. In other programming languages, such as Java, you may have heard of dictionarys referred to as `hashmaps`.

Below are some examples of sets and dictionarys.

In [None]:
example_set = {1,'str',3}
print(example_set)

example_dict = {'A':1, 'B':'str'}
print(example_dict)

# We can also create dictionaries using the dict function if using strings for keys
example_dict = dict(A=1, B='str')
print(example_dict)

### Adding elements to and retreiving elements from Dictionarys

Elements can be added to dictionarys as `key`s and `value`s. For example; in the first line below the `key` is `a` and the `value` is `b`. Note that a key does not have to be a string; integers and floats work just as well. Values can be of any data type.

In [None]:
# Here we assign value 'b' to key 'a'
example_dict['a'] = 'b'

# Here we assign value 2 to key 10
example_dict[10] = 2

# Here we assign value [2,1] to key 1.1
example_dict[1.1] = [2,1]

print(example_dict)

We can then retreive `value`s from the dictionary using the `key`s we stored them under. This can be done with either square brackets or the built-in `get` function. For example:

In [None]:
# Using square brackets
print(example_dict[10])
print(example_dict['a'])

# Using the `get` function
print(example_dict.get(10))
print(example_dict.get('a'))

### Removing elements from a dictionary

Elements can be removed from a dictionary using the `pop` and `del` methods. For example:

In [None]:
# Example dictionary
example_dict = {'A':1, 'B':'str', 'C':2}
print(example_dict)

# Remove element 'A'
example_dict.pop('A')
print(example_dict)

# Remove element C
del example_dict['C']
print(example_dict)

### Operations on Dictionarys

Other operations are also available for dictionarys. Some of the most useful of these are;

 - `keys`: This function returns a list containing the dictionary's keys.
 - `values`: This function returns a list of all the values in the dictionary.
 - `items`: This function returns a list containing a tuple for each (key, value) pair.
 
Examples of these functions are given below. For full documentation and details on more functions see the [Python API](https://docs.python.org/3/tutorial/datastructures.html) and [W3 schools Python documentation](https://www.w3schools.com/python/python_dictionaries.asp).

In [None]:
example_dict = {'A':1, 'B':'str', 'C':2}
print(example_dict.keys())
print(example_dict.values())
print(example_dict.items())

### Adding elements to a set

Elements can be added to a set using the `add` function. For example, see the below code. 

Note that a `set` is designed to mimic the idea of a `set` in mathematics and, therefore, `set`s do not allow for duplicate entries. An entry is either in or not in a set; it cannot be "in" a set twice.

Multiple elements can also be added to a set at once using the `update` method.

In [None]:
example_set = {1,'str',3}
print(example_set)

# Add an element
example_set.add(8)
print(example_set)

# Add another element; note this element is already in the set and therefore the
# set is left unchanged
example_set.add(1)
print(example_set)

# Add two elements to the set
example_set.update([10,'blah'])
print(example_set)

However, there is no notion of indexing for Python `set`s as sets are **unordered**. This means that individual elements cannot be accessed in the same manor as in dictionarys, lists and tuples. We can check if an element is in a set, however, using the `in` keyword.

In [None]:
print(1 in example_set)
print(2 in example_set)

### Removing elements from a set

Elements can be removed from a set using the `remove` function. Note: the `remove` function will errror if the requested value is not in the set.

In [None]:
example_set.remove(1)
print(example_set)

### Operations on Sets

Other operations are also available for sets, many of which are designed to mimic mathematic functions. For example:

 - `difference`:	This function returns a set containing the difference between two or more sets.
 - `intersection`:	This function returns a set which is the intersection of two other sets.
 - `issubset`: This function returns `True` if the first set input contains the second set input and `False` otherwise.
 - `issuperset`: This function returns `True` if the first set input is contained within the second set input and `False` otherwise.
 - `union`: This function returns a set containing the union of input sets.
 
Examples of these are given below. For full documentation and details on more functions see the [Python API](https://docs.python.org/3/tutorial/datastructures.html) and [W3 schools Python documentation](https://www.w3schools.com/python/python_sets.asp).

In [None]:
print('Sets')
set1 = {1,2,3,4}
set2 = {2,4,5,6}
print(set1)
print(set2)

# Difference
print('\nSet Differences')
print(set1.difference(set2))
print(set2.difference(set1))

# Intersection
print('\nIntersection')
print(set1.intersection(set2))

# Is subset or superset
print('\nSubset/Superset')
print(set1.issubset(set2))
print(set1.issuperset({1,2}))

# Union
print('\nUnion')
print(set1.union(set2))

## Exercises

**Question 1:** The below code creates a `list` of numbers:

In [None]:
# List of numbers
nums = [5, 2, 8, 2, 9]

Using this list, write code which does the following:
 1. retrieves the third element of the list
 2. slice the first three elements of the list
 3. replaces the last element with `10`
 4. appends `7` to the end of the list
 5. removes the first occurrence of `2` 
 
Make sure to print the list after each change.

In [None]:
# Write your code here...

**Question 2:** The below code creates a `list`, casts it to a `set` and then casts it back to a `list`.

In [None]:
# Create the list
my_list = [4,2,3]

# Convert it to a set
my_set = set(my_list)

# Convert it back to a list
my_new_list = list(my_set)

print(my_list==my_new_list)

The new list doesn't equal the original list. Why do you think this might be?

*Hint: Try printing `my_list` and `my_new_list`.*

**Question 3:** A string can be converted to a set using the `set()` constructor. Use this fact to identify which letter appears in `string2` but not in `string1` below:

In [None]:
# Here are two strings
string1 = "This is a long random sentence - I wonder which letters it contains and which letters it doesn't."
string2 = "This task makes a clear and coherent string with words taken inside a small set"

*Hint: You might want to recap `difference` function from this notebook and the `lower()` function from the [beginner notebook](./week_01_intro_to_data_types_beginner.ipynb).*

**Question 4:** The below code creates a tuple named `t`.

In [None]:
t = ('a', 'b', 'c', 'd', 'e')

Write some code which does the following:

 1. access the first and last elements of `t` 
 2. slice out `('b', 'c', 'd')` 
 3. check whether `'c'` is in `t`
 4. find the index of `'d'` in `t`
 5. count how many times `'a'` appears in `t`

In [None]:
# Write your code here...

**Question 5:** You are given the below dataset for `5` individuals in a clinical trial. 


| Name     | Weight (kg) | Height (cm) |
|----------|-------------|-------------|
| John     | 82          | 178         |
| Alice    | 68          | 165         |
| Maria    | 74          | 170         |
| David    | 90          | 185         |
| Sarah    | 60          | 160         |

Using the collection types introduced in this notebook (lists, tuples, sets, dictionaries - no loops or user-defined functions), choose an appropriate way to represent the clinical-trial table. Write your code in the box below.

In [None]:
my_data = # Write your code here...