![ContributION - An introduction to Python and Data Science](contribution.png)

# Collection data types

So far, the data types we've looked at were simple.  They contained a single value.  Other data types also exist.  These more complex data types allow for remembering (storing) more values as part of a single variable.  It however means you need to also do something extra to get the exact value you're after.

## List (array)
A **list** remembers an ordered list of values (even of different types).

You can create a list like this:

In [None]:
a = [1, 2, 4, 8, 16]

In [None]:
print(a)

Lists can also contain different types of values.

In [None]:
b = [1, 0.2, True, 'Hello']
print(b)

#### Make your own list of things about yourself, like your name, surname, age, favorite move and favorite food.

In [None]:
name = 'Grant'
print(name)
me = [name, 'Stead', 28, '5th Element', 'Roast lamb']
print(me)


### Indexing

To get to one of those individual values, you need to tell it which one you're after.  We do this by referring the its index.  This is zero-based, meaning the first one is 0, the second one is 1, etc.

In [None]:
me[4]

You can also **start at the end**, but counting down.  -1 is the last one, -2 is the second last one, etc.

In [None]:
me[-1]

You can only refer to valid indexes.  If you go outside of these boundries, you get an error.

In [None]:
a[-100]


You can assign values to individual places in the list (provided they already exist).

In [None]:
a[0] = 100
print(a)

The index can also be a variable, provided the variable contains a valid index.

In [None]:
name_index = 0
me[name_index] = 'Bruce'
print(me)

Assigning to an invalid index will cause an error.

In [None]:
a[100] = 42

So, how do you know how big the list is?

### Lists are smarter than you think.  
Lists aren't just variables to remember a list of number.  A list variable also has functions/methods you can use.

(By the way, plain variables are also smarter)

**len** determines the length of a list.

In [None]:
len(a)

What does **pop** do?

In [None]:
a.pop(0)

In [None]:
print(a)

In [None]:
a.pop(0)

In [None]:
print(a)

Can you deduce what **pop** does?

There is an easy way to find out more about a variable and the methods it has...

In [None]:
help(a)

Use the help above to determine what the following functions/methods do.  You can typically go back and re-run the *cell* by clicking on it and holding Shift down and pressing enter.

In [None]:
me.insert(4, 'Rock music')

In [None]:
me

In [None]:
me.remove('Rock music')
me

In [None]:
me.reverse()
print(me)

In [None]:
c = [1, 4, 7, 9, 2, 6, 4, 3, 4]
c.sort()
c.reverse()
print(c)

In [None]:
a.append(16)
print(a)

#### What else can you do with this lists?  Make your own and play with it.

#### How many values are there in this list?

In [None]:
how_long = [1, 2, 3, 4, 5, 6]

### Lists within lists
It is possible to have a list within a list.  One of the elements in the (outer) list is not a single value, but rather a list itself, which can in turn contain many values.

How can you make a list containing a list?  Just put it in.

In [None]:
favorite_movies = ['Die hard', 'Logan', '5th Element']
me = ['Grant', 'Stead', 44, favorite_movies]
print(me)

The **classes** list contains 2 subjects for every day of the week (5 of them).

In [None]:
classes = [['SVT', 'Math'], ['Sport', 'English'], ['French', 'Physics'], ['Chemistry', 'French'], ['Math', 'Geography']]
len(classes)

So, how do you access a value in a list of a list?

You start from the outer list.  For **classes** it is the days of the week.  Let's look at Wednesday (remember, lists are zero-based).

In [None]:
wednesday = classes[2]
print(wednesday)

On Wednesday, we have 2 classes.  The first one is French.

In [None]:
wednesday[0]

You don't have to create a variable for the inner list, you can do it in one go.  The following 2 statements are equavalent and show Friday's second class (remember... zero-based).

In [None]:
print( (classes[4])[1] )
print( classes[4][1] )

What is the first class of the week?

## Mutable vs. immutable
Mutable means it can change.  Immutable means it cannot be changed (even though it might look like it can).  Mutable basically means you have 2 variables pointing to the same place in memory, and when you change one, it looks like the other one also changed.  Immutable means that two variables point to different places in memory, and when you change one, you do not change the other.  A list variable just points to the list in memory.

In [None]:
a = [1, 2, 3]

In [None]:
print(a)

In [None]:
b = a

When we say **b = a**, we're making them point to the same place in memory.

In [None]:
print(b)

In [None]:
a[1] = 100

In [None]:
print(a)

In [None]:
print(b)

The **copy** method creates a copy in memory and returns a pointer to the new copy, hence e does not point to the same place in memory as a, so changing one doesn't affect the other.

In [None]:
e = a.copy()
print(e)

In [None]:
a[0] = 200
print(a)

In [None]:
print(e)

#### What happened here?

In [None]:
c = 1

In [None]:
print(c)

In [None]:
d = c

In [None]:
print(d)

In [None]:
c = 2

In [None]:
print(c)

In [None]:
print(d)

Why did **c** change but **d** didn't?  

When we assigned **d = c**, we copied the value and we did not assign the variable to point to the same memory location.  This is because *int*egers are immutable.

## Slicing

Just like you can refer to a single value in a list using **indexing**, you can also refer to subset of the list using **slicing**. **Slicing** using square brackets, but gives a start index (included) and end index (excluded)

In [None]:
a = [10, 20, 30, 40, 50, 60, 70, 80, 90]

In [None]:
a[3:6]

The example above starts at index 3 (the 4th one in our list), and goes to just **before** index 6 (the 7th one in the list).

Looking at index 2 (3rd one) in our sliced list gives 60.

In [None]:
a[3:6][2]

#### How can you get the last 3 values in the list?

In [None]:
a[6:9]

#### What if the list was longer, can you still use it?  Try leaving the number blank...

In [None]:
a[6:]

In [None]:
a[-3:]

In [None]:
a[len(a)-3:len(a)]

**Slicing** can also take every **nth** value by specifying a 3rd value in the brackets.

In [None]:
a = [10, 20, 30, 40, 50, 60, 70, 80, 90]

In [None]:
a[0:5:2]

It basically adds these:

In [None]:
[a[0], a[2], a[4]]

#### If you splice a list, is it a mutable or immutable?

In [None]:
a = [1, 2, 3, 4, 5]
b = a[:]
print(a)
print(b)
a[0] = 100
print(a)
print(b)

While assigning a list to a second variable kept it mutable, **slicing** gives a new value.

In [None]:
a = [1, 2, 3, 4, 5, 6]

In [None]:
b = a[2:4]

In [None]:
print(b)

In [None]:
b[0] = 10

In [None]:
print(b)

In [None]:
print(a)

**a** didn't change as **b** doesn't **point** to the same place anymore.

#### What about **indexing**?  Does it give a new value or not?  What about lists of lists?

In [None]:
classes = [['SVT', 'Math'], ['Sport', 'English'], ['French', 'Physics'], ['Chemistry', 'French'], ['Math', 'Geography']]
len(classes)
wednesday = classes[2]
print(classes)
print(wednesday)
wednesday[0] = 'English'
print(wednesday)
print(classes)

#### How would you reverse a list?

In [None]:
a.reverse()
print(a)

#### Is there another way of reversing a?
(Hint: Use step -1 in slicing)

In [None]:
a = [1,2,3,4,5,6,7]

# Sets

A set is the same as a list, except it isn't ordered.  You can't refer to the items by index, and hence you can't slice it.

In [None]:
s = {1, 2, 3, 4, 5}

In [None]:
s[0]

In [None]:
s[0:2]

Sets also can't contain duplicate values.

In [None]:
l = [1,1,1,1,2,3,4,5]
print(l)

s = set(l)
print(s)

### So, what are sets it good for?
Sets allow you to do things like unions between two different sets.

In [None]:
ones = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19}
twos = {2, 4, 6, 8, 10, 12, 14, 16, 18}
threes = {3, 6, 9, 12, 15, 18}

In [None]:
twos.union(threes)

Note that values are not duplicated.

In [None]:
threes.intersection(twos)

In [None]:
twos.issubset(threes)

#### What else can you do with sets?  Try something yourself.

In [None]:
help(twos)

#### Are sets mutable or immutable?

# Tuples
Tuples are like lists, except they are explicitly immutable.  Lists are more commonly used.

In [None]:
t = (1, 2, 3, 4, 5)

In [None]:
t[2]

In [None]:
t[2] = 100

In [None]:
help(tuple)