##  Lecture 3

### Learning objectives
- Learn about collections of variables: The `list()`, `tuple()` and `set()` data structures.
- Learn about _objects_.
- Learn about _methods_ which allow you to do things to (and with) _objects_.

## Data Structures

Now that we know about variables, it would be handy to group them together in some way.  In Python there are many ways to do this: **lists**, **tuples**, **sets** and **dictionaries** (among others). These group arbitrary variables and/or values together, (e.g., strings and integers and floats) in containers.

We'll go through some of the various data structures, starting with something called `list`.

### Lists (`list`)

- Lists are denoted with `[ ]`  and can contain any arbitrary set of elements (entries can have different types), including other lists!

- Lists preserve the original order of the elements (i.e. the order they are inserted).

- The length of the kists can dynamically change (grow or shrink).

- Elements in the list are referenced by an index number (**index**).  Similar to the strings we encountered in the last lecture, the indices begin at 0.  Remember that this is different from what you may be  used to,  which would be that the first element is index 1; in Python, the first element is **index** 0.

- You can  count from the end to the beginning by starting with -1 (the last item in the list), -2 (second to last), etc. 

- Lists have _methods_ that allow items to  be sorted, deleted, inserted, sliced, counted, concatenated, replaced, added on to, etc.


Let's take a look at some examples: 

In [1]:
my_list = ['a', 2.0, '400', 'spam', 42, [24, 2]] # defines a list - note the square brackets
print(my_list) # prints the list


['a', 2.0, '400', 'spam', 42, [24, 2]]


Notice that in the above the elements (entries) in the list have different types (strings, numbers and integers).

But if we want to print out the third **element** in the list we use **index** number 2: 

In [2]:
print(my_list[2]) # print the third element in the list (starting from zero)

400


And similarly, to print the fourth **element** we use **index** number -1:

In [3]:
print(my_list[-1]) # print the last element

[24, 2]


But if you want to print, say, the last three **elements**, you can't do this: 

In [4]:
print(my_list[-3:-1])

['spam', 42]


This is because the slice `list[begin:end]` means from `index = begin` up to and _not including_ `index = end`.  To actually slice out the last three **elements**, you can do it this way:

In [5]:
print(my_list[-3:])

['spam', 42, [24, 2]]


Unlike strings, you can change list **elements** "in place".  By "in place" we mean that you don't have to assign it to a new variable name; the list gets change "in place":

In [6]:
my_list[1] = 26.3   # replaces the second element
print(my_list)

['a', 26.3, '400', 'spam', 42, [24, 2]]


To delete, for example, the 4th **element** from a list, you can use the command `del`:

In [7]:
del my_list[3] # deletes the fourth element 
my_list

['a', 26.3, '400', 42, [24, 2]]

Like strings, you can  slice out a chunk of the middle of a list and assign it to another variable:

In [8]:
new_list = my_list[1:3] # takes the 2nd and third values and puts in newlist
#note it takes out up to but not including the last item number 
print(new_list)

[26.3, '400']


You can determine the number of entries in a list by using the built-in function `len()`:

In [9]:
print(len(new_list))

2


Making copies of lists behaves in ways you might not expect if you are coming from other programming languages.  We will learn more about copies of lists in later lectures, but here are some pro tips for now.  

You can assign a list to another variable name like this: 

In [10]:
my_copy = my_list


The variable `my_copy` is now a _copy_ of `my_list`.  But it is inextricably bound to the original, so if I change one, I change the other.  This type of copy is known as a _shallow_ copy. 

In [11]:
my_list[2] = 'new'
print(my_list)
print(my_copy)

['a', 26.3, 'new', 42, [24, 2]]
['a', 26.3, 'new', 42, [24, 2]]


See how `my_copy` was changed when we changed `my_list`?

To spawn a list that is an independent object (a _deep_ copy), you can do this:  

In [12]:
my_copy = my_list[:]
# now try changing mylist... 
my_list[2] = 1003
print(my_copy) # if there are two things to print, use 'print'
my_list # otherwise only prints the last one

['a', 26.3, 'new', 42, [24, 2]]


['a', 26.3, 1003, 42, [24, 2]]

See how `my_copy` stayed the way it was, even as `my_list` changed?  

There are more ways to make shallow and deep copies of Python objects which we will explore further in a later lecture.

#### Objects in Python

An object in Python is a collection that has _attributes_ and _methods_. The list variable `my_list` is an example of an object - a `list` object.  In fact they are **classes** which we will learn more about in later lectures, but I just wanted to mention them here.    

So what is so special about _objects?_  Python objects have _methods_ which allow you to do things to the object.  Methods have the form:
```python
object.method() # no arguments
```
or 
```python
object.method(argument1) # one arguments
```
or 
```python
object.method(argument1, argument2, ...)
```

Here, `argument1` is something that can get passed into the method.  

Let's look at a few examples starting with the `.append()` method for lists (which appends something (the `argument1`) to the end of a list). 

In [13]:
my_list.append('why not?') 
# append() is a method of lists that appends the argument 'why not?' to the list
print(my_list)

['a', 26.3, 1003, 42, [24, 2], 'why not?']


`.count()` is another method.  Let's see what it does: 

In [14]:
my_list.append('why not?') 
print(my_list.count('why not?'))

2


The method `.count()`  returns the number of times the argument (here the string `'why not?'`) occurs in the list.  

Another very handy list method is the `.index( )` method.  It returns the **index** of the desired argument in the list.  For example, if we wanted to know what the index of the element, 42, is in `my_list`, we would type: 

In [15]:
print(my_list.index(42))

3


These are just a few of the methods for lists. To view all of the methods for a list object, see:
https://docs.python.org/tutorial/datastructures.html.

### Creating lists

For example, we could make this list: 

In [16]:
santas_list = ['naughty', 'nice']
print(santas_list)
# and check it twice
print(santas_list)

['naughty', 'nice']
['naughty', 'nice']


#### More useful lists:  

Earlier, we made a list by defining a variable with square brackets. We added to that list with `.append()`.  Another way to generate a list is to use `range()`, which is one of the Python built-in functions we mentioned in Lecture 2. The function `range()` is a _list_ _generator_ and can be used to generate a list of integers between two numbers,  the `start` and `end`, where each number is separated by a specified `interval`. You can make a list from the generator like this:
```python
list(range(start, end, interval))
```
Note that `range()`, like `list` slicing, goes up to but does not include the value specified by `end`.  

In [17]:
# creates a list starting from 2, upto 20 (not including 20!) in increments / intervals of 4
numlist = list(range(2, 20, 4)) 
print(numlist)


[2, 6, 10, 14, 18]


You can also perform operations on `list`, for example you can "add" two lists together using the "+" operator. This works similarly to strings, where "add" actually concatenates two lists together.

In [18]:
var_la = [1, 2, 3]
var_lb = [4, 5, 6]
var_lc = var_la + var_lb
print(var_lc)

[1, 2, 3, 4, 5, 6]


### Tuples (`tuple`)

**Tuples** are another important object in Python that are similar to lists, but have important differences.  They are denoted by parentheses `( )` and  consist of  values separated by commas. Like lists, they can contain different elements (each entry may assume a different type) and they also preserve the order the elements. 

However, unlike lists, the elements of a tuple cannot be changed in place The length of the tuple also cannot change dynamically (grow or shrink) - its length is fixed once it is created. Tuples are thus called _immutable_. Their primary use is to pass information into and out of programs as we shall see in the coming lectures.  


Similar to both lists and strings, you can slice, concatenate, etc. For more see: 
 
 http://docs.python.org/tutorial/datastructures.html#tuples-and-sequences

Here are three equivalent ways to create a tuple: 

The most explicit is to create a tuple like this, using a list as the argument

In [19]:
t = tuple([1234, 2.0, 'hello'])
print(t)

(1234, 2.0, 'hello')


The second approach is elegant - just use the round brackets (cf. list which uses square brackets)

In [20]:
t = (1234, 2.0, 'hello')
print(t)

(1234, 2.0, 'hello')


Lastly, one can simply do:

In [21]:
t = 1234, 2.0, 'hello'
print(t)


(1234, 2.0, 'hello')


You can retrieve the contents of a tuple (unpack it) like this:

In [22]:
a, b, c = t
print(a)
print(b)
print(c)

1234
2.0
hello


You can access an element in a tuple by using the index number, exactly like a list.

In [23]:
t[0]

1234

But, you can't change it: 

In [24]:
t[0] = 'haha'
print(t)

TypeError: 'tuple' object does not support item assignment

Recall you cannot change the length of `tuple` - there is no `append()` method. That said, you can "add" them together with the "+" operator:

In [25]:
var_ta = (10, 11, 12)
var_tb = (20, 21, 22)
var_tc = var_ta + var_tb
print(var_tc)

(10, 11, 12, 20, 21, 22)


### Sets (`set`)

There are  more data structures that comes in handy and one is  the `set`. They are denoted with curly braces `{ }`.  A `set` "contains an unordered collection of unique and immutable objects". Translation - (i) all entries in a set must of the same type; (ii) entries in a set do not preserve their original order - hence "unordered"; (iii) entries within a `set` are unique - no duplicates will appear; (iii) like `tuple`, you cannot change (mutate) the entries in place. However, like `list`, you can insert new entries into a `set`.

You can create sets in several ways. The first would be the use the Python built-in `set()` function on a list: 

In [26]:
S1 = set(['spam', 'ocelot', 42])
S1

{42, 'ocelot', 'spam'}

The elegant alternative is to create a set just using curly braces:

In [27]:
S1 = {'spam', 'ocelot', 42}
S1

{42, 'ocelot', 'spam'}

Notice how the order changed, compare the order of the values in our input to `set()` with the  values printed of the set.

Also, notice what happens if we violate the "unique" part of the definition:

In [28]:
S2 = set(['spam', 'ocelot', 'ocelot'])
S2

{'ocelot', 'spam'}

Only one of the variables with the value `ocelots` made it into the set - duplicates are discarded. [By the way, "ocelot" is another Monty Python joke - look it up if you like.] 

Sets contain immutable objects, but they themselves can be changed. 


Below are a few useful methods the `set` class supports:

In [29]:
# "add" Merge two sets together, note that single args are converted into a set
print(S1) 
S1.add('chocolate')
print(S1)

{'spam', 42, 'ocelot'}
{'spam', 42, 'chocolate', 'ocelot'}


Note that if you try to `.add()` an entry already defined in the set, the set will not change (entries in a `set` are always unique).

In [30]:
# "clear" Remove all entries from the set 
S2.clear() 
S2

set()

See how `S2` is now just an empty set object `set()`.

Now let's try copying `S1`. 

In [31]:
# copy
S2 = S1.copy()
print(S2)
S1.clear()
print(S1)
print(S2)

{'spam', 42, 'chocolate', 'ocelot'}
set()
{'spam', 42, 'chocolate', 'ocelot'}


The `.copy()` method for sets does not work like copying lists - it made an independent object `S2` which did not clear when `S1` got cleared.  

`.difference()` is another handy method - it can be used to find what is different about two sets. 

In [32]:
# difference
S1 = set(['spam', 'ocelot', 42])
S2 = set(['spam', 'ocelot'])
S1.difference(S2)

{42}

Suppose you want to know what values are the same (i.e. in common) between two sets.  For that, you can use the `.intersection()` method.  

In [33]:
# intersection
S1.intersection(S2)

{'ocelot', 'spam'}

There is a nice short cut for this:

In [34]:
S1&S2

{'ocelot', 'spam'}

You might be tempted to try "adding" two sets together - you cannot

In [35]:
var_sa = {22, 21, 20}
var_sb = {20, -11, 12}
print(var_sb)
var_sc = var_sa + var_sb
print(var_sc)

{12, 20, -11}


TypeError: unsupported operand type(s) for +: 'set' and 'set'

Okay - "+" is not allowed with `set`, but you concatenate two sets this way.

In [36]:
var_sc = set(list(var_sa) + list(var_sb))
print(var_sc)


{12, 20, -11, 21, 22}


You should think about why this actually works.

The "proper" way to achieve the same objective would be to use the `set` method `union()`:

In [37]:
var_sc = var_sa.union(var_sb)
print(var_sc)

{20, 21, 22, -11, 12}


Note that result here contains the same entries as our "convert to list then add" version, but the ordering is different - there is no order to entries in`set`. Besides the behavior of `set`, you should walk away with the take home message that there is more than one way to achieve any desired objective in Python.

### References

1. The `list` class: https://docs.python.org/3/tutorial/datastructures.html
2. The `tuple` class: http://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences
3. The `set` class: https://docs.python.org/3/tutorial/datastructures.html#sets
4. Methods for `set`: https://www.python-course.eu/python3_sets_frozensets.php