# Data Structures

This unit will focus on the various containers used for data, both ones that come naturally with Python, and others we will import from various packages:

* **Lists**: an ordered sequence of objects, indexed by integers.
* **Tuples**: a less flexible version of the list, which is *immutable* (i.e.: you cannot change it once you've created it).
* **Dictionaries**: a more flexible version of the list, where we can define our own index (called 'keys').
* **Pandas Series**: a hybrid of a list and a dictionary, a one-dimensional NumPy array where we can specify the index.
* **Pandas Dataframe**: built on the Series object, Dataframes are multi-column indexed collections of data.

Most of the data work we'll do involve dataframes, but understanding each of these containers is important.  We will often use lists and dictionaries in the creation of dataframes.

## Lists

A *list* is an ordered sequence of data enclosed in square brackets, with each data point separated by a comma.  For example:

In [None]:
l = [2,3,5,7]

Lists can contain multiple data types as entries:

In [None]:
m = [10,'ES',-1.234,True]

and can even have other lists as entries:

In [None]:
J = [21,5,['apples',12,False]]

You can create an empty list by defining a variable name to brackets ```[]```:

In [None]:
empty_list = []
# Calling it will produce the empty list
empty_list

A list is an ordered sequence, and Python will remember the order the elements appear with an assigned integer index, similar to characters in a string.  We can access individual list elements by index with the same syntax as with strings, with the format ```listname[index_number]``` (and once again, we have to remember that counting starts at 0).  For example:

In [None]:
L = ['a',17,'Q']
L[0]

We can incorporate lists into previous control flow techniques we've learned.  Suppose we wanted to iterate over the list indices, and print the list elements if they are strings.  We could run this code:

In [None]:
for x in range(len(L)): # like with strings, len(list) will output the length of the list
    if type(L[x])==str:
        print(L[x])

We can also iterate over the *list itself*, instead of worrying about indices.  Rewriting the previous example:

In [None]:
for x in L:
    if type(x)==str:
        print(x)

Iterating in this way lets us have more compact code.<br>
Consider the following list.  How would we add the first and third numbers from the list together, and store the answer in a variable?

In [None]:
l = [2,3,5,7]
x = 

The case above where we had a list as an element of another list, when we call the index that the list occurs, Python will return that list:

In [None]:
J

In [None]:
J[2]

How would we access the second element of the list (which is the third element of the list `J`)?

This allows us to effectively create *matrices* with lists (we'll  be speaking more about matrices when we get to the unit on statistics and linear algebra).  We create a list of lists, where the internal lists represent the *rows* of the matrix.  For example, we can represent the matrix:

$$ B = \begin{bmatrix} 1 & 2 \\ 0 & 4 \end{bmatrix} $$

as:

In [None]:
B = [[1,2],[0,4]]

The element $2$ in the matrix $B$ is in the first row, second column.  How would we retrieve that from the list object `B` using indices?

In [None]:
B

### Slicing list elements
If we wanted to access a slice of list elements, we would use the same syntax for slicing string elements:  for a given list `l`, to slice out the elements from index `n` up to (but not including) index `m`, we would write:<br>
`l[n:m]`<br>
As an example, if we consider the following list:

In [None]:
sl = ['apples','pears','oranges','cherries','grapefruits','pomegranates']

and wanted elements 2 through 4, we would slice:

In [None]:
sl[1:4]

Leaving the first and/or last index call blank will make Python assume you mean "from the beginning" or "to the end".  

In [None]:
sl[2:]

This slices from element 3 until the end.<br>
We can use the step feature as well, with a second colon:

In [None]:
sl[1::2]

This slice starts from the item at the second index, goes until the end, and only picks out every second item.

So far we've been accessing list elements by index, but what if we want to find out what the index is of a particular list element? (Particularly handy if we have very long lists.)  We accomplish this using the ```index()``` command, with the syntax ```listname.index(element_name)```.  If we wanted to know what the index of ```'cherries'``` from the list ```sl``` is, we would write:

In [None]:
sl.index('cherries')

One issue with this is that it will only return the index of the *first* occurrence of the item you ask for.

In [None]:
repeat = [14,15,16,16,17]
repeat.index(16)

How would we write code that would give us a list of all of the indices of the occurrences of 16 in the list called ```repeat```?

### Replacing and adding list elements

We access a list element by calling ```listname[index_number]```, but we can also use this syntax to redefine list elements.  If we use the list ```l``` above, and wanted to replace the second element with the string ```'lantern'```, we would write:

In [None]:
l[1] = 'lantern'
l

This is the same syntax we use for defining a variable.  But what if we wanted to add something new to the list?  There are a few options here:
1. We can use Python's `append` command to tack on whatever we want to the end of the list.  This operation is performed "in place", meaning our list is redefined to be a list with all of our old elements, and then with the new element at the end.
2. We can concatenate two lists together using `+`, i.e.: we can define a new list which is the old list + another list contianing the new element(s) we want added.

In [None]:
l = [1,2,3,4]
l.append(5)
l

In [None]:
m = [1,2,3,4]
newlist = m + [5]
newlist

It's important to remember that the element `5` we're adding to the list must itself be in a list.  If we tried to just add the number 5, we get the error message:

In [None]:
m+5

We can also concatenate copies of a list onto itself a set number of times by using `*`:

In [None]:
m * 3

This is especially handy when you want a list of size `n` that all contain the same element.  For example, if we wanted a list of all 1s, whose length was 50, we could easily write:

In [None]:
ones = [1]*50
print(ones)

Let's do an exercise where we expand our list `sl` to include kiwis, pineapples, grapes, and tomatoes.

In [None]:
sl = ['apples','pears','oranges','cherries','grapefruits','pomegranates']



### Creating new lists with For loops

If we begin with an empty list, `l=[]`, we can add elements to it iteratively using a for loop. This is achieved the same way we *append* elements to an existing list.

As an example, if we wanted to create a list called `one_to_ten` that contains all of the numbers from 1 to 10, we would write:

In [None]:
one_to_ten = []
for x in range(1,11):  # Note:  why 11 as the upperbound?
    one_to_ten.append(x)
one_to_ten

We can append with formulas as well.  If, instead of just the numbers from 1 to 10, we wanted the list of `2*x+1` for x in the same range, we could make:

In [None]:
another_list = []
for x in range(1,11):  
    another_list.append(2*x+1)
another_list

We can also add conditionals to our loop to get specific things.  For example, if we wanted to get a list of all of the capital letters in a string sentence, we could use an `if` statement to check if each string is a capital before appending:

In [None]:
sentence = 'Creating new lists with For loops.'
capital_letters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
caps = []
for x in sentence:
    if x in capital_letters:
        caps.append(x)
        
caps        

As an exercise, an obvious way to make a list of, say, the even numbers between 0 and 50 with this method is: 

In [None]:
evens = []
for x in range(0,51):
    if x % 2 == 0:
        evens.append(x)

print(evens)

How would we create a list with all of the even numbers between 0 and 50 *without* using a conditional (if) statement?

### List sorting

The order of elements in a list matters, but what if it's not in the order we want it?  Python comes with a few list sorting options, that usually follow alphabetical or numerical rules.  For Python 3, the common tool is the ```sorted()``` function.  If we pass a list of comparable elements (can we compare strings and numbers?), it will produce a list in numerical or alphabetical order.

In [None]:
ln = [2,-5,10,7,3,5]
sorted(ln)

In [None]:
lm = ['apples','pears','oranges','cherries','grapefruits','pomegranates']
sorted(lm)

What would happen if we tried to sort the following two lists?

In [None]:
test1 = [1,3,0,True,10,3.5,False]

In [None]:
test2 = [1,'2',3,'4',5,'A']

For test2, what could we do to create a new list which first has the sorted integers, and then has the sorted strings?

In [None]:
test2_sorted = 

## Tuples

Tuples are very similar to lists, in that they are ordered sequences of objects.  The difference between lists and tuples, however, is that tuples are *immutable*, meaning that they cannot be modified once you have created them (the only way to change them is to redefine them).  We will not be using tuples directly for most of the work we'll be doing, but we will be interacting with them via some of the modelling techniques.

Tuples are created with parentheses, and have their entries separated by commas:

In [None]:
t = (1,2,3)

We can get individual elements via their index with the same syntax as lists and strings.  For example, pulling the first value:

In [None]:
t[0]

But if we try to change one of the values by reassigning it, we get an error message:

In [None]:
t[0] = 4

Similarly, if we try to use `.append()` as we do with lists, we get an error:

In [None]:
(1,2).append(5)

We *can* concatenate two tuples together using `+`, as with lists.  Why doesn't this break the rules of immutability?

In [None]:
(1,2) + (3,4)

We can also "tuple-fy" any sequence by using the `tuple()` command on it.  So you can make a tuple out of a list:

In [None]:
tuple([3,5,7])

or a range object:

In [None]:
tuple(range(1,10))

or even a string:

In [None]:
tuple('data')

### Tuple Packing / Unpacking

While we saw that tuples are enclosed in parentheses, we can actually create them without using parentheses.  If we define a variable to be a sequence of objects separated by commas, this will produce a tuple:

In [None]:
t = 10,'x',True
t

This is called *tuple packing*.  This can also be run in reverse, which is where we will encounter it.  If we have a tuple, we can perform *tuple unpacking* by choosing a number of variable names equal to the size of the tuple, and assigning them (separated by commas) to the tuple:

In [None]:
a,b,c = t

In [None]:
a

In [None]:
b

We get an error message if we don't use the same number as the tuple size:

In [None]:
a,b = t

This unpacking comes in handy when you have a function that returns multiple values.  When a function always returns the same number of values, you can have it return a tuple:

In [None]:
def testf(x):
    return x, 2*x, 3*x+2

In [None]:
testf(2)

If we know what the function is supposed to return, we can unpack the values to variables to use in later computations:

In [None]:
a,b,c = testf(5)

In [None]:
c

In [None]:
a

Many of the NumPy/SciPy functions that provide regression modelling return the relevant parameters as a tuple; this will allow us to unpack the values as variables to create our model.  For example, we'll see code that looks like this:

`slope, intercept, r_value, p_value, slope_std_error = stats.linregress(x, y)`

Then we can define a function with the important parameters to create our linear model:

```def lin_model(x):
    return slope*x + intercept ```

As a note, this kind of unpacking works for other kinds of sequences as well:

In [None]:
x,y,z = [1,2,3]
x

In [None]:
first,second,third = 'abc'

In [None]:
second

## List Comprehensions

We saw that we can create new lists using for loops and the `.append()` function.  For iteratively creating lists, however, there is a much better method:  **list comprehensions**.  They have the basic format:
<br>   `l = [f(x) for x in iterable]`,<br> where `f(x)` is some function on `x`. 
Not only do list comprehensions make your code neater and more compact, they're also generally faster than using for loops (or the `map` function). If we rewrite the example from above where we wanted the list of `2*x+1` for x between 1 and 10:

In [None]:
another_list = [2*x+1 for x in range(1,11)]
another_list

So, we're iterating from 1 to 10, and at each step we're appending the value `2*x+1` to the list.  You can add conditionals to list comprehensions, just as we did with the for loop.  The syntax is<br>
    ` l= [f(x) for x in iterable if (condition on x)]`.  
    
For example, if we wanted all of the elements of a given list that were divisible by 3: 


In [None]:
numbers = [2,3,5,8,9,14,15,21,23,25,30]
l = [x for x in numbers if x%3==0]
l

(Note here the 'function' we're applying is the identity function. We just want the list elements that satisfy the condition, so we're not applying any changes to them.  Hence, it's just `x for x in numbers`.)

This syntax is very similar to *set-builder notation*, which you may have seen in math courses.  For example, if we wanted the collection of integers bigger than 12, we would write

$$ S = \{\,x\in\mathbb{Z}\,|\, x>12\}, $$

i.e. "$S$ is the set of $x$ in the integers **such that** x is bigger than 12".  The first section shows where the elements we're looking at come from, and the vertical line represents "such that", for which follows the condition.

Suppose we had a tuple of words (as strings).  How would we create a list of all of the words that contain a `w`?

In [None]:
words = ['We', 'saw', 'that', 'we', 'can', 'create', 'new', 'lists', 'using', 'for', 'loops', 'and', 'the', 
         '.append()', 'function.', 'For', 'iteratively', 'creating', 'lists,', 'however,', 'there', 'is', 'a', 
         'much', 'better', 'method:', 'list', 'comprehensions.', 'They', 'have', 'the', 'basic', 'format:']
l1 = [    ]

How would we create a list of the third letter of words (from the `words` list) that contain either w or a?

In [None]:
l2 = [  ]

Recalling a previous list creation exercise, how would we create a list containing fifty copies of the number 1 using a list copmrehension?

In [None]:
ones = [ ]

We can see the speed difference by using the `timeit` package.  This will allow us to time small scripts by running them a specified number of times.  The syntax will be

``timeit.timeit(stmt='',setup='',number=x)``

Where `stmt` is the code you want to run, `setup` is an optional argument for any definitions you want to make ahead of time, and `number` is the number of times you want to run the code.

In [None]:
import timeit
# We need to import the package to be able to use it.  

In [None]:
timeit.timeit(stmt='''
for x in range(10000):
    l.append(x)''', setup='l=[]', number=10000)

In [None]:
timeit.timeit(stmt='t= [x for x in range(10000)]', number=10000)

So repeating the same comprehensions and for-loop appends, the loops took over 3 times longer to complete than the comprehensions did.

## Dictionaries

Dictionaries are similar to lists in that they're indexed collections of data.  The main difference, however, is that order does not matter in a dictionary.  So if order doesn't matter, how do we keep track of the data?  The answer is with *keys*.
Dictionaries are enclosed in braces `{}`, and follow the syntax:<br>
`dictname = {key1:value1, key2:,value2, ...}`
<br>
Here, the `keys` are the index we're choosing, which can be integers, floats, or strings.  The `values` can be any other Python object we want, like numbers, lists, strings, other dictionaries, dataframes, etc.  One restriction is that keys must be unique; values can be repeated as many times as you like.  Here is an example dictionary, where the keys are the names of students, and the values are their midterm marks.

In [None]:
midterm_marks = {'Patrick':86,'Lindsay':95,'Ivan':92,'Emily':97,'Iva':89}

If we wanted to access the value of a particular key, the syntax is the same as calling the index of a list item: ```dictname[key]```.  So to access Ivan's mark, we write

In [None]:
midterm_marks['Ivan']

If we had a dictionary with many keys and wanted to access them, we can use the `dictname.keys()` command.  Similarly, we can call the values with `dictname.values()`:

In [None]:
midterm_marks.keys()

In [None]:
midterm_marks.values()

If we wanted both the keys and values, we can use the ```.items()``` command:

In [None]:
midterm_marks.items()

We can use the ```sorted()``` function with dictionaries too.  Calling it on the keys or the values of the dictionary will produce a list that's sorted, assuming you can compare the entries:

In [None]:
sorted(midterm_marks)

In [None]:
sorted(midterm_marks.values())

Say we wanted to find the name of the person with the highest mark.  How would we do this?  There are a number of ways, including the very compact code ```max(midterm_marks, key=midterm_marks.get)```.  Suppose we didn't know this; how could we write code that would find it?

To show issues with trying to define non-unique keys, consider the following example:

In [None]:
bad = {'A':1, 'B':2, 'A':3}

In [None]:
bad['A']

Why does it show the value is 3?  Because the second time we defined `A`, Python overwrote the original definition.  This can be seen by looking at the actual dictionary:

In [None]:
bad

### Updating key values, adding new key values
Adding new entries, and updating previous entries are handled via the same syntax as with lists.  If we wanted to add Tyler's mark of 90 to the midterm marks, we'd write:

In [None]:
midterm_marks['Tyler'] = 90
midterm_marks

If it turns out that Patrick's mark was actually 89, how would we change it?

## Dictionary Comprehensions

We can create dictionaries using comprehensions similar to lists.  In this case, however, we need to specify the keys and the values simultaneously.  If we have different sources for them, our best bet is to use the `zip` function; this will take in multiple iterables, and create tuple pairs from them.  For example:

In [None]:
animals = ['dog','cat','rabbit','crow']
numbers = [9,5,3,7]

z = zip(animals,numbers)
list(z)

(The reason we have to call `list(z)` is because `z` itself is a generator)

In [None]:
z

We can now make a dictionary:

In [None]:
d = {key:value for key,value in zip(animals,numbers)}
d

We can also create a new dictionary by referencing another dictionary, and optionally placing a condition.  If we wanted a dictionary of all of the items in `d` where the value was greater than 4, we'd write:

In [None]:
new_dict = {k:v for k,v in d.items() if v>4 }
new_dict

Dictionaries are not built to look up a *key* by its *value*.  If we have a dictionary, however, that has unique values (why?), how could we use a dictionary comprehension to facilitate a key-lookup by value?

In [None]:
store_sales = {'Malvern':153, 'Cedarbrae':115, 'Agincourt':201, 'Markham': 120 }

# Find, through programming, which key had 120 sales.



## NumPy and Pandas:  Series and Dataframes

NumPy and Pandas are additional packages for Python that come with their own unique set of functions, objects, etc.  The Anaconda distribution comes with them, so we don't have to worry about installing them.  To use them, though, we must *import* them; when Python starts, it only loads the basic packages it needs to function, as a time- and memory-saving feature.  We're going import NumPy and Pandas with this command:

In [None]:
import numpy as np
import pandas as pd

The reason we're importing them with variable names is so we can call their functions/objects prefaced with their variable name.  It's a good way of keeping track of what belongs where.  So, for example, if we wanted to create a Pandas dataframe, we'd use the command `df = pd.DataFrame()`.

### Pandas Series objects

Pandas Series are a built on 1-dimensional NumPy arrays (the basic object of NumPy), but with the addition of an index.  As a result, a Series is much like a list in that it is an ordered sequence of Python objects, whose elements can be called by their index.  The difference here, though, is that we can specify an index of our choosing (much like picking keys for a dictionary). If we do not specify an index, it will revert to computer-science integer indexing.

In [None]:
s = pd.Series([2,3,5,7,11,13])
s

Here, the left-hand column is the index, and the right-hand column is the set of values.  We can specify the index when we create the series:

In [None]:
s = pd.Series([2,3,5,7,11,13],index=['a', 'b', 'c', 'd', 'e','f'])
s

Or, we can assign the index afterwards:

In [None]:
s = pd.Series([2,3,5,7,11,13])
s

In [None]:
s.index

In [None]:
# Setting a new index
s.index = ['a', 'b', 'c', 'd', 'e','f']
s

We can take a dictionary, and create a Series out of it.  This will automatically use the keys as the index:

In [None]:
marks = pd.Series(midterm_marks)
marks

### Slicing, referencing

Both Series and Dataframes items can be called and sliced using the syntax we're used to with lists and dictionaries:

In [None]:
marks['Emily']

In [None]:
s['b':'d']

You will notice that, unlike in regular Python, slicing a Series will *include* the last item listed.<br>
One of the most useful aspects of Series (and Dataframes, as we'll see), is *vectorized* operations.  If we apply operations to the Series, the computer will perform the instructions to every element effectively simultaneously. For example, if we wanted to add two series together, component-wise, we would use:

In [None]:
s1 = pd.Series([4,10,7],index=['a','b','c'])
s2 = pd.Series([1,1,1],index=['a','b','c'])

s3 = s1+s2
s3

And if we wanted to multiply everything in ```s3``` by 2 and then add 5:

In [None]:
s3*2 + 5

One of the great abilities of these types of vectorized operation is to handle mismatched indices.  If we have two Series which overlap in their indices, but which contian differences, it will fill in the gaps for us:

In [None]:
s4 = pd.Series([2,-1,6],index=['b','c','d'])
s5 = pd.Series([3,1,0],index=['a','b','c'])

s4+s5

The index of the resulting Series is the union of the two original indices.  Pandas will fill in any missing data with np.NaN. <br> 
**Note**: NaN (not a number) is the marker used to denote missing data in Pandas and NumPy.

### Vectorization: why we care

Why is vectorization important?  Why are we learning about Series (and soon Dataframes) that look kind of like other objects we've already seen, instead of just trying to use those objects?  There are two main reasons:

1. **Speed**.  Performing vectorized operations will go much faster than iterative ones. 
2. **Code length**.  The amount of code your write will be dramatically reduced by not having to write complicated for-loops.

Instead of having to iterate through all of the indices, we can give Python instructions for the entire Series. 
Vectorized operations allow us to very quickly perform operations on entire groups of data.  As we'll see later on, we can even create vectorized versions of custom functions.  Pandas and NumPy come packaged with a lot of very handy and powerful tools for dealing with very large datasets that only require a line or two of code, which reduces the amount of work we have to do.



### Pandas Dataframes

We will only work with Series in an oblique fashion, as most of our time will be spent with Pandas Dataframe objects.  Dataframes are tables of indexed columns, containing potentially different types of data.  Each column is a pd.Series object.<br>
We can create a dataframe from scratch using a dictionary of Series:




In [None]:
d = {'one' : pd.Series([2, 4, 6], index=['a', 'b', 'c']),'two' : pd.Series(['alpha','beta','gamma','delta'], 
                                                                           index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)

In [None]:
df

Let's create a larger dataframe so we can get a better view of accessing data.  The actual code to produce this dataframe is not overly important for right now.

In [None]:
df = pd.DataFrame(np.random.randn(8,6), columns=['A','B','C','D','E','F'])
df

We can access the top n rows by calling df.head(n), and the bottom m rows by calling df.tail(m).  If we leave n or m blank, it will default to 5.

In [None]:
df.head()

In [None]:
df.tail(3)

We can slice rows with the same syntax as lists and Series.

In [None]:
df[2:4]

We can call an individual column with the syntax we use for list/dictionary items, and we can call a set of columns by passing a list of column names:

In [None]:
df['A']

(Note:  when we call only a single column, we will be given a Series object)

In [None]:
df[['A','D']]

We can define a new column as we would add a new key/value to dictionary:

In [None]:
df['G'] = [1,1,3,7,5,5,3,9]

In [None]:
df

One important thing to note is that if we slice out rows or columns, Python makes a *copy*.  So any changes we make won't appear.  If we sliced out rows 2:4, and multiplied by 0, we see we get:

In [None]:
df[2:4]*0

In [None]:
df

The rows remain unchanged in df.  Similarly for columns:

In [None]:
df['A']*0

In [None]:
df

To do the computation in-place, we have to redefine the rows/columns:

In [None]:
df['A'] = df['A']*0
df

### Loading Data from CSVs
Most of the data we deal with won't be generated inside Pandas, but will come from CSVs.  Download the CSV from the portal, and we'll load it in.  The general syntax is <br>
`variablename = pd.read_csv('filepath') ` <br>
If the file is in the same directory as the notebook, you only need to put the filename as the path.  If it were located in another directory, you'll need to put the entire filepath, e.g.: `pd.read_csv('c:\\data_sets\\Unit2_testdata.csv')`.

In [None]:
dm = pd.read_csv('Unit2_testdata.csv')

But when we read it, we notice an issue:

In [None]:
dm

The index column was mistaken for a column of data.  So we have to tell Pandas what column the index is when it reads it in.  In our case the first column (i.e.: the *0th* column) is the index.  Don't forget to adjust the filepath in your notebook:

In [None]:
dm = pd.read_csv('Unit2_testdata.csv', index_col=0)
dm

Now it recognizes the index properly, and we can go about our slicing, operating business as usual.

In [None]:
dm[3:5]

Note that when we feed Pandas the integers to slice it on, it ignores what the index actually is, and slices based on computer science counting.

In the next lesson, we will learn how to dive deeper into the data in a Pandas Dataframe, performing various checks and queries, and learning about timeseries data.

# Assignment 2

1. Adapt your string-reversing function from the prep course assignment to take in an arbitrary list, and return a new list which is the original list in reverse order. (Again, ```return listname[::-1]``` is easy, but not the point of the exercise.)
2. Suppose we're out grocery shopping, and currently have the following produce in our basket:<br>
`shopping_basket = ['cherry','lemon','celery','grapefruit','apricot'] `.<br>
We have the following dictionary of some fruit by type: <br>
`f_dict = {'citrus':['lemon','lime','grapefruit','orange','pomelo'],'stone fruit':['cherry','apricot','peach'],'pome':['apple','pear','quince']}`. <br>
Write a script which will give a list of all of the items in your basket which are citrus.
3. Continuing from the prep course, write a function which takes in an arbitrary list of numbers, and returns the mean.  (Note:  while `return np.mean(listname)` is easy, it's not really the point of the exercise).  Use this list of numbers to test it: `[25, 54, 27, 54, 23, 47, 23,  4, 27, 36, 26, 12, 25, 29, 41]`. How would you adjust your code to deal with lists that include non-numeric entries?
4. Consider this dictionary of class marks:<br>
`ixlist = ['Patrick','Lindsay','Ivan','Emily','Iva']
class_marks = {'Assignment 1':pd.Series([72,85,87,94,77],index = ixlist),'Assignment 2':pd.Series([82,89,92,92,84],index = ixlist), 'Assignment 3':pd.Series([80,94,90,99,85],index = ixlist), 'Midterm':pd.Series([86,95,92,97,89],index = ixlist), 
'Final Exam':pd.Series([84,92,90,91,92],index = ixlist)}`<br>
Turn this into a Dataframe, and create a new column to compute the 'Final Grade' with the following weighting of marks:  30% for Assignments, 30% for the Midterm, and 40% for the Final Exam.
5.  1. Create a list using a comprehension which contains all of the odd numbers from $-10$ to $10$ (inclusive).
   2. Use a list comprehension to determine how many times letters from the first half of the alphabet (capital and lowercase!) appear in the following sentence:<br>
    'To construct the notion of a Lie group in Dirac geometry, extending the definition of Poisson Lie groups, the Courant algebroids A must themselves carry a multiplicative structure.'
    3. Write a list comprehension whose elements are lists `[x,y]`, where `x` can take on the values `[1,2,3,4]`, where `y` can take on values `[2,4,6]`, and where the `x` and `y` values are not equal.  (Hint: you can chain the `for` statements together in a comprehension).
6. (Bonus) In financial futures trading, each instrument (e.g.: wheat, oil, USD/CAD, S&P500) has its own symbol.  For example, the S&P500 e-mini is called ES.  Each instrument also has a specific number of months it trades, each of which has its own single-letter symbol; ES trades March (H), June (M), September (U), and December (Z).  To refer to a specific contract, you need the instrument name, month code, and year.  The December 2015 S&P500 contract would be ESZ17.  Consider the following list of contracts:<br>
`contracts = ['ESM15','ESZ14','ESU15','ESH15','ESZ15','ESU14']`<br>
If we want them in chronological order (e.g.: 'ESU14' is first, then 'ESZ14', etc.), we can't use the sorted() function, because this will sort them alphanumerically.  Write a function which sorts this list of contracts into chronological order.
7. (Bonus) As we saw, we can approximate a *matrix* by creating a list of lists, e.g.: `[[1,2],[0,1]]` can be seen as representing the matrix $\begin{bmatrix} 1 & 2 \\ 0 & 1 \end{bmatrix}$ by considering each of the internal lists as the rows.  We can *transpose* a matrix by switching the rows and the columns (i.e.: the first row becomes the first column, etc.).  For example, the tranpose of the matrix `[[1,2],[0,1]` is `[[1,0],[2,1]]`, i.e.: $\begin{bmatrix} 1 & 0 \\ 2 & 1 \end{bmatrix}$.<br>
Write a list comprehension that gives us the transpose of a matrix.