# Exercise Set 1

Below are five problems, each worth 1 point. These problems are interleaved with short tutorials on Python. This assignment will be autograded to ensure a quick turnaround. After each problem, there are tests associated with your answer which will help you determine if you have solved the problem correctly. If you can run the cell after your answer and not get any errors, then you very likely have gotten the question right. If you have errors, hopefully the error will help you identify the mistake.

Note that just because you don't get errors doesn't mean that you got the question correct. I have some additional tests held back that I do not show here, though if you pass the ones shown, you will likely pass those as well.

When you are done with the assignment, you should save this notebook manually by clicking on the save button in the toolbar (the floppy disk icon). **Do not rely on autosave. Save manually!** Ensure that you have not renamed the file. **The autograder that is used to grade this notebook requires that the file be named `Exercise_I.ipynb`.** Once you save the notebook, follow the instructions in the `README.md` file to submit the assignment.

Finally, you are encouraged to add new cells as you go through the notebook and experiment. Any cell that should not be copied or deleted is marked as such. As long as you don't copy or delete the cells marked as such, then you should feel free to experiment as much as you would like with this notebook.

The goal of this set of exercises is to introduce you to basic Python syntax and to get you to begin to think like a programmer. The primary skill you must develop to succeed in these exercises is paying careful attention to syntax. In programming, writing something in exactly the right way is essential. A computer cannot understand typos.

The answer to the exercise should be closely related to an example in the material near the exercise. If you find yourself doing something significantly different or beyond what is presented in the examples in this exercise, you are likely on the wrong path. This exercise set is designed to introduce all of the python syntax and concepts necessary to complete each exercise, so carefully understanding the examples in the code should allow you to complete each exercise successfully.

#### Problem \#1 - 1 point

Create a variable `answer_1` and assign to it the value `10`. So, your code will look something like
``` python
answer_1 = ...
```

You may want refer back to the section on "Variables" in the "Intro to Python.ipynb" notebook.

In [6]:
answer_1 = 10

The below cell is a grading cell. It will check to see if your answer is right. If your answer is right, it should print out the number `10` like so: 
``` python
10
```


If it's wrong, you will see an error like the following (which was produced by setting `answer_1` to the value `9`):

``` python
9
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-4-ecd9fd4d3dfb> in <module>
      2 print(answer_1)
      3 from nose.tools import assert_equal
----> 4 assert_equal(answer_1, 10)

~/.brew/Caskroom/miniconda/base/envs/darden/lib/python3.7/unittest/case.py in assertEqual(self, first, second, msg)
    851         assertion_func = self._getAssertEqualityFunc(first, second)
--> 852         assertion_func(first, second, msg=msg)
    853 
    854     def assertNotEqual(self, first, second, msg=None):

~/.brew/Caskroom/miniconda/base/envs/darden/lib/python3.7/unittest/case.py in _baseAssertEqual(self, first, second, msg)
    843             standardMsg = '%s != %s' % _common_shorten_repr(first, second)
    844             msg = self._formatMessage(msg, standardMsg)
--> 845             raise self.failureException(msg)
    846 
    847     def assertEqual(self, first, second, msg=None):

AssertionError: 9 != 10
```

If you do not see an error, then you have the correct solution.

In [7]:
# THIS IS A GRADING CELL. DO NOT EDIT AND DO NOT COPY.
print(answer_1)
from nose.tools import assert_equal
assert_equal(answer_1, 10)

10


## Collections or Putting Stuff in Other Stuff

We will regularly want to bundle together things. This will often be us bundling together data points. Though sometimes, we will bundle together steps in a machine learning model to package it all together.

### Lists

Lists are probably the handiest and most flexible type of collection. We will use them all of the time.

Lists are declared with square brackets `[]`. 

Individual elements of a list can be selected using the syntax `a[ind]`.

In [8]:
# Lists are created with square bracket syntax
a = ['blueberry', 'strawberry', 'pineapple', 'orange']
print(a, type(a))

['blueberry', 'strawberry', 'pineapple', 'orange'] <class 'list'>


Lists (and all collections) are also indexed with square brackets. NOTE: The first index is zero, not one

In [9]:
print(a[0])
print(a[1])

blueberry
strawberry


In [10]:
## You can also count from the end of the list
print('last item is:', a[-1])
print('second to last item is:', a[-2])

last item is: orange
second to last item is: pineapple


In [11]:
# you can access multiple items from a list by slicing, using a colon between indexes
# NOTE: The end value is not inclusive
print('a =', a)
print('get first two:', a[0:2])
print('get middle two:', a[1:3])

a = ['blueberry', 'strawberry', 'pineapple', 'orange']
get first two: ['blueberry', 'strawberry']
get middle two: ['strawberry', 'pineapple']


In [12]:
# You can leave off the start or end if desired
print(a[:2])
print(a[2:])
print(a[:])
print(a[:-1])

['blueberry', 'strawberry']
['pineapple', 'orange']
['blueberry', 'strawberry', 'pineapple', 'orange']
['blueberry', 'strawberry', 'pineapple']


In [13]:
# We can skip items by adding a third parameter for a skip frequency
print(a[1:4:2])
# We can reverse a by making the third parameter negative
print(a[::-1])

['strawberry', 'orange']
['orange', 'pineapple', 'strawberry', 'blueberry']


In [14]:
# Lists are objects, like everything else, and have methods such as append
a.append('banana')
print(a)

a.append([1,2])
print(a)

a.pop()
print(a)

['blueberry', 'strawberry', 'pineapple', 'orange', 'banana']
['blueberry', 'strawberry', 'pineapple', 'orange', 'banana', [1, 2]]
['blueberry', 'strawberry', 'pineapple', 'orange', 'banana']


__TIP:__ A 'gotcha' for some new Python users is that many collections, including lists,
actually store pointers to data, not the data itself. 

Remember when we set `b=a` and then changed `a`?

What happens when we do this in a list?

In [15]:
a = 1
b = a
a = 2
## What is b?
print('What is b?', b)

a = [1, 2, 3]
b = a
print('original b', b)
a[0] = 42
print('What is b after we change a ?', b)

What is b? 1
original b [1, 2, 3]
What is b after we change a ? [42, 2, 3]


In order to avoid this, you have to `copy` the list.

In [16]:
a = [1, 2, 3]
b = a.copy() # We set b equal to a copy of a
print('original b', b)
a[0] = 42
print('What is b after we change a ?', b)
print('What is a ?', a)

original b [1, 2, 3]
What is b after we change a ? [1, 2, 3]
What is a ? [42, 2, 3]


Often times, we will need to know how long a list is. We can get that with `len`:

In [17]:
print(len(a))

3


#### Problem \#2 - 1 point

The below list is a list of the first paragraph of the famous "Lorem ipsum" text that is often used as a placeholder. The text reads as follows:

> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

The cell is read only, and you should not try to edit or delete it. You will use the list to solve problem 2.

In [18]:
# DO NOT DELETE THIS CELL.
problem_2_list = ["Lorem", "ipsum", "dolor", "sit", "amet,", "consectetur", "adipiscing", "elit,", "sed", "do", "eiusmod", "tempor", "incididunt", "ut", "labore", "et", "dolore", "magna", "aliqua.", "Ut", "enim", "ad", "minim", "veniam,", "quis", "nostrud", "exercitation", "ullamco", "laboris", "nisi", "ut", "aliquip", "ex", "ea", "commodo", "consequat.", "Duis", "aute", "irure", "dolor", "in", "reprehenderit", "in", "voluptate", "velit", "esse", "cillum", "dolore", "eu", "fugiat", "nulla", "pariatur.", "Excepteur", "sint", "occaecat", "cupidatat", "non", "proident,", "sunt", "in", "culpa", "qui", "officia", "deserunt", "mollit", "anim", "id", "est", "laborum."]

Store in the variable `answer_2` a sub list of the `problem_2_list` that begins with the third word `dolor` and ends with the fourth word from the end of the list `anim` including all of the words in between. So, your code will look something like
``` python
answer_2 = ...
```
I would suggest that you confirm that the answer is what you expect through printing it out. Note that you are free to just type in this whole list, but I recommend you look for an easier way using list indexing. Remember that in Python, the first element is index `0`, the second element is index `1`, and so on.

In [19]:
answer_2 = problem_2_list[2:-3]

In [20]:
# THIS IS A GRADING CELL. DO NOT EDIT AND DO NOT COPY.
print(answer_2)
from nose.tools import assert_equal
assert_equal(answer_2[0], "dolor")
assert_equal(answer_2[5], "elit,")
assert_equal(answer_2[-1], "anim")
assert_equal(answer_2[19], "ad")

['dolor', 'sit', 'amet,', 'consectetur', 'adipiscing', 'elit,', 'sed', 'do', 'eiusmod', 'tempor', 'incididunt', 'ut', 'labore', 'et', 'dolore', 'magna', 'aliqua.', 'Ut', 'enim', 'ad', 'minim', 'veniam,', 'quis', 'nostrud', 'exercitation', 'ullamco', 'laboris', 'nisi', 'ut', 'aliquip', 'ex', 'ea', 'commodo', 'consequat.', 'Duis', 'aute', 'irure', 'dolor', 'in', 'reprehenderit', 'in', 'voluptate', 'velit', 'esse', 'cillum', 'dolore', 'eu', 'fugiat', 'nulla', 'pariatur.', 'Excepteur', 'sint', 'occaecat', 'cupidatat', 'non', 'proident,', 'sunt', 'in', 'culpa', 'qui', 'officia', 'deserunt', 'mollit', 'anim']


### Tuples

We won't say a whole lot about tuples except to mention that they basically work just like lists, with
two major exceptions:

1. You declare tuples using `()` instead of `[]`
1. Once you make a tuple, you can't change what's in it (referred to as immutable)

You'll see tuples come up throughout the Python language, and over time you'll develop a feel for when
to use them. Often times, they are used as arguments to functions.

In general, they're often used instead of lists:

1. to group items when the position in the collection is critical, such as coord = (x,y)
1. when you want to prevent accidental modification of the items, e.g. shape = (12,23)

In [21]:
xy = (23, 45)
print(xy[0])

23


This following cell won't work because you cannot assign things to tuples. It will raise an error.

In [22]:
xy = (23, 45)
print(xy[0])
xy[0] = "this won't work with a tuple"

23


TypeError: 'tuple' object does not support item assignment

### Anatomy of a traceback error

An error! Let's take a closer look at how to read it.

Traceback errors are `raised` when you try to do something with code it isn't meant to do.  It is also meant to be informative, but like many things, it is not always as informative as we would like.

Looking at our error:

``` python
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-2b1e2849198f> in <module>
      1 xy = (23, 45)
      2 print(xy[0])
----> 3 xy[0] = "this won't work with a tuple"

TypeError: 'tuple' object does not support item assignment
```
    
1. The command you tried to run raised a **TypeError**  This suggests you are using a variable in a way that its **Type** doesnt support
2. the arrow ----> points to the line where the error occurred, In this case on line 3 of your code form the above line.
3. Learning how to **read** a traceback error is an important skill to develop, and helps you know how to ask questions about what has gone wrong in your code. Once you can read the error, you will know a) what line is wrong with your code, and b) what to google to try to fix it.

They can look a little more complicated, as the below piece of code shows:

In [23]:
from sklearn.impute import SimpleImputer
import numpy as np
SimpleImputer(missing_values=np.nan, strategy='mean').fit([[0, 1, 1, "nonsense", np.nan, np.nan]])

ValueError: Cannot use mean strategy with non-numeric data:
could not convert string to float: 'nonsense'

Here, there are multiple `--->` arrows in the error. That's becuase this code uses a method, that calls other code, and the error actually happens several steps down. However, the error is due to the method's input `[[0, 1, 1, "nonsense", np.nan, np.nan]]`, and we can see this, becuase one of the arrows points to 

``` python
----> 2 SimpleImputer(missing_values=np.nan, strategy='mean').fit([[0, 1, 1, "nonsense", np.nan, np.nan]])
```

Since this is a line of our code, this is likely where the error is. Moreover, since the line at the bottom is:

``` python
ValueError: Cannot use mean strategy with non-numeric data:
could not convert string to float: 'nonsense'
```

it probably has something to do with the list we are giving it as input. If we replace `"nonsense"` with a number, then the error goes away.

In [24]:
SimpleImputer(missing_values=np.nan, strategy='mean').fit([[0, 1, 1, 1, np.nan, np.nan]])

Learning to read these errors takes a little time, but when you do get a sense for what it's saying, you will be much better equipped to understand why your code isn't working.

### Dictionaries

Dictionaries are the collection to use when you want to store and retrieve things by their names
(or some other kind of key) instead of by their position in the collection. A good example is a set
of model parameters, each of which has a name and a value. Dictionaries are declared using `{}`.

We are not going to use dictionaries a lot, but we will see them crop up. It is good to have a basic familiarity with them.

In [25]:
# Make a dictionary of model parameters
convertors = {'inches_in_feet' : 12,
              'inches_in_metre' : 39}

print(convertors)
print(convertors['inches_in_feet'])

{'inches_in_feet': 12, 'inches_in_metre': 39}
12


In [26]:
## Add a new key:value pair
convertors['metres_in_mile'] = 1609.34
print(convertors)

{'inches_in_feet': 12, 'inches_in_metre': 39, 'metres_in_mile': 1609.34}


In [27]:
# Raise a KEY error
print(convertors['blueberry'])

KeyError: 'blueberry'

## Defining Functions or Learning How Not to Repeat Yourself

One way to write a program is to simply string together commands, like the ones described above, in a long
file, and then to run that file to generate your results. This may work, but it can be cognitively difficult
to follow the logic of programs written in this style. Also, it does not allow you to reuse your code
easily - for example, what if we wanted to run our logistic growth model for several different choices of
initial parameters?

The most important ways to "chunk" code into more manageable pieces is to create functions and then
to gather these functions into modules, and eventually packages. Below we will discuss how to create
functions and modules. A third common type of "chunk" in Python is classes, but we will not be covering
object-oriented programming in this class.

In [28]:
x = 3.333333
print(round(x, 2))
print(np.sin(x))

3.33
-0.19056763565080653


It is very easy to write your own functions.

In [29]:
# It's very easy to write your own functions
def multiply(x, y):
    z = x*y
    return z

Note that in the above, the function `return`s the value `z`. A function will often, `return` a value, and this allows you to do things like write `answer = multiply(5, 4)` and have the variable `answer` return `20`. If you do not have the `return` line, then you if you wrote `answer = multiply(5, 4)`, while it wouldn't give you an error, the variable `answer` would not have anything in it.

In [30]:
# Once a function is "run" and saved in memory, it's available just like any other function
print(type(multiply))
print(multiply(4, 3))

<class 'function'>
12


In [31]:
# It's useful to include docstrings to describe what your function does
def say_hello(time, people):
    '''
    Function says a greeting. Useful for engendering goodwill
    '''
    return 'Good ' + time + ', ' + people

**Docstrings**: A docstring is a special type of comment that tells you what a function does. It goes between `'''` and `'''` in the beginning of a function.  You can see them when you ask for help about a function. This docstring is completely optional, and we will often define functions that do not have it.

In [32]:
say_hello('afternoon', 'friends')

'Good afternoon, friends'

In [33]:
?say_hello

[0;31mSignature:[0m [0msay_hello[0m[0;34m([0m[0mtime[0m[0;34m,[0m [0mpeople[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m Function says a greeting. Useful for engendering goodwill
[0;31mFile:[0m      /tmp/ipykernel_219/1345993101.py
[0;31mType:[0m      function

In [34]:
# All arguments must be present, or the function will return an error
say_hello('afternoon')

TypeError: say_hello() missing 1 required positional argument: 'people'

In [36]:
# Keyword arguments can be used to make some arguments optional by giving them a default value
# All mandatory arguments must come first, in order
def say_hello(time, people='friends'):
    return 'Good ' + time + ', ' + people

In [37]:
say_hello('afternoon')

'Good afternoon, friends'

In [38]:
say_hello('afternoon', 'students')

'Good afternoon, students'

You may define new variables inside of a function in order to help you do what you want to do. For example, the below code will add two variables and then double the result.

In [41]:
def hidden_variable(a, b):
    c = a + b
    return c*2

In [42]:
hidden_variable(4, 5)

18

Note that in the above function `hidden_variable`, there is a variable called `c`. It is only accessbile within that function. If you try to print it outside of the function look at what happens:

In [43]:
print(c)

NameError: name 'c' is not defined

Moreover, `c` is reset everytime that function is used, so it is kind of a way to have a temporary variable that is only used in that function and then thrown away. This can often be very useful to keep track of things in a function for temporary purposes.

###  Problems 3-5

Problems 3-5 require you to complete a function in order to get the function to output the specified thing given the input. Remember when you are writing functions about how code needs to be indented. Also, after each problem, there are tests associated with your answer. If you can run the cell after your answer and not get any errors, then you very likely have gotten the question right. If you have errors, hopefully the error will help you identify the mistake.

Note that just because you don't get errors doesn't mean that you got the question correct. I have some additional tests held back that I do not show here, though if you pass the ones shown, you will likely pass those as well.

#### IMPORTANT Note About `return` vs `print`

For the following problems, you will need to make sure that your function `return`s the answer. Do not just `print()` your answer. While it will look like your function is doing the right thing when you run it by itself, the function will not actually be returning any answer at all and the grading cell will give an error and count it as wrong.

As an example, the following code is a well written function with proper indentation and a `return` statement. This function will take in two numbers (called `a` and `b`) and return the sum of the two numbers.

In [44]:
def sum_two(a, b):
    answer = a + b
    return answer

And we can see that the function works correctly:

In [45]:
sum_two(5, 7)

12

Note that it output `12`, but it did not `print` 12. An incorrect function (that looks like it gives the right answer) that uses `print` instead of `return` is:

In [46]:
def print_sum_two(a, b):
    answer = a + b
    print(answer)

We can see that it looks like it does the right thing:

In [47]:
print_sum_two(5, 7)

12


However, if you try to use the function to assign to a variable, you will get very different results.

In [48]:
right_answer = sum_two(5, 7)
wrong_answer = print_sum_two(5, 7)

12


However, when you look at what is in the variables, you will see the difference:

In [49]:
right_answer

12

In [50]:
wrong_answer

As you can see, there is nothing in the `wrong_answer` variable because nothing was `return`ed to it. The `print` function just puts it on the screen, but it doesn't pass the answer back. This means that you cannot assign the answer to a variable (and incidently, it means you will get the problem wrong if you use `print` instead of `return`).

Therefore, when you are solving the below problems, make sure that you `return` the answer, and not just `print` it. If you use `print`, you will get the question wrong. The grading cell will tell you that the question is wrong as well. **Please pay careful attention to this.**

#### Problem \#3 - 1 point

Complete the function called `double` that doubles a given number `n`. I.e., if you write `double(4)`, it should `return` `8`. You may want to refer back to the "Operators" section of the "Intro to Python.ipynb" notebook.

The grading cell will check the function at several values. If the function does not return the expected answer, then you will see an error. When the assignment is graded, there will be additional checks, but if you do not see an error when running the grading cell, you should feel fairly confident that your solution is correct. If you get an error, then your solution is incorrect.

In [53]:
# YOUR ANSWER SHOULD GO IN THIS CELL. DO NOT COPY THIS CELL.
def double(n):
    return n * 2

In [54]:
# THIS IS A GRADING CELL. DO NOT EDIT OR COPY.
from nose.tools import assert_equal
assert_equal(double(4), 8)
assert_equal(double(3.5), 7)

#### Problem \#4 - 1 point

Complete the function called `make_abba` such that given two strings, `a` and `b`, `make_abba` returns the result of putting them together in the order abba, e.g. "Hi" and "Bye" returns "HiByeByeHi". You may want to refer back to the "Intro to Python.ipynb" section on "String Operators".

The grading cell will check the function at several values. If the function does not return the expected answer, then you will see an error. When the assignment is graded, there will be additional checks, but if you do not see an error when running the grading cell, you should feel fairly confident that your solution is correct. If you get an error, then your solution is incorrect.

In [58]:
# YOUR ANSWER SHOULD GO IN THIS CELL. DO NOT COPY THIS CELL.
def make_abba(a, b):
    return a + b + b + a

In [57]:
# THIS IS A GRADING CELL. DO NOT EDIT.
from nose.tools import assert_equal
assert_equal(make_abba("Hi", "Bye"), "HiByeByeHi")
assert_equal(make_abba("a", "b"), "abba")

#### Problem \#5 - 1 point

Complete the function called `double_and_replace` that given a string (called `string`), returns the string repeated, but also replaces all letters 'a' with 'b'. I.e., given the string "apple", it should return "bpplebpple". You may want to refer back to the "Intro to Python.ipynb" section on "String Operators" and "Methods".

The grading cell will check the function at several values. If the function does not return the expected answer, then you will see an error. When the assignment is graded, there will be additional checks, but if you do not see an error when running the grading cell, you should feel fairly confident that your solution is correct. If you get an error, then your solution is incorrect.

In [59]:
# YOUR ANSWER SHOULD GO IN THIS CELL. DO NOT COPY THIS CELL.
def double_and_replace(string):
    return string.replace('a', 'b') * 2

In [60]:
# THIS IS A GRADING CELL. DO NOT EDIT.
from nose.tools import assert_equal
assert_equal(double_and_replace("apple"), "bpplebpple")
assert_equal(double_and_replace("test"), "testtest")