# Discussion 0

This is the first (or 0th) discussion of Data 100. It covers basic Python, calculus, and probability concepts. 

## Question 1

For this question, write your answer on paper.

### Part A

Find the value of $x$ that minimizes the function $f(x) = x^2 + 4x + 4$.


We first set the derivative equal to 0 and solve for $x$.

$$\frac{d}{dx} f(x) = 2x + 4$$

$$2x + 4 = 0$$

$$x = -2$$

We make sure the second derivative is positive (shows that the curve is convex) to verify that this is indeed a minimum:

$$\frac{d^2}{dx^2} f(x) = 2$$ 

Since the second derivative is positive, we know that $x = -2$ is a minimum.


### Part B

Calculate the partial derivative of the following expression with respect to $x$.

$$f(x, y) = xy + \sin(x^2y) + \ln(x^3y)$$

**Note**: The partial derivative of a function with respect to a variable `x` involves taking the derivative of the function with respect to ``x`` while holding all other variables constant.


$$f(x, y) = xy + \sin(x^2y) + 3\ln(x) + \ln(y)$$

$$\frac{\partial}{\partial x} f(x, y) = y + 2xy\cos(x^2y) + \frac{3}{x}$$


### Part C

Now, calculate the partial derivative of the experssion with respect to $y$.

$$f(x, y) = xy + \sin(x^2y) + 3\ln(x) + \ln(y)$$

$$\frac{\partial}{\partial y} f(x, y) = x + x^2\cos(x^2y) + \frac{1}{y}$$


## Question 2

For this question, write your answer on paper.

**Are you smarter than a doctor? ONLY 46% OF DOCTORS GOT THIS QUESTION RIGHT.**

100 out of 10,000 women at age forty who participate in routine screening have breast cancer. 80 of those 100 women with breast cancer test positive. 950 out of 9,900 women without breast cancer also test positive. If 10,000 women in this age group undergo a routine screening, about what fraction of these women with positive tests will actually have breast cancer?

Always begin by figuring out what you want to know. In this case, we want to know what fraction (or percentage) of the women with positive tests actually have breast cancer.

First, let’s figure out how many women have positive tests. That’s the denominator of our fraction.

The story above says that 950 of the 9,900 that do not have breast cancer will test positive. So that’s 950 women with a positive test result right there.

The story also says that 80 out of the 100 women who do have breast cancer will get a positive test result. So that’s another 80 women, and 950 + 80 = 1,030 women with a positive test result.

Good. We've got half our fraction. Now, how do we find the numerator? How many of those 1,030 women with a positive test result actually have breast cancer?

Well, the story says that 80 of the 100 women with breast cancer will get a positive test result, so 80 is our numerator.

The fraction of women with positive test results who actually have breast cancer is 80/1,030, which is a probability of .078, which is 7.8%.

So if one of these 40-year-old women tested positive, and the doctor knew the above statistics, then the doctor should tell the woman she has only a 7.8% chance of having breast cancer, even though she had a positive mammography. That’s much less stressful for the woman than if the doctor had told her she had a 70%-80% chance of having breast cancer like most doctors from 1976 apparently would!


## Question 3

Suppose we have the following list `lst` and array `arr`.

In [None]:
import numpy as np
arr = np.arange(5)
lst = list(range(5))

In [None]:
arr

In [None]:
lst

What will be the output of the following lines of Python/Numpy code? Try to predict the output before running the cell.

In [None]:
arr + 5

In [None]:
lst + 5

In [None]:
arr * 2

In [None]:
lst * 2

## Question 4

***Note:*** The line `raise NotImplementedError()` indicates that the implementation still needs to be added. This is an exception derived from `RuntimeError`. Please comment out that line when you have implemented the function.


### Part A

Write a function that returns a list of numbers such that $ x_i=i^2 $ for $1\leq i \leq n$. Don't worry about the case where $n \leq 0$.

In [None]:
def squares(n):
    """Compute the squares of numbers from 1 to n such that the ith element of the returned list equals i^2."""
    ### BEGIN SOLUTION
    if n < 1:
        raise ValueError("n must be greater than or equal to 1")
    return [i ** 2 for i in range(1, n + 1)]
    ### END SOLUTION

Your function should print `[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]` for $n=10$.  
Check that it does:

In [None]:
squares(10)

Check that squares returns the correct output for several inputs.

In [None]:
assert squares(1) == [1]
assert squares(2) == [1, 4]
assert squares(10) == [1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
assert squares(11) == [1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121]

### Part B

Evaluate the following summation (you may use code):

$$\sum_{i=1}^{100} i^3 + 3 i^2$$

In [None]:
### BEGIN SOLUTION
q2b_sum = sum([i ** 3 + 3 * i ** 2 for i in range(1, 101)])
### END SOLUTION
q2b_sum = ...

Check that your sum is correct.

In [None]:
assert q2b_sum == 26517550

### Part C

Write a function `map_func` that will implement mapping and filtering on a list of values `list_vals`. This function will also take in a parameter `mapper`, which takes an input `x` and maps that value to a new value of the same type. The last parameter `filter_func` takes an input `y` and returns a boolean value based on if `y` satifies a certain condtion. In short, `map_func` should return a list of values from `n` that satisfy the condition established by `filter_func`.

**Note**: If you want to see examples of map_func used, look at the tests below.

In [None]:
def map_func(list_vals, mapper, filter_func):
    """Maps and filters the input list n based on the condition established by the input param filter_func. Return a list 
    containing elements of n that filter_func returns true on."""
    ### BEGIN SOLUTION
    return [mapper(i) for i in list_vals if filter_func(i)]
    ### END SOLUTION

Check that `map_func` returns the correct output for several inputs.

In [None]:
assert map_func([1, 2, 3], lambda x: x*2, lambda x: x > 1) == [4, 6]
assert map_func([], lambda x: x*1000, lambda x: x > -10) == []
assert map_func(["piglet", "gavin", "jim", "andy"], lambda x: x[:0:-1], lambda x: len(x) < 6) == ["niva", "mi", "ydn"]

## Question 5

### Part A

Write a function which takes in a string and returns `True` if the string is a palindrome. (A string is a palindrome if it is the same forwards and backwards.)

In [None]:
def is_palindrome(word):
    """Return True if word is a palindrome."""
    ### BEGIN SOLUTION
    if len(word) <= 1:
        return True
    return word[0] == word[-1] and is_palindrome(word[1:-1])
    ### END SOLUTION

Your function should return true for "racecar".

In [None]:
is_palindrome("racecar")

Check that the function works for several inputs.

In [None]:
assert is_palindrome("aviddiva") == True
assert is_palindrome("clearlynotapalindrome") == False
assert is_palindrome("kayak") == True
assert is_palindrome("ab") == False
assert is_palindrome("abb") == False
assert is_palindrome("a") == True

### Part B

Write a function that flattens a nested Python list.

In [None]:
def flatten(lst):
    """Flattens the input list lst so that there are no nested lists."""
    ### BEGIN SOLUTION
    if len(lst) == 0:
        return lst
    if type(lst[0]) == list:
        return flatten(lst[0]) + flatten(lst[1:])
    return lst[:1] + flatten(lst[1:])
    ### END SOLUTION

Check that the function works for several inputs.

In [None]:
assert flatten([1, 2, 3]) == [1, 2, 3]
assert flatten([1, 2, [3, 4]]) == [1, 2, 3, 4]
assert flatten([1, 2, [3, [4]], 5]) == [1, 2, 3, 4, 5]
assert flatten([1, 2, [3, [4]], [5, [[6]]]]) == [1, 2, 3, 4, 5, 6]
assert flatten([1, 2, [3, [[4]]], [5, [[6]]], [[[7]]]]) == [1, 2, 3, 4, 5, 6, 7]

Although there is no built-in `flatten` function for Python lists, numpy arrays have a `flatten` method that can be used to flatten an array. For example, to flatten a numpy array `arr` we would call `arr.flatten()`.
