# Lambda School Data Science - Getting Started with Python

Following are exercises you should complete after watching the first intro lecture. Workflow:

1. Sign in to a Google account
2. Copy the notebook (`File` -> `Save a copy in Drive`)
3. Complete the exercises! This means fill out the code cells, and run them (shift-enter or click the play button that appears when you're in one of them)
4. Take a look at your work, and write comments/add text cells as appropriate to explain
5. Make the notebook URL viewable and submit with the standup form

## Exercise 1 - A bit of Math

For these "word" problems, use Python to clearly solve them. Your code will "show your work" - use good variable names! To show your answers you should write a `print()` statement at the end.

As you work, follow the **20 minute rule** - that means if you're stuck on something for 20 minutes, you should ask a question!

### a) It's a gas

A taxi driver is calculating their profit over two weeks by adding up the fares they charge and subtracting the cost of gas. The price of gas changes over time - it was `$3.52`/gallon the first week and `$3.57`/gallon this second week. Their car gets 20 miles per gallon.

For the first week the driver had a total of 23 passengers with average `$29` fare each, and drove a total of 160 miles. For the second week they had 17 passengers with average `$30` fare each, and drove a total of 220 miles. Assume that for both weeks they purchase all the gas needed during that week (i.e. they refuel every week to maintain a constant level of gas in the tank).

Based on the above, answer the following questions:

- What is their total profit over both weeks?
- During which week was their average (mean) profit per passenger higher?


In [0]:
# A) profit = revenue/sales - total cost
# total cost derivation: 
no_of_gallon_used_week1 = 160/20
total_amount_gas_week1 = 3.52 * no_of_gallon_used_week1
no_of_gallon_used_week2 = 220/20
total_amount_gas_week2 = 3.57 * no_of_gallon_used_week2
grand_cost = float(total_amount_gas_week1) + float(total_amount_gas_week2)
grand_cost
# total_revenue derivation
no_of_pass_week1 = 23
no_of_pass_week2 = 17
cost_per_ride_week1 = 29
cost_per_ride_week2 = 30
total_rev_from_ride_week1 = no_of_pass_week1 * cost_per_ride_week1
total_rev_from_ride_week2 = no_of_pass_week2 * cost_per_ride_week2
grand_rev = total_rev_from_ride_week1 + total_rev_from_ride_week2
profit = grand_rev - grand_cost
print(f'Their total profit over both week is {profit}')


# B) Week with higher profit per passenger

average_profit_per_pass_week1 =  round(profit/no_of_pass_week1, 2)
average_profit_per_pass_week2 =  round(profit/no_of_pass_week1, 2)
print(f"The average profit per passenger in first week is {average_profit_per_pass_week1} whereas second week is {average_profit_per_pass_week2}. Therefore the average profit per passenger is higher in the second week")


Their total profit over both week is 1109.57
The average profit per passenger in first week is 48.24 whereas second week is 48.24. Therefore the average profit per passenger is higher in the second week


### b) Mo' money...

A cash drawer contains 160 bills, all 10s and 50s. The total value of the 10s and 50s is $1,760.

How many of each type of bill are in the drawer? You can figure this out by trial and error (or by doing algebra with pencil and paper), but try to use loops and conditionals to check a plausible possibilities and stop when you find the correct one.

In [0]:
# TODO your code here!

## Exercise 2 - Drawing a plot

Use NumPy and Matplotlib to draw a scatterplot of uniform random `(x, y)` values all drawn from the `[0, 1]` interval. Helpful documentation:

*   https://matplotlib.org/tutorials/index.html
*   https://docs.scipy.org/doc/numpy/user/quickstart.html

Stretch goal - draw more plots! You can refer to the [Matplotlib gallery](https://matplotlib.org/gallery.html) for inspiration, but don't just reproduce something - try to apply it to your own data.

How to get data? There's *many* ways, but a good place to get started is with [sklearn.datasets](http://scikit-learn.org/stable/datasets/index.html):

```
from sklearn import datasets
dir(datasets)
```

In [0]:
import matplotlib.pyplot as plt
import numpy as np

## Exercise 3 - Writing a function
Write a function that, given a list of numbers, calculates the mean, median, and mode of those numbers. Return a dictionary with properties for the mean, median and mode. 

For example:

```
mmm_dict = meanMedianMode([1, 2, 6, 7, 8, 9, 3, 4, 5, 10, 10])
print(mmm_dict)
> {'mean': 5.909090909090909, 'median': 6, 'mode': 10}
```

There are Python standard libraries that make calculating these numbers very easy, but first try your hand at implementing it using the `reduce()` function:

In [0]:
from functools import reduce
from statistics import median
from statistics import mode
from statistics import mean
help(reduce)

Help on built-in function reduce in module _functools:

reduce(...)
    reduce(function, sequence[, initial]) -> value
    
    Apply a function of two arguments cumulatively to the items of a sequence,
    from left to right, so as to reduce the sequence to a single value.
    For example, reduce(lambda x, y: x+y, [1, 2, 3, 4, 5]) calculates
    ((((1+2)+3)+4)+5).  If initial is present, it is placed before the items
    of the sequence in the calculation, and serves as a default when the
    sequence is empty.



In [0]:
def meanMedianMode(numbers):
    # TODO your code here!
    dist = {}
#mean1 = reduce(lambda x,y: x + y/len(numbers), numbers)
    mean1 = mean(numbers)
    mode2 = reduce(lambda a,b:a if a > b else b, set(numbers))
    median3 = median(numbers)
    dist['mean'] = mean1
    dist['median'] = median3
    dist['mode'] = mode2
    return dist
print(meanMedianMode([1, 2, 6, 7, 8, 9, 3, 4, 5, 10, 10]))

{'mean': 5.909090909090909, 'median': 6, 'mode': 10}
