# Intro to Python Packages

- Python comes with many [Built-In Functions](https://docs.python.org/3/library/functions.html)
    - e.g. print, type, range, etc
- But **most of the functionality we need as data scientists is not included** in base Python.


- We can **download other collections of functions and classes, called Packages** (A.K.A. Libraries A.K.A Modules)
    - Python has a Package Index (PyPi) that is basically like an app-store for Python. 

    - In a code cell, we can install any PyPi packages we need using:
        - `!pip <package name>`


- **Packages You will Be Using in Stack 1:**
    - [Numpy](https://numpy.org/doc/stable/index.html#)
    - [Pandas](https://pandas.pydata.org/docs/)
    - [Matplotlib](https://matplotlib.org/)
    - [Seaborn](https://seaborn.pydata.org/)




- Thankfully, Google Colab has most of the packages we need already installed!


## Importing Packages & Modules

- When we import a package we can either just import it under its full name. 
```python
import numpy
```

- We can also give it an alias/handle (a short nick-name)
```python
import numpy as np
```

In [2]:
# import numpy
import numpy
numpy

<module 'numpy' from '/usr/local/lib/python3.7/dist-packages/numpy/__init__.py'>

In [3]:
## import numpy with an alias
import numpy as np
np

<module 'numpy' from '/usr/local/lib/python3.7/dist-packages/numpy/__init__.py'>

In [4]:
## functions and classes stored in the package are reference with .indexing
np.array

<function numpy.array>

### Submodules

- Packages can be made of smaller pieces called submodules. 
    - Submodules allow functions to be organized in a helpful way.
    - Numpy has a submodule called `np.random` that contains functions related to generating or selecting data based on random chance.  


In [5]:
## show the np.random module
np.random

<module 'numpy.random' from '/usr/local/lib/python3.7/dist-packages/numpy/random/__init__.py'>

In [6]:
## Can't choose a dinner option? Let numpy do it!
np.random.choice(['Cheeseburger','Chicken Tikka Masala','Lasagna', "Filet Mignon"])

'Cheeseburger'

# Why NumPy?

- Python lists and tuples are not efficient with large amounts of data. 
- Linear Algebra has a lot of helpful mathmatical manipulations we can use.
- We need a way to store our data in an organized linear fashion.
>- The solution: numpy arrays!



## Working with NumPy Arrays


- Make a `calories_per_serving` array with the calories per serving:

|                      |   Calories Per Serving |
|:---------------------|-----------------------:|
| Cheeseburger         |                    740 |
| Chicken Tikka Masala |                    240 |
| Lasagna              |                    408 |
| Filet Mignon         |                    301 |



In [7]:
## Make a color how many calories are in each>? from www.calorieking.com
calories_per_serving = np.array([740, 240, 408, 301])
calories_per_serving

array([740, 240, 408, 301])


- Make a `prices` array with ther prices:

|                      |   Price |
|:---------------------|--------:|
| Cheeseburger         |    8.5  |
| Chicken Tikka Masala |   12.5  |
| Lasagna              |   11    |
| Filet Mignon         |   15.75 |


    

In [15]:
# what is the price? https://www.numbeo.com/food-prices/
prices = np.array([8.5, 12.5, 11, 15.75])
prices

array([ 8.5 , 12.5 , 11.  , 15.75])

### Q1: What would our total calories be if we ate:

- 2 servings of Lasagna, 1 filet mignon, and 3 cheesburgers?

>Order total = the sum of all prices * number of servings ordered.
- Hint: Make a `servings` array.

In [10]:
# 2 servings of Lasagna and a 1 filet mignon, and 3 cheesburgers?
servings = np.array([3,0,2,1])
servings

array([3, 0, 2, 1])

In [13]:
## Calcualte total caloreis
calories = servings * calories_per_serving
np.sum(calories)

#Another option
#(calories = servings * calories_per_serving).sum()

3337

### Q2: What would our total bill be?

In [16]:
## calculate the total bill
(prices * servings).sum()

63.25

### Q3:  What if we decided to add 2 orders of Tikka Masala?

- Hmmm...what index was Tikka Masala?  🤔

### 💡 How to remind ourselves the names/integer index of each item

- Make an `options_array` of the names of the dinner options:
    - 'Cheeseburger', 'Chicken Tikka Masala','Lasagna', "Filet Mignon"



In [8]:
## arrays can store strings
options_array = np.array(['cheeseburger', 'chicken tikka masala', 'lasagna', 'filet mignon'])
options_array

array(['cheeseburger', 'chicken tikka masala', 'lasagna', 'filet mignon'],
      dtype='<U20')

##### Using Enumerate 

- We can use the `enumerate` function to slice out each dinner option with its integer index.


In [9]:
## I can't remember what index is what! 
# help me, enumerate!
for i, option in enumerate(options_array):
    print(f"[{i}: {option}")

[0: cheeseburger
[1: chicken tikka masala
[2: lasagna
[3: filet mignon


- We will want to re-use this so we can wrap it into a simple function!

In [18]:
## make the index_report function
def index_report():
    for i, option in enumerate(options_array):
        print(f"[{i}] {option}")

[0] cheeseburger
[1] chicken tikka masala
[2] lasagna
[3] filet mignon


### Q3 Continued: What if we decided to add 2 orders of Tikka Masala?

In [19]:
# run our function
index_report()

[0] cheeseburger
[1] chicken tikka masala
[2] lasagna
[3] filet mignon


In [21]:
## use the index to replace the value for chicken tikka msas with 2
servings[1] = 2

servings

array([3, 2, 2, 1])

In [None]:
# calculate total bill


88.25

### Q4: What if there were discounted happy hour promotions?
- Cheesburgers and Filet Mignon are both 25% off
> Hint: make a `discounts` array.


In [None]:
## discounts array
discounts = None

In [None]:
## discounted prices
discounted_prices = None
discounted_prices

In [None]:
## calculate the total prices with the discounts


## Wouldn't it be nice...
>-  if we had a way to group ALL of this infromation wihtout memorizing indices that was really easiy to visualize?
- Hmmm....🤔 - a dictionary might work!

- Make a dinner_data dictionary that contains the data from:
    - prices
    - calories_per_serving
    - discounts
    - and servings

In [24]:
# We could use a dictionry for Price, Calories per serving, discount, servings
dinner_data = {"dish": options_array,
               "prices": prices,
               "calories": calories_per_serving,
               "servings": servings}

dinner_data

{'calories': array([740, 240, 408, 301]),
 'dish': array(['cheeseburger', 'chicken tikka masala', 'lasagna', 'filet mignon'],
       dtype='<U20'),
 'prices': array([ 8.5 , 12.5 , 11.  , 15.75]),
 'servings': array([3, 2, 2, 1])}

- Hmmm, thats **better** but its still really hard see the data aligned.

> 🐼 PANDAS TO THE RESCUE!

In [22]:
import pandas as pd

In [28]:
## make a dataframe from our dinner_data
dinner_df = pd.DataFrame(dinner_data)
dinner_df = dinner_df.set_index('dish')

dinner_df

Unnamed: 0_level_0,prices,calories,servings
dish,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
cheeseburger,8.5,740,3
chicken tikka masala,12.5,240,2
lasagna,11.0,408,2
filet mignon,15.75,301,1


In [None]:
## calculate the order total using the dataframe 


76.375

### Pandas is Built On Top of Numpy

> Pandas is built ON TOP of NumPy and **therefore can do many of the same things as numpy arrays!**

In [None]:
## you can get the data as an array using .values


In [None]:
## what is the average price of our foods?


In [None]:
## how many servings did we order in total?


> We will talk MUCH more about Pandas and DataFrames next week!