# Example using Jupyter Notebook with Python

To demonstrate the compatibility of vscode with Python I will use some basic data manipulation codes with the `pandas` package. This brief example comes from Lesson 1 of GitHub user guipsamora repository [pandas_exercises](https://github.com/guipsamora/pandas_exercises). To work on similar exercises yourself I recommend forking this repository.

test

## Chipotle

Datasets and materials made available thanks to (https://github.com/justmarkham).

### Step 1

Importing the needed libraries

In [1]:
import numpy as np 
import pandas as pd 

print("Pandas version: {0}".format(pd.__version__))
print("Numpy version: {0}".format(np.__version__))

Pandas version: 1.0.4
Numpy version: 1.18.5


### Step 2. Import the dataset from [here](https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv) and assign it as a an object named chipo

In [2]:
url = "https://raw.githubusercontent.com/justmarkham/DAT8/master/data/chipotle.tsv"
chipo = pd.read_csv(url, sep="\t")

### Step 3. Peek at the first 10 entries

In [3]:
chipo.head(10)

Unnamed: 0,order_id,quantity,item_name,choice_description,item_price
0,1,1,Chips and Fresh Tomato Salsa,,$2.39
1,1,1,Izze,[Clementine],$3.39
2,1,1,Nantucket Nectar,[Apple],$3.39
3,1,1,Chips and Tomatillo-Green Chili Salsa,,$2.39
4,2,2,Chicken Bowl,"[Tomatillo-Red Chili Salsa (Hot), [Black Beans...",$16.98
5,3,1,Chicken Bowl,"[Fresh Tomato Salsa (Mild), [Rice, Cheese, Sou...",$10.98
6,3,1,Side of Chips,,$1.69
7,4,1,Steak Burrito,"[Tomatillo Red Chili Salsa, [Fajita Vegetables...",$11.75
8,4,1,Steak Soft Tacos,"[Tomatillo Green Chili Salsa, [Pinto Beans, Ch...",$9.25
9,5,1,Steak Burrito,"[Fresh Tomato Salsa, [Rice, Black Beans, Pinto...",$9.25


This dataset is an item level description of orders placed at the restaurant Chipotle. It is in long format where each `order_id` contains the `quantity` and `item_name` for each unique order item. Note that if the item name has different choices then it is listed as a separate order item. 

### Step 4. Convert the item_price to numeric and calculate the total revenue

In [6]:
chipo.item_price.dtype

dtype('O')

The original variable is considered as a character since it has the dollar sign appended so we need to remove this and convert to float.

In [7]:
dollar_scrub = lambda x: float(x[1:-1])
chipo.item_price = chipo.item_price.apply(dollar_scrub)

Now we check to make sure that the type has been changed.

In [9]:
chipo.item_price.dtype

dtype('float64')

We have successfuly converted this variable, so now if we look at the first 10 entries again the `item_price` does not contain the $ sign.

In [10]:
chipo.head(10)

Unnamed: 0,order_id,quantity,item_name,choice_description,item_price
0,1,1,Chips and Fresh Tomato Salsa,,2.39
1,1,1,Izze,[Clementine],3.39
2,1,1,Nantucket Nectar,[Apple],3.39
3,1,1,Chips and Tomatillo-Green Chili Salsa,,2.39
4,2,2,Chicken Bowl,"[Tomatillo-Red Chili Salsa (Hot), [Black Beans...",16.98
5,3,1,Chicken Bowl,"[Fresh Tomato Salsa (Mild), [Rice, Cheese, Sou...",10.98
6,3,1,Side of Chips,,1.69
7,4,1,Steak Burrito,"[Tomatillo Red Chili Salsa, [Fajita Vegetables...",11.75
8,4,1,Steak Soft Tacos,"[Tomatillo Green Chili Salsa, [Pinto Beans, Ch...",9.25
9,5,1,Steak Burrito,"[Fresh Tomato Salsa, [Rice, Black Beans, Pinto...",9.25


Now with our float `item_price` we can calculate the total revenue as $\text{revenue}=\text{quantity}\times\text{price}$.

In [11]:
rev = (chipo.quantity * chipo.item_price).sum()
print("Total Revenue: $" + str(np.round(rev, decimals=2)))

Total Revenue: $39237.02


## Writing Math Equations 

In Jupyter notebook you can also type with LaTex, making writing math equations simple and easy. Below is an example of an equation for an assumption of weak ignorability that I am using as part of my dissertation work.

$$ Y_{i}(\mathbf{d})\perp\mathbf{D}_{i}\mid \mathbf{C} \quad \forall \quad \mathbf{d}\in\mathcal{D} $$
