# Jupyter Notebook Introduction
## What is Jupyter Notebook?
- Jupyter Notebook is an interactive browser application where you can combine code, outputs from the code, data visualizations, and explanatory into a single document (like the file you are reading now).
- Jupyter Notebook allows you to run Python in the browser. The browser only provides an interactive surface for you to type in the code. The code you type in is executed by the local python program in your computer. So you don't need Internet access to open and run a Jupyter Notebook file.
- A notebook typically contain multiple cells, in which you can type in your Python code. You then run the cell to execute the code. Any results from the code will show up below the cell.  

## Complete the Following Task
- This file illustrates how to load a dataset and then navigate that dataset through Jupyter Notebook.
- Please follow the instruction on top of each empty cell and type the code in that empty cell. You can then execute the cell by putting your cursor in it and click "Run"

## Step 1: Import pandas package
---
```Python
import <packagename> as <shortname>
```
---
- We need to import `pandas` for us to load the dataset.
- `pandas` is a library written for the Python programming language for data manipulation and analysis.
- You can think of Python as your mobile phone operating system, and `pandas` as an app installed in your phone that lets you do more with your phone. However, you will need to open the app before you can do any task (data analysis). This is called importing a package.
- `import pandas as pd` means we `import` `pandas` and name it as `pd` for easy reference afterwards. You can use any shortname you'd like or none at all, but it makes it easier to reference in later code.

## Step 2: Load the dataset
---
```Python
df = pd.read_csv(<filelocation>)
```
---
- First, make sure you saved the dataset `Compustat_fy2019.csv` along with the notebook file in the same local folder (the one with your last name). 
- We use the function `read_csv` contained in the `pd` package to open this dataset. In the code, we can refer to this function using `pd.read_csv()`; which you can think of as telling Python "use the `read_csv` function found in the `pd` package". In this sense, you can think of the dot `.` after `pd` as telling Python to look for the functions within the pandas *app*.
- In the parantheses of `pd.read_csv()`, we can specify related parameters. In this example, we need to point Python to the dataset that is stored in the default directory, i.e., `'compustat_fy2019.csv'` (remember we need to put quotes around a string).
- Lastly, we need to tell Python to store the dataset in memory by giving it a name. We name the dataset `df` in this example, or some variation of that. `df` is short for "`d`ata `f`rame" (you can also name it anything you want). If you do not name it (in other words, you don't use the equal sign), then Python will not remember it!

## Step 3: Navigate the dataset
---
```Python
df.head(<n>)
```
---
- We can have python return the first *n* rows in the loaded dataset using `df.head(n)`. Again think of `head(n)` as a function applied to the dataframe.
- Columns contain variables
    - tic (ticker)
    - conm (company name)
    - datadate (fiscal year end)
    - fyear (fiscal year)
    - at (total assets)
    - lt (total liabilities)
    - teq (total equities)
    - revt (revenue)
    - ni (net income)
    - exchg (exchange code; 11 New York Stock Exchange; 12 American Stock Exchange; 14 NASDAQ)
- Rows represent companies