**required libraries** -> pandas, openpyxl
- run this command in the terminal
    ```
    pip install pandas openpyxl --upgrade
    ```

Pandas Concepts to be covered:
- Loading Data into Pandas
- Series and DataFrames
- Viewing Data
    - head()
    - tail()
    - sample()
- Selection 
    - single column
    - multiple columns
    - Row Selection by Label
        - loc
    - Row Selection by Position
        - iloc
- Data Manipulation
    - Adding Columns
    - Removing Columns
    - Renaming Columns
    - Replacing Values
    - Applying Functions
- Data Cleaning
    - Handling Missing Data
    - Handling Duplicates
    - Handling Outliers
    - Handling Incorrect Data Types
    - Handling Inconsistent Data Entry
- Grouping and Aggregating
    - Grouping
    - Aggregating
    - Applying Functions
- Sorting
    - Sorting by Index
    - Sorting by Values
- Data Visualization
    - Matplotlib
    - Seaborn
    - plotly


In [None]:
import pandas as pd
import numpy as np

In [None]:
pd.set_option('display.max_columns', None)

In [None]:
file = 'Canada.xlsx'
canada = pd.read_excel(file, sheet_name=1, skiprows=20, skipfooter=2)

In [None]:
canada['AreaName'] # series - only 1 column (1 Dimensional)

In [None]:
canada # data frame - 2D data structure

In [None]:
canada.head() # first 5

In [None]:
canada.head(2) # first 2

In [None]:
canada.tail()

In [None]:
canada.sample(5) # random sample of 5 rows

In [None]:
canada[5:10] # slice of dataset

In [None]:
canada['RegName'] # dict-like access to a Series

In [None]:
canada.RegName # object style access

In [None]:
cols = list(range(1980, 1991))
canada[cols]

In [None]:
cols = ['AreaName','RegName',2013]
canada[cols]

In [None]:
canada[['AreaName','RegName',2013]]

In [None]:
canada.set_index('OdName', inplace=True)

In [None]:
canada

In [None]:
canada.iloc[0] # row index 0

In [None]:
canada.iloc[100]

In [None]:
canada.iloc[10:20] # 10 -19 rows

In [None]:
canada.iloc[10:16, :10] # 10-15 rows, 0-9 columns

In [None]:
canada.loc['Japan'] # row that has the index 'Japan'

In [None]:
canada.loc[['Japan','France']]

In [None]:
canada.loc[['Japan','France'], [1980, 1981, 1982, 1983, 1984, 1985]]

In [None]:
vs = canada.loc[['Japan','France','India'], [1980, 1981, 1982, 1983, 1984, 1985]]

In [None]:
# adding a total column
years = list(range(1980, 2014))
canada['Total'] = canada[years].sum(axis=1) # axis=1 means sum across columns (horizontally)
canada.head()