# From CSV to DataFrame

In this activity, you will import a CSV file and create a DataFrame with a Datetime Index.

Instructions:

1. Import the required libraries and dependecies.

2. Using the Pandas `read_csv` function, create a Pandas Dataframe by importing the "prices.csv" file from the Resources folder.  The Pandas `read_csv` function will take in 4 parameters:
    * Using the Path module, specify the relative path to the "prices.csv" file.
    * Set the `index_col` parameter to specify the "Date" column as the index for the Pandas DataFrame.
    * Set the `parse_dates` parameter to True.
    * Set the `infer_datetime_format` parameter to True. 

3. Review the first five rows of the DataFrame using the Pandas `head` function.

4. Review the last five rows of the DataFrame using the Pandas `tail` function.

5. Review both the first and last seven rows of the DataFrame from one cell by calling the Pandas `display` function in conjunction with the `head` and `tail` functions. 

6. Review basic information of the DataFrame by calling the Pandas `info` function.

7. Generate the summary statistics for the DataFrame by calling the Pandas `describe` function. 


## Import the required libraries and dependencies.


In [1]:
# Import the Pandas library
import pandas as pd

# Import the Path module from the pathlib library
from pathlib import Path


## Using the Pandas `read_csv` function, create a Pandas Dataframe by importing the "prices.csv" file from the Resources folder.

The `read_csv` function should take in 4 parameters:

1. Using the Path module, specify the relative path to the "prices.csv" file.
2. Set the `index_col` parameter to specify the "Date" column as the index for the Pandas DataFrame.
3. Set the `parse_dates` parameter to True.
4. Set the `infer_datetime_format` parameter to True. 

In [2]:
# Use the `read_csv` function to create the Pandas DataFrame
prices_df = pd.read_csv(
    Path("../Resources/prices.csv"),
    index_col = "Date", 
    parse_dates = True,
    infer_datetime_format = True
)

## Review the first five rows of the DataFrame using the Pandas `head` function.

In [3]:
# Review the first five rows of the DataFrame
prices_df.head()

Unnamed: 0_level_0,MSFT,AAPL,TWTR
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2020-01-02,160.62,75.09,32.3
2020-01-03,158.62,74.36,31.52
2020-01-06,159.03,74.95,31.64
2020-01-07,157.58,74.6,32.54
2020-01-08,160.09,75.8,33.05


## Review the last five rows of the DataFrame using the Pandas `tail` function.

In [4]:
# Review the last five rows of the DataFrame
prices_df.tail()

Unnamed: 0_level_0,MSFT,AAPL,TWTR
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2020-03-25,146.92,61.38,25.97
2020-03-26,156.11,64.61,26.41
2020-03-27,149.7,61.94,25.29
2020-03-30,160.23,63.7,25.59
2020-03-31,157.71,63.57,24.56


## Review both the first and last seven rows of the DataFrame from one cell by calling the Pandas `display` function in conjunction with the `head` and `tail` functions. 

In [5]:
# Review the first and last seven rows of the DataFrame from the same cell
display(prices_df.head(7))
display(prices_df.tail(7))


Unnamed: 0_level_0,MSFT,AAPL,TWTR
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2020-01-02,160.62,75.09,32.3
2020-01-03,158.62,74.36,31.52
2020-01-06,159.03,74.95,31.64
2020-01-07,157.58,74.6,32.54
2020-01-08,160.09,75.8,33.05
2020-01-09,162.09,77.41,33.22
2020-01-10,161.34,77.58,32.78


Unnamed: 0_level_0,MSFT,AAPL,TWTR
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2020-03-23,135.98,56.09,24.69
2020-03-24,148.34,61.72,25.85
2020-03-25,146.92,61.38,25.97
2020-03-26,156.11,64.61,26.41
2020-03-27,149.7,61.94,25.29
2020-03-30,160.23,63.7,25.59
2020-03-31,157.71,63.57,24.56


## Review basic information of the DataFrame by calling the Pandas `info` function.

In [6]:
# Review basic information about the DataFrame
prices_df.info()

<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 62 entries, 2020-01-02 to 2020-03-31
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   MSFT    62 non-null     float64
 1   AAPL    62 non-null     float64
 2   TWTR    62 non-null     float64
dtypes: float64(3)
memory usage: 1.9 KB


## Generate the summary statistics for the DataFrame by calling the Pandas `describe` function. 

In [7]:
# Generate the summary statistics for the DataFrame
prices_df.describe()

Unnamed: 0,MSFT,AAPL,TWTR
count,62.0,62.0,62.0
mean,164.449032,73.541452,32.349355
std,13.675092,7.303351,4.341601
min,135.42,56.09,22.0
25%,158.29,68.5,31.55
50%,163.895,75.745,33.22
75%,172.3075,79.655,34.96
max,188.7,81.8,39.05
