# Group Dynamics

In this activity, you’ll use the Pandas `groupby` function to calculate the aggregate statistics of a DataFrame.

Instructions:

1. Import `crypto_data.csv` into a Pandas DataFrame by using `read_csv`. Set the index as `data_date`. Be sure to include the `parse_dates` and `infer_datetime_format` parameters. Review the first five rows of the DataFrame.

2. Drop the `data_time` and `timestamp` columns from the DataFrame. Remove any missing values from the remaining columns. Review the first five rows of the cleaned DataFrame.

3. Group the DataFrame by `cryptocurrency`, and then plot the `data_priceUsd` for each cryptocurrency on a single plot.

4. Calculate the `average` price across two years for each cryptocurrency.

5. Calculate the `max` price across two years for each cryptocurrency.

6. Calculate the `min` price across two years for each cryptocurrency.

7. Answer the following questions in your Jupyter notebook:

    * Which cryptocurrency do you recommend investing in?

    * Which cryptocurrency had the largest swing in prices?

References:

[Pandas groupby](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.groupby.html)


In [1]:
# Import the required libraries and dependencies
import pandas as pd
from pathlib import Path

## Step 1: Import `crypto_data.csv` into a Pandas DataFrame by using `read_csv`. Set the index as `data_date`. Be sure to include the `parse_dates` and `infer_datetime_format` parameters. Review the first five rows of the DataFrame.

In [2]:
# Using the read_csv function and the Path module, read in the crypto_data.csv file
# Use the 'data_date' columnas the index. Inclue the parse_dates and infer_datetime_format parameters.
crypto_data_df = pd.read_csv(
    Path("../Resources/crypto_data.csv"), 
    index_col='data_date', 
    parse_dates=True, 
    infer_datetime_format=True)

# Review the first five rows of the DataFrame
# YOUR CODE HERE
display(crypto_data_df.head())

FileNotFoundError: [Errno 2] No such file or directory: '..\\Resources\\crypto_data.csv'

## Step 2: Drop the `data_time` and `timestamp` columns from the DataFrame. Remove any missing values from the remaining columns. Review the first five rows of the cleaned DataFrame.

In [None]:
# Drop the data_time and timestamp columns
crypto_data_df_one = crypto_data_df.drop(columns=['data_time','timestamp'])

# Remove any rows with missing values
crypto_data_df_clean = crypto_data_df_one.dropna()

# Review the first five rows of the cleaned DataFrame
# YOUR CODE HERE
crypto_data_df_clean.head()

In [None]:
crypto_data_df_clean.tail()

## Step 3: Group the DataFrame by `cryptocurrency`, and then plot the `data_priceUsd` for each cryptocurrency on a single plot.

In [None]:
# Group the DataFrame by cryptocurrency and then plot the data_priceUSD
# The plot should include a parameters for legend and figsize.
crypto_data_plot = crypto_data_df.groupby('cryptocurrency')['data_priceUsd']

# View the plot
# YOUR CODE HERE
crypto_data_plot.plot(legend=True, figsize=(16,8))

## Step 4: Calculate the `average` price across two years for each cryptocurrency.

In [None]:
# Determine average price, or mean, for each cryptocurrency
crypto_data_avg = crypto_data_plot.mean()

# View the resulting Series
# YOUR CODE HERE
crypto_data_avg

## Step 5: Calculate the `max` price across two years for each cryptocurrency.

In [None]:
# Determine max price for each cryptocurrency
crypto_data_max = crypto_data_plot.max()

# View the resulting Series
# YOUR CODE HERE
crypto_data_max

## Step 6: Calculate the `min` price across two years for each cryptocurrency.

In [None]:
# Determine min price for each cryptocurrency
crypto_data_min = crypto_data_plot.min()

# View the resulting Series
# YOUR CODE HERE
crypto_data_min

## Step 7: Answer the following questions:

**Question** Which cryptocurrency had the largest swing in prices?

**Answer**  # YOUR ANSWER HERE


**Question** Which cryptocurrency do you recommend investing in?

**Answer** # YOUR ANSWER HERE