The groupby function segments data into groups. 
Once groups are created, a function or operation can be executed against each group

Eg： 

1. Group data by male, female
2. grouping sales data by country
3. In our case, we may want to group data by the stock ticker.

Once the data is grouped, a count, sum, or average can be performed on the result.
It is a common technique in data analysis is to summarize data by grouping similar values.

### Import Libraries and Dependencies

In [1]:
import pandas as pd

%matplotlib inline

### Read in File and Clean Data

In [2]:
# Read CSV

crypto_data = pd.read_csv('crypto_data.csv', index_col='data_date', parse_dates=True, infer_datetime_format=True)
crypto_data.head()


Unnamed: 0_level_0,cryptocurrency,data_priceUsd,data_time,timestamp
data_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2017-05-09,bitcoin,,1494288000000.0,1557285000000.0
2017-05-10,bitcoin,1743.723523,1494374000000.0,
2017-05-11,bitcoin,1828.678209,1494461000000.0,
2017-05-12,bitcoin,1776.443919,1494547000000.0,
2017-05-13,bitcoin,1714.964198,1494634000000.0,


In [None]:
# Drop all columns cryptocurrency and data_priceUsd
crypto_data = crypto_data.drop(columns=['data_time','timestamp'])

# Sort the dates in ascending order
crypto_data = crypto_data.sort_index()

# Drop missing values
crypto_data = crypto_data.dropna()
crypto_data.head()

### Group DataFrame and perform `count` aggregation
To group data, use the groupby function against a non-unique column. The groupby function accepts a series name as an argument.

In [None]:
# Group by crypto data by cryptocurrency and perform count
crypto_data_grp = crypto_data.groupby('cryptocurrency').count()
crypto_data_grp

## df.groupby().Count() give number of rows for each cryptocurrency

### Group DataFrame without aggregate function

The groupby function requires a function or aggregation to proceed it.

Whenever a function is not chained to a groupby function, the output will be a DataFrameGroupBy object rather than an actual DataFrame.

DataFrameGroupBy objects must be aggregated before they can be used.

In [None]:
# Group by crypto data by cryptocurrency
crypto_data_grp = crypto_data.groupby('cryptocurrency')
crypto_data_grp.head()

### Group DataFrame by `cryptocurrency` and calculate the average `data_priceUsd`

aggregate functions that can be applied against DataFrameGroupBy objects include count, sum, and mean. These functions will proceed a groupby function.

In [None]:
# Calculate average data_priceUsd for each crypto
crypto_data_mean = crypto_data.groupby('cryptocurrency').mean()
crypto_data_mean

### Group by more than one column and calculate count

DataFrames can be grouped by more than one column. 

This groups values across each specified column and summarizes the data into one record. This approach can be used as a way to identify if there are any duplicates within the data.

In [None]:
# Group by more than one column
multi_group = crypto_data.groupby(['cryptocurrency','data_priceUsd'])['data_priceUsd'].count()
multi_group

### Compare single column grouping to multicolumn grouping

In [None]:
# Compare one column group with multiple column group
single_group = crypto_data.groupby('cryptocurrency')['data_priceUsd'].count()
single_group

### Plot grouped data to generate more than one line on the same chart

Once data is grouped, each group can be plotted for comparison. This will plot multiple lines on a single plot. Each line is considered a subplot.

In [None]:
# Plot data_priceUsd for each crypto across time
crypto_data.groupby('cryptocurrency')['data_priceUsd'].plot(legend=True)