## Binning TED

### Instructions
* Read in the CSV file provided and print it to the screen.
* Find the minimum "views" and maximum "views".
* Using the minimum and maximum "views" as a reference, create 10 bins in which to slice the data.
* Create a new column called "View Group" and fill it with the values collected through your slicing.
* Group the DataFrame based upon the values within "View Group".
* Find out how many rows fall into each group before finding the averages for "comments", "duration", and "languages".

In [1]:
# Import Dependencies
import pandas as pd

In [2]:
# Create a path to the csv and read it into a Pandas DataFrame
csv_path = "Resources/ted_talks.csv"
ted_df = pd.read_csv(csv_path)

In [4]:
ted_df.head()

Unnamed: 0,comments,description,duration,event,languages,main_speaker,name,title,views
0,4553,Sir Ken Robinson makes an entertaining and pro...,1164,TED2006,60,Ken Robinson,Ken Robinson: Do schools kill creativity?,Do schools kill creativity?,47227110
1,265,With the same humor and humanity he exuded in ...,977,TED2006,43,Al Gore,Al Gore: Averting the climate crisis,Averting the climate crisis,3200520
2,124,New York Times columnist David Pogue takes aim...,1286,TED2006,26,David Pogue,David Pogue: Simplicity sells,Simplicity sells,1636292
3,200,"In an emotionally charged talk, MacArthur-winn...",1116,TED2006,35,Majora Carter,Majora Carter: Greening the ghetto,Greening the ghetto,1697550
4,593,You've never seen data presented like this. Wi...,1190,TED2006,48,Hans Rosling,Hans Rosling: The best stats you've ever seen,The best stats you've ever seen,12005869


In [11]:
# Figure out the minimum and maximum views for a TED Talk
print("Maximum views of the Ted Talk is - " + str(ted_df["views"].max()))
print(f'Minimum views of the Ted Talk is - {str(ted_df["views"].min())}')

Maximum views of the Ted Talk is - 47227110
Minimum views of the Ted Talk is - 50443


In [57]:
# Create bins in which to place values based upon TED Talk views
bins = [0,60000,600000,6000000,50000000]
# Create labels for these bins
labels = ['Very Low','Low','Medium','High']

In [58]:
# Slice the data and place it into bins
ted_df_bins = pd.cut(ted_df["views"],bins, labels=labels)

In [59]:
type(ted_df_bins)

pandas.core.series.Series

In [60]:
# Place the data series into a new column inside of the DataFrame
ted_df["Rating"] = ted_df_bins

In [61]:
ted_df.head()

Unnamed: 0,comments,description,duration,event,languages,main_speaker,name,title,views,Rating
0,4553,Sir Ken Robinson makes an entertaining and pro...,1164,TED2006,60,Ken Robinson,Ken Robinson: Do schools kill creativity?,Do schools kill creativity?,47227110,High
1,265,With the same humor and humanity he exuded in ...,977,TED2006,43,Al Gore,Al Gore: Averting the climate crisis,Averting the climate crisis,3200520,Medium
2,124,New York Times columnist David Pogue takes aim...,1286,TED2006,26,David Pogue,David Pogue: Simplicity sells,Simplicity sells,1636292,Medium
3,200,"In an emotionally charged talk, MacArthur-winn...",1116,TED2006,35,Majora Carter,Majora Carter: Greening the ghetto,Greening the ghetto,1697550,Medium
4,593,You've never seen data presented like this. Wi...,1190,TED2006,48,Hans Rosling,Hans Rosling: The best stats you've ever seen,The best stats you've ever seen,12005869,High


In [62]:
ted_group = ted_df.groupby("Rating")

In [63]:
type(ted_group)

pandas.core.groupby.groupby.DataFrameGroupBy

In [64]:
# Create a GroupBy object based upon "View Group"
ted_df_rating_group = ted_df.groupby("Rating")                   

In [65]:
type(ted_df_rating_group)

pandas.core.groupby.groupby.DataFrameGroupBy

In [66]:
# Find how many rows fall into each bin
ted_df_rating = ted_df_rating_group["Rating"].count()

In [67]:
ted_df_rating

Rating
Very Low       1
Low          400
Medium      2066
High          83
Name: Rating, dtype: int64

In [34]:
# Get the average of each column within the GroupBy object

In [70]:
ted_df_rating_group.mean()

Unnamed: 0_level_0,comments,duration,languages,views
Rating,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Very Low,20.0,3573.0,0.0,50443.0
Low,96.4675,853.04,20.085,415319.7
Medium,189.485963,817.712972,28.200871,1539028.0
High,703.60241,884.542169,40.783133,11865650.0
