## Unit 5 | Assignment - The Power of Plots

## Background

What good is data without a good plot to tell the story?

So, let's take what you've learned about Python Matplotlib and apply it to some real-world situations. For this assignment, you'll need to complete **1 of 2** Data Challenges. As always, it's your choice which you complete. _Perhaps_, choose the one most relevant to your future career.

## Option 1: Pyber

![Ride](Images/Ride.png)

The ride sharing bonanza continues! Seeing the success of notable players like Uber and Lyft, you've decided to join a fledgling ride sharing company of your own. In your latest capacity, you'll be acting as Chief Data Strategist for the company. In this role, you'll be expected to offer data-backed guidance on new opportunities for market differentiation.

You've since been given access to the company's complete recordset of rides. This contains information about every active driver and historic ride, including details like city, driver count, individual fares, and city type.

Your objective is to build a [Bubble Plot](https://en.wikipedia.org/wiki/Bubble_chart) that showcases the relationship between four key variables:

* Average Fare ($) Per City
* Total Number of Rides Per City
* Total Number of Drivers Per City
* City Type (Urban, Suburban, Rural)

In addition, you will be expected to produce the following three pie charts:

* % of Total Fares by City Type
* % of Total Rides by City Type
* % of Total Drivers by City Type

As final considerations:

* You must use the Pandas Library and the Jupyter Notebook.
* You must use the Matplotlib and Seaborn libraries.
* You must include a written description of three observable trends based on the data.
* You must use proper labeling of your plots, including aspects like: Plot Titles, Axes Labels, Legend Labels, Wedge Percentages, and Wedge Labels.
* Remember when making your plots to consider aesthetics!
  * You must stick to the Pyber color scheme (Gold, Light Sky Blue, and Light Coral) in producing your plot and pie charts.
  * When making your Bubble Plot, experiment with effects like `alpha`, `edgecolor`, and `linewidths`.
  * When making your Pie Chart, experiment with effects like `shadow`, `startangle`, and `explosion`.
* You must include an exported markdown version of your Notebook called  `README.md` in your GitHub repository.
* See [Example Solution](Pyber/Pyber_Example.pdf) for a reference on expected format.

In [5]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

In [6]:
city_data = pd.read_csv("raw_data/city_data.csv")
ride_data = pd.read_csv("raw_data/ride_data.csv")
colors = ["Gold", "lightSkyblue", "lightcoral"]
#colors = ["#FFD700", "#87cefa", "#f08080"]
sns.palplot(sns.color_palette(colors))
sns.color_palette()

[(0.12156862745098039, 0.4666666666666667, 0.7058823529411765),
 (1.0, 0.4980392156862745, 0.054901960784313725),
 (0.17254901960784313, 0.6274509803921569, 0.17254901960784313),
 (0.8392156862745098, 0.15294117647058825, 0.1568627450980392),
 (0.5803921568627451, 0.403921568627451, 0.7411764705882353),
 (0.5490196078431373, 0.33725490196078434, 0.29411764705882354),
 (0.8901960784313725, 0.4666666666666667, 0.7607843137254902),
 (0.4980392156862745, 0.4980392156862745, 0.4980392156862745),
 (0.7372549019607844, 0.7411764705882353, 0.13333333333333333),
 (0.09019607843137255, 0.7450980392156863, 0.8117647058823529)]

In [7]:
df = pd.merge(city_data, ride_data, on='city')
df.head()

Unnamed: 0,city,driver_count,type,date,fare,ride_id
0,Kelseyland,63,Urban,2016-08-19 04:27:52,5.51,6246006544795
1,Kelseyland,63,Urban,2016-04-17 06:59:50,5.54,7466473222333
2,Kelseyland,63,Urban,2016-05-04 15:06:07,30.54,2140501382736
3,Kelseyland,63,Urban,2016-01-25 20:44:56,12.08,1896987891309
4,Kelseyland,63,Urban,2016-08-09 18:19:47,17.91,8784212854829


In [8]:
df = df.drop(['date', 'ride_id'], axis=1).reset_index(drop=True)
df.head()

Unnamed: 0,city,driver_count,type,fare
0,Kelseyland,63,Urban,5.51
1,Kelseyland,63,Urban,5.54
2,Kelseyland,63,Urban,30.54
3,Kelseyland,63,Urban,12.08
4,Kelseyland,63,Urban,17.91


In [9]:
urban = df.loc[df['type'] == 'Urban'].reset_index(drop=True)
urban.head()

Unnamed: 0,city,driver_count,type,fare
0,Kelseyland,63,Urban,5.51
1,Kelseyland,63,Urban,5.54
2,Kelseyland,63,Urban,30.54
3,Kelseyland,63,Urban,12.08
4,Kelseyland,63,Urban,17.91


In [10]:
suburban = df.loc[df['type'] == 'Suburban'].reset_index(drop=True)
suburban.head()

Unnamed: 0,city,driver_count,type,fare
0,Carrollbury,4,Suburban,25.0
1,Carrollbury,4,Suburban,49.47
2,Carrollbury,4,Suburban,35.33
3,Carrollbury,4,Suburban,20.26
4,Carrollbury,4,Suburban,46.67


In [11]:
rural = df.loc[df['type'] == 'Rural'].reset_index(drop=True)
rural.head()

Unnamed: 0,city,driver_count,type,fare
0,South Elizabethmouth,3,Rural,22.79
1,South Elizabethmouth,3,Rural,26.72
2,South Elizabethmouth,3,Rural,46.39
3,South Elizabethmouth,3,Rural,31.09
4,South Elizabethmouth,3,Rural,16.5


In [12]:
rural_group = rural.groupby(['city']).city.count().fare.sum()

rural_group.head()

AttributeError: 'Series' object has no attribute 'fare'