## Option 1: Pyber

![Ride](Images/Ride.png)

The ride sharing bonanza continues! Seeing the success of notable players like Uber and Lyft, you've decided to join a fledgling ride sharing company of your own. In your latest capacity, you'll be acting as Chief Data Strategist for the company. In this role, you'll be expected to offer data-backed guidance on new opportunities for market differentiation.

You've since been given access to the company's complete recordset of rides. This contains information about every active driver and historic ride, including details like city, driver count, individual fares, and city type.

Your objective is to build a [Bubble Plot](https://en.wikipedia.org/wiki/Bubble_chart) that showcases the relationship between four key variables:

* Average Fare ($) Per City
* Total Number of Rides Per City
* Total Number of Drivers Per City
* City Type (Urban, Suburban, Rural)

In addition, you will be expected to produce the following three pie charts:

* % of Total Fares by City Type
* % of Total Rides by City Type
* % of Total Drivers by City Type

As final considerations:

* You must use the Pandas Library and the Jupyter Notebook.
* You must use the Matplotlib and Seaborn libraries.
* You must include a written description of three observable trends based on the data.
* You must use proper labeling of your plots, including aspects like: Plot Titles, Axes Labels, Legend Labels, Wedge Percentages, and Wedge Labels.
* Remember when making your plots to consider aesthetics!
  * You must stick to the Pyber color scheme (Gold, Light Sky Blue, and Light Coral) in producing your plot and pie charts.
  * When making your Bubble Plot, experiment with effects like `alpha`, `edgecolor`, and `linewidths`.
  * When making your Pie Chart, experiment with effects like `shadow`, `startangle`, and `explosion`.
* You must include an exported markdown version of your Notebook called  `README.md` in your GitHub repository.
* See [Example Solution](Pyber/Pyber_Example.pdf) for a reference on expected format.

In [19]:
# Dependencies
import numpy as np
import pandas as pd
from scipy import stats
import seaborn as sns
import matplotlib.pyplot as plt

# Read data
city_data = pd.read_csv("raw_data/city_data.csv")
ride_data = pd.read_csv("raw_data/ride_data.csv")
#city_data = city_data.reset_index()


city_data.head()

ride_data.head()

combined_data = city_data.merge(ride_data, left_on='city',right_on='city',how='outer')
combined_data.head()

drivers_per_city = combined_data.groupby(['city']).count().reset_index()
fare_per_city = combined_data.groupby(['city','type']).mean().reset_index()
ride_per_city = combined_data.groupby(['city','type']).nunique()
fare_per_city.head()


chart_df = pd.DataFrame()
chart_df['x'] = ride_per_city['ride_id']
chart_df=chart_df.reset_index()
chart_df['y'] = fare_per_city['fare']
chart_df['s'] = drivers_per_city['driver_count']*10
chart_df['hue'] = fare_per_city['type']

urban_df = chart_df.set_index('hue').loc['Urban']
rural_df = chart_df.set_index('hue').loc['Rural']
suburban_df = chart_df.set_index('hue').loc['Suburban']

plt.scatter(urban_df['x'], urban_df['y'], urban_df['s'], c='b', alpha=0.5, label="Luck")
plt.scatter(rural_df['x'], rural_df['y'], rural_df['s'], c='g', alpha=0.5, label="Luck")
plt.scatter(suburban_df['x'], suburban_df['y'], suburban_df['s'], c='r', alpha=0.5, label="Luck")


