### Observations and Conclusions:
+ For the first five years of Nice Ride MN operations, ridership (number of distinct rides) in the Twin Cities increased by > than 10% per year.  Looking at what is typically the most popular month, July, from 2010 to 2014, ridership increased from about 22k to about 90k (~300%) during the first 5 years.
+ The number of rides is definitely affected by weather, as expected. We see that Nice Ride MN has an 8-month riding season because it's not profitable to operate when there's snow on the ground (December through March).  But as the weather warms up, we see that ridership increases--with the summer months being the most popular. In Minneapolis's warmest month, July, we see the highest ridership. As it cools down, ridership decreases as evidenced by the sharp dropoff in October (steep negative slope). 
+ Over the last four years for which we have data 2015-2018, ridership has remained steady or even decreased as evidenced by the bunched up lines for 2015-2018 timeframe.  We can speculate that this flattening of growth is due to competition from scooters, which have a similar rental model. Also, Nice Ride has stopped serving the St. Paul market.
+ We can expect the same ridership trends that exist Minneapolis will exist in our hometown, Eau Claire.  Although it's a smaller metro area, Eau Claire's climate and the terrain are very similar to that of Minneapolis. 


### Individual Contributions:
+ Offered the original idea for the project.  Made a case for why we should do this. Others agreed.
+ Developed the first list of questions that we would try to answer with our visualizations.
+ Helped the group narrow our focus and refine our primary question by offering the premise that we were developing a presentation for the City of Eau Claire, WI, who was soliciting bids for a bikeshare service according to this news article:  https://volumeone.org/articles/2019/03/21/28625_sharing_their_wheels. 
+ Answered the question with my visualization--How did monthly ridership change over time in MN?

In [1]:
 # Import Numpy for calculations and matplotlib for charting
import numpy as np
import matplotlib.pyplot as plt 
import pandas as pd
from datetime import datetime
from dateutil import parser


In [2]:
# Save path to data set in a variable
dataFile2010 = "Project1CSV/Niceride 2010to2018/NR2010/Nice_Ride_trip_history_2010_season.csv"
dataFile2011 = "Project1CSV/Niceride 2010to2018/NR2011/Nice_Ride_trip_history_2011_season.csv"
dataFile2012 = "Project1CSV/Niceride 2010to2018/NR2012/Nice_Ride_trip_history_2012_season.csv"
dataFile2013 = "Project1CSV/Niceride 2010to2018/NR2013/Nice_Ride_trip_history_2013_season.csv"
dataFile2014 = "Project1CSV/Niceride 2010to2018/NR2014/Nice_Ride_trip_history_2014_season.csv"
dataFile2015 = "Project1CSV/Niceride 2010to2018/NR2015/Nice_Ride_trip_history_2015_season.csv"
dataFile2016 = "Project1CSV/Niceride 2010to2018/NR2016/Nice_Ride_trip_history_2016_season.csv"
dataFile2017 = "Project1CSV/Niceride 2010to2018/NR2017/Nice_Ride_trip_history_2017_season.csv"

# 2018 data is monthly, so import the monthly files
dataAPR2018 = "Project1CSV/Niceride 2010to2018/NR2018/201804-niceride-tripdata.csv"
dataMAY2018 = "Project1CSV/Niceride 2010to2018/NR2018/201805-niceride-tripdata.csv"
dataJUN2018 = "Project1CSV/Niceride 2010to2018/NR2018/201806-niceride-tripdata.csv"
dataJUL2018 = "Project1CSV/Niceride 2010to2018/NR2018/201807-niceride-tripdata.csv"
dataAUG2018 = "Project1CSV/Niceride 2010to2018/NR2018/201808-niceride-tripdata.csv"
dataSEP2018 = "Project1CSV/Niceride 2010to2018/NR2018/201809-niceride-tripdata.csv"
dataOCT2018 = "Project1CSV/Niceride 2010to2018/NR2018/201810-niceride-tripdata.csv"
dataNOV2018 = "Project1CSV/Niceride 2010to2018/NR2018/201811-niceride-tripdata.csv"



In [3]:
type(dataNOV2018)

str

In [4]:
# Use Pandas to read data
nr2010_df = pd.read_csv(dataFile2010)
nr2011_df = pd.read_csv(dataFile2011)
nr2012_df = pd.read_csv(dataFile2012)
nr2013_df = pd.read_csv(dataFile2013)
nr2014_df = pd.read_csv(dataFile2014)
nr2015_df = pd.read_csv(dataFile2015)
nr2016_df = pd.read_csv(dataFile2016)
nr2017_df = pd.read_csv(dataFile2017)



  interactivity=interactivity, compiler=compiler, result=result)
  interactivity=interactivity, compiler=compiler, result=result)


In [5]:
# The 2018 data is monthly, so it must be handled differently.
# Read each .csv into a dataframe.
APR2018_df = pd.read_csv(dataAPR2018)
MAY2018_df = pd.read_csv(dataMAY2018)
JUN2018_df = pd.read_csv(dataJUN2018)
JUL2018_df = pd.read_csv(dataJUL2018)
AUG2018_df = pd.read_csv(dataAUG2018)
SEP2018_df = pd.read_csv(dataSEP2018)
OCT2018_df = pd.read_csv(dataOCT2018)
NOV2018_df = pd.read_csv(dataNOV2018)

# Count the records in each df and assign to a variable.
countAPR2018 = len(APR2018_df)
countMAY2018 = len(MAY2018_df)
countJUN2018 = len(JUN2018_df)
countJUL2018 = len(JUL2018_df)
countAUG2018 = len(AUG2018_df)
countSEP2018 = len(SEP2018_df)
countOCT2018 = len(OCT2018_df)
countNOV2018 = len(NOV2018_df)

# Make a list of the variables to be used later in the plot.
list2018 = [countAPR2018, countMAY2018, countJUN2018, countJUL2018, countAUG2018,\
            countSEP2018, countOCT2018, countNOV2018]

# type(list2018)


  interactivity=interactivity, compiler=compiler, result=result)


In [6]:
# Back to the 2010-2107 datat...
# Change the Start date column from a string to a python datetime object to run the analysis:
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html

nr2010_df["Start date"] = pd.to_datetime(nr2010_df["Start date"], format='%m/%d/%Y %H:%M')
nr2011_df["Start date"] = pd.to_datetime(nr2011_df["Start date"], format='%m/%d/%Y %H:%M')
nr2012_df["Start date"] = pd.to_datetime(nr2012_df["Start date"], format='%m/%d/%Y %H:%M')
nr2013_df["Start date"] = pd.to_datetime(nr2013_df["Start date"], format='%m/%d/%Y %H:%M')
nr2014_df["Start date"] = pd.to_datetime(nr2014_df["Start date"], format='%m/%d/%Y %H:%M')
nr2015_df["Start date"] = pd.to_datetime(nr2015_df["Start date"], format='%m/%d/%Y %H:%M')
nr2016_df["Start date"] = pd.to_datetime(nr2016_df["Start date"], format='%m/%d/%Y %H:%M')
nr2017_df["Start date"] = pd.to_datetime(nr2017_df["Start date"], format='%m/%d/%Y %H:%M')
# nr2014_df.dtypes    

In [7]:
# print(nr2014_df.index)
print(nr2017_df.index)

RangeIndex(start=0, stop=460718, step=1)


In [8]:
# Move the index to the 'Start date' column.
nr2010_df.index=nr2010_df['Start date']
nr2011_df.index=nr2011_df['Start date']
nr2012_df.index=nr2012_df['Start date']
nr2013_df.index=nr2013_df['Start date']
nr2014_df.index=nr2014_df['Start date']
nr2015_df.index=nr2015_df['Start date']
nr2016_df.index=nr2016_df['Start date']
nr2017_df.index=nr2017_df['Start date']
# print(nr2014_df.index)

In [9]:
# Count the unique index values and assign to a variable.
# https://stackoverflow.com/questions/24082784/pandas-dataframe-groupby-datetime-month
# https://www.geeksforgeeks.org/python-pandas-index-value_counts/

rideCounts2010 = nr2010_df.index.month.value_counts()
rideCounts2011 = nr2011_df.index.month.value_counts()
rideCounts2012 = nr2012_df.index.month.value_counts()
rideCounts2013 = nr2013_df.index.month.value_counts()
rideCounts2014 = nr2014_df.index.month.value_counts()
rideCounts2015 = nr2015_df.index.month.value_counts()
rideCounts2016 = nr2016_df.index.month.value_counts()
rideCounts2017 = nr2017_df.index.month.value_counts()


In [10]:
# type(rideCounts)
rideCounts2012

6     52427
7     49231
8     48879
9     41713
5     36193
10    22982
4     22322
11     2003
Name: Start date, dtype: int64

In [11]:
# Sort the series.
rideCounts2010.sort_index(ascending=True, inplace=True)
rideCounts2011.sort_index(ascending=True, inplace=True)
rideCounts2012.sort_index(ascending=True, inplace=True)
rideCounts2013.sort_index(ascending=True, inplace=True)
rideCounts2014.sort_index(ascending=True, inplace=True)
rideCounts2015.sort_index(ascending=True, inplace=True)
rideCounts2016.sort_index(ascending=True, inplace=True)
rideCounts2017.sort_index(ascending=True, inplace=True)
# rideCounts2014

In [12]:
%matplotlib notebook

In [13]:
# Create a list of the months that will be plotted.
x_axis = ['APR','MAY','JUN','JUL','AUG','SEP','OCT','NOV']

In [14]:
# Convert each pandas.core.series.Series to a list for use in the plot.
list2010 = rideCounts2010.tolist()
list2011 = rideCounts2011.tolist()
list2012 = rideCounts2012.tolist()
list2013 = rideCounts2013.tolist()
list2014 = rideCounts2014.tolist()
list2015 = rideCounts2015.tolist()
list2016 = rideCounts2016.tolist()
list2017 = rideCounts2017.tolist()

# The 2010 list has only six elements because it started in June that year. 
# Add two elements to the beginning of the 2010 list.
list2010.insert(0, 0)
list2010.insert(1, 0)

# The 2013 list has January, February, and March.  Rides were < 10 each, so I'm removing them.
list2013.pop(0)
list2013.pop(1)
list2013.pop(2)

list2013

[13, 13970, 56051, 65579, 65011, 47472, 28069, 2054]

In [35]:
# Generate the plot. Use the style that others in my group liked.
plt.style.use('fivethirtyeight')

plt.plot(x_axis, list2010, color="lightblue", marker="o", linewidth=1, label="2010")
plt.plot(x_axis, list2011, color="skyblue", marker="o", linewidth=1, label="2011")
plt.plot(x_axis, list2012, color="cornflowerblue", marker="o", linewidth=1, label="2012")
plt.plot(x_axis, list2013, color="dodgerblue", marker="o", linewidth=1, label="2013")
plt.plot(x_axis, list2014, color="royalblue", marker="o", linewidth=1, label="2014")
plt.plot(x_axis, list2015, color="mediumblue", marker="o", linewidth=1, label="2015")
plt.plot(x_axis, list2016, color="darkblue", marker="o", linewidth=1, label="2016")
plt.plot(x_axis, list2017, color="navy", marker="o", linewidth=1, label="2017")
plt.plot(x_axis, list2018, color="black", marker="o", linewidth=1, label="2018")
# plt.plot(x_axis, yearly_casual_count, color="teal", marker = "o", linewidth=1, linestyle="-", label="Casual")

plt.title("Nice Rides per Month (April-November)")
plt.xlabel("Month")
plt.xticks(x_axis, x_axis, rotation='vertical', fontsize=10)
plt.ylabel("Number of Rides")
plt.yticks(fontsize=10)

plt.legend(bbox_to_anchor=(1, 1),  ncol=1, title="Year", title_fontsize=14, fancybox=True, frameon=True, 
           shadow=True, facecolor="white", fontsize=12)

plt.grid(color="white")

# Save figure and display it here.
plt.savefig('niceRidesOverTime_Dave.png', bbox_inches='tight')
plt.show()
plt.tight_layout()

<IPython.core.display.Javascript object>