# Boston Hubway Analysis

### In this exercise, we will pull the Hubway bikeshare trip data into a dataframe to conduct analysis on rideship, focusing primarily on station locations and overall popularity.  
### This data include the station name and unique IDs, latitudinal & logitudinal information for each station, Bike IDs, and User information such as type, birth year, and gender.  
### Additionally, information about the trip duration is available as a time measure component.

In [1]:
import pandas as pd
import numpy as np

In [2]:
dfx = pd.read_csv('C:\\Users\\benjamin.brosch\\Documents\\PythonFiles\\201712-hubway-tripdata.csv') 
dfx.head()

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender
0,200,12/1/2017 0:02,12/1/2017 0:05,160,Wentworth Institute of Technology - Huntington...,42.337586,-71.096271,12,Ruggles T Stop - Columbus Ave at Melnea Cass Blvd,42.336244,-71.087986,1938,Subscriber,1982,0
1,365,12/1/2017 0:06,12/1/2017 0:12,226,Commonwealth Ave At Babcock St,42.351547,-71.121262,10,B.U. Central - 725 Comm. Ave.,42.350406,-71.108279,57,Subscriber,1997,1
2,297,12/1/2017 0:09,12/1/2017 0:14,74,Harvard Square at Mass Ave/ Dunster,42.373268,-71.118579,76,Central Sq Post Office / Cambridge City Hall a...,42.366426,-71.105495,1201,Subscriber,1977,1
3,1128,12/1/2017 0:09,12/1/2017 0:28,46,Christian Science Plaza - Massachusetts Ave at...,42.343666,-71.085824,130,Upham's Corner TEMPORARY WINTER LOCATION,42.317509,-71.064166,1148,Subscriber,1991,1
4,2594,12/1/2017 0:10,12/1/2017 0:54,193,Brookline Village - Station Street @ MBTA TEMP...,42.333765,-71.120464,32,Landmark Center - Brookline Ave at Park Dr,42.345194,-71.101697,1712,Subscriber,1963,1


### The most popular start-end pairs can be identified using a groupby function. User Type is pulled in here to act as a counter - this could be renamed if required.

In [3]:
dfx_origin_pairs = dfx[['start station name', 'end station name', 
                        'usertype']].groupby(['start station name', 
                                 'end station name']).count().sort_values('usertype', ascending = False).reset_index()
dfx_origin_pairs.head(12)

Unnamed: 0,start station name,end station name,usertype
0,MIT at Mass Ave / Amherst St,MIT Vassar St,264
1,MIT Vassar St,MIT at Mass Ave / Amherst St,235
2,Linear Park - Mass. Ave. at Cameron Ave.,Davis Square,206
3,MIT Vassar St,MIT Stata Center at Vassar St / Main St,205
4,MIT Pacific St at Purrington St,MIT at Mass Ave / Amherst St,182
5,Davis Square,Linear Park - Mass. Ave. at Cameron Ave.,178
6,MIT Stata Center at Vassar St / Main St,MIT Vassar St,173
7,MIT Stata Center at Vassar St / Main St,MIT Pacific St at Purrington St,169
8,MIT Pacific St at Purrington St,Kendall T,160
9,MIT Pacific St at Purrington St,MIT Stata Center at Vassar St / Main St,158


### We can start by pulling information on ride count, based on starting destination and ending destination.
### This is completed by executing another groupby function, to get counts of rides leaving and arriving at each station. 
##### (Note that the individual data frames "dfx_rides_start" and "dfx_rides_end" are not displayed below.)

In [4]:
dfx_rides_start = dfx[['start station id', 
                       'start station name']].groupby(['start station id']).count().reset_index()
dfx_rides_start.rename(columns={'start station name': 'RideCount_Start', 
                                'start station id': 'station id'}, inplace=True)

dfx_rides_end = dfx[['end station id', 
                     'end station name']].groupby(['end station id']).count().reset_index()
dfx_rides_end.rename(columns={'end station name': 'RideCount_End', 
                              'end station id': 'station id'}, inplace=True)

### A merge of the start and end station data tables provide us with a summary data table - this conjunction was performed to ensure we do not miss any locations.
### We have also added columns for "Color_Start" and "Color_End" to assign color categories based on ridership. A "Color_Net" column will display a color based on the percentage difference between the two.

In [5]:
dfx_rides = pd.merge(dfx_rides_start, dfx_rides_end, left_on= 'station id', right_on = 'station id')
dfx_rides['Net Rides (% over start)'] = ((dfx_rides['RideCount_End']- 
                                          dfx_rides['RideCount_Start'])/ 
                                         dfx_rides['RideCount_Start'] 
                                         *100).round(1)
dfx_rides['Color_Start'] = ''
dfx_rides['Color_End'] = ''
dfx_rides['Color_Net'] = ''
dfx_rides = dfx_rides.sort_values('Net Rides (% over start)', ascending = False).reset_index()
dfx_rides.head(10)

Unnamed: 0,index,station id,RideCount_Start,RideCount_End,Net Rides (% over start),Color_Start,Color_End,Color_Net
0,0,1,2,16,700.0,,,
1,169,219,4,13,225.0,,,
2,40,48,98,151,54.1,,,
3,129,170,2,3,50.0,,,
4,110,138,44,65,47.7,,,
5,156,203,5,7,40.0,,,
6,146,190,978,1359,39.0,,,
7,132,174,18,23,27.8,,,
8,161,210,4,5,25.0,,,
9,114,142,41,51,24.4,,,


### Now, based on the statistical output from the describe function, we can capture ridership in quartiles and assign colors for each group. 
### Lower ride totals will be captured in lighter shades of green and higher ride totals will be captured in darker shades of green.
### Additionally we can capture change in net rides (% increase of End over Start) as a third category, with a separate color scheme.

In [6]:
dfx_rides.describe().transpose()

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
index,177.0,88.0,51.239633,0.0,44.0,88.0,132.0,176.0
station id,177.0,111.768362,67.227713,1.0,56.0,107.0,174.0,232.0
RideCount_Start,177.0,311.141243,321.358352,2.0,87.0,255.0,404.0,2074.0
RideCount_End,177.0,311.141243,333.845087,1.0,73.0,240.0,420.0,2009.0
Net Rides (% over start),177.0,2.449153,58.215144,-75.0,-11.0,-0.8,7.7,700.0


###  This short section will be a simple import of the mapping library and building a base map. 
### Here, we have pulled the average latitude and longitude coordinates, and used these calculations to center our Folium map.


In [8]:
avg_lat = dfx['start station latitude'].sum()/len(dfx['start station latitude']) # check mean brah
avg_lon = dfx['start station longitude'].sum()/len(dfx['start station longitude'])

import folium
mapCap = folium.Map(location=[avg_lat, avg_lon], tiles = 'CartoDB positron', zoom_start=12)
mapCap

### The dfx_list0 and dfx_list dataframes come off a bit gratiutous, only used to create the merged dataframe below. This code can be cleaned up in future review.


In [9]:
dfx_list0 = dfx[['end station id', 'end station name', 'end station latitude', 'end station longitude']] 
dfx_list0 = dfx_list0.drop_duplicates()
dfx_list0.sort_values(['end station id'], ascending = True, inplace = True)
dfx_list0.rename(columns={'end station latitude': 'latitude', 
                         'end station longitude': 'longitude',
                         'end station id': 'station id'}, inplace=True)
dfx_list0.head(10)


Unnamed: 0,station id,end station name,latitude,longitude
8707,1,18 Dorrance Warehouse,42.387151,-71.075978
704,3,Colleges of the Fenway - Fenway at Avenue Loui...,42.340115,-71.100619
339,4,Tremont St at E Berkeley St,42.345392,-71.069616
156,5,Northeastern University - North Parking Lot,42.341814,-71.090179
389,6,Cambridge St at Joy St,42.361199,-71.065195
207,7,Fan Pier,42.352941,-71.043885
1228,8,Union Square - Brighton Ave at Cambridge St,42.353334,-71.137313
100,9,Commonwealth Ave at Buick St,42.350622,-71.112882
1,10,B.U. Central - 725 Comm. Ave.,42.350406,-71.108279
92,11,Longwood Ave at Binney St,42.338629,-71.1065


In [10]:
dfx_list = dfx[['start station id', 'start station name', 'start station latitude', 'start station longitude' ]] 
dfx_list = dfx_list.drop_duplicates()
dfx_list.sort_values(['start station id'], ascending = True, inplace = True)
dfx_list.rename(columns={'start station latitude': 'latitude', 
                         'start station longitude': 'longitude', 'start station id': 'station id'}, inplace=True)
dfx_list.head(10)


Unnamed: 0,station id,start station name,latitude,longitude
28694,1,18 Dorrance Warehouse,42.387151,-71.075978
1384,3,Colleges of the Fenway - Fenway at Avenue Loui...,42.340115,-71.100619
131,4,Tremont St at E Berkeley St,42.345392,-71.069616
327,5,Northeastern University - North Parking Lot,42.341814,-71.090179
21,6,Cambridge St at Joy St,42.361199,-71.065195
1496,7,Fan Pier,42.352941,-71.043885
186,8,Union Square - Brighton Ave at Cambridge St,42.353334,-71.137313
187,9,Commonwealth Ave at Buick St,42.350622,-71.112882
447,10,B.U. Central - 725 Comm. Ave.,42.350406,-71.108279
429,11,Longwood Ave at Binney St,42.338629,-71.1065


### This next section will plot all of the points on the map based on latitide and longitude

In [11]:
combined_list = pd.merge(dfx_list, dfx_list0, left_on='station id', right_on = 'station id')
combined_list['Check'] = combined_list['latitude_x'] == combined_list['latitude_y']
combined_list.head(10)

Unnamed: 0,station id,start station name,latitude_x,longitude_x,end station name,latitude_y,longitude_y,Check
0,1,18 Dorrance Warehouse,42.387151,-71.075978,18 Dorrance Warehouse,42.387151,-71.075978,True
1,3,Colleges of the Fenway - Fenway at Avenue Loui...,42.340115,-71.100619,Colleges of the Fenway - Fenway at Avenue Loui...,42.340115,-71.100619,True
2,4,Tremont St at E Berkeley St,42.345392,-71.069616,Tremont St at E Berkeley St,42.345392,-71.069616,True
3,5,Northeastern University - North Parking Lot,42.341814,-71.090179,Northeastern University - North Parking Lot,42.341814,-71.090179,True
4,6,Cambridge St at Joy St,42.361199,-71.065195,Cambridge St at Joy St,42.361199,-71.065195,True
5,7,Fan Pier,42.352941,-71.043885,Fan Pier,42.352941,-71.043885,True
6,8,Union Square - Brighton Ave at Cambridge St,42.353334,-71.137313,Union Square - Brighton Ave at Cambridge St,42.353334,-71.137313,True
7,9,Commonwealth Ave at Buick St,42.350622,-71.112882,Commonwealth Ave at Buick St,42.350622,-71.112882,True
8,10,B.U. Central - 725 Comm. Ave.,42.350406,-71.108279,B.U. Central - 725 Comm. Ave.,42.350406,-71.108279,True
9,11,Longwood Ave at Binney St,42.338629,-71.1065,Longwood Ave at Binney St,42.338629,-71.1065,True


In [12]:
combined_list['Check'].unique()

array([ True], dtype=bool)

### Now, we will merge the data frames based on the Station ID

In [13]:
combined_list = pd.merge(combined_list, dfx_rides, on='station id')
combined_list.head(10)

Unnamed: 0,station id,start station name,latitude_x,longitude_x,end station name,latitude_y,longitude_y,Check,index,RideCount_Start,RideCount_End,Net Rides (% over start),Color_Start,Color_End,Color_Net
0,1,18 Dorrance Warehouse,42.387151,-71.075978,18 Dorrance Warehouse,42.387151,-71.075978,True,0,2,16,700.0,yellowgreen,yellowgreen,dodgerblue
1,3,Colleges of the Fenway - Fenway at Avenue Loui...,42.340115,-71.100619,Colleges of the Fenway - Fenway at Avenue Loui...,42.340115,-71.100619,True,1,169,187,10.7,chartreuse,chartreuse,dodgerblue
2,4,Tremont St at E Berkeley St,42.345392,-71.069616,Tremont St at E Berkeley St,42.345392,-71.069616,True,2,477,427,-10.5,darkgreen,darkgreen,grey
3,5,Northeastern University - North Parking Lot,42.341814,-71.090179,Northeastern University - North Parking Lot,42.341814,-71.090179,True,3,134,154,14.9,chartreuse,chartreuse,dodgerblue
4,6,Cambridge St at Joy St,42.361199,-71.065195,Cambridge St at Joy St,42.361199,-71.065195,True,4,574,572,-0.3,darkgreen,darkgreen,grey
5,7,Fan Pier,42.352941,-71.043885,Fan Pier,42.352941,-71.043885,True,5,185,196,5.9,chartreuse,chartreuse,grey
6,8,Union Square - Brighton Ave at Cambridge St,42.353334,-71.137313,Union Square - Brighton Ave at Cambridge St,42.353334,-71.137313,True,6,155,119,-23.2,chartreuse,chartreuse,tomato
7,9,Commonwealth Ave at Buick St,42.350622,-71.112882,Commonwealth Ave at Buick St,42.350622,-71.112882,True,7,273,286,4.8,limegreen,limegreen,grey
8,10,B.U. Central - 725 Comm. Ave.,42.350406,-71.108279,B.U. Central - 725 Comm. Ave.,42.350406,-71.108279,True,8,384,381,-0.8,limegreen,limegreen,grey
9,11,Longwood Ave at Binney St,42.338629,-71.1065,Longwood Ave at Binney St,42.338629,-71.1065,True,9,383,415,8.4,limegreen,darkgreen,dodgerblue


### Using the GeoPy library, we can use the latitude/longitude information to pull the full address.
### This address can then be converted to a string to return the Zip Code associated with each location.
### Within each section, we can also do a little bit of cleanup to drop unnecessary columns from the data frame - specifically, repeats of lat/lon identifiers and station names, and the remaining (unused) location information.

In [14]:
from geopy.geocoders import Nominatim
locate_this = Nominatim()
#location = locate_this.reverse('42.336244, -71.087986') 
#print(location.address)

In [15]:
combined_list.drop(combined_list[['end station name', 'latitude_y', 'longitude_y']], axis = 1, inplace = True)

In [16]:
combined_list['Lat-Lon'] = list(zip(combined_list['latitude_x'], combined_list['longitude_x']))
combined_list['Location']= combined_list['Lat-Lon'].apply(locate_this.reverse)
combined_list.head(10)

Unnamed: 0,station id,start station name,latitude_x,longitude_x,Check,index,RideCount_Start,RideCount_End,Net Rides (% over start),Color_Start,Color_End,Color_Net,Lat-Lon,Location
0,1,18 Dorrance Warehouse,42.387151,-71.075978,True,0,2,16,700.0,yellowgreen,yellowgreen,dodgerblue,"(42.387151, -71.075978)","(99, Temple Street, Charlestown, Boston, Suffo..."
1,3,Colleges of the Fenway - Fenway at Avenue Loui...,42.340115,-71.100619,True,1,169,187,10.7,chartreuse,chartreuse,dodgerblue,"(42.34011512, -71.10061884)","(Ave Louis Pasteur @ The Fenway, Avenue Louis ..."
2,4,Tremont St at E Berkeley St,42.345392,-71.069616,True,2,477,427,-10.5,darkgreen,darkgreen,grey,"(42.345392, -71.069616)","(Hubway - Tremont St. at Berkeley St., Tremont..."
3,5,Northeastern University - North Parking Lot,42.341814,-71.090179,True,3,134,154,14.9,chartreuse,chartreuse,dodgerblue,"(42.341814, -71.090179)","(Hubway - Northeastern U / North Parking Lot, ..."
4,6,Cambridge St at Joy St,42.361199,-71.065195,True,4,574,572,-0.3,darkgreen,darkgreen,grey,"(42.36119942, -71.06519487)","(Hubway - Cambridge St. at Joy St., Cambridge ..."
5,7,Fan Pier,42.352941,-71.043885,True,5,185,196,5.9,chartreuse,chartreuse,grey,"(42.3529408, -71.04388475)","(100 Northern Avenue, 100, Northern Avenue, Ch..."
6,8,Union Square - Brighton Ave at Cambridge St,42.353334,-71.137313,True,6,155,119,-23.2,chartreuse,chartreuse,tomato,"(42.353334, -71.137313)",(Hubway - Union Square - Brighton Ave. at Camb...
7,9,Commonwealth Ave at Buick St,42.350622,-71.112882,True,7,273,286,4.8,limegreen,limegreen,grey,"(42.35062177, -71.11288219)","(Boston Unversity Lot F, Dummer Street, Coolid..."
8,10,B.U. Central - 725 Comm. Ave.,42.350406,-71.108279,True,8,384,381,-0.8,limegreen,limegreen,grey,"(42.350406, -71.108279)","(BU Farmer's Market - Thursdays, Commonwealth ..."
9,11,Longwood Ave at Binney St,42.338629,-71.1065,True,9,383,415,8.4,limegreen,darkgreen,dodgerblue,"(42.338629, -71.1065)","(Hubway - Longwood Ave / Binney St, 330, Brook..."


In [17]:
combined_list['Location'] = combined_list['Location'].astype('str')
combined_list['Zip_Code'] = combined_list['Location'].str.extract(r'(\d{5})', expand = False)
combined_list.drop(combined_list[['Check', 'Location']], axis = 1, inplace = True)
combined_list.head(7)

Unnamed: 0,station id,start station name,latitude_x,longitude_x,index,RideCount_Start,RideCount_End,Net Rides (% over start),Color_Start,Color_End,Color_Net,Lat-Lon,Zip_Code
0,1,18 Dorrance Warehouse,42.387151,-71.075978,0,2,16,700.0,yellowgreen,yellowgreen,dodgerblue,"(42.387151, -71.075978)",2129
1,3,Colleges of the Fenway - Fenway at Avenue Loui...,42.340115,-71.100619,1,169,187,10.7,chartreuse,chartreuse,dodgerblue,"(42.34011512, -71.10061884)",2118
2,4,Tremont St at E Berkeley St,42.345392,-71.069616,2,477,427,-10.5,darkgreen,darkgreen,grey,"(42.345392, -71.069616)",2118
3,5,Northeastern University - North Parking Lot,42.341814,-71.090179,3,134,154,14.9,chartreuse,chartreuse,dodgerblue,"(42.341814, -71.090179)",2118
4,6,Cambridge St at Joy St,42.361199,-71.065195,4,574,572,-0.3,darkgreen,darkgreen,grey,"(42.36119942, -71.06519487)",2114
5,7,Fan Pier,42.352941,-71.043885,5,185,196,5.9,chartreuse,chartreuse,grey,"(42.3529408, -71.04388475)",2114
6,8,Union Square - Brighton Ave at Cambridge St,42.353334,-71.137313,6,155,119,-23.2,chartreuse,chartreuse,tomato,"(42.353334, -71.137313)",2134


###  Next, we will plot points on the map, based on the locations of each station. Plot points have been stylized (color/shape) to echo the Hubway branding.

In [18]:
from folium.plugins import FloatImage
url = ('http://hubwaydatachallenge.org/static/img/hubway_logo_green.png')
FloatImage(url, bottom = 0, left = 85).add_to(mapCap)

for i in range(0,len(combined_list)):
    folium.RegularPolygonMarker(location =[combined_list.iloc[i]['latitude_x'], 
                                           combined_list.iloc[i]['longitude_x']],
                                radius = 5, number_of_sides = 6, 
                                color = 'limegreen', fill_color = 'white'
                                ).add_to(mapCap)

### Calling this will produce a quick map of all the locations of Hubway bike stations.

In [19]:
mapCap

### We can use the same location defined through our earlier lat/lon coordinates and produce a heatmap of the locations of Hubway bike stations through the aptly named HeatMap function. 
### Here, we have created a duplicate map for the heatmap (i.e., remove the stations so that the output does not confuse/overwhelm the viewer).

In [20]:
from folium.plugins import HeatMap
mapCap1 = folium.Map(location=[avg_lat, avg_lon], tiles = 'CartoDB positron', zoom_start=12)
url = ('http://hubwaydatachallenge.org/static/img/hubway_logo_green.png')

FloatImage(url, bottom = 0, left = 85).add_to(mapCap1)
heatmappin = [[row['latitude_x'],row['longitude_x']] for index, row in combined_list.iterrows()]
HeatMap(heatmappin, min_opacity = 0.6).add_to(mapCap1)
mapCap1

### Using earlier information, we can also plot locations and differentiate between ride total based on the identified color to demonstrate where riders start their rides and where riders end their rides.

In [21]:
mapCap2 = folium.Map(location=[avg_lat, avg_lon], tiles = 'CartoDB positron', zoom_start=12)
url = ('http://hubwaydatachallenge.org/static/img/hubway_logo_green.png')
FloatImage(url, bottom = 0, left = 85).add_to(mapCap2)
for i in range(0,len(combined_list)):
    folium.RegularPolygonMarker(location =[combined_list.iloc[i]['latitude_x'], 
                                           combined_list.iloc[i]['longitude_x']],
                        radius = 5, number_of_sides = 6,
                        #popup = combined_list.iloc[i]['start station name'],
                        color = combined_list.iloc[i]['Color_Start'], fill_color = 'white',
                       ).add_to(mapCap2)
mapCap2



In [22]:
mapCap3 = folium.Map(location=[avg_lat, avg_lon], tiles = 'CartoDB positron', zoom_start=12)
url = ('http://hubwaydatachallenge.org/static/img/hubway_logo_green.png')
FloatImage(url, bottom = 0, left = 85).add_to(mapCap3)
for i in range(0,len(combined_list)):
    folium.RegularPolygonMarker(location =[combined_list.iloc[i]['latitude_x'], 
                                           combined_list.iloc[i]['longitude_x']],
                        radius = 5, number_of_sides = 6,
                        #popup = combined_list.iloc[i]['start station name'],
                        color = combined_list.iloc[i]['Color_End'], fill_color = 'white',
                       ).add_to(mapCap3)
mapCap3

### For Net Rides, we can look at % Change between End and Start location. 
### The first quartile (-11% or below, outlined in Red) and fourth quartile (7.7% or above, outlined in Blue,  are used to identify changes in Net Rides at each location.
### We can also fill each with the "End Color" (total number of end point rides) to present a volume-based visual to accompany change in ridership.

In [23]:
(combined_list['Color_Net'] == 'grey').sum()

89

In [24]:
(combined_list['Color_Net'] == 'dodgerblue').sum()

44

In [25]:
(combined_list['Color_Net'] == 'tomato').sum()

44

In [26]:
mapCap4 = folium.Map(location=[avg_lat, avg_lon], tiles = 'CartoDB positron', zoom_start=12)
url = ('http://hubwaydatachallenge.org/static/img/hubway_logo_green.png')
FloatImage(url, bottom = 0, left = 85).add_to(mapCap4)
for i in range(0,len(combined_list)):
    folium.RegularPolygonMarker(location =[combined_list.iloc[i]['latitude_x'], 
                                           combined_list.iloc[i]['longitude_x']],
                        radius = 8, number_of_sides = 4,
                        #popup = combined_list.iloc[i]['start station name'],
                        color = combined_list.iloc[i]['Color_Net'], 
                                fill_color = combined_list.iloc[i]['Color_End']
                       ).add_to(mapCap4)
mapCap4

## Time for Pedro to set up his Arepa stand!

### In all seriousness, we can now filter to show stations that are (1) in the 4th quartile for popular end destinations (by volume) and (2) in the 4th quartile for % increase between end vs. start locations.
### (With fewer data points, we can include popups to show locational information (Station Name)

In [27]:
Arepa_Time = combined_list[
    (combined_list['Color_Net'] == 'dodgerblue') & 
    (combined_list['Color_End'] == 'darkgreen')]

mapCap5 = folium.Map(location=[avg_lat, avg_lon], tiles = 'CartoDB positron', zoom_start=12)
url = ('http://hubwaydatachallenge.org/static/img/hubway_logo_green.png')
FloatImage(url, bottom = 0, left = 85).add_to(mapCap5)

for i in range(0,len(Arepa_Time)):
    folium.RegularPolygonMarker(location =[Arepa_Time.iloc[i]['latitude_x'], 
                                           Arepa_Time.iloc[i]['longitude_x']],
                        radius = 8, number_of_sides = 4,
                        popup = combined_list.iloc[i]['start station name'],
                        color = Arepa_Time.iloc[i]['Color_Net'], 
                                fill_color = Arepa_Time.iloc[i]['Color_End']
                       ).add_to(mapCap5)
mapCap5

## ---STILL UNDER CONSTRUCTION---
#### Connect a location on a map through the mpl_toolkits library
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
from matplotlib import cm
%matplotlib inline

fig = plt.figure(figsize=(14,10))
ax = fig.add_subplot(111)

m=Basemap(projection='merc', lat_0=42.3, lon_0=-71.1,
          resolution = 'i', area_thresh = 0.05, 
          llcrnrlon=-71.2, llcrnrlat=42.3, 
          urcrnrlon=-71.0, urcrnrlat=42.5, )
x, y = m(dfx_list['longitude'].values, lfx_list['latitude'].values)
m.hexbin(x, y, gridsize=1000,
         bins='log', cmap=cm.YlOrRd_r);

#startlat = 42.387151; startlon = -71.075978
#arrlat = 42.365064; arrlon = -71.119233
#mapit = m.drawgreatcircle(startlon,startlat,arrlon,arrlat, linewidth=2, color='orange')
#show(mapit)