# Week 3 / Day 4 / Mapping Data with Folium

![](images/folium.png)

Below, we will go through a brief introduction to the **Folium** library.  This is a nice way to build interactive visuzlizations.  We will be executing these in the jupyter notebooks, however they are easily output as `.html` files ready to be served.  To begin, let's make sure we have folium installed.  

In [2]:
import folium
import pandas as pd

### We can make basic maps centered at any geolocation.  For example, below we create a basic map of Los Angeles, CA.  

In [3]:
m = folium.Map(location=[34.052235, -118.243683], zoom_start = 12)

In [4]:
m

In [4]:
m.save('index_map.html')

### We can add arguments that include changing the style of the map and the initial zoom level.
[Map style templates](https://deparkes.co.uk/2016/06/10/folium-map-tiles/)

In [5]:
folium.Map(
    location=[34.020012, -117.949509],
    tiles='Stamen Toner',
    zoom_start=13
)

### We can use the `popup` argument to include information to be displayed at specified marker locations.

In [6]:
tooltip = 'Click me!'
m = folium.Map(
    location=[45.372, -121.6972],
    zoom_start=12,
    tiles='Stamen Terrain'
)



folium.Marker([45.3288, -121.6625], popup='<i>Mt. Hood Meadows</i>').add_to(m)
folium.Marker([45.3311, -121.7113], popup='<b>Timberline Lodge</b>').add_to(m)
m

### We can even include `markdown` syntax and icons.

In [7]:
m = folium.Map(
    location=[45.372, -121.6972],
    zoom_start=12,
    tiles='Stamen Terrain'
)

folium.Marker(
    location=[45.3288, -121.6625],
    popup='Mt. Hood Meadows',
    icon=folium.Icon(icon='cloud')
).add_to(m)

folium.Marker(
    location=[45.3311, -121.7113],
    popup='Timberline Lodge',
    icon=folium.Icon(color='green')
).add_to(m)

folium.Marker(
    location=[45.3300, -121.6823],
    popup='Some Other Location',
    icon=folium.Icon(color='red', icon='info-sign')
).add_to(m)

<folium.map.Marker at 0x110a10668>

In [8]:
m

### We can manually control radii for markers of interest.  Below, we plot two circles at specific locations.

In [9]:
m = folium.Map(
    location=[45.5236, -122.6750],
    tiles='Stamen Toner',
    zoom_start=13
)

folium.Circle(
    radius=100,
    location=[45.5244, -122.6699],
    popup='The Waterfront',
    color='crimson',
    fill=False,
).add_to(m)

folium.CircleMarker(
    location=[45.5215, -122.6261],
    radius=50,
    popup='Laurelhurst Park',
    color='#3186cc',
    fill=True,
    fill_color='#3186cc'
).add_to(m)


<folium.vector_layers.CircleMarker at 0x1109f6240>

In [10]:
m

---
## Problem

In [5]:
gardens = pd.read_json('https://data.cityofnewyork.us/resource/yes4-7zbb.json')

In [6]:
gardens.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 536 entries, 0 to 535
Data columns (total 17 columns):
address             535 non-null object
bbl                 431 non-null object
bin                 431 non-null object
boro                536 non-null object
census_tract        429 non-null float64
community_board     536 non-null object
council_district    495 non-null float64
cross_streets       464 non-null object
garden_name         536 non-null object
jurisdiction        536 non-null object
latitude            429 non-null float64
longitude           429 non-null float64
neighborhoodname    323 non-null object
nta                 431 non-null object
postcode            431 non-null object
propid              536 non-null object
size                536 non-null object
dtypes: float64(4), object(13)
memory usage: 71.3+ KB


In [7]:
garden_map = gardens[(gardens.latitude.isna() == False) & (gardens.longitude.isna() == False)]

In [8]:
garden_map.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 429 entries, 0 to 535
Data columns (total 17 columns):
address             429 non-null object
bbl                 429 non-null object
bin                 429 non-null object
boro                429 non-null object
census_tract        429 non-null float64
community_board     429 non-null object
council_district    403 non-null float64
cross_streets       368 non-null object
garden_name         429 non-null object
jurisdiction        429 non-null object
latitude            429 non-null float64
longitude           429 non-null float64
neighborhoodname    265 non-null object
nta                 429 non-null object
postcode            429 non-null object
propid              429 non-null object
size                429 non-null object
dtypes: float64(4), object(13)
memory usage: 60.3+ KB


In [9]:
m = folium.Map(location = [garden_map.latitude[0], garden_map.longitude[0]], tiles='Stamen Toner',)
m

In [10]:
for index, row in garden_map.iterrows():
    folium.CircleMarker(location=(row["latitude"],
                                  row["longitude"]),
                        radius = 1.0,
                        color='blue',
                        fill=True).add_to(m)

In [11]:
m

- What information should be included about the gardens? ***Garden name and address***
- Can you alter the loop above to have this information available by scrolling over a point?

In [107]:
garden_map = gardens[(gardens.latitude.isna() == False) & (gardens.longitude.isna() == False) & (gardens.garden_name.isna() == False) & (gardens.address.isna() == False)]

In [109]:
#popup_text = """{}<br>
# address: {}"""

#popup_text = popup_text.format(row[gardens.garden_name])
#                               , row[gardens.address]

IndexError: tuple index out of range

In [110]:
popup_text = row[gardens.garden_name]

In [113]:
popup_text

garden_name
11 BC Serenity Garden                                                 NaN
1100 Bergen Street Community Garden                                   NaN
110th Street Block Association                                        NaN
117th Street Community Garden                                         NaN
11th Street Community Garden                                          NaN
211th Street Block Association.                                       NaN
2120 Mapes Avenue HDFC                                                NaN
400 Montauk Avenue Block Association. (Ismael Vega y Amigos)          NaN
5th Street Slope Garden Club                                          NaN
6/15 Green                                                            NaN
64th Street Community Garden                                          NaN
6BC Botanical Garden                                                  NaN
6th Street & Avenue B Garden                                          NaN
700 Decatur Street Block A

In [115]:
for index, row in garden_map.iterrows():
    folium.CircleMarker(
        location=(row["latitude"], row["longitude"]),
        radius = 1.0,
        color='blue',
        fill=True,
        popup=row[gardens.garden_name]).add_to(m)

AttributeError: 'Series' object has no attribute 'get_name'

In [94]:
m

### Mapping Bike Data

Now, we will use a dataset from NYC's citibike data.  Our goal is to compare incoming and outgoing traffic at given stations depending on the time of day.  

In [12]:
folium_map = folium.Map(location=[40.738, -73.98],
                        zoom_start=13,
                        tiles="CartoDB dark_matter")
marker = folium.CircleMarker(location=[40.738, -73.98])
marker.add_to(folium_map)

<folium.vector_layers.CircleMarker at 0x10a8237f0>

In [13]:
folium_map

In [14]:
bikes = pd.read_csv('data/201306-citibike-tripdata.csv')

In [15]:
bikes.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 577703 entries, 0 to 577702
Data columns (total 15 columns):
tripduration               577703 non-null int64
starttime                  577703 non-null object
stoptime                   577703 non-null object
start station id           577703 non-null int64
start station name         577703 non-null object
start station latitude     577703 non-null float64
start station longitude    577703 non-null float64
end station id             559644 non-null float64
end station name           559644 non-null object
end station latitude       559644 non-null float64
end station longitude      559644 non-null float64
bikeid                     577703 non-null int64
usertype                   577703 non-null object
birth year                 337382 non-null float64
gender                     577703 non-null int64
dtypes: float64(6), int64(4), object(5)
memory usage: 66.1+ MB


In [16]:
bikes['starttime'] = pd.to_datetime(bikes['starttime'])
bikes['stoptime'] = pd.to_datetime(bikes['stoptime'])
bikes['hour'] = bikes['starttime'].map(lambda x: x.hour)
bikes['ehour'] = bikes['stoptime'].map(lambda x: x.hour)

In [17]:
bikes.head()

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,hour,ehour
0,695,2013-06-01 00:00:01,2013-06-01 00:11:36,444,Broadway & W 24 St,40.742354,-73.989151,434.0,9 Ave & W 18 St,40.743174,-74.003664,19678,Subscriber,1983.0,1,0,0
1,693,2013-06-01 00:00:08,2013-06-01 00:11:41,444,Broadway & W 24 St,40.742354,-73.989151,434.0,9 Ave & W 18 St,40.743174,-74.003664,16649,Subscriber,1984.0,1,0,0
2,2059,2013-06-01 00:00:44,2013-06-01 00:35:03,406,Hicks St & Montague St,40.695128,-73.995951,406.0,Hicks St & Montague St,40.695128,-73.995951,19599,Customer,,0,0,0
3,123,2013-06-01 00:01:04,2013-06-01 00:03:07,475,E 15 St & Irving Pl,40.735243,-73.987586,262.0,Washington Park,40.691782,-73.97373,16352,Subscriber,1960.0,1,0,0
4,1521,2013-06-01 00:01:22,2013-06-01 00:26:43,2008,Little West St & 1 Pl,40.705693,-74.016777,310.0,State St & Smith St,40.689269,-73.989129,15567,Subscriber,1983.0,1,0,0


In [18]:
locations = bikes.groupby('start station id').first()

In [19]:
locations = locations.loc[:, ["start station latitude", "start station longitude", "start station name"]]

In [20]:
locations

Unnamed: 0_level_0,start station latitude,start station longitude,start station name
start station id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
72,40.767272,-73.993929,W 52 St & 11 Ave
79,40.719116,-74.006667,Franklin St & W Broadway
82,40.711174,-74.000165,St James Pl & Pearl St
83,40.683826,-73.976323,Atlantic Ave & Fort Greene Pl
116,40.741776,-74.001497,W 17 St & 8 Ave
119,40.696089,-73.978034,Park Ave & St Edwards St
120,40.686768,-73.959282,Lexington Ave & Classon Ave
127,40.731724,-74.006744,Barrow St & Hudson St
128,40.727103,-74.002971,MacDougal St & Prince St
137,40.761628,-73.972924,E 56 St & Madison Ave


In [21]:
subset = bikes[bikes["hour"]==10]

In [22]:
subset

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,hour,ehour
980,4447,2013-06-01 10:00:06,2013-06-01 11:14:13,457,Broadway & W 58 St,40.766953,-73.981693,457.0,Broadway & W 58 St,40.766953,-73.981693,16813,Customer,,0,10,11
981,481,2013-06-01 10:00:08,2013-06-01 10:08:09,528,2 Ave & E 31 St,40.742909,-73.977061,474.0,5 Ave & E 29 St,40.745168,-73.986831,17543,Subscriber,1976.0,1,10,10
982,1498,2013-06-01 10:00:05,2013-06-01 10:25:03,348,W Broadway & Spring St,40.724910,-74.001547,,,,,16949,Subscriber,1977.0,1,10,10
983,488,2013-06-01 10:00:44,2013-06-01 10:08:52,317,E 6 St & Avenue B,40.724537,-73.981854,545.0,E 23 St & 1 Ave,40.736502,-73.978095,16164,Subscriber,1985.0,1,10,10
984,1394,2013-06-01 10:00:43,2013-06-01 10:23:57,423,W 54 St & 9 Ave,40.765849,-73.986905,423.0,W 54 St & 9 Ave,40.765849,-73.986905,16177,Subscriber,1948.0,1,10,10
985,1138,2013-06-01 10:00:45,2013-06-01 10:19:43,435,W 21 St & 6 Ave,40.741740,-73.994156,151.0,Cleveland Pl & Spring St,40.722104,-73.997249,16443,Customer,,0,10,10
986,189,2013-06-01 10:01:09,2013-06-01 10:04:18,293,Lafayette St & E 8 St,40.730207,-73.991026,495.0,W 47 St & 10 Ave,40.762699,-73.993012,15143,Subscriber,1961.0,1,10,10
987,2682,2013-06-01 10:01:11,2013-06-01 10:45:53,396,Lefferts Pl & Franklin Ave,40.680342,-73.955769,437.0,Macon St & Nostrand Ave,40.680983,-73.950048,15750,Subscriber,1981.0,1,10,10
988,921,2013-06-01 10:01:20,2013-06-01 10:16:41,319,Fulton St & Broadway,40.711066,-74.009447,337.0,Old Slip & Front St,40.703799,-74.008387,15257,Subscriber,1985.0,1,10,10
989,412,2013-06-01 10:01:28,2013-06-01 10:08:20,212,W 16 St & The High Line,40.743349,-74.006818,,,,,18099,Subscriber,1968.0,1,10,10


In [23]:
dept_counts = subset.groupby("start station id").count()

In [24]:
dept_counts = dept_counts.iloc[:, [0]]

In [25]:
dept_counts.columns = ["Departure Counts"]

In [26]:
dept_counts

Unnamed: 0_level_0,Departure Counts
start station id,Unnamed: 1_level_1
72,80
79,117
82,25
83,44
116,115
119,7
120,20
127,178
128,73
137,9


### Problem

Repeat the above for arrivals, in anticipation of joining the two for our map.

In [27]:
bikes['hour'] = bikes['starttime'].map(lambda x: x.hour)

locations = locations.loc[:, ["start station latitude", "start station longitude", "start station name"]]
subset = bikes[bikes["hour"]==18]
dept_counts = subset.groupby("start station id").count()
dept_counts = dept_counts.iloc[:, [0]]
dept_counts.columns = ["Departure Counts"]

locations2 = bikes.groupby('end station id').first()
locations2 = locations2.loc[:, ["end station latitude", "end station longitude", "end station name"]]
subset = bikes[bikes["ehour"]==18]
arr_counts = subset.groupby("end station id").count()
arr_counts = arr_counts.iloc[:, [0]]
arr_counts.columns = ["Arrival Counts"]

In [28]:
trip_counts = dept_counts.join(locations).join(arr_counts)

In [29]:
trip_counts.head()

Unnamed: 0_level_0,Departure Counts,start station latitude,start station longitude,start station name,Arrival Counts
start station id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
72,187,40.767272,-73.993929,W 52 St & 11 Ave,198
79,403,40.719116,-74.006667,Franklin St & W Broadway,342
82,76,40.711174,-74.000165,St James Pl & Pearl St,74
83,111,40.683826,-73.976323,Atlantic Ave & Fort Greene Pl,142
116,215,40.741776,-74.001497,W 17 St & 8 Ave,213


In [30]:
for index, row in trip_counts.iterrows():
    
    net_departures = (row["Departure Counts"]-row["Arrival Counts"])
    
    radius = net_departures/7
    
    if net_departures>0:
        color="#E37222" # tangerine
    else:
        color="#0A8A9F" # teal
    
    folium.CircleMarker(location=(row["start station latitude"],
                                  row["start station longitude"]),
                        radius=radius,
                        color=color,
                        fill=True).add_to(folium_map)

In [31]:
folium_map

In [32]:



popup_text = """{}<br>
                total departures: {}<br> 
                total arrivals: {}<br>
                net departures: {}"""
 
popup_text = popup_text.format(row["start station name"],
                               row["Arrival Counts"],
                               row["Departure Counts"],
                               net_departures)

In [33]:
for index, row in trip_counts.iterrows():
   
    popup_text = row["start station name"]

In [34]:
popup_text

'NYCBS Test'

In [35]:
for index, row in trip_counts.iterrows():
    net_departures = (row["Departure Counts"]-row["Arrival Counts"])
    radius = net_departures/7
    if net_departures>0:
        color="#E37222" # tangerine
    else:
        color="#0A8A9F" # teal
    
    folium.CircleMarker(location=(row["start station latitude"],
                                  row["start station longitude"]),
                        radius=radius,
                        color=color,
                        fill=True, popup = popup_text).add_to(folium_map)

In [36]:
folium_map

### PROBLEM

Compare this image to that of when people are leaving work.  Doe you see what you expect?  What does this tell you about movement in the city?

In [72]:
subset2 = bikes[bikes["hour"]==18]
subset2

Unnamed: 0,tripduration,starttime,stoptime,start station id,start station name,start station latitude,start station longitude,end station id,end station name,end station latitude,end station longitude,bikeid,usertype,birth year,gender,hour,ehour
5848,1140,2013-06-01 18:00:19,2013-06-01 18:19:19,438,St Marks Pl & 1 Ave,40.727791,-73.985649,433.0,E 13 St & Avenue A,40.729554,-73.980572,18294,Customer,,0,18,18
5850,3631,2013-06-01 18:00:41,2013-06-01 19:01:12,500,Broadway & W 51 St,40.762288,-73.983362,500.0,Broadway & W 51 St,40.762288,-73.983362,19283,Customer,,0,18,19
5851,1105,2013-06-01 18:00:54,2013-06-01 18:19:19,438,St Marks Pl & 1 Ave,40.727791,-73.985649,342.0,Columbia St & Rivington St,40.717400,-73.980166,15511,Subscriber,1968.0,1,18,18
5852,497,2013-06-01 18:00:42,2013-06-01 18:08:59,497,E 17 St & Broadway,40.737050,-73.990093,380.0,W 4 St & 7 Ave S,40.734011,-74.002939,19436,Subscriber,1984.0,1,18,18
5853,498,2013-06-01 18:01:00,2013-06-01 18:09:18,301,E 2 St & Avenue B,40.722174,-73.983688,310.0,State St & Smith St,40.689269,-73.989129,14909,Subscriber,1964.0,1,18,18
5854,436,2013-06-01 18:01:14,2013-06-01 18:08:30,216,Columbia Heights & Cranberry St,40.700379,-73.995481,233.0,Cadman Plaza W & Pierrepont St,40.694757,-73.990527,15732,Subscriber,1965.0,2,18,18
5855,200,2013-06-01 18:01:17,2013-06-01 18:04:37,444,Broadway & W 24 St,40.742354,-73.989151,402.0,Broadway & E 22 St,40.740343,-73.989551,20232,Customer,,0,18,18
5856,1534,2013-06-01 18:01:26,2013-06-01 18:27:00,499,Broadway & W 60 St,40.769155,-73.981918,353.0,S Portland Ave & Hanson Pl,40.685396,-73.974315,19959,Subscriber,1967.0,1,18,18
5857,287,2013-06-01 18:01:26,2013-06-01 18:06:13,482,W 15 St & 7 Ave,40.739355,-73.999318,453.0,W 22 St & 8 Ave,40.744751,-73.999154,17518,Customer,,0,18,18
5858,397,2013-06-01 18:01:45,2013-06-01 18:08:22,216,Columbia Heights & Cranberry St,40.700379,-73.995481,233.0,Cadman Plaza W & Pierrepont St,40.694757,-73.990527,15091,Subscriber,1961.0,1,18,18


In [95]:
dept_counts2 = subset2.groupby("start station id").count()

In [96]:
dept_counts2 = dept_counts2.iloc[:, [0]]

In [97]:
dept_counts2.columns = ["Departure Counts"]

In [None]:
locations2 = bikes.groupby('end station id').first()
locations2 = locations2.loc[:, ["end station latitude", "end station longitude", "end station name"]]
subset = bikes[bikes["hour"]==10]
arr_counts = subset.groupby("end station id").count()
arr_counts = arr_counts.iloc[:, [0]]
arr_counts.columns = ["Arrival Counts"]