# CRQ2
###### Visualize Taxis movements! NYC is divided in many Taxis zones. For each yellow cab trip we know the zone the Taxi pick up and drop off the users. Let's visualize, on a chropleth map, the number of trips that starts in each zone. Than, do another map to count the races that end up in the single zone. Comment your discoveries. To perform this task we use the library folium. 

In [None]:
import pandas as pd
import folium
import json

### Loading Data

**Variables explaination**: <br>
*df* : contains two columns, PULocationID and DOLocationID, respectively: the location id for the pickup and the dropoff of the trip record, this DataFrame is the core for the next steps. <br>
*geo_data* :

In [None]:
df = pd.DataFrame()
for month in ['01', '02', '03', '04', '05']:
        df = df.append(pd.DataFrame(pd.read_csv("yellow_tripdata_2018-"+month+".csv", nrows=10000, usecols=['PULocationID', 'DOLocationID'])), ignore_index = True)

geo_data = json.load(open('taxi_zones.json'))
df.head()

The next chunk creates a simple map with the coordinates of New York City, as you can see, it's empty: in the next chunks we fill it with data

In [None]:
NYmap=folium.Map(
    location=[40.7142700, -74.0059700],   #coordinates of new York
    zoom_start=11,                        
    tiles='CartoDB positron'              #style of our map
)
NYmap

*This shows the recorded zones*: at this moment there are no pickups or dropoffs in the map. <br>
We want only to show the zone were data were registered.

In [None]:
folium.GeoJson(
    geo_data,
    style_function=lambda feature: {
        'fillColor':'blue',
        'color' : 'orchid',
        'weight' : 1,
        'fillOpacity' : 0.5,
        'colorOpacity' : 0.5
    }
).add_to(NYmap)

NYmap


**Core of data manipulation** <br>

In [None]:
location_id_to_number_of_pickups=df.groupby('PULocationID')['PULocationID'].count()
location_id_to_number_of_dropoff=df.groupby('DOLocationID')['DOLocationID'].count()

location_id_to_number_of_pickups.head()

df_zone_to_pickup_to_dropoff=pd.DataFrame(index=list(range(1,266)),columns=[])
df_zone_to_pickup_to_dropoff['ZoneID']=list(range(1,266))

zone_index_to_dropoff_number = []
zone_index_to_pickup_number = []

for i in range(1,266):
    if i in location_id_to_number_of_pickups:
        zone_index_to_pickup_number.append(location_id_to_number_of_pickups[i])
    else:
        zone_index_to_pickup_number.append(0)    
        
    if i in location_id_to_number_of_dropoff:
        zone_index_to_dropoff_number.append(location_id_to_number_of_dropoff[i])
    else:
        zone_index_to_dropoff_number.append(0)     #we need to do this check becouse some zones are missing (maybe 0 taxi taken in that zone)

df_zone_to_pickup_to_dropoff['taxi_pickups'] = zone_index_to_pickup_number
df_zone_to_pickup_to_dropoff['taxi_dropoff'] = zone_index_to_dropoff_number
df_zone_to_pickup_to_dropoff.head()

Now he have our fixed data frame that we can use to create our fantastic Choropleth maps.


In [None]:
NYmap2 = folium.Map(
    location=[40.7142700, -74.0059700],   #coordinates of new York
    zoom_start=11,                        
    tiles='CartoDB positron'              #style of our map
)

NYmap2.choropleth(
    geo_data=geo_data,  #our geojson datas
    data=df_zone_to_pickup_to_dropoff,    #our dataframe
    columns=['ZoneID', 'taxi_pickups'],
    key_on='feature.properties.LocationID', #the key in geojson file that way want to take as zone
    fill_color='YlGnBu',   #the color scale that we want
    fill_opacity=0.8,
    line_opacity=0.2,
    legend_name='Number of taxi taken in 2018',
    highlight=True    #enable the highlight function, to enable highlight functionality when you hover over each area.
)

folium.Marker(
    location=[40.7730135746, -73.8702298524],
    popup='Airport LaGuardia',
    icon=folium.Icon(icon='plane')
).add_to(NYmap2)

folium.Marker(
    location=[40.6413111, -73.7781391],
    popup='John F. Kennedy International Airport',
    icon=folium.Icon(icon='plane')
).add_to(NYmap2)

folium.Marker(
    location=[40.7828647, -73.9653551],
    popup = 'Central Park',
    icon=folium.Icon(color='red')
).add_to(NYmap2)

folium.Marker(
    location=[40.758895, -73.985131],
    popup = 'Times Square',
    icon=folium.Icon(color='red')
).add_to(NYmap2)

folium.Marker(
    location=[40.7061927, -74.0091604],
    popup = 'Wall Street',
    icon=folium.Icon(color='red')
).add_to(NYmap2)

folium.Marker(
    location=[40.692013, -74.181557],
    popup='Newark Liberty International Airport',
    icon=folium.Icon(icon='plane')
).add_to(NYmap2)

NYmap2

Now we want to do same job but for drop location.

In [None]:
NYmap3 = folium.Map(
    location=[40.7142700, -74.0059700],   #coordinates of new York
    zoom_start=11,                        
    tiles='CartoDB positron'              #style of our map
)

NYmap3.choropleth(
    geo_data=geo_data,
    data=df_zone_to_pickup_to_dropoff,
    columns=['ZoneID', 'taxi_dropoff'],
    key_on='feature.properties.LocationID',
    fill_color='YlGnBu',
    fill_opacity=0.8,
    line_opacity=0.2,
    legend_name='Number of taxi drops in 2018',
    highlight=True    
)
folium.Marker(
    location=[40.7730135746, -73.8702298524],
    popup='LaGuardia Airport',
    icon=folium.Icon(icon='plane')
).add_to(NYmap3)

folium.Marker(
    location=[40.6413111, -73.7781391],
    popup='John F. Kennedy International Airport',
    icon=folium.Icon(icon='plane')
).add_to(NYmap3)

folium.Marker(
    location=[40.7828647, -73.9653551],
    popup = 'Central Park',
    icon=folium.Icon(color='red')
).add_to(NYmap3)

folium.Marker(
    location=[40.758895, -73.985131],
    popup = 'Times Square',
    icon=folium.Icon(color='red')
).add_to(NYmap3)

folium.Marker(
    location=[40.7061927, -74.0091604],
    popup = 'Wall Street',
    icon=folium.Icon(color='red')
).add_to(NYmap3)

folium.Marker(
    location=[40.692013, -74.181557],
    popup='Newark Liberty International Airport',
    icon=folium.Icon(icon='plane')
).add_to(NYmap3)

NYmap3

## RESULTS

As we can appreciate from maps, use of Yellow taxis is highly concentrated in Manhattan. This is true for taxi drops and even more for taxi pickups.
We see that, outside Manhattan, we have two other zones where taxi are used a lot.Predictably, One is "John Fitzgerald Kennedy International", the other one is "La Guardia" airport.