<a href="https://colab.research.google.com/github/AvonleaFisher/Analyzing-NYC-311-Service-Requests/blob/main/Exploring_the_Data/Mapbox_Density_Heatmaps.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

In this section, we'll create summaries and visualizations to explore the data.

**Note:** Notebooks with exploratory visualizations have been broken up into multiple sections to reduce file size. Dependencies required across all EDA notebooks are imported below. To view animated plots that do not render on GitHub, enter the URL into [Jupyter Notebook Viewer](https://nbviewer.jupyter.org/).

# Loading Dependencies

In [1]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import re
import string

import nltk
from nltk.corpus import stopwords
from nltk import word_tokenize
from nltk.stem import WordNetLemmatizer 
from nltk.collocations import *
from nltk import FreqDist
from nltk.probability import FreqDist
from os import path
from PIL import Image
import matplotlib.pyplot as plt
import os
from wordcloud import WordCloud, STOPWORDS

!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import plotly.express as px
import plotly.graph_objects as go
import seaborn as sns
import random
import plotly.io as pio

 #display plotly figures
pio.renderers.default = "plotly_mimetype+notebook_connected"

In [2]:
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

In [3]:
from google.colab import drive
drive.mount("/content/drive")
path = '/content/drive/MyDrive/Colab Notebooks/community_board_311.csv'
df = pd.read_csv(path, index_col=0)

Mounted at /content/drive



elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison



## Animated Heatmaps for Call Volume by Month

To visualize how the daily call volume changes over time, we'll use Plotly's [Mapbox Density Heatmap](https://plotly.com/python/mapbox-density-heatmaps). Points are plotted on a map with given geographic coordinates, and the color of the points changes based on the proximity of surrounding points on the map. In the color scale used for this map, areas with a higher density of calls will appear yellow, while low-density areas will appear purple. The data is subsetted by month so that a separate animation can be produced for each month. In order to reduce file size, a random subset of the data from each month is taken below, and the code for rendering the animation is active only for the month of August.

In [4]:
#Subset the data by month
June = df[df.month==6].sample(frac =.5)
July = df[df.month==7].sample(frac =.5)
August = df[df.month==8].sample(frac =.1)
September = df[df.month==9].sample(frac =.5)
October = df[df.month==10].sample(frac =.5)
November = df[df.month==11].sample(frac =.5)

In [5]:
def plot_calls(month_df, month):
  
    """Takes in a subsetted dataframe with the data for a given month, and the 
       name of the month as a string. Returns an animated mapbox density heatmap
       to show variation in call volume across time and space."""

    fig = px.density_mapbox(month_df, lat=month_df['latitude'], 
                            lon=month_df['longitude'], 
                            radius=2,
                            animation_frame=month_df["day"],
                            hover_name=None, 
                            hover_data=['complaint_type', 'community_board'],
                            width=550, height=550,
                            color_continuous_scale=px.colors.sequential.Inferno
                                                            )
    fig.update_layout(mapbox_style="carto-positron", mapbox_zoom=8.5, 
                  mapbox_center = {"lat": 40.6885, "lon": -73.93211},)
                      
    fig.layout.coloraxis.showscale = False    
                      
    fig.update_layout(title={
        'text': 'Call Volume in {}'.format(month),
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'}),

    font=dict(family="silom",
              size=14, color="#58508d")

    fig.update_layout(transition = {'duration': 10})

    fig.show()

In [6]:
# plot_calls(June, 'June')

In [7]:
# plot_calls(July, 'July')

In [8]:
plot_calls(August, 'August')

In [9]:
# plot_calls(September, 'September')

In [10]:
# plot_calls(October, 'October')

In [11]:
# plot_calls(November, 'November')

We can see that the greatest call density is consistently concentrated in the Bronx and Upper Manhattan. Staten Island, conversely, has a consistenty low call density. This is unsurprising given, for one, that it is the least densely populated borough in New York City. On August 4th and the subsequent couple of days, there is a notable flare-up in calls throughout the entire city. In the next section of the analysis, we will explore this further with area plots.  

# References

[Mapbox Density Heatmap in Python](https://plotly.com/python/mapbox-density-heatmaps/)
