# NYC 311 Noise Complaint Analysis

**Author:** Olivia Mohning  
**Date:** July 2025  
**Dataset:** `311_noise_complaints_sample_small.csv`  

Exploring patterns and forecasting trends in noise complaint data from NYC Open Data. This project includes data cleaning, exploratory data analysis (EDA), and time-series modeling using Python, SQL, and other core data science tools.

## Table of Contents

1. [Imports and Setup](#Imports-and-Setup)
2. [Data Preview & Basic Structure](#Data-Preview-&-Basic-Structure)
3. [Filtering to Noise Complaints, Basic Cleaning](#Filtering-to-Noise-Complaints,-Basic-Cleaning)
4. [Time-Based Trends](#Time-Based-Trends) *(coming soon)*
5. [Forecasting and Modeling](#Forecasting-and-Modeling) *(coming soon)*
6. [Conclusions & Next Steps](Conclusions--Next-Steps) *(coming soon)*

## Imports and Setup

Load core libraries for data analysis.

In [1]:
import pandas as pd

## Data Preview & Basic Structure

Load the dataset, preview its dimensions, and inspect the columns to get a sense of the data.

In [2]:
# Loading in a random sample of 20,000 instances from the NYC 311 noise complaint public dataset
df = pd.read_csv("311_noise_complaints_sample_small.csv")

# Preview dataset size
print(f"\nDataset shape: {df.shape}")


Dataset shape: (20000, 41)


In [3]:
# Column names, data types, and non-null counts
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 41 columns):
 #   Column                          Non-Null Count  Dtype  
---  ------                          --------------  -----  
 0   unique_key                      20000 non-null  int64  
 1   created_date                    20000 non-null  object 
 2   closed_date                     16668 non-null  object 
 3   agency                          20000 non-null  object 
 4   agency_name                     20000 non-null  object 
 5   complaint_type                  20000 non-null  object 
 6   descriptor                      19461 non-null  object 
 7   location_type                   17596 non-null  object 
 8   incident_zip                    19825 non-null  float64
 9   incident_address                19312 non-null  object 
 10  street_name                     19310 non-null  object 
 11  cross_street_1                  17116 non-null  object 
 12  cross_street_2                  

Below is a reference key describing the columns, adapted from the [NYC Open Data documentation](https://data.cityofnewyork.us/Social-Services/311-Service-Requests-from-2010-to-Present/erm2-nwe9/data_dictionary).

- **unique_key**: Unique identifier for each service request  
- **created_date**: Date and time the complaint was created  
- **closed_date**: Date and time the complaint was closed (if closed)  
- **agency**: Code for the agency handling the complaint  
- **agency_name**: Full name of the agency  
- **complaint_type**: General category of the complaint (e.g., "Noise")  
- **descriptor**: More detailed sub-category of the complaint  
- **location_type**: Type of location (e.g., residential building, street, etc.)  
- **incident_zip**: ZIP code where the incident occurred  
- **incident_address**: Street address where the complaint occurred  
- **street_name**: Street name only  
- **cross_street_1** / **cross_street_2**: Intersecting streets near the incident  
- **intersection_street_1** / **intersection_street_2**: Alternate fields for intersection location  
- **address_type**: How the address was provided (e.g., exact, intersection, etc.)  
- **city**: Name of the city the incident occurred in  
- **landmark**: Noted nearby landmark (if applicable)  
- **facility_type**: Type of facility involved (very sparse)  
- **status**: Current status of the complaint (e.g., “Closed”, “Open”)  
- **due_date**: When the agency aimed to resolve the issue by (rarely filled)  
- **resolution_description**: Description of the resolution or response  
- **resolution_action_updated_date**: Timestamp of last resolution update  
- **community_board**: Community board jurisdiction  
- **bbl**: Borough-Block-Lot code (for NYC land lots)  
- **borough**: NYC borough (Manhattan, Bronx, etc.)  
- **x_coordinate_state_plane** / **y_coordinate_state_plane**: NYC-specific coordinates  
- **open_data_channel_type**: How the complaint was submitted (phone, app, etc.)  
- **park_facility_name**: Park facility name (if applicable)  
- **park_borough**: Borough assigned to the park  
- **vehicle_type**: Type of vehicle involved (sparse)  
- **taxi_company_borough**: Borough of taxi company (rare)  
- **taxi_pick_up_location**: Taxi pick-up area (rare)  
- **bridge_highway_name**, **bridge_highway_direction**, **road_ramp**, **bridge_highway_segment**: Location data for complaints on highways/bridges (rare)  
- **latitude** / **longitude**: Geographic coordinates  
- **location**: Combined lat/long string


## Filtering to Noise Complaints, Basic Cleaning
Narrowing the dataset to only the noise complaints, and dropping columns that become irrelevant

In [4]:
# Narrowing the dataset to only noise-related reports
noise_df = df[df['complaint_type'].str.contains("Noise", na=False)].copy()

# Dropping columns where every value is now null after filtering for just noise complaints
noise_df = noise_df.dropna(axis=1, how='all')

# Display remaining column names
print(f"Remaining columns: {noise_df.shape[1]}\n")
display(noise_df.dtypes)

Remaining columns: 33



unique_key                          int64
created_date                       object
closed_date                        object
agency                             object
agency_name                        object
complaint_type                     object
descriptor                         object
location_type                      object
incident_zip                      float64
incident_address                   object
street_name                        object
cross_street_1                     object
cross_street_2                     object
intersection_street_1              object
intersection_street_2              object
address_type                       object
city                               object
landmark                           object
status                             object
resolution_description             object
resolution_action_updated_date     object
community_board                    object
bbl                               float64
borough                           

In [18]:
# Checking for redundant entries
dupes = noise_df.duplicated().sum()
print(f"Duplicate rows: {dupes}\n")

# Exploring complaint type counts
print(noise_df['complaint_type'].value_counts().head(10))

Duplicate rows: 0

complaint_type
Noise - Residential         2467
Noise - Street/Sidewalk     1709
Noise - Commercial           493
Noise - Vehicle              349
Noise                        326
Noise - Park                 109
Noise - Helicopter            85
Noise - House of Worship      12
Name: count, dtype: int64


In [25]:
# Viewing a random sample of instances to decide if any columns must be reformatted
with pd.option_context('display.max_columns', None):
    display(noise_df.sample(15, random_state=42))

Unnamed: 0,unique_key,created_date,closed_date,agency,agency_name,complaint_type,descriptor,location_type,incident_zip,incident_address,street_name,cross_street_1,cross_street_2,intersection_street_1,intersection_street_2,address_type,city,landmark,status,resolution_description,resolution_action_updated_date,community_board,bbl,borough,x_coordinate_state_plane,y_coordinate_state_plane,open_data_channel_type,park_facility_name,park_borough,vehicle_type,latitude,longitude,location
6843,65234814,2025-06-11T23:29:54.000,2025-06-12T00:42:19.000,NYPD,New York City Police Department,Noise - Commercial,Loud Music/Party,Club/Bar/Restaurant,10004.0,47 STONE STREET,STONE STREET,COENTIES ALLEY,MILL LANE,COENTIES ALLEY,MILL LANE,ADDRESS,NEW YORK,STONE STREET,Closed,The Police Department responded to the complai...,2025-06-12T00:42:23.000,01 MANHATTAN,1000298000.0,MANHATTAN,981390.0,195892.0,ONLINE,Unspecified,MANHATTAN,,40.704355,-74.010315,"\n, \n(40.70435458995396, -74.01031512653833)"
18866,65264690,2025-06-14T15:27:27.000,2025-06-14T16:06:14.000,NYPD,New York City Police Department,Noise - Residential,Loud Music/Party,Residential Building/House,11236.0,10622 FARRAGUT ROAD,FARRAGUT ROAD,EAST 105 STREET,EAST 108 STREET,EAST 105 STREET,EAST 108 STREET,ADDRESS,BROOKLYN,FARRAGUT ROAD,Closed,The Police Department responded to the complai...,2025-06-14T16:06:18.000,18 BROOKLYN,3081740000.0,BROOKLYN,1012907.0,176581.0,ONLINE,Unspecified,BROOKLYN,,40.651304,-73.896725,"\n, \n(40.65130435746643, -73.89672531475608)"
15312,65219843,2025-06-10T23:56:16.000,2025-06-11T00:56:36.000,NYPD,New York City Police Department,Noise - Residential,Loud Music/Party,Residential Building/House,11206.0,141 MONTROSE AVENUE,MONTROSE AVENUE,MANHATTAN AVENUE,GRAHAM AVENUE,MANHATTAN AVENUE,GRAHAM AVENUE,ADDRESS,BROOKLYN,MONTROSE AVENUE,Closed,The Police Department responded to the complai...,2025-06-11T00:56:40.000,01 BROOKLYN,3030520000.0,BROOKLYN,999829.0,196964.0,ONLINE,Unspecified,BROOKLYN,,40.707284,-73.943809,"\n, \n(40.70728372457755, -73.94380894115152)"
742,65265720,2025-06-14T21:32:29.000,2025-06-14T21:50:11.000,NYPD,New York City Police Department,Noise - Residential,Loud Music/Party,Residential Building/House,11235.0,206 CORBIN PLACE,CORBIN PLACE,ORIENTAL BOULEVARD,DEAD END,ORIENTAL BOULEVARD,DEAD END,ADDRESS,BROOKLYN,CORBIN PLACE,Closed,The Police Department responded to the complai...,2025-06-14T21:50:15.000,13 BROOKLYN,3087230000.0,BROOKLYN,997078.0,149362.0,ONLINE,Unspecified,BROOKLYN,,40.576631,-73.953822,"\n, \n(40.576630914121814, -73.95382188377533)"
12129,65199101,2025-06-07T15:07:55.000,2025-06-07T16:22:15.000,NYPD,New York City Police Department,Noise - Street/Sidewalk,Loud Music/Party,Street/Sidewalk,11385.0,55-06 MYRTLE AVENUE,MYRTLE AVENUE,MADISON STREET,PUTNAM AVENUE,MADISON STREET,PUTNAM AVENUE,ADDRESS,RIDGEWOOD,MYRTLE AVENUE,Closed,The Police Department responded to the complai...,2025-06-07T16:22:20.000,05 QUEENS,4035450000.0,QUEENS,1009767.0,194287.0,ONLINE,Unspecified,QUEENS,,40.699913,-73.907974,"\n, \n(40.699912913907106, -73.9079742673068)"
18209,65274883,2025-06-15T03:53:06.000,2025-06-15T04:45:55.000,NYPD,New York City Police Department,Noise - Street/Sidewalk,Loud Music/Party,Street/Sidewalk,10458.0,2340 CROTONA AVENUE,CROTONA AVENUE,EAST 183 STREET,EAST 187 STREET,EAST 183 STREET,EAST 187 STREET,ADDRESS,BRONX,CROTONA AVENUE,Closed,The Police Department responded to the complai...,2025-06-15T04:46:02.000,06 BRONX,2031020000.0,BRONX,1016286.0,249983.0,PHONE,Unspecified,BRONX,,40.852762,-73.884198,"\n, \n(40.85276240967357, -73.8841983188242)"
3956,65257782,2025-06-14T00:45:37.000,2025-06-14T01:02:28.000,NYPD,New York City Police Department,Noise - Street/Sidewalk,Loud Music/Party,Street/Sidewalk,10453.0,DAVIDSON AVENUE,DAVIDSON AVENUE,DAVIDSON AVENUE,WEST 176 STREET,DAVIDSON AVENUE,WEST 176 STREET,INTERSECTION,,,Closed,The Police Department responded to the complai...,2025-06-14T01:02:31.000,05 BRONX,,BRONX,1008326.0,248396.0,PHONE,Unspecified,BRONX,,40.848432,-73.912977,"\n, \n(40.84843186100232, -73.91297729395332)"
7029,65214548,2025-06-09T19:02:42.000,2025-06-09T19:23:14.000,NYPD,New York City Police Department,Noise - Residential,Loud Talking,Residential Building/House,11204.0,1 DAHL COURT,DAHL COURT,DEAD END,58 STREET,DEAD END,58 STREET,ADDRESS,BROOKLYN,DAHL COURT,Closed,The Police Department responded to the complai...,2025-06-09T19:23:17.000,12 BROOKLYN,3054940000.0,BROOKLYN,988666.0,166099.0,ONLINE,Unspecified,BROOKLYN,,40.622579,-73.984092,"\n, \n(40.6225787685612, -73.98409238126375)"
16176,65232097,2025-06-11T22:49:28.000,2025-06-11T23:28:28.000,NYPD,New York City Police Department,Noise - Vehicle,Car/Truck Music,Street/Sidewalk,10455.0,990 LEGGETT AVENUE,LEGGETT AVENUE,BECK STREET,FOX STREET,BECK STREET,FOX STREET,ADDRESS,BRONX,LEGGETT AVENUE,Closed,The Police Department responded to the complai...,2025-06-11T23:28:31.000,02 BRONX,2026840000.0,BRONX,1011965.0,236061.0,ONLINE,Unspecified,BRONX,SUV,40.814565,-73.899875,"\n, \n(40.814565178701955, -73.89987509966802)"
1334,65215816,2025-06-09T23:04:36.000,2025-06-10T00:12:20.000,NYPD,New York City Police Department,Noise - Residential,Banging/Pounding,Residential Building/House,10460.0,985 EAST 174 STREET,EAST 174 STREET,BRYANT AVENUE,LONGFELLOW AVENUE,BRYANT AVENUE,LONGFELLOW AVENUE,ADDRESS,BRONX,EAST 174 STREET,Closed,The Police Department responded to the complai...,2025-06-10T00:12:23.000,03 BRONX,2029980000.0,BRONX,1016184.0,243988.0,MOBILE,Unspecified,BRONX,,40.836308,-73.884596,"\n, \n(40.83630827673318, -73.88459557090619)"


## Further Data Cleaning

Some findings from the above random sample of instances, and plan of action for more cleaning:
1. `created_date` and `closed_date`: created_date is datetime64, closed_date is still string. May need conversion.
2. `agency` and `agency_name`: Always NYPD; probably not useful for filtering or analysis.
3. `complaint_type` and `descriptor`: Rich source of info. Good candidates for grouping/severity classification later.
4. `location_type`: Potentially interesting; might correlate with complaint_type or borough.
5. `incident_zip`: Probably not necessary if `borough` included. Will check.
6. `incident_address`, `street_name`: Useful for mapping or aggregating. Street name redundant after address. Could combine/clean.
7. `cross_street_1`, `cross_street_2`, `intersection_street_1`, `intersection_street_2`: Redundant after address. Could clean.
8. `address_type`: Could be useful. Explore further.
9. `city`, `landmark`: Consider omitting in favor of borough and address.
10. `status`, `resolution_description`: Could be useful. Explore further.
12. `resolution_action_updated_date`: Compare to created_date and closed_date to determine whether useful.
13. `community_board`, `bbl`: May or may not be useful.
15. `borough`: Definitely useful. Central for geographic analysis.
16. `x_coordinate_state_plane`, `y_coordinate_state_plane`, `latitude`, `longitude`, `location`: Probably redundant, location is messy, latitude/longitude likely best for plotting.
17. `open_data_channel_type`: Keep for now, might show behavioral trends of complaint callers.
18. `park_facility_name`: Likely just noise, consider dropping.
19. `park_borough`: Redundant after borough, drop it.
20. `vehicle_type`: Mostly full of NULLs, also irrelevant. Drop it.

## Data Manipulation (add description below)
Description here

In [7]:
# Convert 'created_date' to datetime
df['created_date'] = pd.to_datetime(df['created_date'], errors='coerce')