# Finding Datasets
This notebook shows how to find datasets for a state


In [1]:
import openpolicedata as opd

In [2]:
# Query for the entire table of available data as a pandas DataFrame (https://pandas.pydata.org/docs/user_guide/10min.html#min)
# This shows all the datasets that are available for access
# This information can be filtered to find a dataset of interest
datasets = opd.datasets.query()
datasets.head()

Unnamed: 0,State,SourceName,Agency,AgencyFull,TableType,coverage_start,coverage_end,last_coverage_check,Description,source_url,readme,URL,Year,DataType,date_field,dataset_id,agency_field,min_version,query
0,Arizona,Chandler,Chandler,Chandler Police Department,ARRESTS,2018-01-01,2024-05-08,05/09/2024,Arrest reports completed by a Chandler Police ...,https://data.chandlerpd.com/catalog/arrest-boo...,,https://data.chandlerpd.com/catalog/arrest-boo...,MULTIPLE,CSV,arrest_date_time,,,0.2,
1,Arizona,Chandler,Chandler,Chandler Police Department,CALLS FOR SERVICE,2018-01-01,2024-05-08,05/09/2024,This dataset contains details for all of the c...,https://data.chandlerpd.com/catalog/calls-for-...,,https://data.chandlerpd.com/catalog/calls-for-...,MULTIPLE,CSV,call_received_date_time,,,,
2,Arizona,Chandler,Chandler,Chandler Police Department,INCIDENTS,2018-01-01,2024-05-02,05/09/2024,This dataset contains details for all of the g...,https://data.chandlerpd.com/catalog/general-of...,,https://data.chandlerpd.com/catalog/general-of...,MULTIPLE,CSV,report_event_date,,,0.4.1,
3,Arizona,Gilbert,Gilbert,Gilbert Police Department,CALLS FOR SERVICE,2006-11-15,2024-05-08,05/09/2024,,https://data.gilbertaz.gov/maps/2dcb4c20c9a444...,,https://maps.gilbertaz.gov/arcgis/rest/service...,MULTIPLE,ArcGIS,EventDate,,,,
4,Arizona,Gilbert,Gilbert,Gilbert Police Department,EMPLOYEE,NaT,NaT,07/06/2023,A data set of all employees that have previous...,https://data.gilbertaz.gov/datasets/TOG::gilbe...,,https://services1.arcgis.com/JLuzSHjNrLL4Okwb/...,NONE,ArcGIS,,,,,


In [3]:
# Find out which states data is available for
print(f"These states have datasets: {datasets['State'].unique()}")

These states have datasets: ['Arizona' 'Arkansas' 'California' 'Colorado' 'Connecticut' 'Delaware'
 'District of Columbia' 'Florida' 'Georgia' 'Idaho' 'Illinois' 'Indiana'
 'Iowa' 'Kansas' 'Kentucky' 'Louisiana' 'Maryland' 'Massachusetts'
 'Michigan' 'Minnesota' 'Mississippi' 'Missouri' 'Montana' 'Nebraska'
 'Nevada' 'New Hampshire' 'New Jersey' 'New York' 'North Carolina'
 'North Dakota' 'Ohio' 'Oklahoma' 'Oregon' 'Pennsylvania' 'Rhode Island'
 'South Carolina' 'South Dakota' 'Tennessee' 'Texas' 'Vermont' 'Virginia'
 'Washington' 'Wisconsin' 'Wyoming']


In [4]:
# To see all available datasets for a state use the following filter.
df = opd.datasets.query(state="Maryland")
df.head()

Unnamed: 0,State,SourceName,Agency,AgencyFull,TableType,coverage_start,coverage_end,last_coverage_check,Description,source_url,readme,URL,Year,DataType,date_field,dataset_id,agency_field,min_version,query
470,Maryland,Baltimore,Baltimore,Baltimore Police Department,ARRESTS,2010-01-01,2024-05-04,05/10/2024,This dataset represents arrests made by the Ba...,https://data.baltimorecity.gov/datasets/baltim...,,https://egis.baltimorecity.gov/egis/rest/servi...,MULTIPLE,ArcGIS,ArrestDateTime,,,0.2,
471,Maryland,Baltimore,Baltimore,Baltimore Police Department,CALLS FOR SERVICE,2017-01-01,2017-12-31,07/06/2023,Police Emergency and Non-Emergency calls to 911,https://data.baltimorecity.gov/datasets/baltim...,,https://services1.arcgis.com/UWYHeuuJISiGmgXx/...,2017,ArcGIS,,,,,
472,Maryland,Baltimore,Baltimore,Baltimore Police Department,CALLS FOR SERVICE,2018-01-01,2018-12-31,07/06/2023,Police Emergency and Non-Emergency calls to 912,https://data.baltimorecity.gov/datasets/baltim...,,https://services1.arcgis.com/UWYHeuuJISiGmgXx/...,2018,ArcGIS,,,,,
473,Maryland,Baltimore,Baltimore,Baltimore Police Department,CALLS FOR SERVICE,2019-01-01,2019-12-31,07/06/2023,Police Emergency and Non-Emergency calls to 913,https://data.baltimorecity.gov/datasets/baltim...,,https://services1.arcgis.com/UWYHeuuJISiGmgXx/...,2019,ArcGIS,,,,,
474,Maryland,Baltimore,Baltimore,Baltimore Police Department,CALLS FOR SERVICE,2020-01-01,2020-12-31,07/06/2023,Police Emergency and Non-Emergency calls to 914,https://data.baltimorecity.gov/datasets/baltim...,,https://services1.arcgis.com/UWYHeuuJISiGmgXx/...,2020,ArcGIS,,,,,


In [5]:
# Now further filter the dataset by looking for particular data in a particular state
# First look at the table data options for the state
df = opd.datasets.query(state="Maryland")
print(f"{df.iloc[0]['State']} has the following tables available: {df['TableType'].unique()}")


Maryland has the following tables available: ['ARRESTS' 'CALLS FOR SERVICE' 'STOPS' 'TRAFFIC STOPS' 'COMPLAINTS'
 'CRASHES - INCIDENTS' 'CRASHES - NONMOTORIST' 'CRASHES - SUBJECTS'
 'INCIDENTS']


In [6]:
# For example query for all traffic stops data containing traffic stops in Maryland select the table_type using the names from the previous cell.
df = opd.datasets.query(table_type='TRAFFIC STOPS', state="Maryland")
df.head()
# To learn how to load the data open the notebook: loading_datasets.ipynb

Unnamed: 0,State,SourceName,Agency,AgencyFull,TableType,coverage_start,coverage_end,last_coverage_check,Description,source_url,readme,URL,Year,DataType,date_field,dataset_id,agency_field,min_version,query
479,Maryland,Maryland,MULTIPLE,,TRAFFIC STOPS,2007-01-01,2014-03-31,01/10/2024,Standardized stop data from the Stanford Open ...,https://openpolicing.stanford.edu/data/,https://github.com/stanford-policylab/opp/blob...,https://stacks.stanford.edu/file/druid:yg821jf...,MULTIPLE,CSV,date,,department_name,,
485,Maryland,Montgomery County,Montgomery County,Montgomery County Police Department,TRAFFIC STOPS,2012-06-07,2024-05-09,05/10/2024,This dataset contains traffic violation inform...,https://data.montgomerycountymd.gov/Public-Saf...,,data.montgomerycountymd.gov,MULTIPLE,Socrata,date_of_stop,4mse-ku6q,,,
