# DATA 202 - Module 10: AI and Dashboards
* Instructor: Dr. Josh Fagan

### Instructions

Welcome to the Module 10 assignment of DATA 202. This assignment is meant to help you review/familiarize yourself with creating dashboards.

To receive credit for a assignment, answer all questions correctly and submit before the deadline listed on Canvas.

---
### Collaboration Policy

Data science is a collaborative activity. While you may talk with others about the labs, we ask that you **write your solutions individually**. If you do discuss the assignments with others please **include their names** below.

**Collaborators**: *list collaborators here*
* Jessi Hudgins 
* Joseph Beller (vol study center)

## Introduction
The goal of this assignment is to use the boston cime dataset (listed as `crime.csv`) to create an interactive dashboard using `Plotly` and `Dash`. Please review any posted class materials for instructions on installing appropriate tools. 

As we have been using `seaborn` for most of our plotting, and not`Plotly`, I thought this was also a prime time to introduce exciting new AI tools and have ChatGPT take a first pass at creating the code for us.

So this project boils down to two main component:
1) Craft a prompt for ChatGPT to generate an initial pass at coding up our dashboard  
2) Edit and debug the provided code to get it to look like what we want

## Exercise 1 - Craft a Prompt
Craft a prompt to ChatGPT telling the tool to create a dashboard using `Plotly` and `Dash` and give you the code.

Use this link when telling ChatGPT what data to use:  
https://www.kaggle.com/datasets/AnalyzeBoston/crimes-in-boston?select=crime.csv

The dashboard should have the following elements:
- A map element showing the location of each crime
- A bar plot showing the total number of crimes by district
- A pie chart showing the total number of crimes by day of the week
- A line plot showing how the number of crimes has changed over time. There should be a separate line for each district
- Have an interactive element to change what crimes you are looking at

Paste your prompt into the markdown cell below:

**Your prompt here:**
I am learning how to use the python packages Plotly and Dash and using this dataset: https://www.kaggle.com/datasets/AnalyzeBoston/crimes-in-boston?select=crime.csv

I need to make a dashboard that includes the following: 
	•	A map element showing the location of each crime
	•	A bar plot showing the total number of crimes by district
	•	A pie chart showing the total number of crimes by day of the week
	•	A line plot showing how the number of crimes has changed over time. There should be a separate line for each district
	•	Have an interactive element to change what crimes you are looking at

Paste and run the given code (without edits) in the cell below.

In [1]:
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import plotly.express as px
import pandas as pd

# Load the dataset
data = pd.read_csv("/Users/carolinelpetersen/Desktop/DATA202/data/crime.csv")

# Preprocess the data
# Extract necessary information for each plot

# Create the Dash app
app = dash.Dash(__name__)

# Define the layout of the dashboard
app.layout = html.Div([
    dcc.Dropdown(
        id='crime-dropdown',
        options=[{'label': crime, 'value': crime} for crime in data['OFFENSE_CODE_GROUP'].unique()],
        value='Select Crime Type'
    ),
    html.Div([
        dcc.Graph(id='crime-map'),
        dcc.Graph(id='crime-bar'),
    ], style={'display': 'flex'}),
    html.Div([
        dcc.Graph(id='crime-pie'),
        dcc.Graph(id='crime-line'),
    ], style={'display': 'flex'}),
])

# Define callback to update map
@app.callback(
    Output('crime-map', 'figure'),
    [Input('crime-dropdown', 'value')]
)
def update_map(selected_crime):
    if selected_crime == 'Select Crime Type':
        filtered_data = data.copy()
    else:
        filtered_data = data[data['OFFENSE_CODE_GROUP'] == selected_crime]

    # Create map plot using Plotly Express
    fig = px.scatter_mapbox(filtered_data, lat='Lat', lon='Long', hover_name='DISTRICT',
                            hover_data=['OFFENSE_CODE_GROUP'], zoom=10)
    fig.update_layout(mapbox_style="open-street-map")
    return fig

# Define callback to update bar plot
@app.callback(
    Output('crime-bar', 'figure'),
    [Input('crime-dropdown', 'value')]
)
def update_bar(selected_crime):
    if selected_crime == 'Select Crime Type':
        grouped_data = data.groupby('DISTRICT').size().reset_index(name='Total Crimes')
    else:
        filtered_data = data[data['OFFENSE_CODE_GROUP'] == selected_crime]
        grouped_data = filtered_data.groupby('DISTRICT').size().reset_index(name='Total Crimes')

    # Create bar plot using Plotly Express
    fig = px.bar(grouped_data, x='DISTRICT', y='Total Crimes', title='Total Crimes by District')
    return fig

# Define callback to update pie chart
@app.callback(
    Output('crime-pie', 'figure'),
    [Input('crime-dropdown', 'value')]
)
def update_pie(selected_crime):
    if selected_crime == 'Select Crime Type':
        grouped_data = data.groupby('DAY_OF_WEEK').size().reset_index(name='Total Crimes')
    else:
        filtered_data = data[data['OFFENSE_CODE_GROUP'] == selected_crime]
        grouped_data = filtered_data.groupby('DAY_OF_WEEK').size().reset_index(name='Total Crimes')

    # Create pie chart using Plotly Express
    fig = px.pie(grouped_data, names='DAY_OF_WEEK', values='Total Crimes', title='Total Crimes by Day of the Week')
    return fig

# Define callback to update line plot
@app.callback(
    Output('crime-line', 'figure'),
    [Input('crime-dropdown', 'value')]
)
def update_line(selected_crime):
    if selected_crime == 'Select Crime Type':
        grouped_data = data.groupby(['DISTRICT', 'YEAR']).size().reset_index(name='Total Crimes')
    else:
        filtered_data = data[data['OFFENSE_CODE_GROUP'] == selected_crime]
        grouped_data = filtered_data.groupby(['DISTRICT', 'YEAR']).size().reset_index(name='Total Crimes')

    # Create line plot using Plotly Express
    fig = px.line(grouped_data, x='YEAR', y='Total Crimes', color='DISTRICT',
                  title='Total Crimes Over Time by District')
    return fig

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 8190: invalid start byte

### Exercise 1 Grading Notes

Final Exercise 1 Grade:

50/50

## Exercise 2 - Editing and Debugging
There will likely be a few things you need to change after getting the code from ChatGPT. Below are a list of steps you may need to do:

1. Change where the data is coming from  
ChatGPT will try to use a link that most likely does not work. Run the cell below this one (changing the path to the file) to load the data and get a subset for our dashboard.
2. Ensure the right columns are being used
3. Ensure the right arguments are being structured and passed correctly

Copy and paste the code given to you from ChatGPT in the cell below and make the appropriate edits. If you need to delete lines, just comment them out instead of getting rid of them. On lines that you have to edit, leave a comment at the end of the line with your initials. 

For example, if the code give you:
```
# Load the dataset from the CSV file
url = "https://raw.githubusercontent.com/AnalyzeBoston/crimes-in-boston/master/crime.csv"
df = pd.read_csv(url)
```
And you want to change it to use the `crime` DataFrame created in the following cell, you may have:
```
# Load the dataset from the CSV file
#url = "https://raw.githubusercontent.com/AnalyzeBoston/crimes-in-boston/master/crime.csv"
#df = pd.read_csv(url)
df = crimes # edited by JF
```

**Note**: Make sure to visit the site `http://127.0.0.1:8050/` to see your dashboard in action!

In [3]:
import pandas as pd

#crimes = pd.read_csv('/Users/carolinelpetersen/Desktop/DATA202/data/crime.csv', encoding='latin-1', parse_dates=['OCCURRED_ON_DATE'])
#have to write it to my own device to grade: 
crimes = pd.read_csv("crime.csv", encoding='latin-1', parse_dates=['OCCURRED_ON_DATE'])



# Drop rows with missing locations
crimes.dropna(subset=['Lat', 'Long', 'DISTRICT'], inplace=True)

# Focus on major crimes in 2018
crimes = crimes[crimes.OFFENSE_CODE_GROUP.isin([
    'Larceny', 'Auto Theft', 'Robbery', 'Larceny From Motor Vehicle', 'Residential Burglary',
    'Simple Assault', 'Harassment', 'Ballistics', 'Aggravated Assault', 'Other Burglary', 
    'Arson', 'Commercial Burglary', 'HOME INVASION', 'Homicide', 'Criminal Harassment', 
    'Manslaughter'])]
crimes = crimes[crimes.YEAR>=2018]

# Print the first five rows of the table
crimes

Unnamed: 0,INCIDENT_NUMBER,OFFENSE_CODE,OFFENSE_CODE_GROUP,OFFENSE_DESCRIPTION,DISTRICT,REPORTING_AREA,SHOOTING,OCCURRED_ON_DATE,YEAR,MONTH,DAY_OF_WEEK,HOUR,UCR_PART,STREET,Lat,Long,Location
0,I182070945,619,Larceny,LARCENY ALL OTHERS,D14,808,,2018-09-02 13:00:00,2018,9,Sunday,13,Part One,LINCOLN ST,42.357791,-71.139371,"(42.35779134, -71.13937053)"
6,I182070933,724,Auto Theft,AUTO THEFT,B2,330,,2018-09-03 21:25:00,2018,9,Monday,21,Part One,NORMANDY ST,42.306072,-71.082733,"(42.30607218, -71.08273260)"
8,I182070931,301,Robbery,ROBBERY - STREET,C6,177,,2018-09-03 20:48:00,2018,9,Monday,20,Part One,MASSACHUSETTS AVE,42.331521,-71.070853,"(42.33152148, -71.07085307)"
19,I182070915,614,Larceny From Motor Vehicle,LARCENY THEFT FROM MV - NON-ACCESSORY,B2,181,,2018-09-02 18:00:00,2018,9,Sunday,18,Part One,SHIRLEY ST,42.325695,-71.068168,"(42.32569490, -71.06816778)"
24,I182070908,522,Residential Burglary,BURGLARY - RESIDENTIAL - NO FORCE,B2,911,,2018-09-03 18:38:00,2018,9,Monday,18,Part One,ANNUNCIATION RD,42.335062,-71.093168,"(42.33506218, -71.09316781)"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
200006,I162071070,522,Residential Burglary,BURGLARY - RESIDENTIAL - NO FORCE,B2,281,,2018-02-13 01:17:00,2018,2,Tuesday,1,Part One,GREENVILLE ST,42.326968,-71.080519,"(42.32696802, -71.08051941)"
215588,I162054268,522,Residential Burglary,BURGLARY - RESIDENTIAL - NO FORCE,D4,627,,2018-07-27 17:50:00,2018,7,Friday,17,Part One,BOYLSTON ST,42.344423,-71.098331,"(42.34442266, -71.09833083)"
318679,I152049494,614,Larceny From Motor Vehicle,LARCENY THEFT FROM MV - NON-ACCESSORY,C6,196,,2018-03-12 11:00:00,2018,3,Monday,11,Part One,OLD COLONY AVE,42.335811,-71.055754,"(42.33581104, -71.05575441)"
318867,I142025834-00,335,Robbery,ROBBERY - UNARMED - CHAIN STORE,D4,147,,2018-07-17 01:00:00,2018,7,Tuesday,1,Part One,COLUMBUS AVE,42.340872,-71.081458,"(42.34087158, -71.08145768)"


In [4]:
import dash
from dash import dcc, html
from dash.dependencies import Input, Output
import plotly.express as px
import pandas as pd

# Load the dataset
#data = pd.read_csv("/Users/carolinelpetersen/Desktop/DATA202/data/crime.csv", encoding='latin-1', parse_dates=['OCCURRED_ON_DATE']) #edited CP
#have to write it to my own device to grade: 
data = pd.read_csv("crime.csv", encoding='latin-1', parse_dates=['OCCURRED_ON_DATE'])



# Preprocess the data
# Extract necessary information for each plot

# Create the Dash app
app = dash.Dash(__name__)

# Define the layout of the dashboard
app.layout = html.Div([
    dcc.Dropdown(
        id='crime-dropdown',
        options=[{'label': crime, 'value': crime} for crime in data['OFFENSE_CODE_GROUP'].unique()],
        value='Select Crime Type'
    ),
    html.Div([
        dcc.Graph(id='crime-map'),
        dcc.Graph(id='crime-bar'),
    ], style={'display': 'flex'}),
    html.Div([
        dcc.Graph(id='crime-pie'),
        dcc.Graph(id='crime-line'),
    ], style={'display': 'flex'}),
])

# Define callback to update map
@app.callback(
    Output('crime-map', 'figure'),
    [Input('crime-dropdown', 'value')]
)
def update_map(selected_crime):
    if selected_crime == 'Select Crime Type':
        filtered_data = data.copy()
    else:
        filtered_data = data[data['OFFENSE_CODE_GROUP'] == selected_crime]

    # Create map plot using Plotly Express
    fig = px.scatter_mapbox(filtered_data, lat='Lat', lon='Long', hover_name='DISTRICT',
                            hover_data=['OFFENSE_CODE_GROUP'], zoom=9.5, center={"lat": 42.31137, "lon": -71.08115}) #edited CP (centered points)
    fig.update_layout(mapbox_style="open-street-map")
    return fig

# Define callback to update bar plot
@app.callback(
    Output('crime-bar', 'figure'),
    [Input('crime-dropdown', 'value')]
)
def update_bar(selected_crime):
    if selected_crime == 'Select Crime Type':
        grouped_data = data.groupby('DISTRICT').size().reset_index(name='Total Crimes')
    else:
        filtered_data = data[data['OFFENSE_CODE_GROUP'] == selected_crime]
        grouped_data = filtered_data.groupby('DISTRICT').size().reset_index(name='Total Crimes')

    # Create bar plot using Plotly Express
    fig = px.bar(grouped_data, x='DISTRICT', y='Total Crimes', title='Total Crimes by District')
    return fig

# Define callback to update pie chart
@app.callback(
    Output('crime-pie', 'figure'),
    [Input('crime-dropdown', 'value')]
)
def update_pie(selected_crime):
    if selected_crime == 'Select Crime Type':
        grouped_data = data.groupby('DAY_OF_WEEK').size().reset_index(name='Total Crimes')
    else:
        filtered_data = data[data['OFFENSE_CODE_GROUP'] == selected_crime]
        grouped_data = filtered_data.groupby('DAY_OF_WEEK').size().reset_index(name='Total Crimes')

    # Create pie chart using Plotly Express
    fig = px.pie(grouped_data, names='DAY_OF_WEEK', values='Total Crimes', title='Total Crimes by Day of the Week')
    return fig

# Define callback to update line plot
@app.callback(
    Output('crime-line', 'figure'),
    [Input('crime-dropdown', 'value')]
)
def update_line(selected_crime):
    if selected_crime == 'Select Crime Type':
        grouped_data = data.groupby(['DISTRICT', 'YEAR']).size().reset_index(name='Total Crimes')
    else:
        filtered_data = data[data['OFFENSE_CODE_GROUP'] == selected_crime]
        grouped_data = filtered_data.groupby(['DISTRICT', 'YEAR']).size().reset_index(name='Total Crimes')

    # Create line plot using Plotly Express
    fig = px.line(grouped_data, x='YEAR', y='Total Crimes', color='DISTRICT',
                  title='Total Crimes Over Time by District')
    
    fig.update_xaxes(tickformat='d') #edited CP (updated axis)

    return fig

# Run the app
if __name__ == '__main__':
    app.run_server(debug=True)

### Exercise 2 Grading Notes


Final Exercise 2 Grade:

50/50

---
## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. I recommend going to the "Kernel" menu at the top and selecting "Restart & Run All". This will ensure that everything runs correctly when it is run sequentially. 