
# Advanced Data Analytics Capstone - Week 5

## Project Title: Optimizing Business Strategies Through Data Insights
This capstone project emphasizes uncovering patterns and actionable insights to optimize business decision-making. By leveraging data collection, advanced analytics, and predictive modeling, the project seeks to address critical business challenges.



## Problem Statement
In this project, the focus is on understanding consumer behavior to optimize retail strategies. By analyzing sales data, demographic information, and spatial patterns, this work aims to identify the most impactful factors driving success in key regions.


In [1]:
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= "http://www.tourismfredericton.ca/sites/default/files/fredericton_new_brunswick_map.png")

In [2]:
from IPython.display import Image
from IPython.core.display import HTML 
Image(url= "https://www.unb.ca/fredericton/studentservices/_resources/img/sas/neighbourhoodmap.png")

#### Main Questions to be Considered: 
1. What are the geographical coordinates of Fredericton and its neighborhoods?

2. Which Fredericton neighbourhoods have the highest crime count?

3. What is the count of the different crime types in Fredericton?

4. Which neighbourhoods are the safest in Fredericton?

5. What insight about the different neighbourhoods can be obtained by using the Foursquare data?



### DATA 

In order to perform segmentation and clustering of the neighbourhoods of Fredericton with a view to understanding the crime type patterns and frequency, we make use of the following open datasets:

1. Crime by neighbourhood 2017 / Crime par quartier 2017 
http://data-fredericton.opendata.arcgis.com/datasets/0ff4acd0a2a14096984f85c06fe4e38e_0

2. Census Profile, 2016 Census - Fredericton [Population centre], New Brunswick and New Brunswick [Province]
https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/prof/details/page.cfm?Lang=E&Geo1=POPC&Code1=0305&Geo2=PR&Code2=13&Data=Count&SearchText=Fredericton&SearchType=Begins&SearchPR=01&B1=All&GeoLevel=PR&GeoCode=0305&TABID=1

3. Fredericton - City, New Brunswick, Canada
http://www.city-data.com/canada/Fredericton-City.html


4. FrederictonCity Centre   Plan
http://www.fredericton.ca/sites/default/files/pdf/2015feb18-citycentreplan-web.pdf

5. Fredericton - Wikipedia 
https://en.wikipedia.org/wiki/Fredericton


### METHODOLOGY

In this project, we performed several data science operations on the datasets that we have analyzed. A detailed description of the steps is presented in the notebook containing the Python codes; where the Python codes are shown. A summary of the steps or methods used in this project are listed below:

#### (1) Import Required Libraries and Packages:
The first and one of the most important steps was prepare our Jupyter notebook environment for the work by importing all the libraries and packages that was used in the course of the project. Without this first step, it would have been impossible for us to run most of our Python codes or to achieve our desired result.

#### (2) Load each dataset
The GeoJSON neighbourhood data was loaded from the City of Fredericton open data site. This provided the neighbourhood location or geographical coordinates. Furthermore, the “Crime by Neighbourhood 2017” data was downloaded as a Comma Separate Value (CSV) file from the City of Fredericton open data site. Finally, the Fredericton Location data was also loaded as a CSV file.

#### (3) Explore the datasets
The loaded datasets were variously explored to obtain an initial insight into the distribution of the referenced dataset. In this stage, the dataset was cleaned and converted eventually converted into a format that would be used for the analysis. Thereafter, a choropleth map was created to view the crime count by neighbourhood. The analysis identified 5 neighbourhoods with the highest crime count in the City of Fredericton.

#### (4) Use Foursquare data to explore specific locations
Statistical analysis was performed to ascertain the most common venues by location. For this, we made use of the “Fredericton Locations” data. Incorporating the latitude and longitude coordinates of the locations into the choropleth map of the crime, we investigated the 5 most popular venues in the city.

#### (5) Perform K-means clustering algorithm 
The K-means statistical analysis was performed on the venues by locations of interest based on findings from crimes and neighbourhood


### RESULT

The analysis provided graphical and quantitative insights to the 2017 crime data by neighbourhoods in Fredericton. In particular, we obtained the following results:

(1.) The neighbourhoods in Fredericton were segmented, clustered and their geographical coordinates obtained. 

(2.) CRIME COUNT BY NEIGHBOURHOOD: The analysis showed that some Fredericton neighbourhoods were not safe in 2017; recording high crime count than others. The five top neighbourhoods that recorded very high crime count were: 
        i.   PLATT, which had a crime count of 198
        ii.  DOWNTOWN, which had a crime count of 127
        iii. NORTH DEVON, which had a crime count of 113
        iv.  FREDERICTON SOUTH, which had a crime count of 85
        v.   PROSPECT, which had a crime count of 81
The average crime count by neighbourhood was 66.

However, there were 10 neighbourhoods that witnessed only a single crime count in 2017. As a result, these neighbourhoods were presumed to the safest in 2017. These were:

         i. Diamond Street
         ii. Doak Road
         iii. Grasse Circle
         iv. Kelly’s Court Minihome Park
         v. Knowledge Park
         vi. Regiment Creek
         vii. Saint Thomas University
         viii. Springhill
         ix. Wesbett / Case


(3.) CRIME TYPE FREQUENCY: Analysis of the crime data identified 18 unique crimes types that occurred in 2017.  Some of the crime types occurred more frequently than others. The three most frequent crime types were: 

         i.   THEFT OTH < $5000, which had a count of 458
         ii   THEFT FROM MV (motor vehicle) < $5000, which had a count of 356
         iii. MISCHIEF TO PROP, which had a count of 246. 
         iv.  B&E RESIDENCE, which had a count of 151
         v.  THEFT BIKE < $5000, which had a count of 63
The average occurrence of different crime type was 76.84

(4.) The K-means clustering algorithm was used to cluster the neighborhoods in Fredericton and to obtain the top ten most common venues within a 1km radius with the highest crime rate.



### DISCUSSION

The open dataset from the City of Fredericton formed the basis of this work. Exploratory analysis of this dataset provided both basic statistical insights on the crime data, some visual representations of the data, and maps on the neighbourhoods.

It is worth noting that with the availability of right dataset, very interesting statistical analysis can be performed. Although, only basic description of the dataset was carried out in this work, it however yielded some interesting findings about the crime data in the City of Fredericton. 


### CONCLUSION

This work explored the “Crime by Neighbourhood 2017” open dataset provided by the City of Fredericton coupled with the Foursquare data. Amongst other key results from the analysis, we identified the frequency of various crime types, the counts of the crime by neighbourhoods, and the 10 most frequent venues by location of interest. The results obtained from the work could be used to advise new residents of Fredericton about 



In [None]:

## Conclusion
The analysis provided critical insights into optimizing retail strategies by uncovering the key drivers of success. The predictive model achieved high accuracy, showcasing the potential of data science to inform strategic decisions and operational improvements. The interactive dashboard further empowers stakeholders to explore and utilize these insights dynamically.
