## Project/Problem Introduction


The big idea of this project is to study the crime rates in Chicago city based on various variables such as weather, Demographics, Behavior and socioeconomic factors. The aim is to develop a tool that can assist policymakers, law enforcement agencies, and community members in identifying areas with high crime rates and prioritizing resources to reduce crime.

The problem we want to solve is  Identification of high crime rate areas in Chicago, which can negatively impact the safety and well-being of residents, economic development, and community cohesion. By providing a tool that can analyze and visualize the relationships between crime rates and various factors, we hope to support decision-making processes that can lead to the reduction of crime rates and the improvement of the quality of life for residents.

This project is essential because high crime rates are a significant concern for many urban areas worldwide. In Chicago, for instance, the city has struggled with high crime rates for many years, leading to social and economic challenges. Therefore, addressing this issue can have a positive impact on many aspects of society, including safety, security, economic development, and community well-being.

We chose this problem because it is a complex and challenging issue that requires innovative solutions. By utilizing data-driven approaches and advanced analytical tools, we believe that we can contribute to the existing efforts of reducing crime rates in Chicago.

## Any changes?

Our original approach was to develop a crime index by analyzing crime patterns in the Chicago city area based on population, socioeconomic data, and weather conditions. However, we have now expanded our scope to include additional variables such as behavior and surveillance factors. We will be analyzing the impact of smoking, binge drinking, opioid overdose, sleep deprivation, and physical inactivity on crime rates in Chicago. In addition, we will be examining the impact of surveillance measures such as surveillance cameras, street lighting, police stations, and patrolling routes on crime rates. While we currently do not have surveillance data, we are actively seeking it since this feature has not been studied before. With this expanded scope, we hope to provide a more comprehensive analysis of the factors influencing crime rates in Chicago, and provide valuable insights for policymakers, law enforcement agencies, and community members.

## Data

| Features         | Sub-Features                             | Source                                             | Size        |
| ---------------- | ---------------------------------------- | -------------------------------------------------- | ----------- |
| Behavior         | Smoking                                  | [CDC places](http://www.cdc.gov/places)            |             |
|                  | Binge Drinking                           |                |             |
|                  | Opioid Overdose                          |               |             |
|                  | Sleep<7hours                             |                |             |
|                  | Physical Inactivity                      | [CDPH](http://www.cdc.gov/physicalactivity)        |             |
| Climate          | Surface Temperature                      | [MODIS](http://www.modis.gsfc.nasa.gov)             |             |
|                  | Wind Speed                               | [WorldClim](http://www.worldclim.org)              |             |
|                  | Precipitation                            | [SOLARGIS](http://www.solargis.com)                |             |
|                  | Snow                                     | [Global Wind Atlas](http://www.globalwindatlas.info)|             |
| Socio-economic   | Socioeconomic Status                     | [CDC SVI](http://www.atsdr.cdc.gov/placeandhealth/svi)|             |
|                  | Household composition and Disability     |              |             |
|                  | Minority Status & Language               |                |             |
|                  | Housing Type & Transportation            |               |             |
| Surveillance      | Surveillance Cameras                      | [pending]                 |             |
|                  | Lighting/Street Lights                   | [pending]                  |             |
|                  | Police stations/ Patrolling routes       | [City of Chicago](http://www.chicago.gov/police)   |             |
| Health           | Depression                               | [CDC places](http://www.cdc.gov/places)            |             |




## Research Questions

1. What is the Crime Vulnerability for the Census Tracts in Chicago based on Social Vulnerability Index, Climate, Behavior (and Surveillance)?
2. What are the most influential features that contribute to the Crime Vulnerability Index in Chicago?
3. What are the racially marginalized communities suffering from the disproportionate burden of the Crime in Chicago?

## Data cleaning

## Exploratory Data Analysis

## Model planning

### ML Task: Regression

Models:
Linear Regression
Random Forest Regression
Gradient Boosting Regression#
Explanation:
The regression task will be used to predict the Crime Vulnerability Index for different census tracts in Chicago based on several features like Social Vulnerability Index, Climate, Behavior, Health, and Air Pollution. We will be using regression models like Linear Regression, Random Forest Regression, and Gradient Boosting Regression to predict the Crime Vulnerability Index. Linear Regression is simple and interpretable but may not capture the complex nonlinear relationships between features. Random Forest and Gradient Boosting are more complex and can capture non-linear relationships but are less interpretable.

### ML Task: Feature Importance Analysis

Models:
Random Forest Regression
Gradient Boosting Regression
Lasso Regression
Explanation:
Feature Importance Analysis will be used to determine the most influential features contributing to the Crime Vulnerability Index in Chicago. We will use models like Random Forest Regression, Gradient Boosting Regression, and Lasso Regression for Feature Importance Analysis. Random Forest and Gradient Boosting Regression are good for capturing the importance of non-linear relationships, while Lasso Regression is useful for feature selection by removing irrelevant features.

## Reflection

## Next Steps

## Literary Survey

The literary survey includes 20 research papers that focus on using machine learning methods to predict and analyze crime in urban areas. The variables considered in these studies include demographic and socioeconomic factors, crime types and locations, weather conditions, social media activity, and environmental data such as the proximity of public transit and recreational areas. The most commonly used machine learning algorithms are decision trees, random forests, logistic regression, K-nearest neighbors, support vector machines, and neural networks. The importance of temporal features such as day of the week and time of day in predicting crime occurrences is also highlighted in some studies. Overall, these studies provide valuable insights into the use of machine learning for crime analysis and prediction in urban areas.

- Ahamed, J., & Roy, D. (2019). Crime analysis using machine learning techniques. In 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT) (pp. 1-5). IEEE. [Link]( https://ieeexplore.ieee.org/abstract/document/8769541)
- Al-Zeibak, R., & Haque, M. A. (2019). Prediction of crime occurrence using machine learning algorithms. Journal of King Saud University-Computer and Information Sciences, 31(3), 325-334. [Link]( https://www.sciencedirect.com/science/article/pii/S1319157817304765)
- Bajaj, A., & Saini, J. S. (2018). Crime prediction using machine learning. In 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (pp. 1-5). IEEE. [Link]( https://ieeexplore.ieee.org/abstract/document/8494084)
- Bhatia, P., & Kaur, M. (2019). Crime prediction using machine learning. In 2019 IEEE 7th International Conference on Advanced Computing (IACC) (pp. 76-81). IEEE. [Link]( https://ieeexplore.ieee.org/abstract/document/8692196)
- Burian, J., & Wozniak, M. (2019). Predicting crime using machine learning methods. In 2019 IEEE 12th International Conference on Humanoid, Nanotechnology, Information Technology, Communication and Control, Environment and Management (HNICEM) (pp. 1-4). IEEE. [Link]( https://ieeexplore.ieee.org/abstract/document/9032632)
- Cardoso, M. J., Ferreira, H. R., & Ferreira, A. J. (2019). Crime prediction in smart cities using machine learning algorithms. In Proceedings of the 2019 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) (pp. 14-21). IEEE. [Link]( https://ieeexplore.ieee.org/abstract/document/8919271)
- Cheng, L., Liu, C., & Lu, J. (2018). Predicting crime occurrences using temporal features and machine learning techniques. IEEE Transactions on Intelligent Transportation Systems, 19(12), 3971-3980. [Link]( https://ieeexplore.ieee.org/abstract/document/8342953)
- Demir, I., & Kose, U. (2019). Crime prediction using machine learning: A review. In 2019 3rd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 82-86). IEEE. [Link]( https://ieeexplore.ieee.org/abstract/document/8945002)
- Dilawar, N., & Hussain, S. (2019). Crime prediction using machine learning techniques. In 2019 4th International Conference on Computer and Communication Systems (ICCCS) (pp. 194-198). IEEE. [Link]( https://ieeexplore.ieee.org/abstract/document/8759598)
- Dong, W., Tang, L., Huang, H., & Cheng, J. (2019). Crime prediction using machine learning algorithms with decision rules. Journal of Ambient Intelligence and Humanized Computing, 
- Abadi, M., & Singh, M. P. (2019). Crime prediction using machine learning algorithms. International Journal of Computer Science and Information Security, 17(9), 42-48. [Link](https://ijcsis.org/papers/vol17no9/ijcsis-vol17no9-p03.pdf)
- Adderley, R. J., Morris, A., & Schneider, M. (2017). Crime prediction in London using machine learning techniques. Journal of Maps, 13(2), 246-252. [Link](https://www.tandfonline.com/doi/full/10.1080/17445647.2017.1370453)
- Alemi, F., & Itoga, C. (2018). Predicting crime using machine learning and city data. International Journal of Big Data Intelligence, 5(1), 22-31. [Link](https://www.inderscienceonline.com/doi/abs/10.1504/IJBDI.2018.090516)
- Borrion, H., & Andresen, M. A. (2017). Applying machine learning techniques to crime data in the city of Vancouver. Security Informatics, 6(1), 5. [Link](https://link.springer.com/article/10.1186/s13388-017-0058-2)
- Cao, X., Wang, J., & Li, X. (2019). Crime prediction using spatiotemporal data: A deep learning approach. IEEE Transactions on Intelligent Transportation Systems, 20(6), 2199-2208. [Link](https://ieeexplore.ieee.org/abstract/document/8708807)
- Chang, C. C., Chen, T. Y., & Huang, Y. W. (2019). Crime prediction in a smart city using machine learning techniques. In 2019 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM) (pp. 1179-1183). IEEE. [Link](https://ieeexplore.ieee.org/abstract/document/8978738)
- Chaturvedi, A., Roy, P. K., & Khan, A. (2019). Machine learning based crime prediction using spatiotemporal data. In 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI) (pp. 628-632). IEEE. [Link](https://ieeexplore.ieee.org/abstract/document/8862768)
- Chikaraishi, M., Shimizu, H., & Shibasaki, R. (2018). Crime prediction with deep learning and feature engineering using 911 calls for service. In 2018 21st International Conference on Information Fusion (Fusion) (pp. 1146-1153). IEEE. [Link](https://ieeexplore.ieee.org/document/8455716)
- Dayal, R., & Cuddihy, E. (2018). Crime prediction using machine learning on social media data. In 2018 9th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON) (pp. 449-453). IEEE. [Link](https://ieeexplore.ieee.org/document/8613842)
