This project analyzes and predicts crime occurrences in Chicago using machine learning techniques. The workflow includes data cleaning, preprocessing, feature engineering, and the application of various classification models such as Random Forest, Logistic Regression, Decision Tree, SVM, and KNN. The notebook demonstrates exploratory data analysis, outlier detection, feature importance visualization, and model evaluation using metrics like accuracy, precision, recall, F1 score, and confusion matrices.
- Data cleaning and preprocessing of real-world crime data
- Encoding categorical features and scaling numerical features
- Visualization of feature correlations and outliers
- Implementation and comparison of multiple classification algorithms
- Model performance evaluation and optimization
- Python
- Pandas
- NumPy
- Seaborn
- Matplotlib
- scikit-learn
Open the notebook and run the cells sequentially to reproduce the analysis and predictions. Adjust parameters and models as needed for further experimentation.