The Dataset contains details about bank customers who are withdrawing their accounts. The dataset includes personal information like Age, Income, Sex, Type of Credit card used.
The goal of this repository is to predict the attrition of the customer after adding noise to the personal details of the customers.
This project addresses the following Data Analysis topics:
Learn about the dataset:
- Is there missing data?
- Is it categorical/ordinal?
- Plotting relational heatmaps of features
- Droping some of the columns which many not contribute much to our analysis
- Encoding features into ordinal values
- Encoding features using One Hot Encoding
- Adding noise to features using the Differenial Privacy Library
- Prediction
- Python 3 and pip.
- Set up a virtual environment (optional, but recommended).
- Install dependencies using pip: pip install -r requirements.txt.
- Seaborn
- Matplotlib
- Diffprivlib
- Scikit Learn
- NumPy
- Pandas
- Jupyter
- Python3