We are trying to built a predictive classification model which based on system configuration will predict the whether it is likely to get attacked by a malware.
Dataset contains 82 attributes.
- Data Cleaning
- Handling Missing Values
- Skewness
- Categorical Data Handling
- Category reduction
- Case-sensitive merging
- Special Character handling
- Target variable
- Distribution & Bias
- EDA Tasks
- Data Visualisation
- Dataframe modification and conversion
- Train Test Split
- Storage in S3 bucket
- Accurary
- F1- Score
- ROC and AUC curve