Skip to content

Predicting if a system is vulnerable for malware. Analysis on different models.

Notifications You must be signed in to change notification settings

amit-singh-rathore/malware-detection

Repository files navigation

Malware Detection

Description

We are trying to built a predictive classification model which based on system configuration will predict the whether it is likely to get attacked by a malware.

Data Definition

Dataset contains 82 attributes.

Data Preparation

  1. Data Cleaning
    • Handling Missing Values
    • Skewness
  2. Categorical Data Handling
    • Category reduction
    • Case-sensitive merging
    • Special Character handling

Data Exploration

  1. Target variable
    • Distribution & Bias
  2. EDA Tasks
  3. Data Visualisation

Data Preparation for SageMaker

  1. Dataframe modification and conversion
  2. Train Test Split
  3. Storage in S3 bucket

Model Training in SageMaker

Model Inference

Model Performance

  1. Accurary
  2. F1- Score
  3. ROC and AUC curve

Feature importance

Pandas Profiling top 15 features

Releases

No releases published

Packages

No packages published