Deepfake Classification Project

Overview

This project focuses on classifying deepfake media using a machine learning model. The objective is to analyze a dataset containing metadata about various media files and predict whether they are real or fake using a Random Forest Classifier.

Dataset

The dataset deepfake_detection_metadata_dataset.csv contains 1000 rows of media metadata. It includes the following features:

media_type: Image or Video
content_category: News, Social Media, Interview, Political Speech
face_count: Number of faces detected in the media
audio_present: Whether audio is present
lip_sync_score: Assessment of the lip sync quality
visual_artifacts_score: Score indicating the presence of visual artifacts
compression_level: Level of compression applied to the media
lighting_inconsistency_score: Score evaluating lighting inconsistencies
source_platform: Social media or news platform where the media was sourced
label: Real or Fake

Methodology

The analysis runs in a Jupyter Notebook (deepfake_analysis.ipynb) and covers the following steps:

Data Loading and Exploration: Loading the data using pandas.
Data Cleaning: Dropping irrelevant columns.
Encoding: Converting categorical data into numeric values (One-Hot Encoding) and mapping the target label (Real to 0, Fake to 1).
Feature Scaling: Preprocessing numerical features using StandardScaler.
Model Training: Splitting the data into training (80%) and testing (20%) sets, then training a RandomForestClassifier.
Evaluation: The model is evaluated on the test set. An initial baseline using all features performs exceptionally well, while a more realistic evaluation excluding direct predictive artifacts yields performance closer to baseline, demonstrating the challenges in deepfake detection.

Requirements

Python 3
pandas
scikit-learn
Jupyter Notebook

Running the Project

Activate the provided virtual environment (venv).
Ensure required packages are installed (e.g., pip install pandas scikit-learn).
Start a Jupyter server and open deepfake_analysis.ipynb.
Run the notebook cells sequentially to reproduce the workflow.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md
deepfake_analysis.ipynb		deepfake_analysis.ipynb
deepfake_detection_metadata_dataset.csv		deepfake_detection_metadata_dataset.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deepfake Classification Project

Overview

Dataset

Methodology

Requirements

Running the Project

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Deepfake Classification Project

Overview

Dataset

Methodology

Requirements

Running the Project

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages