This project aims to build and train a model to detect AI-generated images.
This code implements a system that uses image embeddings, an autoencoder for dimensionality reduction, and a Random Forest Classifier to identify AI-generated images. It involves the following steps:
- Data Preparation: Extracts images from zip files, generates image embeddings using 'clip-ViT-L-14' model, and creates datasets with embeddings and labels (real/fake).
- Autoencoder Training: Trains an autoencoder to reduce the dimensionality of the image embeddings.
- Random Forest Classifier Training: Trains a Random Forest Classifier using the encoded embeddings to classify images as real or fake.
- Inference: Loads the trained models, generates embeddings for test images, encodes them using the autoencoder, and predicts the labels using the Random Forest Classifier.
- Python 3.x
- sentence_transformers
- Pillow (PIL)
- TensorFlow
- scikit-learn
- pandas
- numpy
- matplotlib
- zipfile
- requests
- io
- os
- Install Dependencies: