This project contains the code for a machine learning model that classifies images as human or AI-generated. It consists of three distinct models, a Convolutional Neural Network (CNN), a Support Vector Machine (SVM), and a Generative Adversarial Network (GAN). It is NOT an ensemble model, but an investigation into the performance of different models on the same dataset.
main.py
: Main script to run the projectsrc
: Source Foldercnn
: CNN modelCNNmodel.py
: Primary code for the CNN model, containing preprocessing, training, and testing functions
svm
: SVM modelmodel.py
: Primary code for the SVM model, containing preprocessing, training, and testing functions
gan
: GAN modelmodel_gan.py
: Primary code for the GAN model, containing preprocessing, training, and testing functionsmodel
: Folder containing the GAN modelimages
: Folder containing the GAN generated images for each epoch
datasets
: Folder containing the dataset utilities for training and testingfake_dataset.py
: Class representing the streamed fake dataset - identical data types as real datasetreal_dataset.py
: Class representing the streamed real dataset - identical data types as fake datasetTrain_GCC-training.tsv
: Training dataset, consists of URLs and captionsValidation_GCC-1.1.0-Validation.tsv
: Validation dataset, consists of URLs and captionsataset_InfImagine___fake_image_dataset_default_0.0.0_4edf8865bd54ae2acf4708b709b74b6086b61cba.lock
: Fake image dataset, small file containing information retrieved from the HuggingFace datasets librarytest
: Folder containing a 20k test dataset, downloaded and saved from the full datasettrain
: Folder containing a 100k train dataset, downloaded and saved from the full dataset
Additional files in the root directory include
requirements.txt
: Required packages for the projectREADME.md
: This file!job.sbatch
: Slurm job script for running the project on the ICE clusterReport-XXXXXXX.out
: Output file from running the project on the ICE clustergitattributes
: Git LFS file for tracking large files - in our case the training TSVgitignore
: Generic gitignore
Report documentation is found in the astro-docs
branch, with content served on a GH pages site, available here