# Introduction

## Foreword

Contains Chris's iterations of NN models. 

Outputs exported to `../Results/model_scores.csv`

## Problem Statement

The rapid evolution of generative artificial intelligence (GPAI, LLMs) social media has rapidly increased the public’s access to powerful, deceptive tools. One such concern is the increasing prevalence of deepfake images, which pose a significant threat to public trust and undermines the epistemic integrity of visual media. (Source).

These manipulated images can be utilized to spread false information, manipulate public opinion, and polarize communities, which can have serious consequences for both social and political discourse. In this project, we aim to develop a machine learning model that can detect differences between deepfakes and real images to combat the spread of manipulated visual media and protect the integrity of social discourse.

## Imports, Global Variables, and Helper Functions

In [8]:
# Importing global variables
sys.path.append('../Helper')
import config

# Setting random value
set_random_seed(config.random_seed_value)

## Data Loading

In [9]:
# Extracting w/h
image_shape = img_to_array(load_img("../../Data/Train/Real/real_1.jpg")).shape
w = image_shape[0]
h = image_shape[1]

In [10]:
# Setting filepaths to image data
train_directory = "../../Data/Train"
validation_directory = "../../Data/Validation/"
test_directory = "../../Data/Test/"
directories = [train_directory, validation_directory, test_directory]

In [11]:
# Loading data
mega_data = []
for i in range(3):
    mega_data.append(image_dataset_from_directory(
        directory=directories[i],
        image_size=(w, h),
        batch_size=64,
        seed=config.random_seed_value,
        label_mode='binary'
    ))

Found 140002 files belonging to 2 classes.
Found 39428 files belonging to 2 classes.
Found 10905 files belonging to 2 classes.


In [12]:
# Variable names to data
train_ds = mega_data[0]
val_ds = mega_data[1]
test_ds = mega_data[2]

# Modeling

## Baseline Model (Sequential)

### Preprocessing (Topology + Compiling)

In [22]:
# instantiate
model = Sequential()

# input layer
model.add(Input(shape=(w, h, 3)))
model.add(Rescaling(1./255))
model.add(BatchNormalization())

# convolutional layers
model.add(Dropout(0.3))
model.add(Conv2D(64, (3,3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Dropout(0.3))
model.add(Conv2D(32, (3,3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten())

# hidden layers
model.add(Dropout(0.3))
model.add(Dense(64, activation="relu"))
model.add(BatchNormalization())

model.add(Dropout(0.3))
model.add(Dense(32, activation="relu"))
model.add(BatchNormalization())

# output layer
model.add(Dense(1, activation="sigmoid"))

# compile
model.compile(
    optimizer=Adam(learning_rate=0.005),
    loss=BinaryCrossentropy(),
    metrics=config.standard_metrics
)

### Fitting

In [23]:
# es = EarlyStopping(patience=3)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
                              patience=5, min_lr=0.0005)
res = model.fit(train_ds,
                validation_data=val_ds, 
                epochs=20,
                callbacks=[reduce_lr])

Epoch 1/20
   3/2188 [..............................] - ETA: 2:11:14 - loss: 0.9890 - binary_accuracy: 0.5078 - auc: 0.4889 - precision: 0.4922 - recall: 0.5081 - true_negatives: 67.0000 - true_positives: 63.0000 - false_positives: 65.0000 - false_negatives: 61.0000

KeyboardInterrupt: 

### Evaluation

In [None]:
model.summary()

In [None]:
# Saving results of model to csv
# res.to_csv("../Results/model_scores.csv", mode="a")

### Graphing

In [None]:
pass

In [None]:
# Saving image
# plt.savefig("../../Images/_.png")