# 🔥 AI-Generated vs. Real Art Classification (CIFAKE Dataset)
### **Built with CNN & Transfer Learning (ResNet50) + Explainability (Grad-CAM)**

**Author:** Your Name  
**Objective:** Train a CNN to classify AI-generated vs. real artwork using the CIFAKE dataset.

**🔹 Techniques Used:**
- Convolutional Neural Networks (CNNs)
- Transfer Learning with ResNet50
- Data Augmentation & Dropout for Overfitting Prevention
- Hyperparameter Optimization (Adam, Learning Rate Scheduling)
- Evaluation Metrics (Confusion Matrix, ROC-AUC, Classification Report)
- Explainability with Grad-CAM

In [1]:
# 📌 Step 1: Import Libraries
import numpy as np
import pandas as pd
import os
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications import ResNet50
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Dropout
from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix, classification_report


## 📌 Step 2: Load the CIFAKE Dataset

In [3]:
import os
import pandas as pd

# Define the dataset directory
data_dir = '/kaggle/input/cifake-real-and-ai-generated-synthetic-images'

# Define train and test directories
train_dir = os.path.join(data_dir, 'train')
test_dir = os.path.join(data_dir, 'test')

# Get train images
train_real_images = [os.path.join(train_dir, 'REAL', img) for img in os.listdir(os.path.join(train_dir, 'REAL'))]
train_fake_images = [os.path.join(train_dir, 'FAKE', img) for img in os.listdir(os.path.join(train_dir, 'FAKE'))]

# Get test images
test_real_images = [os.path.join(test_dir, 'REAL', img) for img in os.listdir(os.path.join(test_dir, 'REAL'))]
test_fake_images = [os.path.join(test_dir, 'FAKE', img) for img in os.listdir(os.path.join(test_dir, 'FAKE'))]

# Create DataFrames
train_df = pd.DataFrame({'image_path': train_real_images + train_fake_images,
                         'label': ['real']*len(train_real_images) + ['fake']*len(train_fake_images)})

test_df = pd.DataFrame({'image_path': test_real_images + test_fake_images,
                        'label': ['real']*len(test_real_images) + ['fake']*len(test_fake_images)})

# Convert labels to binary (real = 1, fake = 0)
train_df['label'] = train_df['label'].map({'real': 1, 'fake': 0})
test_df['label'] = test_df['label'].map({'real': 1, 'fake': 0})

# Shuffle datasets
train_df = train_df.sample(frac=1).reset_index(drop=True)
test_df = test_df.sample(frac=1).reset_index(drop=True)

# Display first few rows
train_df.head(), test_df.head()


(                                          image_path  label
 0  /kaggle/input/cifake-real-and-ai-generated-syn...      0
 1  /kaggle/input/cifake-real-and-ai-generated-syn...      1
 2  /kaggle/input/cifake-real-and-ai-generated-syn...      0
 3  /kaggle/input/cifake-real-and-ai-generated-syn...      0
 4  /kaggle/input/cifake-real-and-ai-generated-syn...      0,
                                           image_path  label
 0  /kaggle/input/cifake-real-and-ai-generated-syn...      0
 1  /kaggle/input/cifake-real-and-ai-generated-syn...      1
 2  /kaggle/input/cifake-real-and-ai-generated-syn...      0
 3  /kaggle/input/cifake-real-and-ai-generated-syn...      1
 4  /kaggle/input/cifake-real-and-ai-generated-syn...      1)

## 📌 Step 3: Preprocess Images and Create Data Generators

In [None]:
img_size = (128, 128)
batch_size = 32
datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)
train_gen = datagen.flow_from_dataframe(df, x_col='image_path', y_col='label', target_size=img_size, batch_size=batch_size, subset='training', class_mode='binary')
val_gen = datagen.flow_from_dataframe(df, x_col='image_path', y_col='label', target_size=img_size, batch_size=batch_size, subset='validation', class_mode='binary')

## 📌 Step 4: Build & Train a CNN Model with Transfer Learning (ResNet50)

In [None]:
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(128, 128, 3))
base_model.trainable = False  # Freeze base model layers
model = Sequential([
    base_model,
    Flatten(),
    Dropout(0.3),
    Dense(128, activation='relu'),
    Dense(1, activation='sigmoid')
])
model.compile(optimizer=Adam(learning_rate=0.001), loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(train_gen, validation_data=val_gen, epochs=5)

## 📌 Step 5: Evaluate Performance

In [None]:
y_pred = model.predict(val_gen) > 0.5
y_true = val_gen.classes
cm = confusion_matrix(y_true, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', xticklabels=['Fake', 'Real'], yticklabels=['Fake', 'Real'])
plt.show()
print(classification_report(y_true, y_pred, target_names=['Fake', 'Real']))

## 📌 Step 6: Explainability using Grad-CAM

In [None]:
# Grad-CAM implementation (explainability)
# [To be implemented for visualization of CNN predictions]

### 🚀 **Next Steps:**
- Fine-tune ResNet50 by unfreezing some layers
- Train with more epochs & experiment with hyperparameters
- Deploy as a web app using Flask or Streamlit

**📌 If this helped you, consider starring the GitHub repo!** ⭐