# **Image-to-Poem Generation Using Deep Generative Models**

### Student Correspondences:

1. Neha Anusooya Thimmarayi - nanusooy@depaul.edu
2. Rohan Shankar Patil - rpatil5@depaul.edu

#### Project Description:

This project explores the use of deep generative models to generate creative, emotionally resonant poetry from visual inputs. The core objective is to develop a machine learning system that accepts an image and generates a corresponding poem that captures the image’s mood, theme, or aesthetic, rather than providing literal descriptions.

***Sufficient explanations on why each step is essential.
Instructions on how to test each function with example cases to illustrate functionality.
Commentary on the purpose of each implementation choice, especially if choices deviate from typical practices.***

***(We'll remove all content which are in Italics)***

----------------

### 1. Introduction to Libraries

*Briefly introduce the libraries used (e.g., NumPy, PyTorch), explaining their roles.
Mention any installation or setup instructions if special libraries are required.*

#### 1.1 Importing Required Libraries

In [2]:

import os
import json
import random
import pandas as pd
import numpy as np
from PIL import Image
import requests
from io import BytesIO
import torch
import torchvision
from torchvision import transforms
from torchvision.models import resnet18
from tqdm import tqdm


In [18]:
import torch

if torch.cuda.is_available():
    print("GPU is available")
else:
    print("GPU is not available")

GPU is not available


### Dataset Preparation


In [3]:
dataset_path = '/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/multim_poem.json'

with open(dataset_path, 'r') as f:
    data = json.load(f)

print(f"Total entries in dataset: {len(data)}")

# sample
print("Example entry:")
print(json.dumps(data[0], indent=2))

Total entries in dataset: 8292
Example entry:
{
  "poem": "what is lovely never dies\nbut passes into other loveliness\nstar-dust or sea-foam flower or winged air",
  "image_url": "https://farm2.staticflickr.com/1086/1002051357_0e9162423e.jpg",
  "id": 0
}


In [4]:
# Clean Dataset
cleaned_data = []

for item in data:
    poem = item.get("poem")
    url = item.get("img") or item.get("image_url")

    if poem and url and poem.strip():
        cleaned_data.append({
            "image_url": url,
            "poem": poem.strip()
        })

print(f"Cleaned dataset size: {len(cleaned_data)} entries")


df = pd.DataFrame(cleaned_data)
df.head()

Cleaned dataset size: 8292 entries


Unnamed: 0,image_url,poem
0,https://farm2.staticflickr.com/1086/1002051357...,what is lovely never dies\nbut passes into oth...
1,https://farm8.staticflickr.com/7434/1002469112...,sods on the dugout begin to be fledged\nwith f...
2,https://farm1.staticflickr.com/19/100255672_97...,one must have the mind of winter\nto regard th...
3,https://farm2.staticflickr.com/1034/1002997433...,to put meaning in one's life may end in madnes...
4,https://farm4.staticflickr.com/3741/1004000893...,of living pained branches\nmy garden's braided...


In [5]:

os.makedirs("/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/data/processed", exist_ok=True)
df.to_csv("/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/data/processed/cleaned_poems.csv", index=False)
df.to_json("/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/data/processed/cleaned_poems.json", orient="records", indent=2)

print("Cleaned dataset saved.")

Cleaned dataset saved.


In [6]:

image_dir = "/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/data/images"
os.makedirs(image_dir, exist_ok=True)

In [10]:
def download_image(url, save_path):
    try:
        response = requests.get(url, timeout=5)
        response.raise_for_status()
        img = Image.open(BytesIO(response.content)).convert("RGB")
        img.save(save_path)
        return True
    except Exception as e:
        return False


sample_df = df.sample(n=1000, random_state=42).reset_index(drop=True)

success_count = 0

valid_data = []
image_dir = "/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/data/images"
os.makedirs(image_dir, exist_ok=True)

img_id = 0  # counter for naming valid image files

for _, row in tqdm(sample_df.iterrows(), total=len(sample_df)):
    url = row['image_url']
    poem = row['poem']
    filename = f"{img_id}.jpg"
    img_path = os.path.join(image_dir, filename)

    if download_image(url, img_path):
        valid_data.append({
            "image_filename": filename,
            "poem": poem
        })
        img_id += 1

print(f"Successfully downloaded: {len(valid_data)} images")

# Save valid entries
valid_df = pd.DataFrame(valid_data)
valid_df.to_csv("/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/data/processed/filtered_poem_data.csv", index=False)

100%|██████████| 1000/1000 [00:27<00:00, 36.09it/s]

Successfully downloaded: 720 images





In [11]:
# Align images to poem
valid_indices = [i for i in range(1000) if os.path.exists(f"/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/data/images/{i}.jpg")]


filtered_df = sample_df.loc[valid_indices].reset_index(drop=True)
print(f"Final usable pairs: {len(filtered_df)}")


filtered_df.to_csv("/workspaces/Image-to-Poem-Generation-Using-Deep-Generative-Models/data/processed/filtered_poem_data.csv", index=False)

Final usable pairs: 899


### 2. Model Design and Implementation

*Explain the model’s architecture, including input and output structures.
Provide annotated code blocks for each part of the model, detailing the purpose and functionality of each section.
Include small test cases to demonstrate the correctness of key methods.*

### 3. Training Process

*Outline your training pipeline, including data loading, pre-processing, and any regularization techniques.
Briefly describe hyperparameters used (learning rate, batch size, epochs) and reasoning behind their choice.
Include sample output or logs from training to illustrate model performance and learning curves.*

### 4. Evaluation Results

*Present evaluation metrics and explain the criteria used to assess the model’s performance.
Show example predictions or outputs to demonstrate model accuracy and behavior.
Provide insights into the model’s strengths, weaknesses, and areas for improvement based on the results.*