# Decoding Fitzgerald
### _Exploring Themes and Emotions in F. Scott Fitzgerald’s Works with NLP_
**Author:** Virginia Herrero

## Import Libraries

In [1]:
# Utilities
import os
import re
import requests
import time

# Text processing
import nltk
from nltk.tokenize import sent_tokenize

# Download NLTK resources
nltk.download('punkt')

# Data visualization
import matplotlib.pyplot as plt
import seaborn as sns

[nltk_data] Downloading package punkt to C:\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


## Data Collection

This project analyzes themes and emotions in F. Scott Fitzgerald's works using natural language processing techniques. The texts were sourced from Project Gutenberg, where only the first three novels—This Side of Paradise, The Beautiful and Damned, and The Great Gatsby—are available for free. Tender Is the Night is not yet in the public domain and therefore excluded from the analysis. Fitzgerald’s unfinished final novel, The Last Tycoon, published posthumously, is also not included.

In [3]:
# Ensure the raw data directory exists
os.makedirs("../data/raw", exist_ok = True)

# Dictionary of Fitzgerald's books and their Project Gutenberg plain text URLs
books = {
    "This-side-of-paradise": "https://www.gutenberg.org/cache/epub/805/pg805.txt",
    "The-beautiful-and-damned": "https://www.gutenberg.org/cache/epub/9830/pg9830.txt",
    "The-great-gatsby": "https://www.gutenberg.org/cache/epub/64317/pg64317.txt"   
}

# Download and save each book in the raw data directory
for title, url in books.items():
    print(f"Downloading {title}...")
    response = requests.get(url)
    if response.status_code == 200:
        filepath = f"../data/raw/{title}.txt"
        with open(filepath, "w", encoding = "utf-8") as f:
            f.write(response.text)
        print(f"Saved to {filepath}\n")
    else:
        print(f"Failed to download {title} from {url}\n")
    time.sleep(2)

print("All downloads completed!")

Downloading This-side-of-paradise...
Saved to ../data/raw/This-side-of-paradise.txt

Downloading The-beautiful-and-damned...
Saved to ../data/raw/The-beautiful-and-damned.txt

Downloading The-great-gatsby...
Saved to ../data/raw/The-great-gatsby.txt

All downloads completed!
