# Cleaned Odyssey's text from Peter Green's translation

Solid and clean copy. Here are some characteristics. 

- Clean from origin
    * no pagination or other in page marginalia
    * no weid characters in text
    * native digital version (no scan, OCR, etc)
- Separated by books
- Line number every five lines
- Macron in long vocals
- Superscripts for footnotes remain
- Stripped from source indicatorat the end of each book, i.e., `footer line`: EBSCOhost: eBook Collection (EBSCOhost) printed on 3/7/2025 11:00:14 PM UTC via UNIVERSITAET TUEBINGEN. All use subject to https://www.ebsco.com/terms-of-use.

In [1]:
# Step 1: Remove everything before Green's first line
# "The man, Muse—tell me about that resourceful man, who wandered."
filepath = "/Users/debr/odysseys_en/raw_txts/Odyssey_Green.txt"

found_start = False
filtered_lines = []

with open(filepath, "r", encoding="utf-8") as inputfile:
    for line in inputfile:
        stripped_line = line.strip()  # Remove leading/trailing spaces
        
        # Skip empty lines
        if not stripped_line:
            continue
        
        if not found_start:
            # Normalize the line (strip spaces, convert to lowercase)
            if stripped_line.lower() == "Book 1".lower():
                found_start = True  # Start recording from now on
                filtered_lines.append(line)  # Keep the starting line
        else:
            filtered_lines.append(line)

print("".join(filtered_lines)[:200])

Book 1
The man, Muse—tell me about that resourceful man, who wandered
far and wide, when he’d sacked Troy’s sacred citadel:
many men’s townships he saw, and learned their ways of thinking,
many the gr


In [2]:
# Step 2: Remove everything after Green's last line
# "likening herself to Mentōr, in both voice and appearance."
# Also remove any footer that starts with "EBSCOhost"

cleaned_lines = []
footer_start_phrase = "EBSCOhost"
end_marker = "likening herself to Mentōr, in both voice and appearance."

for line in filtered_lines:
    # If line starts with "EBSCOhost", remove everything from that point onward
    if line.strip().startswith(footer_start_phrase):
        line = line.split(footer_start_phrase)[0]  # Keep only the part before "EBSCOhost"
    
    # Add the cleaned line before checking for the end marker
    cleaned_lines.append(line)
    
    # Stop after including the end marker
    if line.strip().lower() == end_marker.lower():
        break  

# Print cleaned content to check (e.g., first 3000 characters)
print("".join(cleaned_lines)[-100:])

s Athēnē, daughter of aegis-bearing Zeus,
likening herself to Mentōr, in both voice and appearance.



In [3]:
# Assign cleaned text to corpus variable
corpus = "".join(cleaned_lines)

# Print last 200 characters to check
print(corpus)

Book 1
The man, Muse—tell me about that resourceful man, who wandered
far and wide, when he’d sacked Troy’s sacred citadel:
many men’s townships he saw, and learned their ways of thinking,
many the griefs he suffered at heart on the open sea,
battling for his own life and his comrades’ homecoming. Yet
5
no way could he save his comrades, much though he longed to—
it was through their own blind recklessness that they perished,
the fools, for they slaughtered the cattle of Hēlios the sun god
and ate them: for that he took from them their day of returning.
Tell us this tale, goddess, child of Zeus; start anywhere in it!
10
Now the rest, all those who’d escaped from sheer destruction,
were home by now, survivors of both warfare and the sea;
Him alone, though longing for his homecoming and his wife,
the queenly nymph Kalypsō, bright among goddesses,
held back in her hollow cavern, desiring him for her husband.
15
But when the year arrived, with its circling seasons, in which
the gods had or

In [4]:
# Define the file path to save the cleaned text
filepath = "/Users/debr/odysseys_en/cleaned_txts/Odyssey_Green_cleaned.txt"

# Save the corpus to the file
with open(filepath, "w", encoding="utf-8") as outputfile:
    outputfile.write(corpus)

print(f"Cleaned text saved to {filepath}")

Cleaned text saved to /Users/debr/odysseys_en/cleaned_txts/Odyssey_Green_cleaned.txt
