# Epub Translator Colab Notebook

This notebook allows you to use the Epub Translator tool to quickly translate epub books into bilingual book. The tool is designed to maintain the original text format while providing a rough translation.

To support my work please consider make donation: [Via Paypal](https://paypal.me/duocnguyen)

## Setup

First, let's install the necessary dependencies and the Epub Translator tool.

In [None]:
# Install Epub Translator
!curl -fsSL https://raw.githubusercontent.com/nguyenvanduocit/epubtrans/main/scripts/install_unix.sh | bash

# Verify installation
!epubtrans --version

workingDir = !pwd

## Set up Anthropic API Key

To use the translation feature, you need to set up your Anthropic API key. Input your

In [None]:
import os
from getpass import getpass
anthropic_key = getpass('Enter your Anthropic API key: ')
os.environ['ANTHROPIC_KEY'] = anthropic_key

## Connect google drive

This step is optional, if connect, you can open/save file directly to your drive

In [None]:
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

# Define the fixed path
fixed_path = '/content/drive/My Drive/books'

# Check if the directory exists, if not, create it
if not os.path.exists(fixed_path):
    os.makedirs(fixed_path)
    print(f"Created directory: {fixed_path}")
else:
    print(f"Directory already exists: {fixed_path}")

# Print the absolute path
working_dir = os.path.abspath(fixed_path)
print(working_dir)

## Upload EPUB File

1. Run the code block below.
2. Click on "Choose Files" when prompted.
3. Select your EPUB file.

In [None]:
from google.colab import files
import os

print(f"Uploading file to working directory: {working_dir}")

uploaded = files.upload()
epub_filename = None
unpacked_dir = None
epub_file_path = None

if not uploaded:
    print("No file was uploaded.")
else:
    # Get the filename of the uploaded file
    original_filename = next(iter(uploaded))

    # Extract the base name without any added numbers
    base_name, extension = os.path.splitext(original_filename)
    base_name = base_name.rstrip('0123456789')
    epub_filename = f"{base_name}{extension}"

    # Check if the file has .epub extension
    if not epub_filename.lower().endswith('.epub'):
        print(f"Error: {epub_filename} is not an EPUB file. Please upload only EPUB files.")
    else:
        # Create the name for the unpacked directory
        unpacked_dir = os.path.join(working_dir, base_name)

        # Create full path for the uploaded file
        epub_file_path = os.path.join(working_dir, epub_filename)

        # Check if the file already exists
        if os.path.exists(epub_file_path):
            print(f"File {epub_filename} already exists in {working_dir}. Skipping upload.")
        else:
            # Save the uploaded file to the working directory with the original name
            with open(epub_file_path, 'wb') as f:
                f.write(uploaded[original_filename])

            print(f"Uploaded file: {original_filename}")
            print(f"Saved as: {epub_filename}")
            print(f"Saved to: {epub_file_path}")

        print(f"Unpacked directory name will be: {unpacked_dir}")

## Prepare

Now, let's go through the steps to translate the book.

In [None]:
print(f"unpacking to : {unpacked_dir}")
!epubtrans unpack "{epub_file_path}"

print(f"cleaning : {unpacked_dir}")
!epubtrans clean "{unpacked_dir}"

print(f"marking : {unpacked_dir}")
!epubtrans mark "{unpacked_dir}"

## Translate

This code can be run multiple times without overwriting previous translations. It will continue to translate any missing text each time it is executed. Feel free to run the translation process repeatedly as needed.

In [None]:
!epubtrans translate "{unpacked_dir}" --source English --target Vietnamese

### 5. (Optional) Style Adjustment

This step allows you to modify the visibility of the original text:

- To fade the original text, enhancing the translated version's visibility, run this block.
- To completely hide the original text, add the `--hide=source` option.

In [None]:
!epubtrans styling "{unpacked_dir}"

### 6. Package into a bilingual book

In [None]:
!epubtrans pack "{unpacked_dir}"

## Download the Translated Book

After the translation process is complete, you can download the translated book using the following code:

In [None]:
import re

# Function to extract the number from the filename
def extract_number(filename):
    match = re.search(r'\((\d+)\)', filename)
    return int(match.group(1)) if match else 0

# Get all files in the current directory
epubFiles = [f for f in os.listdir('.') if f.startswith(f"{unpacked_dir}-bilangual") and f.endswith('.epub')]

if not epubFiles:
    print("No matching files found.")
else:
    # Sort files based on the number in parentheses (if present)
    latest_file = max(epubFiles, key=extract_number)

    print(f"Latest file: {latest_file}")
    files.download(latest_file)

## Important Notes

1. The quality of translation depends on the Anthropic API and may not be perfect for all types of content.
2. Large books may take a considerable amount of time to translate.
3. Make sure to keep your Anthropic API key confidential and do not share it publicly.
4. This notebook provides a basic workflow. For more advanced usage or troubleshooting, refer to the [Epub Translator GitHub repository](https://github.com/nguyenvanduocit/epubtrans).

Feel free to modify this notebook to suit your specific needs or to experiment with different options provided by the Epub Translator tool.