<a href="https://colab.research.google.com/github/PaulNjinu254/Seq2Seq/blob/main/Seq2Seq.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Intrepretation of the English to French translation code
'''
Lines 1 to 6: Model metadata — title, author, date info, description, accelerator.

Lines 8 to 29: Introduction — explains the model purpose, dataset, and training/inference workflow.

Lines 31 to 33: Section header — Setup.

Lines 35 to 39: Import libraries — numpy, keras, os, Path.

Lines 41 to 43: Section header — Download the data.

Lines 45 to 47: Download and unzip English–French dataset.

Lines 49 to 56: Section header — Configuration.

Lines 58 to 63: Hyperparameters and dataset path.

Lines 65 to 67: Section header — Prepare the data.

Lines 69 to 73: Initialize storage lists and sets for texts and characters.

Lines 74 to 76: Read dataset file.

Lines 77 to 85: Process lines, add start/end tokens, store texts, collect characters.

Lines 87 to 91: Sort characters, compute vocab sizes, and max sequence lengths.

Lines 93 to 97: Print dataset statistics.

Lines 99 to 100: Create character-to-index dictionaries.

Lines 102 to 110: Initialize NumPy arrays for encoder/decoder one-hot data.

Lines 112 to 121: Populate one-hot encoded arrays with sequences and padding.

Lines 123 to 125: Section header — Build the model.

Lines 127 to 129: Define encoder input + LSTM + extract states.

Lines 131 to 132: Store encoder states.

Lines 134 to 135: Define decoder input.

Lines 137 to 141: Define decoder LSTM + dense softmax output layer.

Lines 143 to 145: Create training model.

Lines 147 to 149: Section header — Train the model.

Lines 151 to 153: Compile model.

Lines 154 to 160: Train model, save to disk.

Lines 162 to 168: Section header — Run inference (sampling).

Lines 171 to 175: Reload model from disk.

Lines 177 to 180: Build encoder inference model.

Lines 182 to 190: Build decoder inference model.

Lines 192 to 194: Create reverse lookup dictionaries.

Lines 197 to 227: Define decoding function for inference.

Lines 230 to 236: Test decoding on sample sequences and print results.
'''


'\nLines 1 to 6: Model metadata — title, author, date info, description, accelerator.\n\nLines 8 to 29: Introduction — explains the model purpose, dataset, and training/inference workflow.\n\nLines 31 to 33: Section header — Setup.\n\nLines 35 to 39: Import libraries — numpy, keras, os, Path.\n\nLines 41 to 43: Section header — Download the data.\n\nLines 45 to 47: Download and unzip English–French dataset.\n\nLines 49 to 56: Section header — Configuration.\n\nLines 58 to 63: Hyperparameters and dataset path.\n\nLines 65 to 67: Section header — Prepare the data.\n\nLines 69 to 73: Initialize storage lists and sets for texts and characters.\n\nLines 74 to 76: Read dataset file.\n\nLines 77 to 85: Process lines, add start/end tokens, store texts, collect characters.\n\nLines 87 to 91: Sort characters, compute vocab sizes, and max sequence lengths.\n\nLines 93 to 97: Print dataset statistics.\n\nLines 99 to 100: Create character-to-index dictionaries.\n\nLines 102 to 110: Initialize NumPy

In [4]:
#  Upload pretrained model ZIP

from google.colab import files
import zipfile
import os

print("📂 Please upload your pretrained model ZIP file...")
uploaded = files.upload()

📂 Please upload your pretrained model ZIP file...


Saving pretrained_model.zip to pretrained_model.zip


In [10]:
# Extract the zip
for filename in uploaded.keys():
    zip_path = filename
    break  # get the first uploaded file

with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    zip_ref.extractall("/content/model")
print("✅ Model files extracted to /content/model")

✅ Model files extracted to /content/model


In [16]:
import zipfile

zip_path = "/content/pretrained_model.zip"
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
    print(zip_ref.namelist())

['encoder-5-3000.pkl', 'decoder-5-3000.pkl']


In [27]:
# Tokenize captions
import nltk
from nltk.tokenize import word_tokenize
import string

# Download required NLTK resources
nltk.download('punkt')
nltk.download('punkt_tab')

tokens = []
for caption in captions:
    # Lowercase + remove punctuation
    caption = caption.lower().translate(str.maketrans('', '', string.punctuation))
    # Tokenize
    tokens.extend(word_tokenize(caption))


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.


In [32]:
# Build Vocabulary from captions
from collections import Counter
import nltk
nltk.download('punkt')

def build_vocab(captions, threshold=5):
    counter = Counter()
    for caption in captions:
        tokens = nltk.tokenize.word_tokenize(caption.lower())
        counter.update(tokens)

    # Keep only words that occur >= threshold times
    words = [word for word, cnt in counter.items() if cnt >= threshold]

    # Add special tokens
    vocab = {}
    vocab['<pad>'] = 0
    vocab['<start>'] = 1
    vocab['<end>'] = 2
    vocab['<unk>'] = 3

    for i, word in enumerate(words, 4):
        vocab[word] = i

    return vocab

captions_list = [
    "A man riding a horse on a beach.",
    "A group of people playing football.",
    "A cat sitting on a mat."
]
vocab = build_vocab(captions_list, threshold=1)  # Lower threshold for demo

# Encoder & Decoder initialization
embed_size = 256
hidden_size = 512
vocab_size = len(vocab)

encoder = Encoder(embed_size)
decoder = Decoder(embed_size, hidden_size, vocab_size)

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [34]:
# Upload your custom images

from google.colab import files

print("📷 Please upload the images you want captions for...")
uploaded_images = files.upload()

image_paths = []
for filename in uploaded_images.keys():
    path = f"/content/{filename}"
    image_paths.append(path)

print(f"Uploaded {len(image_paths)} images!")

📷 Please upload the images you want captions for...


Saving man-riding-a-horse-through-a-downhill-field.jpg to man-riding-a-horse-through-a-downhill-field.jpg
Saving kreit-Sl0paD0U9KM-unsplash.jpg to kreit-Sl0paD0U9KM-unsplash.jpg
✅ Uploaded 2 images!


In [36]:
# Preprocess images (example preprocessing)

from PIL import Image
import numpy as np

def preprocess_image(img_path):
    img = Image.open(img_path).convert('RGB')
    img = img.resize((299, 299))
    img_array = np.array(img) / 255.0
    img_array = np.expand_dims(img_array, axis=0)
    return img_array

In [5]:
# Function to generate captions

def generate_caption(image_path):
    # Step 1: Encode image
    processed_img = preprocess_image(image_path)
    feature_vector = encoder.predict(processed_img)  # Extract features

    # Generate description
    # Assuming decoder model takes encoded vector & outputs sequence
    description_tokens = decoder.predict(feature_vector)

    # Convert token IDs to words (example, you must have your tokenizer)
    # tokenizer is assumed to be loaded if available
    if 'tokenizer.pkl' in os.listdir('/content/model'):
        import pickle
        with open("/content/model/tokenizer.pkl", "rb") as f:
            tokenizer = pickle.load(f)
        words = [tokenizer.index_word.get(np.argmax(token), '') for token in description_tokens[0]]
        caption = ' '.join([w for w in words if w])
    else:
        caption = "Generated description (tokenizer not found)"

    return caption

In [17]:
# Upload images
from google.colab import files

print("\n Please upload the images you want to caption...\n")
uploaded = files.upload()

# Store image paths in a list
image_paths = list(uploaded.keys())

# Generate captions for uploaded images
import os

print("\n Generating captions for your uploaded images...\n")

# Loop through each uploaded image and generate a caption
for img_path in image_paths:
    try:
        caption = generate_caption(img_path)
        print(f"{os.path.basename(img_path)} → {caption}")
    except Exception as e:
        print(f"Could not generate caption for {os.path.basename(img_path)}: {e}")



 Please upload the images you want to caption...



Saving man-riding-a-horse-through-a-downhill-field.jpg to man-riding-a-horse-through-a-downhill-field (2).jpg

📝 Generating captions for your uploaded images...

Could not generate caption for man-riding-a-horse-through-a-downhill-field (2).jpg: name 'encoder' is not defined


In [None]:
# Running It with Keras Instead of PyTorch
'''
If you have a PyTorch implementation but want to run it in Keras:

Steps:

Model Architecture Conversion

Identify the layers and structure in the PyTorch model.py.

Recreate the architecture in Keras using tf.keras.layers equivalents (e.g., nn.Linear → Dense, nn.Conv2d → Conv2D, nn.LSTM → LSTM).

Keep parameter sizes identical so weights can be mapped.

Weight Conversion

PyTorch and Keras use different formats for weights.

Use a library like onnx to export the PyTorch model to the ONNX format, then load into TensorFlow/Keras via onnx-tf or tf2onnx.

Alternatively, manually load the .pth file, extract tensors with state_dict(), and assign them to Keras layers with layer.set_weights(), making sure dimensions match.

Tokenizer & Data Preprocessing

Replace PyTorch text/image preprocessing (torchvision.transforms, custom tokenizers) with Keras equivalents (tf.image, Tokenizer, TextVectorization).

Training / Inference Adjustments

Inference steps (model.eval() in PyTorch) translate to model.predict() in Keras.

Batch handling will be via tf.data pipelines instead of PyTorch DataLoader.
'''

# Rewriting model.py in Keras (sample code)

from tensorflow.keras import layers, Model, Input

# Encoder
image_input = Input(shape=(224, 224, 3))
x = layers.Conv2D(64, (3,3), activation='relu')(image_input)
x = layers.GlobalAveragePooling2D()(x)
encoder_output = layers.Dense(256, activation='relu')(x)

# Decoder
caption_input = Input(shape=(None,))
embedding = layers.Embedding(vocab_size, 256)(caption_input)
merged = layers.Concatenate()([encoder_output, embedding])
lstm_out = layers.LSTM(512, return_sequences=True)(merged)
output = layers.Dense(vocab_size, activation='softmax')(lstm_out)

model = Model(inputs=[image_input, caption_input], outputs=output)
model.compile(optimizer='adam', loss='categorical_crossentropy')

'''
Translating Between Japanese and English
Steps:

Use a Japanese tokenizer like MeCab or SentencePiece (because Japanese text does not have spaces).

Prepare parallel corpus (e.g., JESC, Tatoeba, or Kyoto Free Translation Task dataset).

Train a Seq2Seq or Transformer model (Hugging Face’s MarianMT or T5 works well).

For inference, ensure proper preprocessing:

Japanese → tokenization (subwords)

English → detokenization



Advanced Machine Translation Methods

Attention Mechanisms (Bahdanau, Luong)

Transformers (Vaswani et al., 2017) — models like BERT, GPT, and MarianMT.

Multilingual Models — single model trained on multiple languages (mBART, mT5).

Pre-trained Models with Fine-tuning — start with a general MT model, fine-tune on your specific domain.



Generating Images from Text (Opposite of Captioning)
Techniques:

Diffusion Models (e.g., Stable Diffusion, DALL·E 2, MidJourney)

GANs (StackGAN, AttnGAN — text-conditioned image generation)

CLIP + Diffusion (guiding generation with text embeddings)

Neural Rendering (NeRF-based, though mainly 3D)

Basic Flow for Text-to-Image:

Encode text into vector representation (BERT/CLIP encoder).

Feed into generative model (Diffusion or GAN).

Generate image pixels conditioned on the text embedding.
'''