random: For randomly selecting words during text generation.

re: Regular expressions used to extract words from the input text.

defaultdict: A special dictionary that automatically creates lists for unseen keys (used to store the model).

In [1]:
import random
import re
from collections import defaultdict

- **Class Name**: `MarkovChainTextGenerator`
- **Purpose**: Generates text based on a simple Markov chain model.
- **Key Components**:
  - `__init__()`: Initializes an empty dictionary to store word-pair relationships.
  - `train(text)`:
    - Tokenizes and lowercases input text.
    - Adds `<START>` and `<END>` markers.
    - Builds a model of word pairs and the words that follow them.
  - `generate(length=50)`:
    - Starts with a word after `<START>`.
    - Predicts and appends words based on previous two.
    - Stops when it reaches `<END>` or desired length.
- **How it works**:
  - Learns how words follow each other in the input text.
  - Uses random selection to build new, similar-sounding text.


In [7]:
class MarkovChainTextGenerator:
    def __init__(self):
        self.model = defaultdict(list)

    def train(self, text):
        words = re.findall(r'\b\w+\b', text.lower())
        words = ['<START>'] + words + ['<END>']
        for i in range(len(words) - 2):
            key = (words[i], words[i + 1])
            self.model[key].append(words[i + 2])

    def generate(self, length=50):
        current = ('<START>', random.choice([w2 for (w1, w2) in self.model if w1 == '<START>']))
        output = []
        for _ in range(length):
            next_words = self.model.get(current)
            if not next_words:
                break
            next_word = random.choice(next_words)
            if next_word == '<END>':
                break
            output.append(next_word)
            current = (current[1], next_word)
        return ' '.join(output)



- **Purpose**: Runs the code only when the script is executed directly.
- **Steps Performed**:
  - Opens and reads the contents of `input.txt`.
  - Creates an instance of `MarkovChainTextGenerator`.
  - Trains the model on the input text.
  - Generates and prints 100 words of text based on the trained model.
- **Special Note**:
  - The `if __name__ == "__main__":` check ensures this part doesn't run if the script is imported as a module elsewhere.


In [11]:
if __name__ == "__main__":
    with open("input.txt", "r", encoding="utf-8") as f:
        text = f.read()

    generator = MarkovChainTextGenerator()
    generator.train(text)

    print("\n📢 Generated Text:\n")
    print(generator.generate(length=100))


📢 Generated Text:

4 create a private bucket in aws upload a file and locate the uploaded file and locate the uploaded files will be accessible publicly until the url into a new browser tab and press enter 3 expected output you should see an access denied message indicating the file s properties 2 open the object url in a web browser o expected output you should see an access denied message indicating the file s properties 2 open the object url in a web browser o expected output you should see an access denied message indicating the file s properties 2 open
