<a href="https://colab.research.google.com/github/06Priya-s/PRODIGY_GA_03/blob/main/Text_NxtWord_generation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
! pip install markovify

Collecting markovify
  Downloading markovify-0.9.4.tar.gz (27 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting unidecode (from markovify)
  Downloading Unidecode-1.4.0-py3-none-any.whl.metadata (13 kB)
Downloading Unidecode-1.4.0-py3-none-any.whl (235 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m235.8/235.8 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[?25hBuilding wheels for collected packages: markovify
  Building wheel for markovify (setup.py) ... [?25l[?25hdone
  Created wheel for markovify: filename=markovify-0.9.4-py3-none-any.whl size=18606 sha256=ebed1987e4b6160cab78d4dd4206d5b5750e98c1bec661fc4f0a272905c01438
  Stored in directory: /root/.cache/pip/wheels/9c/20/eb/1a3fb93f3132f2f9683e4efd834800f80c53aeddf50e84ae80
Successfully built markovify
Installing collected packages: unidecode, markovify
Successfully installed markovify-0.9.4 unidecode-1.4.0


In [2]:
# Markov Chain Text Generator
# This implementation uses character-level Markov chains to generate text

from collections import defaultdict
import random
import requests

In [3]:
class MarkovChain:
    def __init__(self, order=2):
        """
        Initialize the Markov Chain with a given order.
        Order determines how many previous characters to consider for prediction.
        """
        self.order = order
        self.model = defaultdict(lambda: defaultdict(int))
        self.start_states = []

    def train(self, text):
        """
        Train the model on the given text
        """
        if len(text) <= self.order:
            return

        for i in range(len(text) - self.order):
            # Get the current state (sequence of 'order' characters)
            current_state = text[i:i+self.order]

            # Get the next character
            next_char = text[i+self.order]

            # Update the model
            self.model[current_state][next_char] += 1

            # Track possible start states
            if i == 0:
                self.start_states.append(current_state)

    def generate(self, length=100, start_state=None):
        """
        Generate text of specified length
        """
        if not self.model:
            return ""

        # Choose a random start state if none provided
        if start_state is None:
            current_state = random.choice(self.start_states)
        else:
            current_state = start_state[:self.order]
            if current_state not in self.model:
                current_state = random.choice(self.start_states)

        generated_text = current_state

        for _ in range(length - self.order):
            # Get possible next characters and their counts
            next_chars = self.model[current_state]

            if not next_chars:
                break

            # Convert counts to probabilities
            total = sum(next_chars.values())
            probs = {char: count/total for char, count in next_chars.items()}

            # Choose next character based on probabilities
            next_char = random.choices(
                list(probs.keys()),
                weights=list(probs.values()),
                k=1
            )[0]

            generated_text += next_char

            # Update current state
            current_state = generated_text[-self.order:]

        return generated_text

In [6]:
# Download some sample text (Alice in Wonderland)
url = "https://www.gutenberg.org/files/11/11-0.txt"
response = requests.get(url)
text = response.text

# Clean the text a bit (keep it simple)
text = text.replace('\r', ' ').replace('\n', ' ')
text = text[:10000]  # Just use the first part for this example

# Create and train the model
markov = MarkovChain(order=3)
markov.train(text)

# Generate some text
generated_text = markov.generate(length=200)
print("Generated Text:")
print(generated_text)

# Try with different order
print("\nTrying with order=10:")
markov = MarkovChain(order=10)
markov.train(text)
generated_text = markov.generate(length=10000)
formatted_text = generated_text.replace('. ', '.\n')
print(formatted_text)

Generated Text:
*** START OF THE PROJECT GUTENBERG EBOOK 11 ***    *            *    *     *   *   Alice had picking do _very some see it.  In a passaged ope.”  Do came a rand to look fill it Send? “I wouldn’t a long

Trying with order=10:
*** START OF THE PROJECT GUTENBERG EBOOK 11 ***  [Illustration]     Alice’s Adventures in Wonderland  by Lewis Carroll  THE MILLENNIUM FULCRUM EDITION 3.0  Contents   CHAPTER IX.
   The Mock Turtle’s Story  CHAPTER XI.
   Who Stole the Tarts?  CHAPTER IV.
   The Rabbit was no longer to be seen: she found herself in a long, low hall, which was lit up by a row of lamps hanging from the roof.
 There were doors all round the hall, but they were all locked; and when Alice had no idea what Latitude or Longitude either, but the Rabbit-Hole  CHAPTER VI.
   Pig and Pepper  CHAPTER I.
    Down the Rabbit-Hole   Alice was not a _very_ good opportunity for showing off her knowledge, as there was nothing else to do, so Alice ventured to taste it, and finding it v