<a href="https://colab.research.google.com/github/Mahecoding/PRODIGY_GA_03/blob/main/PRODIGY_GA_03.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

:# **Text Generation with Markov Chains**
**Overview:** Markov chains are mathematical models used to generate sequences, where the next element (word or character) is determined based on the current state or the previous few states.

 In text generation, this involves creating a sequence of text where the next word is predicted based on the preceding word(s). The model is "memoryless," meaning it relies only on a fixed number of previous states

In [7]:
!pip install gradio

Collecting gradio
  Downloading gradio-4.42.0-py3-none-any.whl.metadata (15 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi (from gradio)
  Downloading fastapi-0.113.0-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.4.0-py3-none-any.whl.metadata (2.9 kB)
Collecting gradio-client==1.3.0 (from gradio)
  Downloading gradio_client-1.3.0-py3-none-any.whl.metadata (7.1 kB)
Collecting httpx>=0.24.1 (from gradio)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting orjson~=3.0 (from gradio)
  Downloading orjson-3.10.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (50 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.4/50.4 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
Collecting pydub (from gradio)
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting python-multipart>=0.0.9 (from gradi

In [8]:
import random
import gradio as gr

In [9]:
# Load the dataset from a file in Google Drive
file_path = '/content/drive/MyDrive/Colab Notebooks/PRODIGY/dataset.txt'
dataset_text = load_dataset(file_path)

In [10]:

from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

# Load the dataset from a file in Google Drive
def load_dataset(file_path):
    with open(file_path, 'r') as file:
        return file.read()

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [12]:

# Step 1: Function to generate word pairs from text
def generate_word_pairs(text):
    words = text.split()  # Corrected typo: spilt -> split
    for i in range(len(words) - 1):
        yield (words[i], words[i + 1])

# Step 2: Function to build the Markov chain from word pairs
def build_markov_chain(text):
    chain = {}
    for word1, word2 in generate_word_pairs(text):
        if word1 in chain:
            chain[word1].append(word2)
        else:
            chain[word1] = [word2]
    return chain

# Step 3: Function to generate text using the Markov chain
def generate_text(chain, length=10, seed_word=None):
    if seed_word is None or seed_word not in chain:
        seed_word = random.choice(list(chain.keys()))

    word = seed_word
    generated_words = [word]

    for _ in range(length - 1):
        word = random.choice(chain.get(word, [random.choice(list(chain.keys()))]))
        generated_words.append(word)

    return ' '.join(generated_words)

# Step 4: Gradio interface function
def gradio_markov_interface(seed_word, length):
    chain = build_markov_chain(dataset_text)  # Build the Markov chain using the loaded dataset
    return generate_text(chain, length=length, seed_word=seed_word)

# Step 5: Create and launch the Gradio interface
interface = gr.Interface(
    fn=gradio_markov_interface,
    inputs=[
        gr.Textbox(placeholder="Enter a seed word (optional)", label="Seed Word (Optional)"),
        gr.Slider(minimum=5, maximum=100, step=1, value=20, label="Length of Generated Text")
    ],
    outputs="text",
    title="Markov Chain Text Generator",
    description="Generate text based on Markov chains using the dataset loaded from Google Drive. Set a seed word if desired and choose the length of the generated text."
)

# Step 6: Launch the Gradio app
interface.launch()


Setting queue=True in a Colab notebook requires sharing enabled. Setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://cd0586de16229d705f.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


