The Next Word Prediction program is a simple text generation application that utilizes a basic n-gram language model to predict the next word in a given sequence of words. The program is designed to take user input, analyze the context of the input, and provide a probabilistic prediction of the most likely word to follow. The core idea behind this program is to use the statistical patterns observed in a training text to make educated guesses about the next word in a sequence.

In [1]:
# Import necessary libraries
import re
import random
import tensorflow
import keras

In [2]:
# Read the file which contains general sentences with special symbols
with open("/content/1661-0 (1).txt", "r", encoding="utf-8") as file:
    text_data = file.read()

In [3]:
def preprocess_text(text):
    text = re.sub(r"[^a-zA-Z0-9\s]", "", text)
    text = text.lower()
    return text

In [4]:
preprocessed_text = preprocess_text(text_data)
words = preprocessed_text.split()

In [5]:
# Build n-gram frequencies
def build_ngram_frequencies(words, n):
    ngrams = {}
    for i in range(len(words) - n):
        ngram = tuple(words[i:i+n])
        next_word = words[i+n]
        if ngram in ngrams:
            ngrams[ngram].append(next_word)
        else:
            ngrams[ngram] = [next_word]
    return ngrams

In [6]:
ngram_order = 2  # Change this to the desired n-gram order
ngram_frequencies = build_ngram_frequencies(words, ngram_order)

In [7]:
# Predict the next word based on user input
def predict_next_word(user_input):
    user_input = preprocess_text(user_input)
    user_words = user_input.split()

    if len(user_words) < ngram_order:
        return "Input more words to predict"

    ngram = tuple(user_words[-ngram_order:])
    if ngram in ngram_frequencies:
        next_word_candidates = ngram_frequencies[ngram]
        next_word = random.choice(next_word_candidates)
        return next_word
    else:
        return "No prediction available"

In [8]:
# User interaction loop
while True:
    user_input = input("Enter a few words: ")
    if user_input.lower() == "exit":
        break
    predicted_word = predict_next_word(user_input)
    print("Predicted next word:", predicted_word)

Enter a few words: the
Predicted next word: Input more words to predict
Enter a few words: yes
Predicted next word: Input more words to predict
Enter a few words: Title: The Adventures of Sherlock Holmes
Predicted next word: alone
Enter a few words: Arthur Conan 
Predicted next word: doyle
Enter a few words: Produced by an anonymous
Predicted next word: project
Enter a few words: You may copy
Predicted next word: it
Enter a few words: exit
