**Next Word Predictor**

In [1]:
import re
from collections import defaultdict


DATASET

In [2]:
import pandas as pd
corpus = """
hello how are you doing today
hello what is your name
hello it is nice to meet you
how are you feeling today
how is the weather outside
how was your day at work
what are you doing right now
what is the plan for tomorrow
what time will the meeting start
where are you going for lunch
where is the nearest coffee shop
where can I find a good restaurant nearby
the weather is sunny and warm today
the weather forecast predicts rain tomorrow
it is going to be a beautiful day
the team meeting has been postponed
the project deadline is approaching fast
we need to finalize the report by monday
are you coming to the office tomorrow
are you free this evening for a call
can you please send me the latest report
can you let me know the meeting agenda
traveling is a great way to explore new places
travel plans for this summer include visiting europe
do you have any recommendations for good travel destinations
the flight to new york is scheduled for tomorrow morning
the train to boston will arrive at 5 pm
please confirm your hotel reservations before the trip
how was your recent trip to the mountains
what did you enjoy most about your vacation
learning new skills is essential for career growth
attending workshops and seminars can help improve knowledge
reading books is a great way to gain new perspectives
technology is advancing at a rapid pace
artificial intelligence is transforming many industries
machine learning and data science are popular career paths
programming languages like python and java are widely used
writing clean and efficient code is important for software development
staying healthy requires regular exercise and a balanced diet
eating fruits and vegetables is good for your health
drinking plenty of water helps keep you hydrated
a good night's sleep is important for mental health
spending time with family and friends improves happiness
life is about finding balance between work and personal time
hobbies like painting, reading, and gardening can be relaxing
watching movies and listening to music are popular pastimes
sports like football, basketball, and tennis are enjoyed worldwide
success comes with hard work and dedication
challenges are opportunities for growth and learning
perseverance and consistency are keys to achieving goals
"""

Preprocess the Data

In [3]:
# Function to clean and tokenize the corpus
def preprocess(text):
    # Remove punctuation
    text = re.sub(r'[^a-zA-Z\s]', '', text)
    return text.lower().split()


Training The Model

In [4]:
def train_next_word_model(corpus):
    model = defaultdict(list)
    words = preprocess(corpus)
    for i in range(len(words) - 1):
        # Map word to next word
        model[words[i]].append(words[i + 1])
    return model


In [5]:
def predict_next_word(model, word):
    if word in model:
        # Most frequent next word
        return max(set(model[word]), key=model[word].count)
    else:
        return "No prediction available"

Testing the Model

In [6]:
# Training the model
model = train_next_word_model(corpus)

# User input for word prediction
print("Simple Next-Word Predictor (type 'exit' to quit)")
while True:
    word = input("\nEnter a word: ").lower()
    if word == 'exit':
        break
    next_word = predict_next_word(model, word)
    print(f"Predicted next word: {next_word}")

Simple Next-Word Predictor (type 'exit' to quit)

Enter a word: hello
Predicted next word: what

Enter a word: what
Predicted next word: is

Enter a word: is
Predicted next word: the

Enter a word: the
Predicted next word: weather

Enter a word: exit
