# Predicting the Next Word: Historical Evolution of NLP

This notebook walks through the evolution of NLP techniques for the task of predicting the **next word** in a sentence. We’ll build models from scratch using the following techniques:

1. Basic Python and dictionaries
2. Regex-based patterns
3. N-gram language model

Each section demonstrates how these techniques handle next-word prediction for a simple dataset and will train models **from scratch** without using any pre-trained tools.


## predicting next word based on python

In [29]:
def next_word_predictor(context):
    if context == "the":
        return "sun"
    elif context == "sun":
        return "rises"
    elif context == "rises":
        return "in"
    elif context == "in":
        return "the"
    elif context == "the east":
        return "and"
    elif context == "and":
        return "sets"
    elif context == "sets":
        return "in"
    elif context == "the west":
        return "."
    elif context == "in the":
        return "east"  # or later, "west" depending on position
    else:
        return "unknown"

# Test the function
contexts = ["the", "sun", "rises", "in", "the east", "and", "sets", "in the", "the west"]

for c in contexts:
    print(f"Input: '{c}' ➜ Predicted next word: '{next_word_predictor(c)}'")

Input: 'the' ➜ Predicted next word: 'sun'
Input: 'sun' ➜ Predicted next word: 'rises'
Input: 'rises' ➜ Predicted next word: 'in'
Input: 'in' ➜ Predicted next word: 'the'
Input: 'the east' ➜ Predicted next word: 'and'
Input: 'and' ➜ Predicted next word: 'sets'
Input: 'sets' ➜ Predicted next word: 'in'
Input: 'in the' ➜ Predicted next word: 'east'
Input: 'the west' ➜ Predicted next word: '.'


## IVR

In [30]:
def ivr_system(user_input):
    if user_input == "1":
        return "You selected Account Balance. Your balance is $2,540."
    elif user_input == "2":
        return "You selected Credit Card Services. Press 1 for Bill Payment, 2 for Card Lost."
    elif user_input == "2-1":
        return "You selected Bill Payment. Your due is $120, payable by June 15."
    elif user_input == "2-2":
        return "You selected Card Lost. A block request has been initiated."
    elif user_input == "3":
        return "Connecting you to a representative... Please wait."
    elif user_input == "0":
        return "Thank you for calling ABC Bank. Goodbye!"
    else:
        return "Invalid option. Please try again."

# 🧪 Simulate user input:
inputs = ["1", "2", "2-1", "2-2", "3", "0", "9"]

for i in inputs:
    print(f"Input: {i} ➜ Response: {ivr_system(i)}")

Input: 1 ➜ Response: You selected Account Balance. Your balance is $2,540.
Input: 2 ➜ Response: You selected Credit Card Services. Press 1 for Bill Payment, 2 for Card Lost.
Input: 2-1 ➜ Response: You selected Bill Payment. Your due is $120, payable by June 15.
Input: 2-2 ➜ Response: You selected Card Lost. A block request has been initiated.
Input: 3 ➜ Response: Connecting you to a representative... Please wait.
Input: 0 ➜ Response: Thank you for calling ABC Bank. Goodbye!
Input: 9 ➜ Response: Invalid option. Please try again.


## IVR advanced

In [32]:
def keyword_ivr_system(user_input):
    user_input = user_input.lower()

    if "balance" in user_input:
        return "Your current balance is $2,540."
    elif "lost card" in user_input or ("lost" in user_input and "card" in user_input):
        return "Your card has been blocked. A new one will be issued."
    elif "loan" in user_input:
        return "We offer personal, home, and auto loans. Visit abc.com/loans."
    elif "representative" in user_input or "human" in user_input:
        return "Connecting you to a representative... please wait."
    elif "exit" in user_input or "goodbye" in user_input:
        return "Thank you for calling ABC Bank. Have a great day!"
    else:
        return "Sorry, I didn’t catch that. Could you please repeat?"

# 🧪 Test cases
inputs = [
    "Can you tell me my balance?",
    "I lost my card!",
    "Tell me about your loan options",
    "I want to speak to a human",
    "exit",
    "blargle foo"
]

for i in inputs:
    print(f"User: {i}\nSystem: {keyword_ivr_system(i)}\n")

User: Can you tell me my balance?
System: Your current balance is $2,540.

User: I lost my card!
System: Your card has been blocked. A new one will be issued.

User: Tell me about your loan options
System: We offer personal, home, and auto loans. Visit abc.com/loans.

User: I want to speak to a human
System: Connecting you to a representative... please wait.

User: exit
System: Thank you for calling ABC Bank. Have a great day!

User: blargle foo
System: Sorry, I didn’t catch that. Could you please repeat?



## using regex

In [33]:
import re

def regex_ivr_system(user_input):
    user_input = user_input.lower()

    if re.search(r"\bbalance\b", user_input):
        return "Your current balance is $2,540."
    
    elif re.search(r"lost.*card|card.*lost", user_input):
        return "Your card has been blocked. A new one will be issued."
    
    elif re.search(r"\bloan(s)?\b", user_input):
        return "We offer personal, home, and auto loans. Visit abc.com/loans."
    
    elif re.search(r"representative|human|agent", user_input):
        return "Connecting you to a representative... please wait."
    
    elif re.search(r"exit|goodbye|bye", user_input):
        return "Thank you for calling ABC Bank. Have a great day!"
    
    else:
        return "Sorry, I didn’t catch that. Could you please repeat?"

# 🧪 Test inputs
inputs = [
    "What’s my balance?",
    "I think I lost my credit card yesterday",
    "Tell me about your loans and interest rates",
    "Can I speak to a human being?",
    "Okay goodbye",
    "What is foo bar?"
]

for i in inputs:
    print(f"User: {i}\nSystem: {regex_ivr_system(i)}\n")

User: What’s my balance?
System: Your current balance is $2,540.

User: I think I lost my credit card yesterday
System: Your card has been blocked. A new one will be issued.

User: Tell me about your loans and interest rates
System: We offer personal, home, and auto loans. Visit abc.com/loans.

User: Can I speak to a human being?
System: Connecting you to a representative... please wait.

User: Okay goodbye
System: Thank you for calling ABC Bank. Have a great day!

User: What is foo bar?
System: Sorry, I didn’t catch that. Could you please repeat?



## Using Ngram language model

In [20]:
text = "the sun rises in the east the sun sets in the west"

In [14]:
# Tokenize the sentence
tokens = text.lower().split()

In [18]:
from collections import defaultdict, Counter

# Build bigram model
bigram_model = defaultdict(Counter)

In [19]:
bigram_model

defaultdict(collections.Counter, {})

In [20]:
for i in range(len(tokens) - 1):
    bigram_model[tokens[i]][tokens[i+1]] += 1

In [21]:
bigram_model

defaultdict(collections.Counter,
            {'the': Counter({'sun': 2, 'east': 1, 'west': 1}),
             'sun': Counter({'rises': 1, 'sets': 1}),
             'rises': Counter({'in': 1}),
             'in': Counter({'the': 2}),
             'east': Counter({'the': 1}),
             'sets': Counter({'in': 1})})

In [22]:
def predict_next_ngram(word):
    if word in bigram_model:
        return bigram_model[word].most_common(1)[0][0]
    return ""

In [27]:
print("Next word after 'in':", predict_next_ngram("in"))

Next word after 'in': the
