<a href="https://colab.research.google.com/github/diya0510/LSTM-Word-Predictor/blob/main/Next_Word_Predictor.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
text="""Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think and act like humans. These systems are capable of learning from data, adapting to new inputs, and performing tasks that traditionally require human cognition. AI is a broad field that encompasses subfields such as machine learning, natural language processing, computer vision, robotics, and expert systems.

Machine Learning (ML) is a core component of AI and involves the use of algorithms that can improve automatically through experience. Supervised learning, unsupervised learning, and reinforcement learning are the primary categories of machine learning. Supervised learning relies on labeled data, while unsupervised learning seeks patterns in data without labels. Reinforcement learning involves agents learning to make decisions by interacting with an environment and receiving feedback in the form of rewards or penalties.

Natural Language Processing (NLP) enables computers to understand, interpret, and generate human language. It is used in various applications such as chatbots, machine translation, sentiment analysis, and speech recognition. Techniques in NLP often involve syntactic parsing, semantic analysis, and language modeling. Modern NLP relies heavily on deep learning models like recurrent neural networks (RNNs), transformers, and attention mechanisms.

Computer Vision allows machines to interpret and make decisions based on visual data. It is used in facial recognition, object detection, medical imaging, and autonomous vehicles. Convolutional Neural Networks (CNNs) are a popular architecture for computer vision tasks due to their ability to capture spatial hierarchies in images.

AI also plays a significant role in robotics, where it helps in enabling perception, decision-making, and control in autonomous systems. Robots equipped with AI can perform tasks in manufacturing, healthcare, agriculture, and even space exploration. These robots often rely on sensor fusion, path planning, and reinforcement learning to function effectively in dynamic environments.

Ethical considerations in AI are becoming increasingly important as the technology becomes more pervasive. Issues such as algorithmic bias, data privacy, job displacement, and autonomous decision-making raise questions about accountability, transparency, and fairness. Researchers and policymakers emphasize the importance of responsible AI development that prioritizes human values and societal well-being.

The integration of AI in everyday life is accelerating. Virtual assistants like Siri, Alexa, and Google Assistant use AI to respond to voice commands and perform tasks. Recommendation systems on platforms like Netflix and Amazon utilize AI to personalize content for users. In healthcare, AI algorithms assist in diagnosis, treatment planning, and drug discovery.

As AI continues to evolve, so does its potential to transform industries and redefine human-machine interaction. Emerging areas such as explainable AI (XAI), federated learning, and neuromorphic computing aim to make AI systems more understandable, privacy-preserving, and brain-like in architecture. The future of AI will likely involve greater collaboration between humans and intelligent systems, where machines augment human capabilities rather than replace them.

Despite its promise, AI also presents technical challenges. Data quality, model interpretability, and generalization across domains remain significant hurdles. Researchers are working on improving model robustness, reducing training time, and developing more efficient algorithms that can learn from limited data. Advances in hardware, such as GPUs and TPUs, have also been instrumental in scaling up AI applications.

In conclusion, artificial intelligence represents a transformative force across multiple domains. From its mathematical foundations to its real-world applications, AI is shaping how we work, communicate, and make decisions. Continued innovation, ethical stewardship, and interdisciplinary collaboration will be crucial in unlocking the full potential of AI in the coming years.
"""

In [3]:
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer

In [4]:
tokenizer=Tokenizer()

In [5]:
tokenizer.fit_on_texts([text])

In [8]:
tokenizer.word_index

{'and': 1,
 'in': 2,
 'ai': 3,
 'to': 4,
 'learning': 5,
 'the': 6,
 'of': 7,
 'data': 8,
 'as': 9,
 'human': 10,
 'that': 11,
 'are': 12,
 'systems': 13,
 'is': 14,
 'on': 15,
 'like': 16,
 'a': 17,
 'such': 18,
 'machine': 19,
 'tasks': 20,
 'language': 21,
 'make': 22,
 'its': 23,
 'intelligence': 24,
 'machines': 25,
 'from': 26,
 'computer': 27,
 'vision': 28,
 'algorithms': 29,
 'can': 30,
 'reinforcement': 31,
 'decisions': 32,
 'nlp': 33,
 'it': 34,
 'applications': 35,
 'autonomous': 36,
 'also': 37,
 'more': 38,
 'artificial': 39,
 'humans': 40,
 'these': 41,
 'natural': 42,
 'processing': 43,
 'robotics': 44,
 'involves': 45,
 'use': 46,
 'supervised': 47,
 'unsupervised': 48,
 'relies': 49,
 'with': 50,
 'interpret': 51,
 'used': 52,
 'analysis': 53,
 'recognition': 54,
 'often': 55,
 'involve': 56,
 'neural': 57,
 'networks': 58,
 'architecture': 59,
 'for': 60,
 'significant': 61,
 'where': 62,
 'decision': 63,
 'making': 64,
 'robots': 65,
 'perform': 66,
 'healthcare': 

In [9]:
input_sequences=[]
for sentence in text.split("\n"):
  tokenized_sentence=tokenizer.texts_to_sequences([sentence])[0]

  for i in range (1,len(tokenized_sentence)):
    input_sequences.append(tokenized_sentence[:i+1])


In [13]:
input_sequences

[[39, 24],
 [39, 24, 3],
 [39, 24, 3, 78],
 [39, 24, 3, 78, 4],
 [39, 24, 3, 78, 4, 6],
 [39, 24, 3, 78, 4, 6, 79],
 [39, 24, 3, 78, 4, 6, 79, 7],
 [39, 24, 3, 78, 4, 6, 79, 7, 10],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11, 12],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11, 12, 80],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11, 12, 80, 4],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11, 12, 80, 4, 81],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11, 12, 80, 4, 81, 1],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11, 12, 80, 4, 81, 1, 82],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11, 12, 80, 4, 81, 1, 82, 16],
 [39, 24, 3, 78, 4, 6, 79, 7, 10, 24, 2, 25, 11, 12, 80, 4, 81, 1, 82, 16, 40],
 [39,
  24,
  3,
  78,
  4,
  6,
  79,
  7,
  10,
  24,
  2,
  25,
  11,
  12,
  80,
  4,
  81,
  1,
  82,


In [14]:
max_len=max(len(sentence) for sentence in input_sequences)

In [15]:
max_len

73

In [17]:
from tensorflow.keras.preprocessing.sequence import pad_sequences
padded_input_sequences = pad_sequences(input_sequences, maxlen = max_len, padding='pre')

In [18]:
padded_input_sequences

array([[  0,   0,   0, ...,   0,  39,  24],
       [  0,   0,   0, ...,  39,  24,   3],
       [  0,   0,   0, ...,  24,   3,  78],
       ...,
       [  0,   0,   0, ...,   3,   2,   6],
       [  0,   0,   0, ...,   2,   6, 319],
       [  0,   0,   0, ...,   6, 319, 320]], dtype=int32)

In [19]:
X=padded_input_sequences[:,:-1]
y=padded_input_sequences[:,-1]

In [20]:
X.shape

(565, 72)

In [21]:
y.shape

(565,)

In [23]:
from tensorflow.keras.utils import to_categorical
y = to_categorical(y,num_classes=321)

In [24]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense

In [30]:
model=Sequential()
model.add(Embedding(321,100,input_length=72))
model.add(LSTM(150))
model.add(Dense(321,activation="softmax"))



In [31]:
model.compile(loss='categorical_crossentropy', optimizer='adam',metrics=['accuracy'])

In [33]:
model.summary()

In [34]:
model.fit(X,y,epochs=100)

Epoch 1/100
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 191ms/step - accuracy: 0.0321 - loss: 5.7589
Epoch 2/100
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 156ms/step - accuracy: 0.0487 - loss: 5.4121
Epoch 3/100
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 130ms/step - accuracy: 0.0551 - loss: 5.3492
Epoch 4/100
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 125ms/step - accuracy: 0.0507 - loss: 5.3306
Epoch 5/100
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 129ms/step - accuracy: 0.0447 - loss: 5.3131
Epoch 6/100
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 220ms/step - accuracy: 0.0431 - loss: 5.2735
Epoch 7/100
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 137ms/step - accuracy: 0.0692 - loss: 5.2171
Epoch 8/100
[1m18/18[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 143ms/step - accuracy: 0.0599 - loss: 5.2160
Epoch 9/100
[1m18/18[0m [32m━

<keras.src.callbacks.history.History at 0x7a778b29d950>

In [36]:
import numpy as np

In [40]:
import time
text = "Natural Language Processing"

for i in range(10):
  # tokenize
  token_text = tokenizer.texts_to_sequences([text])[0]
  # padding
  padded_token_text = pad_sequences([token_text], maxlen=56, padding='pre')
  # predict
  pos = np.argmax(model.predict(padded_token_text))

  for word,index in tokenizer.word_index.items():
    if index == pos:
      text = text + " " + word
      print(text)
      time.sleep(2)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 73ms/step
Natural Language Processing nlp
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 52ms/step
Natural Language Processing nlp enables
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 57ms/step
Natural Language Processing nlp enables computers
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step
Natural Language Processing nlp enables computers to
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 57ms/step
Natural Language Processing nlp enables computers to understand
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 52ms/step
Natural Language Processing nlp enables computers to understand interpret
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step
Natural Language Processing nlp enables computers to understand interpret and
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 57ms/step
Natural Language Processing nlp e