In [1]:
faqs = """My name is Satyam Singh, and I am a 2nd-year Computer Science and Engineering (CSE) student at NIT Nagaland. Over the course of my academic journey, I have developed a profound interest in technology, programming, and artificial intelligence. I have honed my skills in Machine Learning, Deep Learning, Object Detection, Data Structures, and Database Management Systems (DBMS). These disciplines have laid the foundation for my technical proficiency and have driven my curiosity to explore the cutting-edge realms of technology.

As a dedicated learner, I have achieved several milestones. I hold a certification in Full Stack Development, which demonstrates my expertise in the MERN stack and showcases my ability to manage and execute projects efficiently. Beyond academics, I have taken on leadership roles, such as serving as the Assistant Coding Secretary of the coding club at NIT Nagaland. In this capacity, I have been instrumental in organizing tech fests like Ekarithin, fostering a culture of innovation and collaboration within the student community. Additionally, I have led the college graphics designing team, where I successfully organized and managed various design projects and events. These experiences have not only enhanced my technical and creative skills but also improved my ability to lead and work collaboratively.

My technical journey is marked by a series of projects that reflect my interest in solving real-world problems. One of my notable projects is an MNIST classifier developed using deep learning techniques. This project involved leveraging neural networks to classify handwritten digits, showcasing my understanding of convolutional neural networks and their applications. Another significant project is an SMS spam classifier, built using the Naive Bayes algorithm. This project highlights my ability to work with text data and apply machine learning techniques to distinguish between spam and legitimate messages. I have also developed a house price predictor model, which demonstrates my capability to handle regression problems and make predictions based on historical data.

Beyond individual projects, I have also been actively involved in competitive programming, which has further refined my problem-solving skills. I have tackled challenging problems such as finding the longest common prefix between arrays, calculating the coverage of zeros in a binary matrix, and solving advanced problems on platforms like LeetCode. These experiences have not only enhanced my algorithmic thinking but also prepared me to approach problems with a structured mindset.

My academic and professional journey has also been enriched by practical exposure. I secured a one-month internship at CodeAlpha, where I worked on a credit scoring model using the Gradient Boosting classifier algorithm. This internship allowed me to apply my machine learning knowledge in a real-world scenario, gaining insights into feature engineering, model optimization, and evaluation metrics. Additionally, I organized a NIT-N Weekly Contest Hackathon, which served as a platform to inspire and challenge my peers to engage in coding and problem-solving activities.

Open-source contributions have been a significant part of my journey. As part of GSSoC, I made an impactful contribution by adding an error 404 handler function in Flask to a GitHub repository. This experience not only improved my coding skills but also deepened my understanding of collaborative development and the open-source ecosystem. Furthermore, I am customizing my personal website, KV Nexus, to align with my futuristic vision. The website features a dark background, neon blue and pink gradients, and a network-themed design, reflecting my creative approach and attention to detail.

My learning journey is ongoing, and I am currently delving into Recurrent Neural Networks (RNNs) to broaden my understanding of sequential data processing. I have also worked with time series data, exploring columns such as High, Low, Open, Close, Adj Close, Year, Month, and Day. These experiences are shaping my expertise in handling temporal data and developing predictive models.

I approach challenges with a preference for structured and concise information. This trait is evident in my problem-solving methods and the way I organize my thoughts. My dedication to learning and improving is further demonstrated by my participation in a one-day program on cutting-edge technology, where I gained valuable insights into emerging trends.

My journey is not limited to technical endeavors; it also encompasses a strong emphasis on communication and collaboration. My email, nitish.campusx@gmail.com, serves as a point of contact for networking and opportunities. My overarching goal is to leverage my technical skills, leadership abilities, and passion for innovation to make a meaningful impact in the field of technology. As I continue to grow and learn, I am excited to embrace new challenges and contribute to the ever-evolving landscape of computer science and engineering.


"""

In [2]:
from tensorflow.keras.preprocessing.text import Tokenizer
import tensorflow as tf
from keras.utils import pad_sequences

In [3]:
tokenizer = Tokenizer()
tokenizer.fit_on_texts([faqs])

In [4]:
len(tokenizer.word_index)

370

In [5]:
input_sequences = []
for sentences in faqs.split('\n'):
    tokenized_sentence =tokenizer.texts_to_sequences([sentences])[0]

    for i in range(1,len(tokenized_sentence)):
        n_gram = tokenized_sentence[:i+1] 
        input_sequences.append(n_gram)
        

In [6]:
input_sequences

[[1, 109],
 [1, 109, 10],
 [1, 109, 10, 110],
 [1, 109, 10, 110, 111],
 [1, 109, 10, 110, 111, 2],
 [1, 109, 10, 110, 111, 2, 3],
 [1, 109, 10, 110, 111, 2, 3, 24],
 [1, 109, 10, 110, 111, 2, 3, 24, 4],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57, 58],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57, 58, 59],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57, 58, 59, 2],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57, 58, 59, 2, 34],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57, 58, 59, 2, 34, 113],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57, 58, 59, 2, 34, 113, 60],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57, 58, 59, 2, 34, 113, 60, 35],
 [1, 109, 10, 110, 111, 2, 3, 24, 4, 112, 57, 58, 59, 2, 34, 113, 60, 35, 36],
 [1,
  109,
  10,
  110,
  111,
  2,
  3,
  24,
  4,
  112,
  57,
  58,
  59,
  2,
  34,
  113,
  60,
  35,
  36,
  61],
 [1,
  109,
  10,
  110,
  111,
  2,
  3,
  24,
  4,
  112,
  57,
  

In [7]:
max_len = max([len(x) for x in input_sequences])

In [8]:
padded_input_sequences = pad_sequences(input_sequences,maxlen=max_len,padding ="pre")

In [9]:
padded_input_sequences[0]

array([  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   1, 109])

In [10]:
x = padded_input_sequences[:,:-1]
y = padded_input_sequences[:,-1]

In [11]:
print(x.shape,y.shape)

(759, 122) (759,)


In [12]:
from tensorflow.keras.utils import to_categorical
y = to_categorical(y,num_classes=371)

In [13]:
y.shape

(759, 371)

In [14]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM,Dense,Embedding

model = Sequential()
model.add(Embedding(373,100,input_length=56))
model.add(LSTM(100,return_sequences=True))
model.add(LSTM(75))
model.add(Dense(371,activation="softmax"))



In [15]:
model.compile(optimizer='adam',loss="categorical_crossentropy",metrics=['accuracy'])
model.summary()

In [16]:
history = model.fit(x,y,epochs=100)

Epoch 1/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 134ms/step - accuracy: 0.0281 - loss: 5.8961
Epoch 2/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 154ms/step - accuracy: 0.0339 - loss: 5.5463
Epoch 3/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 156ms/step - accuracy: 0.0514 - loss: 5.3792
Epoch 4/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 152ms/step - accuracy: 0.0457 - loss: 5.3701
Epoch 5/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 129ms/step - accuracy: 0.0384 - loss: 5.3439
Epoch 6/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 128ms/step - accuracy: 0.0475 - loss: 5.3831
Epoch 7/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 126ms/step - accuracy: 0.0429 - loss: 5.3595
Epoch 8/100
[1m24/24[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 127ms/step - accuracy: 0.0506 - loss: 5.3604
Epoch 9/100
[1m24/24[0m [32m━

In [17]:
import time
import numpy as np
text = input()

for i in range(10):
    #tokenizer
    token_text = tokenizer.texts_to_sequences([text])[0]
    #padding
    padded_input_text = pad_sequences([token_text],maxlen=50,padding="pre")
    #predict 
    pos = np.argmax(model.predict(padded_input_text))

    for word,index in tokenizer.word_index.items():
        if index==pos:
            text = text+" "+ word
            print(text)
            time.sleep(2)



[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 474ms/step
 My name is satyam
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 36ms/step
 My name is satyam singh
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 48ms/step
 My name is satyam singh and
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 29ms/step
 My name is satyam singh and i
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step
 My name is satyam singh and i am
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 28ms/step
 My name is satyam singh and i am a
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 51ms/step
 My name is satyam singh and i am a 2nd
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 58ms/step
 My name is satyam singh and i am a 2nd year
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 41ms/step
 My name is satyam singh and i am a 2nd year computer
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━

In [18]:
model.save("model_predictor.h5")

