In [59]:
faqs = """Professional Background & Education
What is Kiran's educational background?
Kiran holds an MS in Artificial Intelligence & Machine Learning from Drexel University (GPA: 3.93/4.0, graduated Sep 2025), a BS in Electronics from Mumbai University (GPA: 8.1/10.0, Jun 2019), and a Diploma in Computer Engineering from S4's Polytechniq (GPA: 9.5/10.0, Jun 2016).
What is Kiran's current professional status?
Kiran recently completed his MS in AI & ML from Drexel University and has over 4 years of professional experience in Machine Learning, Data Science, and AI roles. He most recently worked as a Machine Learning Research Intern at Drexel University (Dec 2024 - Mar 2025).
What certifications does Kiran hold?
Kiran has completed the MIT-WPU PG program in AI & ML (2023) and IBM Professional Data Science certification (2022).
Technical Skills & Expertise
What programming languages does Kiran know?
Kiran is proficient in Python, Java, Javascript, C++, and GoLang.
What AI/ML technologies is Kiran experienced with?
Kiran has expertise in Machine Learning Algorithms, NLP, Computer Vision, Large Language Models (LLMs), Agentic AI, Data Analysis, ETL, Data Visualization, and A/B Testing.
What cloud platforms can Kiran work with?
Kiran has hands-on experience with AWS, GCP, and Azure. He's also skilled in MLOps tools including Docker, Kubernetes, CI/CD pipelines, git, and Jenkins.
Does Kiran have experience with databases?
Yes, Kiran works with both SQL (MySQL) and NoSQL databases (MongoDB), as well as vector databases like ChromaDB and Pinecone DB for AI applications.
What is Kiran's strongest technical area?
Kiran excels in building production-grade ML systems, particularly in NLP, Computer Vision, and deploying scalable AI solutions using cloud infrastructure and MLOps practices.
Work Experien
What kind of companies has Kiran worked for?
Kiran has diverse experience across startup environments (Hefshine Softwares), mid-sized product teams (Great Software Laboratory), and academic research settings (Drexel University).
What was Kiran's role at Great Software Laboratory?
As a Data Scientist (May 2021 - Dec 2023), Kiran architected production-grade anomaly detection systems, automated multi-source ETL workflows processing millions of records, and containerized ML risk engines for deployment across 5 business units.
What achievements did Kiran accomplish at Great Software Laboratory?
Kiran elevated threat detection accuracy by 15%, lowered false positives by 10% across 10M+ daily security events, cut latency by 12%, and boosted data-driven security decision implementation by 30%.
What did Kiran do as a Machine Learning Engineer at Hefshine?
At Hefshine Softwares (Nov 2019 - Apr 2021), Kiran accelerated ML integration boosting data-processing speed by 30%, delivered predictive attrition models that cut turnover by 10%, and led the full ML lifecycle in a startup setting.
What research work has Kiran done?
At Drexel University, Kiran developed zero-shot vision-language models like GPT-4V and LLaVA, boosting classification accuracy by 10% and robustness by 20%. He led 20+ prompt engineering experiments and synthesized insights from 50+ research publications into production prototypes.
Projects & Technical Capabilities
What notable projects has Kiran built?
Kiran has built: (1) A Transformer NLP Summarizer for automated research paper analysis with ROUGE-L 87 and BERTScore 89, (2) A Self-Driving Car Risk System achieving 96% lane-keeping precision, and (3) An AI Trip Planner with personalized recommendations for 100+ destinations.
Can Kiran work with Computer Vision?
Yes, Kiran has extensive Computer Vision experience. He built a self-driving car risk system using CNNs, TensorFlow, and OpenCV, achieving 12% improvement in behavior prediction accuracy and 96% lane-keeping precision.
Does Kiran have experience with Generative AI?
Absolutely. Kiran has worked with GPT-4V, LLaVA, Transformers, and Gemini API. His projects include NLP summarization systems and AI-powered travel planners using GenAI technologies.
Has Kiran deployed ML models to production?
Yes, Kiran has extensive production deployment experience. He's containerized ML systems using Docker and Kubernetes, implemented CI/CD pipelines, and deployed models on AWS SageMaker and GCP AI Platform.
Can Kiran work with real-time data processing?
Yes, Kiran has experience with real-time data processing using Kafka, PySpark, and ETL workflows handling millions of API calls and log records with 99% SLA compliance.
Awards & Recognition
What awards has Kiran received?
Kiran placed 5th out of 100+ teams at the 2024 Philly Codefest for building a multilingual AI app for legal assistance. He also received a Top 3 out of 250 "Pat on the Back" award in 2022 for excellence in scalable ML system design.
Collaboration & Work Style
Can Kiran work in cross-functional teams?
Yes, Kiran has demonstrated strong cross-functional collaboration, mentoring junior engineers, and working with diverse teams across startup, enterprise, and research environments.
Does Kiran have leadership experience?
Yes, Kiran has led ML initiatives including 20+ prompt engineering experiments, managed full ML lifecycles in startup settings, and mentored junior engineers while driving scalable production deployments.
What is Kiran's approach to ML development?
Kiran follows a comprehensive approach covering the full ML lifecycle: from research and experimentation through production deployment, with strong emphasis on scalability, monitoring, and business impact measurement.
Availability & Contact
How can I contact Kiran?
You can reach Kiran at kiran.shidruk.us@gmail.com, connect on LinkedIn at linkedin.com/in/kiranshidruk/, or check his GitHub at github.com/kiranshidruk. Phone: +1 215 240-9822.
Is Kiran available for full-time positions?
Kiran recently graduated with his MS in September 2025 and is available for full-time opportunities in Machine Learning, Data Science, and AI roles.
What type of roles is Kiran looking for?
Kiran is seeking roles in Machine Learning Engineering, Data Science, AI Research, or MLOps where he can leverage his expertise in building production-grade AI systems and working with cutting-edge technologies like LLMs and Computer Vision.
What is Kiran's location?
Kiran is based in Philadelphia, Pennsylvania, USA.
Domain Knowledge
What industries has Kiran worked in?
Kiran has experience in cybersecurity (anomaly detection and threat analysis), HR tech (attrition prediction), autonomous vehicles (self-driving car systems), and travel/recommendations.
Does Kiran have experience with security and risk systems?
Yes, Kiran has significant experience building anomaly detection systems for enterprise security, processing 10M+ daily security events, and developing ML-based risk scoring engines.
Can Kiran work with NLP applications?
Yes, Kiran has completed coursework in NLP with Deep Learning and built production NLP systems including research paper summarizers and behavioral signal detection models using transformers and LLMs.
Academic Coursework
What graduate-level courses has Kiran completed?
Kiran has completed Machine Learning, Software Design, Data Structures and Algorithms, NLP with Deep Learning, Applied AI, Data Analysis and Interpretation, Computer Vision (as Research Assistant), and Recommender Systems for Data Science.
Does Kiran have a strong foundation in computer science fundamentals?
Yes, Kiran has completed comprehensive coursework in Data Structures and Algorithms, Operating Systems, Object-Oriented Programming, and System Design, providing a solid foundation for building scalable systems.
"""

In [60]:
import tensorflow
from tensorflow.keras.preprocessing.text import Tokenizer

In [61]:
tokenizer = Tokenizer()

In [62]:
tokenizer.fit_on_texts([faqs])

In [63]:
tokenizer.index_word

{1: 'kiran',
 2: 'and',
 3: 'in',
 4: 'has',
 5: 'with',
 6: 'what',
 7: 'ai',
 8: 'a',
 9: 'ml',
 10: 'data',
 11: 'for',
 12: 'is',
 13: 'experience',
 14: 'systems',
 15: 'learning',
 16: 'research',
 17: 'at',
 18: 'yes',
 19: 'production',
 20: 'machine',
 21: 'computer',
 22: 'nlp',
 23: 'can',
 24: 'work',
 25: 'by',
 26: 'of',
 27: 'does',
 28: 'vision',
 29: "kiran's",
 30: 'from',
 31: 'university',
 32: 'completed',
 33: 'science',
 34: 'as',
 35: 'models',
 36: 'using',
 37: 'drexel',
 38: '10',
 39: 'his',
 40: 'he',
 41: 'the',
 42: 'on',
 43: 'have',
 44: 'building',
 45: 'detection',
 46: 'processing',
 47: 'risk',
 48: 'security',
 49: 'full',
 50: 'professional',
 51: 'engineering',
 52: 'roles',
 53: 'worked',
 54: 'analysis',
 55: 'scalable',
 56: 'across',
 57: 'startup',
 58: 'teams',
 59: 'software',
 60: 'built',
 61: 'driving',
 62: 'system',
 63: 'time',
 64: 'ms',
 65: 'gpa',
 66: '3',
 67: '0',
 68: '2025',
 69: '1',
 70: 'recently',
 71: 'technical',
 72: '

In [64]:
len(tokenizer.index_word)

437

In [73]:
#  this converts all the sentences in the form of number sequence
#  the strategy will be if 50 is the input ouput should be 97, similarly if 50, 97 will be input then 188 should be ouput

# 50 --> 97
# 50, 97 --> 188
# 6 --> 12
# 6, 12 ---> 29
# 6, 12, 29 --> 189

# and so on

#  we will convert this as supervised ML task and that too categorical, because values are discrete not continues.
# multiclass classifciation problem, with one hot encoded representation
#  output 2 will be represented as [0,1,0,0,0,,....], 3 will be [0,0,1,0,0,0,....] so total classes will be will be 437! and with use of softmax, maximum pro, we will decide the output

for sentence in faqs.split('\n'):
   print(tokenizer.texts_to_sequences([sentence])[0])


[50, 97, 188]
[6, 12, 29, 189, 97]
[1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37, 31, 65, 66, 193, 99, 67, 100, 194, 68, 8, 195, 3, 196, 30, 197, 31, 65, 198, 69, 38, 67, 101, 102, 2, 8, 199, 3, 21, 51, 30, 200, 201, 65, 202, 103, 38, 67, 101, 203]
[6, 12, 29, 204, 50, 205]
[1, 70, 32, 39, 64, 3, 7, 9, 30, 37, 31, 2, 4, 206, 99, 207, 26, 50, 13, 3, 20, 15, 10, 33, 2, 7, 52, 40, 208, 70, 53, 34, 8, 20, 15, 16, 209, 17, 37, 31, 104, 105, 210, 68]
[6, 211, 27, 1, 212]
[1, 4, 32, 41, 213, 214, 215, 216, 3, 7, 9, 106, 2, 217, 50, 10, 33, 218, 107]
[71, 219, 72]
[6, 108, 220, 27, 1, 221]
[1, 12, 222, 3, 223, 224, 225, 226, 2, 227]
[6, 7, 9, 73, 12, 1, 228, 5]
[1, 4, 72, 3, 20, 15, 74, 22, 21, 28, 229, 109, 35, 75, 230, 7, 10, 54, 76, 10, 231, 2, 8, 232, 233]
[6, 110, 234, 23, 1, 24, 5]
[1, 4, 235, 42, 13, 5, 111, 112, 2, 236, 113, 114, 237, 3, 77, 238, 78, 115, 116, 117, 118, 119, 239, 2, 240]
[27, 1, 43, 13, 5, 79]
[18, 1, 241, 5, 242, 243, 244, 2, 245, 79, 246, 34, 247, 34, 248, 79, 80, 249

In [66]:
input_sequences = []
for sentence in faqs.split('\n'):
   tokenizer_sequence = tokenizer.texts_to_sequences([sentence])[0]

   for i in range(1, len(tokenizer_sequence)):
      n_gram_sequence = tokenizer_sequence[:i+1]
      input_sequences.append(n_gram_sequence)

In [67]:
#  we created n_gram, basically we are preaparing dataset to be in the form of input and output

input_sequences

[[50, 97],
 [50, 97, 188],
 [6, 12],
 [6, 12, 29],
 [6, 12, 29, 189],
 [6, 12, 29, 189, 97],
 [1, 190],
 [1, 190, 98],
 [1, 190, 98, 64],
 [1, 190, 98, 64, 3],
 [1, 190, 98, 64, 3, 191],
 [1, 190, 98, 64, 3, 191, 192],
 [1, 190, 98, 64, 3, 191, 192, 20],
 [1, 190, 98, 64, 3, 191, 192, 20, 15],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37, 31],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37, 31, 65],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37, 31, 65, 66],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37, 31, 65, 66, 193],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37, 31, 65, 66, 193, 99],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37, 31, 65, 66, 193, 99, 67],
 [1, 190, 98, 64, 3, 191, 192, 20, 15, 30, 37, 31, 65, 66, 193, 99, 67, 100],
 [1,
  190,
  98,
  64,
  3,
  191,
  192,
  20,
  15,
  30,
  37,
  31,
  65,
  66,
  193,
  99,
  67,
  100,
  194],
 [1,
  190,
  98,
  64,
  3,
  191,
  192

In [68]:
#  size of rows are not equal it has to be for supervised machine learning tasks. we will do padding and fill it with zeros.
# we will need to look for large sentence.


max_sequence_len = max([len(x) for x in input_sequences])
max_sequence_len

50

In [69]:
from tensorflow.keras.preprocessing.sequence import pad_sequences

padded_input_sequences = pad_sequences(input_sequences, max_sequence_len, padding='pre')

In [70]:
padded_input_sequences

array([[  0,   0,   0, ...,   0,  50,  97],
       [  0,   0,   0, ...,  50,  97, 188],
       [  0,   0,   0, ...,   0,   6,  12],
       ...,
       [  0,   0,   0, ..., 187,  11,  44],
       [  0,   0,   0, ...,  11,  44,  55],
       [  0,   0,   0, ...,  44,  55,  14]], dtype=int32)

In [71]:
X = padded_input_sequences[:,:-1]
y = padded_input_sequences[:,-1]

In [74]:
X.shape, y.shape

((1057, 49), (1057,))

In [75]:
from tensorflow.keras.utils import to_categorical
y = to_categorical(y, num_classes=len(tokenizer.word_index)+1)
# bcoz, one hot encoding starts with 0

In [76]:
y.shape

(1057, 438)

In [77]:
y

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])

In [78]:
#  LSTM architecture will be embedding layer, LSTM layer and Dense Layer

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense

In [79]:
model = Sequential()
model.add(Embedding(len(tokenizer.word_index)+1, 100, input_length=max_sequence_len-1))
model.add(LSTM(150))
model.add(Dense(len(tokenizer.word_index)+1, activation='softmax'))



In [80]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [84]:
model.build(input_shape=(None, max_sequence_len-1))

In [85]:
model.summary()

*** Embedding layer *** --> avoid zeros and convert it into some dense value
It basically converts every word into 100 embeddings for it.

so our input is (1,49) --> (1,49,100)

(1057, 49) ---> (1057,49,100)

 that is why in embeddings we can see, 438 ( total number of words) * 100 = 43800 parameter

So, after this layer we can say for every row, for every word, we have 100 dimentional vector. and this data is feed to LSTM


*** LSTM layer *** --> it process input in timesteps, so it will process first row in 49 timesteps here, for every timestep it will take one word embeddings and will process it, it goes on until it process that 49th word and then that ouput ht will be feed to Dense layer with the shape of (1, 150) because we are using 150 nodes in LSTM layer.

LSTM 150 unit, means we have 150 nodes of LSTM and each of them have 4 gates, input  gate, forget get, input gate, output gate and cell gate.

so formulae to calcualte parameters will be (4 * (input_dim + hidden_dim + 1) * hidden_dim) ---> 4 * (100 + 150 + 1 ) * 150  ----> 150,600


*** Dense layer *** ----> it has (150 * 438) input + bias 438 = 66,138 total parameter. also it will give probability for all 438 nodes and max probability corresponding word will be our output.


In [86]:
model.fit(X,y,epochs=100)

Epoch 1/100
[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 87ms/step - accuracy: 0.0183 - loss: 6.0235
Epoch 2/100
[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 83ms/step - accuracy: 0.0711 - loss: 5.5950
Epoch 3/100
[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 128ms/step - accuracy: 0.0848 - loss: 5.4220
Epoch 4/100
[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 87ms/step - accuracy: 0.0825 - loss: 5.2934
Epoch 5/100
[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 83ms/step - accuracy: 0.0821 - loss: 5.2310
Epoch 6/100
[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 86ms/step - accuracy: 0.0884 - loss: 5.0500
Epoch 7/100
[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 87ms/step - accuracy: 0.1019 - loss: 5.0595
Epoch 8/100
[1m34/34[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 132ms/step - accuracy: 0.1049 - loss: 4.8652
Epoch 9/100
[1m34/34[0m [32m━━━━━━━

<keras.src.callbacks.history.History at 0x7cbf9d5f5370>

In [125]:
# for testing
import numpy as np
seed_text = "Kiran elevated threat detection accuracy by"

# for that we will need to follow few preprocessing steps.
# tokenize ---> padding ---> prediction


for i in range(5):
  text_tokenized = tokenizer.texts_to_sequences([seed_text])[0]
  text_tokenized = pad_sequences([text_tokenized], maxlen=max_sequence_len-1, padding='pre')

  pos = np.argmax(model.predict(text_tokenized)) # gives max probability position
  for word, index in tokenizer.word_index.items():
    if index == pos:
      seed_text += " " + word
      print(seed_text)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 49ms/step
Kiran elevated threat detection accuracy by 15
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 48ms/step
Kiran elevated threat detection accuracy by 15 lowered
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 47ms/step
Kiran elevated threat detection accuracy by 15 lowered false
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 60ms/step
Kiran elevated threat detection accuracy by 15 lowered false positives
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 63ms/step
Kiran elevated threat detection accuracy by 15 lowered false positives by


In [126]:
model.predict(text_tokenized)

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 61ms/step


array([[3.88250632e-09, 1.31606994e-05, 8.61441193e-04, 9.69803659e-04,
        5.09086112e-06, 9.75607909e-05, 4.07259693e-09, 2.76435657e-07,
        9.21170908e-07, 1.18313306e-04, 2.48517472e-06, 4.29905558e-05,
        3.19855940e-06, 3.44468353e-05, 3.11233889e-05, 7.13458530e-06,
        4.87167017e-06, 3.38624632e-05, 2.36897746e-09, 8.51327877e-06,
        6.74023992e-09, 2.56475694e-08, 2.03498605e-08, 1.71005752e-07,
        3.07621768e-08, 9.80278611e-01, 5.85879374e-04, 5.44773957e-06,
        6.17176693e-06, 1.42315329e-07, 8.48905438e-06, 1.12145593e-04,
        2.06519900e-07, 2.78909079e-06, 1.82831078e-04, 2.03760617e-04,
        5.02769899e-06, 2.49614391e-06, 1.92915578e-03, 2.32542414e-04,
        7.48151506e-05, 6.73005707e-06, 1.58361763e-06, 7.41719575e-08,
        9.58655846e-07, 2.19488106e-06, 1.41573444e-04, 1.54540940e-05,
        1.78623566e-04, 1.55032706e-06, 1.11313248e-05, 1.79610197e-06,
        2.84750149e-06, 3.24608777e-07, 2.83786653e-06, 9.523014

In [127]:
#  This model is performing really good on training data, but it may not be on test/unseen data.