A ML Engineer specialized in building Natural Language Processing applications using Deep Learning models
You can reach me via
- Specialist in NLP who has ...
- extensively applied NLP techniques for text data
- decent mathematical knowledge on the fundamentals of Statistics/Probability, ML/DL and NLP
- Generalist who builds Python-based Data Science applications/solutions which ...
- use Sklearn in case of ML models and PyTorch ecosystem for DL
- use Pandas, Altair, Jupyter and Streamlit for Data Explorations and Visualization
Work Experience Summary
- Total Experience: 12+ Years | 2010 - Present
- NLP Experience: 8+ Years | 2014 - Present
- Market Research Experience: 4 Years | 2010 - 2014
Company | Designation | Timeline |
---|---|---|
Toyota Connected India | Senior ML Engineer | Jul'22 - Present |
Ford Motor Company | Senior Analyst Deputy Manager |
May'18 - Jun'22 |
LatentView | Senior Analyst Assistant Manager |
Apr'14 - Apr'18 |
Beroe | Analyst Senior Analyst Lead Analyst |
Jul'10 - Dec'13 |
Current Role
- Since Jul'22, I work in Toyota Connected India
- I co-develop NLP applications in cloud with a team of Software Engineers to aid Connected Car customers
Previous Roles
- From May'18 till Jun'22, I have been working in data science
NLP projects
at Ford Analytics Division- Worked for teams such as Artificial Intelligence Advancement Center, Customer Experience and Operations Analytics
- For 4 years, I had offered
Social Media Analytics
and Text Analysis solutions to a F100 Tech client of LatentView Analytics- In the first 4 years of my career, I had worked in Market Research domain.
Key Technical Skills
- Python | NLP via Rules, Linguistics and ML Techniques | Deep Learning for NLP | ML Projects Execution
Click to Expand
- with state-of-the-art transfer learning models (Feature Extraction and Fine-tuning)
- with customized text preprocessing logic using computational linguistic techniques wherever it helps!
- which are deployed using Python CLI apps, FastAPI REST APIs or Streamlit UIs in Kubernetes
- which is reusable, replicable and runnable in docker containers
- which is modularized and packaged (as
from some_internal_package import what_you_need
) - committed to GitHub for co-development and issue-resolution
- with docstrings and pytests, subjected to Pull Requests when multiple developers are involved
Extensively Used |
Working Knowledge |
|
---|---|---|
Tools | |
|
Python Libraries | |
|
- WSL for local development, and linux machines for GPU-powered, dockerized applications development
- predominantly PyCharm (Professional) for remote development but use opensource VS code for local development
- Jupyter Notebook to learn coding concepts
- draw.io, mermaid and markdown for flowcharts and documentation purposes
Click to Expand
- OneNote for taking lots of notes from emails, meetings and websites and
- Slack for communicating, weekly updates and jotting down reminder messages to self
- Microsoft PPT for conveying data stories/insights to non-developer team mates or superiors
- A
People Manager
- directly managing the delivery of Social Media Analytics projects of 8+ members in my stint at
LatentView Analytics
- directly managing the delivery of Social Media Analytics projects of 8+ members in my stint at
- A
Technical Mentor/Trainer
- enhancing the NLP/Python expertise of fellow team members or reportees
Academic Background
- B.E. Madras Institute of Technology, 8.6 CGPA | 2006 - 2010
- State topper in State-level Eng. Entrance Exam | 2006
- Twelfth Grade - 95% | 2006 ; Tenth Grade - 92% | 2004
Course Work
Google Cloud Platform Big Data and Machine Learning Fundamentals
| Coursera-GCP | Apr 2021- 5 course
DeepLearning
Specialization | Coursera-Deeplearning.ai | Nov'18 - May'19- Applied ML and Applied Text Mining Courses | Coursera-University of Michigan | Dec'17 - Jan'18
- Stanford Online Certification Course on SQL | Stanford University Online | 2015
Project #1: Aspect-based Sentiment Analysis
- Built a reusable Sequence Classification ML Pipeline which converts customer comments into
Aspects
andSentiment
- Highlights of the Pipeline:
- Spark+Spacy Preprocessing
- Transfer Learning + Clustering aided Annotation
- Less annotation for Training (compared to traditional ML) by intelligent use of DL+ML models
- Dockerized Environment for Model Training and Inference
- Fine-tuned Transformer models
- Look here for more details
Project #2: Personally Identifiable Information (PII) Detection using NER
- Annonymized PII in text data that resulted in less restricted use of the data
- by building a Named Entity Recognition (NER) system that can detect PII
- Highlights of the Pipeline:
- Bootstrapped the training data using Spacy rules (thus easing the annotation process)
- Spacy's Roberta Base Transformer model allowed for no truncation of sentence max length
- Inference REST API (via an asynchronous FastAPI deployment using K8s) that can be plugged into multiple applications
- Look here for more details
Project #3: NLP Semantic Search Pipeline
-
Goal: To create "digital threads" for connecting automotive data sources
- which has technician comments about issues before the launch of a vehicle,
- by assigning semantically matching common part categories to every issue in both data sources
-
Built a pipeline that ensembles results of
3 pairs of Retriever-Reader models
wherein- the
Retriever
narrows down the search space and - the
Reader
zeroes in on the right results
- the
-
Look here for more details
Project #4: Unsupervised Clustering Pipeline
- Built reusable Text Clustering pipelines
- for deriving actionable insights from unlabeled text corpus
- Highlights of the Pipeline:
- The clustering pipeline provided options for both Traditional Topic Modeling and DL-Embedding based Hard Clustering
- Incorporated the models into an easy-to-use
Streamlit
UI deployed via K8s - The codebase was built on top of the main open source libraries
- PyTorch (Transformers, Sentence Transformers) and Sklearn
- Look here for more details
👨💻 SenthilKumar@WannaKnowMore:~WhoAmI$ cat MyProfessionalStory.txt
🤵
How did I start my career?
- Back in July 2010, I had started out providing customized Market Research (MR) in my first 4 years of my career.
- Simply put, it was a
no-code work
- involving cold-calling, speaking to experts and reading a lot of secondary research material
- to write actionable procurement intelligence reports .
- This first job, right after my engineering undergraduation,
- had taught me the importance of tough-to-learn soft skills
- especially in communication be it written, one-on-one, cold-calling, team presentations and many more.
When did I transition to NLP?
- Since 2014, I have been in the field of Data Science, and the romance has not died down yet :).
- Largely because of the interesting NLP opportunities that landed my way.
- I had primarily worked on
Social Media Analytics
atLatentView
from 2014 to 2018 where
- I had aided my F100 tech major client to effectively use social media insights in their marketing decisions
- Since May 2018, as a Data Scientist at Ford,
- my technical learnings in ML/DL and NLP have been on an upward trend!
What are my mottos?
Striving to follow the below mottos for professional betterment:
- To keep upskilling my technical knowledge
- Firmly believe there are Miles to go before I sleep
- To bring the best collaborative, transparent and importantly humble self in my interactions with colleagues/friends,
- This is so that trust is enabled, long-term partnerships are forged and great results are achieved
- To stand on the shoulders of the giants of open source
- In other words, be applied practitioner first, and not try to reinvent the wheel unless it has some learning/business benefit
👨💻 SenthilKumar@WannaKnowMore:~WhoAmI$ cat MyPersonalStory.txt
👨👩👦
My Small World
I am here working happily in the Data Science field largely because of the sacrifice & guidance of my
wife
.
- She guided my transition from Market Research to Data Science. She is a fellow analytics professional too
- She is on a break to take care of our possibly autistic todler son.
- I am cognizant of this privilege that I am enjoying (me being able to work when she couldn't).
- It has been particularly exacerbated by covid situation and personal losses
Speaking of my
son
- He is the apple of my eye
- He seems to have exemplary memory, well beyond his age! (possibly biased opinion 🙂)
- He grasps abstract things like shapes, numbers, letters, and words faster
- He could be in some autism spectrum (slower learning in social skills compared to kids of his age)
- With my wife's leadership we diagnosed it early and
- Hopefully we are acting on it early before it becoming too noticeableMy Interests
- For last 2 years, I have spent (okay, wasted!) a lot of time on many must-watch TV series. Some iconic I must say.
- My favorite genres: Sci-Fi, Comics, Legal/Medical thrillers and anything out of this world
- My favorites among novels include many mythology fictional writings
- An ardent tea lover!
👨💻 SenthilKumar@WannaKnowMore:~WhoAmI$ cat MyPDFResume.txt
📜
- Here is my résumé in pdf