Skip to content
View senthilkumarm1901's full-sized avatar
💭
Sweat more during peace, bleed less during war
💭
Sweat more during peace, bleed less during war

Block or report senthilkumarm1901

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
senthilkumarm1901/README.md

Hi there 👋 I'm Senthil Kumar 👨‍💻

A ML Engineer specialized in building Natural Language Processing applications using Deep Learning models

You can reach me via


                                                                    LinkedIn              Github              myResume              Blog             

🤵‍♂️ I am a ...

  • Specialist in NLP who has ...
    • extensively applied NLP techniques for text data
    • decent mathematical knowledge on the fundamentals of Statistics/Probability, ML/DL and NLP
  • Generalist who builds Python-based Data Science applications/solutions which ...
    • use Sklearn in case of ML models and PyTorch ecosystem for DL
    • use Pandas, Altair, Jupyter and Streamlit for Data Explorations and Visualization

📈 My Career Graph ...


Toyota Connected India              Ford (GDIA Skill division)              LatentView Analytics              Beroe - A Procurement MR Firm             

Work Experience Summary


  • Total Experience: 12+ Years | 2010 - Present
  • NLP Experience: 8+ Years | 2014 - Present
  • Market Research Experience: 4 Years | 2010 - 2014

Company Designation Timeline
Toyota Connected India Senior ML Engineer Jul'22 - Present
Ford Motor Company Senior Analyst
Deputy Manager
May'18 - Jun'22
LatentView Senior Analyst
Assistant Manager
Apr'14 - Apr'18
Beroe Analyst
Senior Analyst
Lead Analyst
Jul'10 - Dec'13
Current Role
  • Since Jul'22, I work in Toyota Connected India
  • I co-develop NLP applications in cloud with a team of Software Engineers to aid Connected Car customers

Previous Roles
  • From May'18 till Jun'22, I have been working in data science NLP projects at Ford Analytics Division
  • Worked for teams such as Artificial Intelligence Advancement Center, Customer Experience and Operations Analytics
  • For 4 years, I had offered Social Media Analytics and Text Analysis solutions to a F100 Tech client of LatentView Analytics
  • In the first 4 years of my career, I had worked in Market Research domain.
Key Technical Skills
  • Python | NLP via Rules, Linguistics and ML Techniques | Deep Learning for NLP | ML Projects Execution

While Coding ...

Click to Expand

🛠️ I typically build NLP applications ...

  • with state-of-the-art transfer learning models (Feature Extraction and Fine-tuning)
  • with customized text preprocessing logic using computational linguistic techniques wherever it helps!
  • which are deployed using Python CLI apps, FastAPI REST APIs or Streamlit UIs in Kubernetes

🛠️ I typically write codebase ...

  • which is reusable, replicable and runnable in docker containers
  • which is modularized and packaged (as from some_internal_package import what_you_need)
  • committed to GitHub for co-development and issue-resolution
  • with docstrings and pytests, subjected to Pull Requests when multiple developers are involved

🛠️ I typically build using ...

Extensively Used Working Knowledge
Tools Python Git Shell Markdown
Jupyter PyCharm Docker
Kubernetes Poetry
Venv Conda
Python Libraries SpaCy HuggingFace Transformers PyTorch
Pandas regex sklearn
PySpark Altair/Seaborn GenSim
FastAPI Streamlit

💻 I typically use tools such as ...

  • WSL for local development, and linux machines for GPU-powered, dockerized applications development
  • predominantly PyCharm (Professional) for remote development but use opensource VS code for local development
  • Jupyter Notebook to learn coding concepts
  • draw.io, mermaid and markdown for flowcharts and documentation purposes

While not coding ...

Click to Expand

📅 🎙️ I typically use tools such as ...

  • OneNote for taking lots of notes from emails, meetings and websites and
  • Slack for communicating, weekly updates and jotting down reminder messages to self
  • Microsoft PPT for conveying data stories/insights to non-developer team mates or superiors

🧔 Apart from being a Data Science Developer, I have donned the hat of ...

  • A People Manager
    • directly managing the delivery of Social Media Analytics projects of 8+ members in my stint at LatentView Analytics
  • A Technical Mentor/Trainer
    • enhancing the NLP/Python expertise of fellow team members or reportees

👨‍🎓 My Educational Background ...

Academic Background
  • B.E. Madras Institute of Technology, 8.6 CGPA | 2006 - 2010
  • State topper in State-level Eng. Entrance Exam | 2006
  • Twelfth Grade - 95% | 2006 ; Tenth Grade - 92% | 2004
Course Work
  • Google Cloud Platform Big Data and Machine Learning Fundamentals| Coursera-GCP | Apr 2021
  • 5 course DeepLearning Specialization | Coursera-Deeplearning.ai | Nov'18 - May'19
  • Applied ML and Applied Text Mining Courses | Coursera-University of Michigan | Dec'17 - Jan'18
  • Stanford Online Certification Course on SQL | Stanford University Online | 2015

🧑‍💼 Short Summaries of my Key Projects ...

Project #1: Aspect-based Sentiment Analysis
  • Built a reusable Sequence Classification ML Pipeline which converts customer comments into Aspects and Sentiment
  • Highlights of the Pipeline:
    • Spark+Spacy Preprocessing
    • Transfer Learning + Clustering aided Annotation
      • Less annotation for Training (compared to traditional ML) by intelligent use of DL+ML models
    • Dockerized Environment for Model Training and Inference
    • Fine-tuned Transformer models
  • Look here for more details
Project #2: Personally Identifiable Information (PII) Detection using NER
  • Annonymized PII in text data that resulted in less restricted use of the data
    • by building a Named Entity Recognition (NER) system that can detect PII
  • Highlights of the Pipeline:
    • Bootstrapped the training data using Spacy rules (thus easing the annotation process)
    • Spacy's Roberta Base Transformer model allowed for no truncation of sentence max length
    • Inference REST API (via an asynchronous FastAPI deployment using K8s) that can be plugged into multiple applications
  • Look here for more details
Project #3: NLP Semantic Search Pipeline
  • Goal: To create "digital threads" for connecting automotive data sources

    • which has technician comments about issues before the launch of a vehicle,
    • by assigning semantically matching common part categories to every issue in both data sources
  • Built a pipeline that ensembles results of 3 pairs of Retriever-Reader models wherein

    • the Retriever narrows down the search space and
    • the Reader zeroes in on the right results
  • Look here for more details

Project #4: Unsupervised Clustering Pipeline
  • Built reusable Text Clustering pipelines
    • for deriving actionable insights from unlabeled text corpus
  • Highlights of the Pipeline:
    • The clustering pipeline provided options for both Traditional Topic Modeling and DL-Embedding based Hard Clustering
    • Incorporated the models into an easy-to-use Streamlit UI deployed via K8s
    • The codebase was built on top of the main open source libraries
      • PyTorch (Transformers, Sentence Transformers) and Sklearn
  • Look here for more details

👨‍💻 ssh SenthilKumar@WannaKnowMore

👨‍💻 SenthilKumar@WannaKnowMore:~WhoAmI$ cat MyProfessionalStory.txt 🤵
How did I start my career?
  • Back in July 2010, I had started out providing customized Market Research (MR) in my first 4 years of my career.
    • Simply put, it was a no-code work
      • involving cold-calling, speaking to experts and reading a lot of secondary research material
      • to write actionable procurement intelligence reports .
    • This first job, right after my engineering undergraduation,
      • had taught me the importance of tough-to-learn soft skills
      • especially in communication be it written, one-on-one, cold-calling, team presentations and many more.
When did I transition to NLP?
  • Since 2014, I have been in the field of Data Science, and the romance has not died down yet :).
  • Largely because of the interesting NLP opportunities that landed my way.
  • I had primarily worked on Social Media Analytics at LatentView from 2014 to 2018 where
    • I had aided my F100 tech major client to effectively use social media insights in their marketing decisions
  • Since May 2018, as a Data Scientist at Ford,
    • my technical learnings in ML/DL and NLP have been on an upward trend!
What are my mottos?

Striving to follow the below mottos for professional betterment:

  • To keep upskilling my technical knowledge
    • Firmly believe there are Miles to go before I sleep
  • To bring the best collaborative, transparent and importantly humble self in my interactions with colleagues/friends,
    • This is so that trust is enabled, long-term partnerships are forged and great results are achieved
  • To stand on the shoulders of the giants of open source
    • In other words, be applied practitioner first, and not try to reinvent the wheel unless it has some learning/business benefit
👨‍💻 SenthilKumar@WannaKnowMore:~WhoAmI$ cat MyPersonalStory.txt 👨‍👩‍👦
My Small World
  • I am here working happily in the Data Science field largely because of the sacrifice & guidance of my wife .

    • She guided my transition from Market Research to Data Science. She is a fellow analytics professional too
    • She is on a break to take care of our possibly autistic todler son.
    • I am cognizant of this privilege that I am enjoying (me being able to work when she couldn't).
    • It has been particularly exacerbated by covid situation and personal losses
  • Speaking of my son

    • He is the apple of my eye
    • He seems to have exemplary memory, well beyond his age! (possibly biased opinion 🙂)
    • He grasps abstract things like shapes, numbers, letters, and words faster
    • He could be in some autism spectrum (slower learning in social skills compared to kids of his age)
      - With my wife's leadership we diagnosed it early and
      - Hopefully we are acting on it early before it becoming too noticeable
My Interests
  • For last 2 years, I have spent (okay, wasted!) a lot of time on many must-watch TV series. Some iconic I must say.
    • My favorite genres: Sci-Fi, Comics, Legal/Medical thrillers and anything out of this world
  • My favorites among novels include many mythology fictional writings
  • An ardent tea lover!
👨‍💻 SenthilKumar@WannaKnowMore:~WhoAmI$ cat MyPDFResume.txt 📜

Popular repositories Loading

  1. myNLPnotes myNLPnotes Public

    This repo has a collection of my notes on the theoretical concepts in NLP

    Jupyter Notebook 2 1

  2. aws_serverless_recipes aws_serverless_recipes Public

    This repository holds various small scale serverless recipes

    HTML 1

  3. MyCourseWorkNotes MyCourseWorkNotes Public

    This repo has notes from some of the courses I did

    Jupyter Notebook

  4. PythonTutorials PythonTutorials Public

    This repo holds my practice notebooks in several Python libraries

    Jupyter Notebook

  5. WordEmbedding WordEmbedding Public

    This repository covers the theory of the embedding algorithms and implementation in python

    Jupyter Notebook

  6. StatisticalLanguageModels StatisticalLanguageModels Public

    A gentle intro to the theory of Statistical Language Models (LMs) | An attempt to understand ABCs of NLP in the era of Transformer LMs generating Poems ;)

    Jupyter Notebook