Skip to content
View shirish-01's full-sized avatar

Block or report shirish-01

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
shirish-01/README.md

profile view counter


👨‍💻 About Me

Hello! I’m VENKATA SUBBARAO SHIRISH ADDAGANTI👋, a graduate student at Northeastern University, where I’ve done my MS in Data Analytics Engineering . My journey revolves around creating end-to-end data solutions—from wrangling massive datasets and building robust ETL pipelines, to developing and deploying AI models with industry-standard MLOps practices.

What drives me is the challenge of transforming raw data into actionable insights. I love exploring new technologies that push the limits of data-driven problem-solving, whether it’s cloud-native platforms, containerization tools, or the latest machine learning frameworks. In my spare time, you’ll often find me reading up on cutting-edge data science trends, tinkering with side projects on GitHub, or brainstorming new ways to leverage AI in real-world scenarios.


🔥 GitHub Highlights

GitHub Stats Top Languages


🚀 Featured Project

Scalable MLOps Pipeline for Real-Time Amazon Reviews

We developed an end-to-end sentiment analysis solution on the UCSD Amazon Reviews 2023 dataset (~338 million reviews) that automates data ingestion, validation, preprocessing, and model deployment. Leveraging Apache Airflow for pipeline orchestration, DVC for data versioning and state-of-the-art models like BERT and RoBERTa for sentiment classification, the system also integrates RAG (Retrieval-Augmented Generation) for aspect-wise summarization and MLflow for model tracking. Deployed via Docker and Vertex AI, and monitored through CI/CD on GitHub Actions, this solution provides real-time analytics and interactive dashboards in Streamlit, empowering teams to make data-driven decisions that enhance customer experience and drive business growth.

  • Pipeline Flow:
    1. Data Ingestion: Real-time data streams from Amazon’s review endpoints → stored in GCP bucket.
    2. ETL & Preprocessing: Tokenization, cleaning, language detection, and sentiment labeling in Airflow DAGs.
    3. Modeling: TensorFlow-based sentiment classifier trained on massive labeled data, achieving ~80% accuracy.
    4. Continuous Delivery: GitHub Actions triggers container rebuilds, automatically deploying new model versions to Kubernetes clusters.
    5. Monitoring & Alerting: Automatic logs, metrics in Stackdriver, Slack notifications on anomalies.
  • Throughput: Scaled to handle 1,000+ predictions per second, ensuring near real-time insights for marketing, product, and user experience teams.
  • Impact: Provided instantaneous sentiment insights, aiding product managers in rapid response to customer feedback and iterative product improvements.

🎓 Education

Northeastern University, Boston

  • MS in Data Analytics Engineering (Dec 2024) | GPA: 4.0/4.0
  • Relevant Coursework: Data Management, Data Mining, Data Visualization, Algorithms, Statistical Methods, MLOps

⚙️ Tech & Tools (So Far, But Not Limited To...)

Data & Analytics
Python R Java SQL Excel PowerBI Tableau Hadoop BigQuery MongoDB Snowflake scikit-learn NLTK TensorFlow PyTorch


MLOps & Cloud
Docker Airflow MLflow GitHub Actions AWS Azure GCP FastAPI Flask

📫 Contact Me

Location: Boston, MA

LinkedIn Badge


Pinned Loading

  1. telugu-question-answering-system-using-roBERTa telugu-question-answering-system-using-roBERTa Public

    Jupyter Notebook 1

  2. 2-Level-attendance-system 2-Level-attendance-system Public

    Python 2

  3. Airline_Database_Management_System Airline_Database_Management_System Public

    Airline DataBase management using sql

  4. DataCake DataCake Public

    Forked from nikhil-swamix/DataCake

    data mechanic and hyperScraper

    Python

  5. AlgoDM-Fall2023-Team5/Wardrobe-Stylist AlgoDM-Fall2023-Team5/Wardrobe-Stylist Public

    Jupyter Notebook 1 1

  6. MLOps-Group-3/Amazon-Reviews-Sentiment-Analysis MLOps-Group-3/Amazon-Reviews-Sentiment-Analysis Public

    Python 2 3