Hello! I’m VENKATA SUBBARAO SHIRISH ADDAGANTI👋, a graduate student at Northeastern University, where I’ve done my MS in Data Analytics Engineering . My journey revolves around creating end-to-end data solutions—from wrangling massive datasets and building robust ETL pipelines, to developing and deploying AI models with industry-standard MLOps practices.
What drives me is the challenge of transforming raw data into actionable insights. I love exploring new technologies that push the limits of data-driven problem-solving, whether it’s cloud-native platforms, containerization tools, or the latest machine learning frameworks. In my spare time, you’ll often find me reading up on cutting-edge data science trends, tinkering with side projects on GitHub, or brainstorming new ways to leverage AI in real-world scenarios.
We developed an end-to-end sentiment analysis solution on the UCSD Amazon Reviews 2023 dataset (~338 million reviews) that automates data ingestion, validation, preprocessing, and model deployment. Leveraging Apache Airflow for pipeline orchestration, DVC for data versioning and state-of-the-art models like BERT and RoBERTa for sentiment classification, the system also integrates RAG (Retrieval-Augmented Generation) for aspect-wise summarization and MLflow for model tracking. Deployed via Docker and Vertex AI, and monitored through CI/CD on GitHub Actions, this solution provides real-time analytics and interactive dashboards in Streamlit, empowering teams to make data-driven decisions that enhance customer experience and drive business growth.
- Pipeline Flow:
- Data Ingestion: Real-time data streams from Amazon’s review endpoints → stored in GCP bucket.
- ETL & Preprocessing: Tokenization, cleaning, language detection, and sentiment labeling in Airflow DAGs.
- Modeling: TensorFlow-based sentiment classifier trained on massive labeled data, achieving ~80% accuracy.
- Continuous Delivery: GitHub Actions triggers container rebuilds, automatically deploying new model versions to Kubernetes clusters.
- Monitoring & Alerting: Automatic logs, metrics in Stackdriver, Slack notifications on anomalies.
- Throughput: Scaled to handle 1,000+ predictions per second, ensuring near real-time insights for marketing, product, and user experience teams.
- Impact: Provided instantaneous sentiment insights, aiding product managers in rapid response to customer feedback and iterative product improvements.
Northeastern University, Boston
- MS in Data Analytics Engineering (Dec 2024) | GPA: 4.0/4.0
- Relevant Coursework: Data Management, Data Mining, Data Visualization, Algorithms, Statistical Methods, MLOps
Location: Boston, MA


