Building scalable data pipelines and cloud infrastructure with Python, GCP, and Big Data technologies
- π Currently working on: Building automated data pipelines and cloud-native applications
- π± Learning: FastAPI, AI Integration, Advanced Cloud Architecture, System design
- πΌ Experience: 2.5+ years in Cloud Engineering, Data Processing
- π Writing: Technical articles on Medium
- π― Interests: AI, Geospatial Data, Big Data, and Scalable Systems
July 2023 - January 2025
- Built scalable data pipelines processing 100GB+ geospatial datasets (TIFF, CSV) using Google Earth Engine API
- Optimized algorithms achieving 20% efficiency improvement
- Deployed containerized applications on Kubernetes with infrastructure managed via Terraform
- Implemented comprehensive monitoring and logging solutions using GCP Cloud Logging
Tech Stack: Python, GCP (GCS, Pub/Sub, Kubernetes, VM), Docker, Terraform, Google Earth Engine API
January 2023 - July 2023
- Processed large-scale healthcare device data using PySpark on distributed systems (AWS Glue)
- Developed automated data export pipelines with Cloud Scheduler, Batch Jobs, and Cloud Workflows
- Created interactive data visualization dashboards using Streamlit for real-time insights
- Analyzed accident data to identify high-risk patterns and trends using big data techniques
Tech Stack: PySpark, AWS Glue, Streamlit, Terraform, Cloud Workflows
π·οΈ IntelliScraper
Anti-bot detection asynchronous web scraping library built with Playwright
Description: A production-ready Python library for scraping protected websites (job platforms, social networks, e-commerce) that bypass anti-bot systems. Features session management, proxy support, and advanced HTML parsing. Published on PyPI with 2,000+ downloads.
Tech Stack: Python, Playwright, Asyncio, Bright Data Proxy
Highlights:
- π Session management with cookies and browser fingerprints for authenticated scraping
- π‘οΈ Advanced anti-detection techniques to bypass bot protection systems
- β‘ Fully asynchronous architecture for high-performance concurrent scraping
- π¦ Published open-source library with 2.08K+ PyPI downloads
- π Integrated proxy support (Bright Data) and CLI tool for session generation
- β 4-Star Competitive programming - CodeChef
- β 5-Star C++ - HackerRank
- π Competitive Programming Essentials - Master Algorithms Certification
- π» Active problem solver on GeeksforGeeks, LeetCode,
Check out more of my articles on Medium
π‘ Open to collaborating on int