I'm Sophie(Huidong). With over 3+ years of experience, I specialize in implementing machine learning algorithms and crafting scalable data-driven applications. My strengths lie in handling vast datasets and creating optimized ETL processes. I'm passionate about the ever-evolving world of data and always eager to learn, adapt, and contribute.
-
Telematics Data Mastery ๐
- Managed end-to-end lifecycle of large-scale telematics (GPS/accelerometer) data.
- Developed ETL processes to manage billions of rows, extracting essential features for risk prediction.
-
Performance Optimization โก
- Designed a real-time infrastructure for GLM-based risk prediction, integrating it with auto insurance pricing to exceed industry benchmarks by 200%.
- Achieved 90% processing time reduction via geospatial data optimization.
-
Automation & Monitoring ๐ค
- Introduced automation with Databricks and Apache Airflow.
- Set up monitoring protocols for consistent data quality and model performance.
-
Cloud Mastery โ๏ธ
- Designed and hosted a web app on AWS EC2 using Streamlit, marking a significant decrease in manual effort.
- Semantic Search Engine: Launched a resilient search engine, fortified with a Jenkins CI/CD pipeline for timely updates.
- Netflix Data Analysis ETL: Crafted an ETL pipeline with DBT, Snowflake, and Airflow. Visualization made possible with Quicksight.
- Article Recommendation Engine: A web application that recommends BBC articles based on user preferences using word2vec.
- Spark Streaming: Real-time data ingestion and processing using Azure Event Hub, Databricks, and Snowflake.
- Feature Importance Exploration: Detailed exploration of diverse feature-importance methods for ML models.
- Machine Learning Implementations: Demonstrated in-depth understanding with ground-up creations of ML algorithms like Sentiment Analysis with Naive Bayes, KMeans, Decision Tree, Random Forest, Linear Regression
If you'd like to connect, feel free to reach out to me through LinkedIn.
I'm always open to discussing new projects, ideas, or opportunities. Let's get in touch and explore the world of data together!