Notebooks .ipynb and .py scripts:
- Repository - CX Data Analyst Portfolio Project - Insurance Industry
- Repository - Hotel Booking Cancellation Analysis
- Repository - LEGO Database Analysis: SQL in Python Environment
- Repository - Yandex Practicum Assorted DA projects
Data Analysis Projects
| Project | Type of Research | Annotation |
|---|---|---|
| 01. Customer Lifetime Value Optimization Through Proactive Health Engagement | Predictive Analytics & Customer Segmentation | Analysis of 25,000 health insurance records to predict churn and identify high-value segments through activity-based CLV modeling. Includes retention scoring, geospatial trend analysis, and personalized engagement strategies. Tech: Pandas, Matplotlib, Seaborn, Scikit-Learn. Skills: EDA, predictive modeling, customer segmentation, health metrics analysis. |
| 02. Hotel Booking Cancellation Analysis | Exploratory Data Analysis & Business Intelligence | EDA of 36,274 hotel bookings identifying $4.2M revenue loss from cancellations. Statistical validation (t-tests, chi-square), interactive dashboards, and targeted interventions projecting $400K-600K annual recovery. Tech: Pandas, Matplotlib, Seaborn, Scikit-Learn, Plotly, React. Skills: Statistical analysis, feature engineering, business impact quantification, dashboard development. |
| 03. LEGO Database Analysis: SQL in Python Environment | SQL + Python Integration | PostgreSQL analysis via Jupyter notebook on Rebrickable LEGO database (11,673 sets, 8 tables). Analyzes production trends, color patterns, and theme popularity. Tech: PostgreSQL, Python, Jupyter, SQLAlchemy. Skills: CTEs, multi-table JOINs (3+ tables), aggregate functions, subqueries, Python-SQL integration. |
| 04. PostgreSQL Queries: Startup Investment Analysis | Advanced SQL Database Analysis | 23 progressive SQL queries analyzing venture capital investments across 7-table relational database. Includes acquisition ROI calculations, geographic funding patterns, and temporal trend analysis (2010-2013). Tech: PostgreSQL. Skills: Complex joins (4+ tables), CTEs, nested subqueries, window functions, date manipulation, pivot-style reporting. |
| 05. Music Service Research | Exploratory Data Analysis | Hypothesis testing comparing user behavior across two major cities using Yandex Music service data. Tech: Pandas, Matplotlib. Skills: EDA fundamentals, hypothesis testing, comparative analysis. |
| 06. Borrower Reliability Study | Data Preprocessing & Feature Engineering | Analysis of loan repayment factors based on marital status and family size. Tech: Pandas, NLTK. Skills: Data cleaning, lemmatization, categorical data handling, missing value treatment. |
| 07. Real Estate Price Analysis | Exploratory Data Analysis & Regression | Identifying key parameters impacting real estate property values through statistical analysis and visualization. Tech: Pandas, Matplotlib, Seaborn, NumPy. Skills: EDA, data preprocessing, correlation analysis, feature importance, time series analysis. |
| 08. Government-Supported Film Performance Analysis | Market Research & Trend Analysis | Analysis of local film distribution market focusing on government-supported films and viewer engagement trends. Tech: Pandas, Matplotlib, Seaborn, NumPy. Skills: EDA, market trend identification, stakeholder-focused insights, data visualization. |
