Skip to content
View Imaneimy's full-sized avatar

Block or report Imaneimy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Imaneimy/README.md

Imane Moussafir

Data & BI Engineer

Projects

Repository What it does Stack
etl-pipeline-testing PySpark ETL pipeline with schema validation, business rules, and XRAY-format test cases PySpark, Pytest
datalake-regression-tests Medallion architecture regression harness (Bronze→Silver→Gold) with a DataFrame comparator that detects schema drift PySpark, Pytest
sql-data-validation 20 SQL data quality checks (nulls, FK integrity, value ranges, business rules) with an HTML report Python, SQLite
anomaly-dashboard SQLite-backed anomaly tracker with a generated HTML dashboard — a lightweight JIRA/XRAY alternative Python, SQLite
retail-sales-analysis ETL + KPI analysis on retail sales data, segmented by category, region, age, gender Pandas, Matplotlib
sales-forecasting Monthly revenue forecasting with ARIMA — stationarity testing, seasonal decomposition, AIC order selection Statsmodels, Pandas
customer-segmentation-rfm RFM scoring + K-Means clustering on 400 customers — elbow method, silhouette, segment revenue breakdown Scikit-learn, Pandas
customer-churn-prediction Churn prediction with logistic regression — feature encoding, evaluation, feature importance Scikit-learn, Pandas

Stack

Languages: Python, SQL
Data: Pandas, NumPy, PySpark
ML: Scikit-learn, Statsmodels
BI: Power BI, DAX, Power Query
Testing: Pytest, pytest-cov
Databases: PostgreSQL, MySQL, SQLite


Background

  • Orange Maroc (Feb–Aug 2025) — built 15+ Power BI dashboards for procurement KPIs and supplier performance; automated Excel/Power Query reporting pipelines
  • EMSI — Engineering degree in Computer Science & Networks (MIAGE), specialized in Data Engineering and BI
  • Self-studying ML/DL through Scikit-learn, Keras, and hands-on projects

LinkedIn

Pinned Loading

  1. anomaly-dashboard anomaly-dashboard Public

    Test anomaly tracker for Big Data projects — JIRA-inspired ticket management with HTML dashboard and SQLite backend

    Python

  2. datalake-regression-tests datalake-regression-tests Public

    Regression test suite for a Medallion Datalake architecture Bronze/Silver/Gold — schema drift, row count and statistical comparisons

    Python

  3. etl-pipeline-testing etl-pipeline-testing Public

    ETL pipeline testing suite with PySpark and Pytest — schema validation, business rules, unit and integration tests

    Python

  4. sql-data-validation sql-data-validation Public

    SQL data quality checks on a DataMart with automated HTML and CSV reporting — nulls, uniqueness, FK integrity and business rules

    Python

  5. customer-segmentation-rfm customer-segmentation-rfm Public

    Customer segmentation using RFM scoring and K-Means clustering — elbow method, silhouette score, segment revenue breakdown. 400 customers, 2800+ transactions.

    Python

  6. sales-forecasting sales-forecasting Public

    Monthly sales forecasting with ARIMA — stationarity testing, seasonal decomposition, AIC-based order selection. Inspired by procurement forecasting work at Orange Maroc.

    Python