Xelly is a full-stack web application designed to simplify the cleaning, clustering, analysis, and forecasting of trade and product data stored in Excel or CSV files. It leverages machine learning techniques and time-series forecasting to deliver insights without requiring coding expertise.
Live App: clean-excel.vercel.app
- Problem Statement
- Proposed Solution
- Key Features
- System Architecture
- Deployment and Security
- Installation (Local Development)
- Tech Stack
- Author
Consulting firms and analysts frequently deal with messy Excel files containing raw trade data with inconsistent entries, typographical errors, and non-standardized naming conventions. Manual cleaning and clustering is tedious and error-prone, while advanced analysis and forecasting are limited in Excel.
Xelly automates:
- Data cleaning and standardization
- Clustering similar entries using TF-IDF + Cosine Similarity and Sentence Transformers
- Trade data filtering and interactive dashboards
- Forecasting product prices or quantities using Prophet, Linear Regression, and Polynomial Regression
- Upload Excel (.xlsx) or CSV (.csv) files via drag-and-drop or file picker.
- Automated data cleaning: trimming spaces, normalizing formats, currency (USD), and unit conversions (e.g., UQC → KG).
- Preview cleaned data and download standardized files.
- Groups similar text entries like
“Petroleum coke”and“Petroleum coke bulk”. - Adjustable similarity threshold.
- Option to accept or reject suggested replacements.
- Trade Type Detection (Import/Export).
- Multi-level filtering by city, supplier, HS code, value, year, etc.
- Insights include:
- Top importer-supplier pairs
- Market share metrics
- Unit price trends
- Importer-supplier heatmaps
- Time-series forecasts for columns (e.g., quantity or unit price).
- Models:
- Linear Regression
- Polynomial Regression
- Prophet (Meta)
- Outputs:
- 2-year forecast (line + bar charts)
- Model performance metrics (MAE, RMSE, R²)
- Seasonal patterns and historical trends
- Key statistics: transaction count, unique products, time range.
- Quantity and price metrics.
- Product-specific transaction records.
- Frontend: React (Vercel hosting).
- Backend: Flask API (Render hosting).
- Core Modules:
upload_bp– File upload, cleaning.cluster_bp/cosine_bp– TF-IDF & cosine similarity clustering.filter_bp– Trade data filtering and analysis.company_bp– Company-level metrics.forecast_bp– Prophet-based time series forecasting.
- Frontend: Hosted on Vercel (HTTPS, TLS, CI/CD).
- Backend: Hosted on Render (HTTPS endpoints, environment variables for sensitive keys).
- CORS Policies & Server Validation for data protection.
# Clone repository
git clone https://github.com/saun09/clean-excel.git
# Backend Setup
cd backend
pip install -r requirements.txt
flask run
# Frontend Setup
cd frontend
npm install
npm run dev