Professional AI-powered data analysis platform for comprehensive data visualization and intelligent recommendations
DataScope is a cutting-edge web application that revolutionizes data analysis by combining advanced statistical computing with artificial intelligence. Upload your datasets and receive instant comprehensive analysis, beautiful visualizations, and AI-generated insights that help you make data-driven decisions.
- π€ AI-Powered Analysis: Leverages Google Gemini AI for intelligent insights and recommendations
- π Advanced Visualizations: Automatic generation of correlation matrices, distributions, and statistical plots
- π Comprehensive Analytics: Deep statistical analysis including outlier detection and data quality assessment
- π‘ Smart Recommendations: AI-driven suggestions for data cleaning, preprocessing, and ML readiness
- π¨ Modern UI/UX: Responsive design with interactive grid backgrounds and professional styling
- π Security-First: Temporary file processing with automatic cleanup and no permanent data storage
- β‘ Production-Ready: Docker, Heroku, and Railway deployment configurations included
- Python 3.10+ - Core runtime environment
- Flask 2.3.3 - Web framework with modern architecture
- Pandas 2.1.1 - Data manipulation and analysis
- NumPy 1.24.3 - Numerical computing
- Matplotlib 3.7.2 - Statistical visualizations
- Seaborn 0.12.2 - Advanced statistical graphics
- Scikit-learn 1.3.0 - Machine learning utilities
- Google Generative AI 0.3.0 - AI-powered insights
- Plotly 5.17.0 - Interactive visualizations
- Modern CSS3 - Custom design system with CSS Grid and Flexbox
- Vanilla JavaScript - Interactive animations and dynamic grid system
- Canvas API - Animated background effects
- Font Awesome 6.5.1 - Professional iconography
- Progressive Enhancement - Works without JavaScript
- Gunicorn 21.2.0 - WSGI HTTP Server
- Docker - Containerized deployment
- Environment Configuration - Secure settings management
- Health Check Endpoints - Monitoring and uptime tracking
- Statistical Summaries: Mean, median, mode, standard deviation, quartiles
- Data Quality Assessment: Missing value analysis, duplicate detection, data type validation
- Feature Analysis: Unique value counts, cardinality analysis, data distribution patterns
- Outlier Detection: Statistical outlier identification using IQR and z-score methods
- Correlation Analysis: Pearson, Spearman correlation matrices with significance testing
- Distribution Analysis: Histograms with KDE overlays for all numeric columns
- Correlation Heatmaps: Interactive correlation matrices with statistical significance
- Box Plots: Outlier visualization and quartile analysis
- Categorical Analysis: Bar charts for categorical data distribution
- Missing Data Visualization: Heatmaps showing missing value patterns
- Data Quality Scoring: Automated assessment of dataset quality
- Cleaning Recommendations: AI-generated suggestions for data preprocessing
- Pattern Recognition: Automatic detection of trends, seasonality, and anomalies
- ML Readiness Assessment: Evaluation of data suitability for machine learning
- Next Steps Guidance: Intelligent recommendations for further analysis
- Interactive Grid Background: Dynamic background that responds to mouse movements
- Professional Design System: Consistent color palette and typography
- Responsive Layout: Optimized for desktop, tablet, and mobile devices
- Real-time Processing: Progress indicators and smooth loading states
- Error Handling: Comprehensive error pages and user feedback
DataScope Deployment/
βββ π Core Application
β βββ app.py # Flask application with comprehensive routing
β βββ requirements.txt # Python dependencies with version pinning
β βββ .env # Environment configuration (not in repo)
βββ π Frontend Assets
β βββ static/
β β βββ css/
β β β βββ style.css # Modern design system with CSS variables
β β β βββ dashboard.css # Dashboard-specific styles
β β β βββ dashboard_fixed.css # Fixed dashboard layout styles
β β βββ js/
β β β βββ script.js # Interactive grid system and animations
β β β βββ simple_script.js # Utility functions
β β βββ images/
β β β βββ datascope-logo.svg # Vector logo
β β β βββ datascope-favicon.svg # Favicon
β β β βββ FORMAL.png # Additional branding
β β βββ plots/ # Generated visualization storage
βββ π Templates
β βββ base.html # Base template with navigation
β βββ index.html # Landing page with features showcase
β βββ upload.html # File upload interface
β βββ upload_new.html # Alternative upload interface
β βββ results_dashboard.html # Analysis results display
β βββ about.html # About page
β βββ contact.html # Contact form
β βββ 404.html # 404 error page
β βββ 500.html # 500 error page
βββ π Deployment Configuration
β βββ Dockerfile # Docker containerization
β βββ Procfile # Heroku deployment config
β βββ runtime.txt # Python version specification
β βββ .gitignore # Git ignore rules
βββ π Temporary Storage
βββ uploads/ # Temporary file uploads (auto-cleaned)
- Python 3.10+ (specified in runtime.txt)
- pip package manager
- Git for repository cloning
-
Clone the Repository
git clone <your-repository-url> cd "DataScope Deployment"
-
Create Virtual Environment
python -m venv venv # Windows venv\Scripts\activate # macOS/Linux source venv/bin/activate
-
Install Dependencies
pip install -r requirements.txt
-
Environment Configuration
# Create .env file with required variables GEMINI_API_KEY=your_gemini_api_key_here FLASK_SECRET_KEY=your_secure_secret_key FLASK_ENV=development MAX_CONTENT_LENGTH=16777216 UPLOAD_FOLDER=uploads SMTP_SERVER=smtp.gmail.com SMTP_PORT=587 SMTP_USERNAME=your_email@gmail.com SMTP_PASSWORD=your_app_password -
Run Application
python app.py
Application will be available at
http://localhost:5000
# Build Docker image
docker build -t datascope .
# Run container
docker run -p 5000:5000 --env-file .env datascope# Install Heroku CLI and login
heroku login
# Create new Heroku app
heroku create your-datascope-app
# Set environment variables
heroku config:set GEMINI_API_KEY=your_api_key
heroku config:set FLASK_SECRET_KEY=your_secret_key
# Deploy application
git push heroku main- Connect your GitHub repository to Railway
- Configure environment variables in Railway dashboard
- Automatic deployment on git push
- Maximum File Size: 50MB (configurable via MAX_CONTENT_LENGTH)
- Supported Formats: CSV (.csv), Excel (.xlsx, .xls)
- Processing Timeout: 120 seconds
- Automatic Cleanup: Files deleted after analysis
- Provider: Google Gemini AI
- Model: gemini-pro
- Rate Limiting: Built-in request throttling
- Fallback: Graceful degradation if AI unavailable
- File Validation: Extension and content type checking
- Data Sanitization: Automatic data cleaning and validation
- Session Management: Secure session handling
- CSRF Protection: Built-in Flask security features
- No Data Persistence: Files automatically deleted post-analysis
- Descriptive Statistics: Complete statistical summaries
- Missing Value Analysis: Comprehensive missing data patterns
- Data Type Detection: Automatic type inference and validation
- Outlier Detection: Multiple statistical methods (IQR, Z-score)
- Correlation Analysis: Pearson and Spearman correlations
- Distribution Plots: Histograms with kernel density estimation
- Correlation Heatmaps: Interactive correlation matrices
- Box Plots: Quartile analysis and outlier visualization
- Bar Charts: Categorical data frequency analysis
- Missing Data Heatmaps: Visual missing value patterns
- Data Quality Scoring: Automated quality assessment
- Preprocessing Recommendations: AI-suggested data cleaning steps
- Pattern Detection: Trend and anomaly identification
- ML Readiness: Machine learning suitability assessment
- Custom Insights: Context-aware analytical recommendations
--primary-dark: #28262b /* Main background and primary elements */
--primary-light: #a9a29c /* Secondary text and accents */
--secondary-light: #d5ccc7 /* Primary text and highlights */
--neutral-dark: #333333 /* Cards and surface elements */- Font Family: Inter (Google Fonts)
- Weights: 300, 400, 500, 600, 700, 800, 900
- Scale: Modular scale from 0.75rem to 3.75rem
- Animated Grid Background: Canvas-based particle system
- 3D Card Hover Effects: CSS transforms with parallax
- Smooth Transitions: Easing functions for natural motion
- Loading Animations: Progress indicators and spinners
GET /- Landing pageGET /upload- File upload interfacePOST /analyze- File analysis (accepts multipart/form-data)GET /results/<timestamp>- Analysis results dashboardPOST /chat- AI chat interfaceGET /export/<timestamp>- Export analysis data
GET /api/health- Health check endpointGET /api/status- Application statusGET /about- About pageGET /contact- Contact formPOST /contact- Contact form submission
404- Custom 404 error page500- Custom 500 error page- Graceful error handling with user-friendly messages
# Test file upload functionality
curl -X POST -F "file=@sample.csv" http://localhost:5000/analyze
# Test health endpoint
curl http://localhost:5000/api/health
# Test API status
curl http://localhost:5000/api/status- CSV files with various data types
- Excel files (.xlsx, .xls)
- Files with missing values
- Large datasets (up to 50MB)
- Temporary Processing: Files stored only during analysis
- Automatic Cleanup: All uploads deleted after processing
- No Permanent Storage: Zero data retention policy
- Secure File Handling: Validated file types and content
- Memory Management: Efficient memory usage and cleanup
- No Data Sharing: Zero third-party data sharing
- Local Processing: Data processed on secure servers
- Encrypted Transmission: HTTPS for all communications
- Session Security: Secure session management
- Environment Variables: Sensitive data stored securely
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes with proper testing
- Commit changes:
git commit -m 'Add amazing feature' - Push to branch:
git push origin feature/amazing-feature - Open a Pull Request with detailed description
- Python: Follow PEP 8 style guidelines
- JavaScript: Use ES6+ features and modern practices
- CSS: Follow BEM methodology for class naming
- Documentation: Update README and code comments
- Testing: Add tests for new functionality
- Efficient Data Processing: Pandas vectorized operations
- Memory Management: Automatic garbage collection
- Caching: Static asset caching
- Compression: Gzip compression for responses
- CDN Integration: Ready for CDN deployment
- Analysis Speed: < 30 seconds for most datasets
- Memory Usage: Optimized for large datasets
- Response Time: < 2 seconds for page loads
- Uptime: 99.9% availability target
File Upload Errors
- Ensure file size is under 50MB
- Verify file format (CSV or Excel)
- Check file permissions and corruption
Analysis Failures
- Verify data quality and format
- Check for extremely large datasets
- Ensure network connectivity for AI features
Deployment Issues
- Verify all environment variables are set
- Check Python version compatibility
- Ensure all dependencies are installed
- GitHub Issues: Report bugs and feature requests
- Contact Form: Use in-app contact form for support
- Documentation: Check this README for comprehensive info
This project is licensed under the MIT License - see the LICENSE file for complete details.
- Google Gemini AI - For providing advanced AI capabilities
- Flask Community - For the excellent web framework
- Open Source Libraries - Pandas, NumPy, Matplotlib, Seaborn, and others
- Design Inspiration - Modern data visualization platforms
- Contributors - All developers who contribute to this project
- Languages: Python, JavaScript, CSS, HTML
- Framework: Flask with modern architecture
- AI Integration: Google Gemini API
- Deployment: Docker, Heroku, Railway ready
- UI/UX: Modern responsive design
- Security: Production-grade security features
Start analyzing your datasets with DataScope's AI-powered platform
Try DataScope NowBuilt with β€οΈ for data professionals worldwide
DataScope - Where Data Meets Intelligence