Skip to content

A Streamlit + Pandas based tool for uploading, cleaning, exploring, and analyzing spreadsheet (CSV/Excel) data. Inspired by Codedex project: Analyze Spreadsheet Data with Pandas + ChatGPT.

License

Notifications You must be signed in to change notification settings

Fuser2k/pandas-spreadsheet-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pandas Spreadsheet Analyzer

A simple Flask web app to upload a CSV (or use the bundled sample) and explore, clean, and analyze book bestseller data with pandas.

Features

  • Upload your own .csv or use the sample dataset.
  • Raw data exploration: preview, shape, columns, and numeric summary.
  • Cleaning: remove duplicates, standardize column names, and fix numeric types.
  • Analysis: top authors by count and total reviews, high-rated books (>= 4.8), genre performance (avg rating, avg reviews, median price), and yearly distribution.

Quick Start (Windows/Mac/Linux)

  1. Ensure Python 3.10+ is installed.
  2. Install deps:
    pip install flask pandas
  3. Run the app:
    python app.py
  4. Open http://127.0.0.1:5000/ in your browser.
  5. Upload your CSV or click "Use Sample CSV".

Expected Columns (Flexible)

The app adapts to common column names:

  • Title: Title or Name
  • Rating: Rating or User Rating
  • Year: Publication Year or Year
  • Optional: Author, Reviews, Price, Genre

Project Structure

.
├── app.py                    # Flask app (web UI)
├── main.py                   # CLI helpers and cleaning/analysis logic
├── templates/
│   └── index.html            # HTML template (English)
├── bestsellers with categories.csv   # Sample dataset
├── cleaned_bestsellers.csv   # Example cleaned output (optional)
├── .gitignore                # Typical Python/Flask ignores
├── .gitattributes            # Text normalization
└── LICENSE                   # Project license

Development Notes

  • app.py reuses clean(df) from main.py when available; includes a safe fallback.
  • Cleaning standardizes names (Name→Title, User Rating→Rating, Year→Publication Year), drops duplicates, and coerces numeric columns.
  • Jinja template uses explicit is not none checks to avoid pandas truthiness errors.

CLI Usage (Optional)

You can also run the analysis from the terminal:

python main.py --csv path/to/your.csv

It will preview, clean, analyze, and write cleaned_bestsellers.csv next to your input.

License

This project is licensed under the MIT License. See LICENSE for details.


Türkçe Kısa Bilgi

  • Proje: CSV yükleyip keşif/temizlik/analiz yapan basit bir Flask arayüzü.
  • Çalıştırma: pip install flask pandas ardından python app.py ve http://127.0.0.1:5000/ adresini açın.
  • Örnek veri: "Use Sample CSV" ile hazır dosyayı kullanabilirsiniz.

About

A Streamlit + Pandas based tool for uploading, cleaning, exploring, and analyzing spreadsheet (CSV/Excel) data. Inspired by Codedex project: Analyze Spreadsheet Data with Pandas + ChatGPT.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published