This project automates the process of fetching invoice data from MongoDB, flattening nested documents, and exporting them into Excel (.xlsx) and CSV (.csv) formats. It runs in a continuous loop, checking for new records at regular intervals, and keeps track of statistics such as new documents, failed records, and success rates.
- Connects to a MongoDB collection and fetches documents
- Flattens nested invoice data into tabular format using pandas
- Exports results to both CSV and Excel files
- Adds a computed Grand Total column (gst + sgst)
- Runs in a continuous cycle with configurable update intervals
- Logs detailed processing info, warnings, and errors
- Tracks statistics: processed, failed, duplicates, no-invoices, invalid-invoices
mongo_excel_updater/
│
├── main.py                # Entry point for the application
├── config.py              # Configuration settings (Mongo URI, output paths, intervals)
│
├── utils/
│   ├── mongo_helper.py    # Fetch data from MongoDB
│   ├── flatten_helper.py  # Flatten nested invoices + add Grand Total
│   ├── excel_helper.py    # Save DataFrame to Excel/CSV
│   ├── logger.py          # Centralized logging setup
│   └── __init__.py
│
├── output/
│   ├── invoices.csv       # Generated CSV file
│   └── invoices.xlsx      # Generated Excel file
│
|__ logs/
|
|
└── README.md              # Project documentation
git clone <repo_url>
cd mongo_excel_updaterpython -m venv venv
source venv/bin/activate   # Linux/Mac
venv\Scripts\activate      # Windowspip install -r requirements.txtEdit config.py with your details:
MONGO_URI = "mongodb://username:password@host:port/db"
OUTPUT_CSV = "output/invoices.csv"
OUTPUT_XLSX = "output/invoices.xlsx"
UPDATE_INTERVAL = 10  # seconds between fetch cyclespython main.pyThe app will:
- Start cycle logging
- Fetch data from MongoDB
- Flatten invoices
- Save updated CSV/Excel files
- Repeat after every UPDATE_INTERVALseconds
Logs are printed to the console with details such as:
- Total documents fetched
- Processed, failed, and skipped invoices
- New documents/rows detected in each cycle
- File save status
- Success rate per cycle and overall
Example log:
--- CYCLE #1 - 12:36:13 ---
Fetched: 120 documents (Change: +5)
Processed: 100 | Failed: 2 | No Invoices: 18
Output: 250 rows (Change: +10)
File Status: [OK] Saved
--- SUMMARY ---
New Records: +5 documents | +10 rows
Duration: 4.20s | Success Rate: 97.6%
- Python 3.x
- pandas for data processing
- openpyxl for Excel writing
- pymongo for MongoDB connection