Full-stack web application that ingests Google Spreadsheets, validates and normalizes data, stores it in MongoDB, and provides a React UI to manage parsed datasets.
Access the application: https://spreadsheet-parser-production.up.railway.app
Note: The application is deployed on Railway. Make sure all environment variables are configured correctly for production use.
The main interface showing the spreadsheet URL input form and ingestion summary with statistics.
The dataset management interface displaying the list of ingested datasets and the detailed data viewer with all parsed rows and columns.
- π₯ Ingest Google Sheets - Parse and import data from Google Spreadsheets
- β Data Validation - Automatic validation and normalization of spreadsheet data
- πΎ MongoDB Storage - Persistent storage of parsed datasets
- π Dataset Management - View, download, and delete parsed datasets
- π₯ CSV Export - Download datasets as CSV files
- π¨ Modern UI - Clean and intuitive React interface
- Backend (
/backend): Express API, Google Sheets client, parser/validator, MongoDB persistence (Mongoose), CSV exporter. - Frontend (
/frontend): React + Vite single-page UI for ingestion, listing datasets, viewing rows, downloading CSV, deleting datasets. - Database: MongoDB (local or Atlas). Each ingest is stored as a document containing headers, rows, and parsing summary/logs.
- β Node.js 18+
- β npm 9+
- β
MongoDB 6+ (local
mongodor Atlas connection string) - β Google Cloud project with Sheets API enabled and a service account that has access to your copied spreadsheet.
- π Open the provided sample sheet and choose
File β Make a copy. - π€ Share your copy with the Google service account email (from the credentials you create below) as a Viewer.
- π Grab the URL of your copy and keep it handy for testing.
Example test sheet URL (replace with your copy):
https://docs.google.com/spreadsheets/d/YOUR_COPY_ID/edit#gid=0
Create /backend/.env (based on backend/env.example):
PORT=4000
MONGODB_URI=mongodb://localhost:27017/spreadsheet_parser
GOOGLE_SERVICE_ACCOUNT_EMAIL=your-service-account@project.iam.gserviceaccount.com
GOOGLE_PRIVATE_KEY="-----BEGIN PRIVATE KEY-----\n...\n-----END PRIVATE KEY-----\n"
GOOGLE_SHEETS_SCOPES=https://www.googleapis.com/auth/spreadsheets.readonly
ALLOWED_ORIGIN=http://localhost:5173- Service account: In Google Cloud console create credentials β Service Account β add Sheets API role (or basic). Download JSON key, copy
client_emailandprivate_key. Store the private key exactly as shown above (escaped newlines).
Create /frontend/.env:
VITE_API_BASE_URL=http://localhost:4000Note: In production, the frontend uses relative URLs (same origin), so this is only needed for local development with separate servers.
cd backend
npm install
npm run devServer runs on http://localhost:4000.
cd frontend
npm install
npm run devVite dev server runs on http://localhost:5173.
# Install all dependencies
npm run install:all
# Build frontend
npm run build
# Start backend (serves both API and frontend)
npm start- Open your web browser
- Navigate to: https://spreadsheet-parser-production.up.railway.app
- Create a Google Spreadsheet or use an existing one
- Ensure the first row contains column headers
- Fill in your data rows below the headers
- Share the spreadsheet with your Google Service Account email (set in environment variables) as a Viewer
- On the homepage, you'll see an input field for the Google Sheet URL
- Copy the Google Spreadsheet URL (format:
https://docs.google.com/spreadsheets/d/SPREADSHEET_ID/edit#gid=0) - Paste the URL into the input field
- Click the "Ingest Spreadsheet" button
- Wait for the processing to complete
After ingestion, you'll see:
- Summary Card showing:
- Total rows read
- Rows successfully inserted
- Rows skipped (if any)
- Skip reasons (if applicable)
- Dataset List showing all ingested spreadsheets
For each dataset, you can:
- ποΈ View - Click to see the full dataset with all rows and columns
- π₯ Download CSV - Export the dataset as a CSV file
- ποΈ Delete - Remove the dataset from the database
- Click on any dataset in the list
- View the complete data in a table format
- See headers and all parsed rows
- Navigate back to the dataset list using the back button
- π Open the React UI.
- π Paste your copied Google Sheet URL and click Ingest Spreadsheet.
- βοΈ Backend fetches the first worksheet, validates headers, coerces values (numbers/dates), skips empty rows, logs issues, and stores everything in MongoDB.
- β
The UI refreshes showing:
- Summary of the ingest (rows read/inserted/skipped, warnings/logs).
- Table of datasets with view / CSV download / delete actions.
- Dataset viewer renders rows as a table.
| Method | Endpoint | Description |
|---|---|---|
GET |
/health |
Health check endpoint |
POST |
/api/spreadsheets |
Ingest spreadsheet by URL |
GET |
/api/spreadsheets |
List stored datasets |
GET |
/api/spreadsheets/:id |
Retrieve dataset (headers + rows + summary) |
GET |
/api/spreadsheets/:id/csv |
Download dataset as CSV |
DELETE |
/api/spreadsheets/:id |
Delete dataset |
- β
Start MongoDB (
mongod) locally or ensure Atlas cluster reachable. - β Run backend & frontend dev servers.
- β In UI, ingest your sheet URL.
- β
Confirm:
- Summary shows expected counts/logs.
- Dataset list refreshes with new entry.
- Viewer renders rows with correct headers.
- CSV download returns well-formed file.
- Delete action removes dataset.
- Ensure the service account email has Viewer access to your copied sheet
- Verify that
GOOGLE_PRIVATE_KEYretains newline escapes (\\nin env file) - Check that Google Sheets API is enabled in your Google Cloud project
- Duplicate headers / empty header: Fix the sheet's first row, then re-run ingest
- Missing data: Check that rows aren't completely empty (they'll be skipped)
- CORS issues: Set
ALLOWED_ORIGINto your frontend origin (e.g.,http://localhost:5173) - Mongo connection failures:
- Verify the MongoDB URI
- Ensure MongoDB is running (for local) or Atlas cluster is accessible
- Whitelist IP addresses in MongoDB Atlas Network Access settings
- For Railway deployment, use
0.0.0.0/0to allow all IPs
- Environment variables not working: Ensure variables are set at the service level in Railway, not project level
- Frontend not loading: Check that
npm run buildcompleted successfully andfrontend/distexists - Port issues: Railway automatically sets
PORTenvironment variable
spreadsheet-parser/
βββ backend/
β βββ src/
β β βββ controllers/ # Request handlers
β β βββ infrastructure/ # Database connection
β β βββ models/ # Mongoose models
β β βββ routes/ # Express routes
β β βββ services/ # Business logic (Google Sheets, parser)
β β βββ index.js # Express app entry point
β βββ package.json
βββ frontend/
β βββ src/
β β βββ App.jsx # Main React component
β β βββ main.jsx # React entry point
β β βββ styles.css # Application styles
β βββ index.html
β βββ package.json
βββ package.json # Root package.json (monorepo)
βββ railway.toml # Railway deployment config
βββ README.md
This project is licensed under the MIT License.
MIT License
Copyright (c) 2025 Reyden Jenn Cagata
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Contributions are welcome! Please feel free to submit a Pull Request.
For issues, questions, or contributions, please open an issue on the repository.
Made with β€οΈ using Node.js, Express, React, and MongoDB