Skip to content

Cruelhelp/DataCleanerWeb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

Data Cleaner (Web)

Data Cleaner (Web) is an offline HTML/JavaScript application for cleaning and standardising government V‑series Excel/CSV datasets. Drag and drop one or more files and let the app extract important metadata (folder names, file numbers, names, ministries/departments), normalise dates, and generate cleaned CSV output – all in your browser with no server required.

Features

Multi‑file upload: Add multiple Excel (.xlsx, .xlsm, .xls) or CSV files at once; they will be processed sequentially.

Path trimming options: Choose whether to trim all slashes from the file path or remove only the leading slash before cleaning.

Cleaning pipeline: Extract file numbers, names, and ministries/agencies, normalise dates to DD/MM/YYYY format, trim whitespace, and optionally remove banned words.

Customisable settings: Enable or disable whitespace trimming, date normalisation, and banned‑word removal from the Settings tab.

Excel‑style previews: Compare “Before” and “After” snapshots of the first 30 rows with row/column headers and gridlines. The preview pane auto‑fills when a file is added and updates automatically after cleaning.

Downloadable outputs: Save cleaned files individually or download all cleaned CSVs at once.

Themes: Switch between dark and light user interface themes; your preference is saved between sessions.

Change log & About: View a dated change log of new features and improvements, and see links to the web and desktop versions of the project.

Responsive design: The interface adapts to both desktop and mobile screens.

Usage

Open the datacleanerweb.html file in a modern web browser (Chrome, Edge, Firefox). No internet connection or server is required.

Click Add Files or drag and drop your Excel/CSV files into the sidebar.

In the Settings tab, choose your file path trimming strategy (trim all slashes or trim only the first) and configure optional cleaning steps.

Click Start Cleaning to process your files. Logs will appear in the Logs tab and cleaned files in the Output tab.

Use the Preview links to compare the original and cleaned data for each file, or use Download to save individual cleaned CSVs.

Click Download All to download all cleaned files at once.

Check the Change Log and About tabs for recent updates and project information.

Development

This project is built with plain HTML, CSS, and JavaScript. It relies on the following libraries:

Papa Parse – Fast CSV parser for the browser.

SheetJS (xlsx) – Reads Excel files in the browser.

Day.js – Lightweight date handling.

All functionality runs locally in your browser; no backend server is required. If you wish to turn this app into a desktop executable, you can wrap it with frameworks such as Electron or package a similar Python version using PyInstaller.

Contributing

Pull requests are welcome! Feel free to open issues or suggest enhancements. See the Change Log in the app for recent updates.

License

This project is licensed under the MIT License. See the LICENSE file for details.

© 2025 Ruel McNeil – All Rights Reserved.

About

This application automates the cleaning of government V‑series datasets." " It extracts information from document names, formats dates, derives metadata," " and produces cleaned CSV files.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages