This Jupyter notebook is designed for data cleaning and preprocessing the menu performance data from GoBiz and Grab Merchant invoices.
- Remove Unnecessary Words: Cleansing the dataset of superfluous words that add noise rather than value to the analysis.
- Replace Misspelled Words: Correcting common spelling errors to ensure consistency and accuracy in the data. For instance, correcting "chiken" to "chicken" in menu item descriptions.
- Date Formatting: Standardizing date formats across the dataset to facilitate time series analysis and reporting.
- Remove Unnecessary Columns: Identifying and eliminating columns that do not contribute to the analysis objectives, such as redundant information or irrelevant details, to streamline the dataset.
To run this notebook, you will need to have Python installed on your system, along with Jupyter Notebook or JupyterLab. The analysis relies on several Python libraries, including pandas, numpy, and matplotlib.
- Python (>=3.7)
- pandas
- numpy
- matplotlib
-
Check Python Installation:
- Verify Python installation by running
python --version
in your terminal or command prompt. If Python is not installed, download it from python.org.
- Verify Python installation by running
-
Install Jupyter:
- Install Jupyter Notebook by running the command
pip install notebook
.
- Install Jupyter Notebook by running the command
-
Prepare the Environment:
- Clone this repository or download the notebook to your local machine.
-
Navigate to Your Notebook:
- Open your terminal or command prompt and navigate to the directory containing the notebook.
-
Start Jupyter Notebook or JupyterLab:
- Run the command
jupyter notebook
orjupyter lab
, depending on which application you're using.
- Run the command
-
Open the Notebook:
- Navigate to the notebook file in the Jupyter interface and open it to start your data cleaning process.