In this project, I utilized a sample JSON sales dataset obtained from Kaggle: Sales JSON Dataset. The data was imported into MongoDB for further analysis. The key steps involved in this project include:
- Querying the MongoDB database to calculate key performance indicators (KPIs). This step was essential for validating the accuracy of the data imported into Tableau for visualization.
- Exporting the sales document from MongoDB in JSON format.
- Converting the JSON file to CSV format for compatibility with Tableau.
- Importing the CSV file into Tableau to create a comprehensive dashboard.
You can view and interact with the Tableau dashboard here.
- Obtain the dataset from Kaggle.
-
Install Mongo Shell: Follow the instructions here: Mongo Shell Installation.
-
Create Database and Collection:
- Open Mongo Shell.
- Create and switch to your database:
use mydatabase
Replace
mydatabasewith your desired database name. This command creates and accesses the database if it doesn't already exist.- Import your JSON file into the database:
mongoimport --db mydatabase --collection mycollection --file data.json
Replace
mydatabaseandmycollectionwith your database and collection names. A collection is created when data is first inserted.
-
Setup VSCode: If not installed, download here: VSCode Download.
-
Setup and navigate to project directory:
mkdir myproject # This creates the directory cd myproject # This navigates to the directory
-
Create virtual environment:
python3 -m venv myenv # Replace myenv with your desired virtual environment name -
Activate your virtual environment:
source myenv/bin/activate -
Install requisite libraries from requirements file:
pip install -r requirements.txt
-
Connect to Database: Follow instructions in the
db_operationsfile. -
Calculate KPIs: Refer to the
kpi_calculationsfile. -
Convert JSON to CSV: Use the
json_to_csvfile to transform BSON data for use in Tableau.
- Import the CSV file into Tableau for data visualization.
Challenge: Working with a JSON file in Tableau as a beginner in data analysis.
- Initial Approach: Directly importing the JSON file into Tableau resulted in incorrect data mapping due to the flattening process.
- Solution: Flatten the database data into a CSV file before importing it into Tableau. This ensured accurate data representation.
To validate the data in Tableau:
- Comparison with KPIs: Validate the Tableau data by comparing it with pre-calculated KPIs from the database.
- Alternative Method: Pick IDs from visible rows in Tableau and query the same IDs in the database to compare results.
This streamlined process ensures the accuracy and integrity of your data analysis workflow.
