This repo contains resources including Jupyter Notebook files for exploatory data analysis on a dataset for Programming for Engineers College Course.
This repository contains code for analyzing a dataset of all inventories of red wine in a beverage manufacturing company. The dataset is provided in a csv file format called data.csv.
The objective of this project is to perform exploratory data analysis on the provided dataset and make a presentation to the investors at the end of the year meeting.
Clone the repository:
git clone https://github.com/AWESOME04/SENG-207-Course-Project-1.git
Install the required dependencies:
pip install pandas matplotlib
The data analysis process involves the following steps:
Pre-processing the data to remove null values and duplicates
Showing statistical inferences of the entire dataset using descriptive statistics and graphical representations
Showing statistical inferences of individual columns in the dataset using descriptive statistics and graphical representations
Showing hidden inferences using correlation analysis between columns
The results of the data analysis are presented in the presentation.pdf file in this repository. The presentation contains descriptive statistics, graphical representations, and correlation analysis of the dataset.
Based on the results of the data analysis, we can conclude that the inventory of red wine is generally stable throughout the year with some seasonal variations. There is also a positive correlation between the quantity of red wine in stock and the region where it is stored.
Evans Acheampong (@AWESOME04)