Skip to content

A machine learning project harnessing Kaggles game and sales dataset to generate forecasts for potential sales of a specific example game, relying on a handful of crucial factors.

Notifications You must be signed in to change notification settings

PKrystian/DataSetGames_Project

Repository files navigation

Machine Learning Game Sales Project

Games Dataset - kaggle link


Description:

Utilizing kaggle's dataset of games and game sales, program will create a prediction on how many sales a given example game could sell based on few important features.

Operation scheme

First, after importing the data from the csv file, we start analyzing our dataset. Here we see the 5 best selling games from our dataset:

Now we check for irregularities through plotbox:

We can see that below 1995 there are only single game records, later after analysis we will remove them.

Here is the heatmap of columns with correlation. We can see their little degree of connections.

After analyzing the dataset, we start removing unnecessary information. At the beginning we rename the columns for aesthetic reasons, later we remove unnecessary ones, e.g. Name or Rank. Then we delete individual data, including games released before 1995. Now we check how does plotbox look after cleaning:

Now we normalize the data through one hot encoder function. Here's how our dataset looks like now.

After cleaning the data, we can start training the model, we have chosen PCA, which will reduce the number of data only to the most important. Here's how our weight graph looks like.

We see a low degree of importance of the data, which means that our data does not have a sufficiently large correlation with the Global sales. After training the model, we obtained a percentage result with a degree of: 0.0422979797979798

Conclusion:

Based on the results of trained models, we can conclude that we have too little correlating information in our Dataset. The changes that we can apply include increasing the data range or adding new columns along a greater correlation with sales (e.g. game budget).

About

A machine learning project harnessing Kaggles game and sales dataset to generate forecasts for potential sales of a specific example game, relying on a handful of crucial factors.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages