## __Identifying and Defining__

__Data:__ I’m looking to analyse data regarding the rating of Nintendo games over the last 20 years.\
__Goal:__ I want to find out if they’re is a major difference between critic and user ratings for games and to determine if there is bias from critics to certain genres or certain game series.\
__Source:__ https://www.kaggle.com/datasets/joebeachcapital/nintendo-games\
__Access:__ The data is publicly available on Kaggle.\
__Access Method:__ The data is contained within a csv file

### __Functional Requirements__

__Data Loading:__ The program must load a csv file and have a way to check that it is not being fed data from a different database or that it is not being sent data at all.\
__Description:__ Load the data from the csv file\
__Input:__ The Nintendo Dataset.\
__Output:__ The Nintendo Dataset is loaded into the program.

__Actor:__ User\
__Goal:__ To load the Nintendo Dataset into the system.\
__Preconditions:__ User has access to nintendo dataset\
__Main Flow:__
1. User places the dataset for reading into the correct folder.
1. System validates the file format.
1. System loads the dataset and displays the information in a dataframe.

__Postconditions:__ Dataset is loaded and ready for cleaning.

__Data Cleaning:__ The program must clean the data to remove, the unnecessary columns and rows that have missing data, to avoid errors and make data more readable\
__Description:__ Remove all unessassary and unhelpful information.\
__Input:__ The loaded Nintendo Dataset.\
__Output:__ A Cleaned useable dataset, ready for analysis.

__Actor:__ Programmer\
__Goal:__ To clean the nintendo data set to improve usability and accuracy.\
__Preconditions:__ User has access to nintendo dataset\
__Main Flow:__
1. The program removes columns with unnecessary information
1. Removes rows that are missing values
1. Removes IOS titles
1. Removes all third party titles

__Postconditions:__ Dataset is cleaned and ready for analysis.

__Data Analysis:__ The program should determine the average critic score (different genres), average user score (different genres),
__Description:__ averages in categories are determined and compared.\
__Input:__ The Cleaned Nintendo Dataset.\
__Output:__ An Analysed dataset, ready to be visualised.

__Actor:__ User\
__Goal:__ To analyse the cleaned nintendo dataset.\
__Preconditions:__ User has access to cleaned nintendo dataset\
__Main Flow:__
1. The system sorts the games by genre, then calculates the average score of each genre for critics and users an then
1. compares the scores given by critics and users.

__Postconditions:__ Dataset is analysed and ready for display.

__Data Visualisation:__  Chart that compare values from Critics and Users, and tables that show this data as well.\
__Description:__ Analysed data is displayed to the User.\
__Input:__ The analysed Nintendo Dataset.\
__Output:__ A visual to convey the insight gathered from the analysis of the Nintendo Dataset.

__Actor:__ User\
__Goal:__ To visualise the analysed nintendo dataset.\
__Preconditions:__ User has access to analysed nintendo dataset\
__Main Flow:__
1. The program connects to matplotlib
1. Creates graphs comparing the scores from Critics and Users
1. Displays accompanying table with values.

__Postconditions:__ Dataset is visualised for the user.

__Data Reporting:__ The system should output the processed data into a csv file.\
__Description:__ prossessed data is stored in a folder on the users computer.\
__Input:__ The cleaned Nintendo Dataset, and The analysed Nintendo Dataset.\
__Output:__ A folder containing all important information.

__Actor:__ User\
__Goal:__ To record the cleaned and analysed nintendo dataset.\
__Preconditions:__ User has access to cleaned nintendo dataset\
__Main Flow:__
1. Place cleaned and analysed datasets into data folder
1. Add matplotlib graph files as well.

__Postconditions:__ All files are stored in a new folder for the user.


### __Non-Functional Requirements__

__Usability:__ What is required from the User Interface and a 'README' document?
        The ‘README’ document must explain what a User has to do with the dataset to get the program working. The User Interface should be simple, clear and easy to understand, while providing a user with the expected output.

__Reliability:__ What is required from the system when providing information to the user on errors and ensuring data integrity?
        The system should have a fail safe that detects errors and tells the user when it does detect an error. The system should never make edits to the original dataset to ensure the integrity of the dataframe.


## __Researching and Planning__


__Purpose:__ The purpose of analysing the ratings of Nintendo games from Users and Critics over the past 20 years is to determine if there is any bias towards genre in either category and if Critics in general give higher ratings.

__Missing Data:__ Currently, the bias of critics/users is not shown and by discovering this potential bias, a greater understanding of the quality of video games can be found.

__Stakeholders:__ Consumers will benefit from this information as it will allow them to determine if a game is actually worth buying, in spite of reviews.

__Use:__ Consumers will be able to bias towards certain genres, and gain a deeper understanding of how higher ratings don’t necessarily indicate a better game and be educated on potential bias towards future games.

### __Privacy and Security__

- __Data Privacy of Source:__ Kaggle needs to protect the information on their site from potential data hacks to ensure data accuracy, and safety. Kaggle must also ensure that a user's data is not leaked while using the site and that download files are not malicious software.

- __Application Data Privacy:__ There is no personal data contained within the data I have sourced and there is no way to identify an individual from the data I am using alone. If I were to release this application to the general public I would have a responsibility to ensure that user data is not leaked while using the application and that the data is protected and cannot be tampered with.

- __Cyber Security:__ An Application should have secure data encryption (which means that the data is scrambled and changed so that it can not be read without a key) that is backed up on a regular basis to ensure that data cannot be corrupted. User authentication (the process in which a person is verified to be who they claim to be. This can be in the form of a password or pin) should be used to ensure that confidentiality and user data is protected. Password hashing (the process of turning a password into an unrecognisable script once entered) should also be used to prevent information being stolen in the event of a data breach.


## __My Dataframe__

|field|Datatype|Format of Display|Description|Example|Validation|
|---|---|---|---|---|---|
Title|Object|XX...XX|Title of game|Super Mario Odyssey|Can be any number of characters and can contain numbers, but can't contain 'Wave ' or 'Edition'|
Platform|Object|XX...XX|The Name of the Platform that the game released on|Switch|Must be 5 characters or shorter. Can contain numbers, but no special characters|
Meta_Score|float64|NN.0|The rating a game got from critics|84.0|Must be a whole number ending with a .0|
User_Score|float64|NN.0|The rating a game got from critics|84.0|Must be a whole number ending with a .0|
Genres|Object|['XX...XX','XX...XX']| The genre's of the game|['Action','Metroidvania']|Must be format of '['XX...XX','XX...XX'] and can't contain numbers'


## __Evaluation__
__Data Visualisation:__

<img src="images/Figure_1.png" alt="Visualised Data" width="400"/>

__Calculations:__ I used the .mean method to calculate the averages of the ratings for different genres of games. I then used .round to round all values to 0 dp.\
__Accuracy:__ The information is accurate, as I removed all titles that were classified as dlc or released on IOS, which means that only real games were considered in the averages. However, due to many of the 'biggest' titles being classified under many genres and having generally the highest scores. The values for each genre were likely inflated.\
__Conclusions:__ The data showed that critics and users gave similar scores to games across the board. This differed from what I was expecting, but made it evident that users and critics generally perceived game quality the same way. However, the data did show that both critics and users had a strong positive bias toward fantasy and strategy games and a strong negative bias towards party and simulation games. Perhaps this can be explained by the fact that their is a much lower quantity of party and simulation games, meaning that their were less chances for 'great games' to be a part of these genres, ultimately bringing scores down. Fantasy and Strategy games were also the most common types of games as many games had that listed as one of their genres. Overall, the program gave some incite to what genres are generally higher ranked and had some useful functions which made it a decent tool, despite the potential bias created by games with multiple genres.

### __Peer Evaluation__
__Bodhi:__ Rufus' program completes his functional requirements and runs properly ensuing that the correct file is placed in the correct spot. It worked properly and had a wide range of useful features for the user to do which all worked properly in the program.
__Evaluation:__ 
According to Bodhi's evaulation my program was fully meets my functional and non functional requirements. Bodhi states that my program had a 'wide range of useful features' which shows how affective my program is in assissting the user. I somewhat agree with this but believe that my program could still be improved in many ways, to make information more accurate and helpful for the user.

__Yyoung:__ Rufus program fulfills his functional requirements. It supported a lot of different features the is helpful to users that want to analse and view the data. His program also fulfilled his mentions nonfunctional requirements.
__Evaluation:__     Yyoung essentially said the same as Bodhi in reguards to my programs effectiveness with relation to my functional and non function requirments. 

### __Evaluation of Final Product__
#### __Functional Requirements__

In relation to data loading, my final product is successful. In my functional requirements it is stated that there should be a fail safe that prevents a user from getting an error while loading. This was accomplished by using a try and excerpt statment, which causes the user to get an error message from the program itself, resulting in a more user friendly program.

In relation to data cleaning, my final product was a complete success, as it removes all unnecessary columns and rows missing data from the dataframe, exactly like what is requested in my functional requirements, resulting in a more readable and useable peice of data for analysis.

In relation to data analysis, my final product was somewhat successful. This is because, averages are calcuated for each genre as requested in my functional requirements, but due to an oversight in how a games genre is selected, some bias was created, causing for the analysis to be less impactful then once thought. Despite this, the program still provides some insight.

In relation to data visualisation, my product was a complete success. In my functional requirements I state that I should create a chart that compares values from critcs and users. My program does this, leading to a successfull and easily readable visualisation that improves the overall quality of the program.

In relation to data reporting my product is successful, as I save the files that I said I would to the users computer when they quit, ensuring that the user has easy access to relevant information provided by this program and enhancing the overall experience. 

#### __Nonfunctional Requirements__

In relation to useability my product is successful. This is because a README is provided that tells the user what they have to do to get the program working and because my program has a simple user interface, which results in a easy to use and helpful program for the user that is also simple to understand. 



In relation to reliability my product is increddibly successful as it checks often for errors when loading through thre use of try and excerpt statements. this creates a program that rarely has errors which makes the program reliable to the user.

#### __Evaluation of project management__

In relation to project management, I was mostly successfull. I have put a lot of effort and rigourous testing into this task as, I have commited many hours to it and committed changes I've made on github often to explain the changes I made to my code. Due to the time I have put in to this task I have been able to overcome many challenges including losing about 4-5 hours of progress due to erros with github and also problems in code that required me to look around online and determine the best way forward with my program. this resulted in a easy to use and helpful tool that provides some insight into the scores given to nintendo games of different genres. 
However, one area where I did struggle was time management. If I had used my time more affectively in class in the first week or two of this project I likely could have saved myself several hours of work from home, leading me to swear off talking in class when starting a new project.

