
## ***Functional Requirements***
* **Data Loading:**
* Description: The program should be able to load .csv files, detect errors in files and appropriately inform the user that there is an error
* Input: The user accidentally inputs a dataset that does not include genres
* Output: The program rejects the file and displays an error message
#
* **Data Cleaning:** 
* Description: The program will remove all unneeded columns
* Input: The user will input a dataset into the system
* Output: The program will detect the columns and remove the unneeded ones
#
* **Data Analysis:**
* Description: The program will filter out any song that isn’t metal or one of its subgenres
* Input: The user will input a dataset into the program
* Output: the program will select any songs that include metal or one of its subgenres and create a new dataset that only includes those songs
#
* **Data Visualisation:** 
* Description: The program should display the information in both a Pandas dataframe and matplotlib graphs.
* Input: The user will ask to display the popularity of metal songs over time
* Output: The program will analyse the data and display the popularity of metal songs over time as a matplotlib bar graph
#
* **Data Reporting:** 
* Descriptions: The program should have a clear confirmation of a successful/unsuccessful analysis and visulisation.
* Input: The user will run the program and interact with the user interface
* Output: The program will respond with a confirmation message along with the desired output 

## ***Use Cases:***
*Data loading*
* **Actor:** User
* **Goal:** To load a dataset into the program
* **Preconditions:** User has a dataset with the correct format ready to load/
* **Main Flow:**
    * User places the dataset for reading into the correct folder
    * System validates the file format as .csv
    * System loads the dataset and displays the information in a dataframe
* **Postcoditions:** Dataset is loaded and ready for analysis.
#
*Data Cleaning*
* **Actor:** Program
* **Goal:** To clean the dataset provided
* **Preconditions:** User has already loaded a dataset correctly
* **Main Flow:**
    * Program briefly analyses the dataset to ensure that there are the needed columns
    * Program selects the columns that are needed for the data analysis
    * Program creates a new dataframe with only those columns inside
* **Postconditions:** Dataset is cleaned and ready to be analysed
#
* **Actor:** Program
* **Goal:** To filter the dataset and remove all songs that are not metal or one of its subgenres
* **Preconditions:** Program has cleaned the dataset
* **Main Flow:**
    * Program selects all rows with one of the selected genres
    * Program creates a new dataframe
    * Program moves all of the selected rows to the new dataframe
    * Program loads and displays new dataset, with filtered results
* **Postconditions:** Dataset is loaded and displays filtered informations
#
*Data Visualisation*
* **Actor:** Program
* **Goal:** To appropriately visualise the data.
* **Preconditions**: User has loaded a dataset into the program
* **Main Flow:**
    * Program Validates the file format as .csv
    * Program loads the dataset and displays the information in a Pandas dataframe
    * Program uses data from the dataset to display the information in a matplotlib graph
* **Postconditions:** Dataset is loaded, displayed in a Pandas dataframe and displayed in a matplotlib graph

## ***Non-Functional Requirements***
* **Usability**:
    * User Interface:
        * The user interface should be as simple to use and navigate as possible for newer users, while also allowing for more experienced users to filter data into more specific samples
    * README document:
        * The README document should clearly and concisely describe the purpose of the program, as well as explain how to set up and use the program properly
#
* **Reliability**:
    * Reliability depends on ensuring data integrity, for which there are 5 principles:
        * Attributable: The data can be traced back to the one generating it
        * Legible: The data can be easily deciphered
        * Contemporaneous: Simultaneously recorded
        * Original: The data is the original copy of the data
        * Accurate: The data is accurate
#
**Data Reporting**:
* **Actor**: Program
* **Goal**: To clearly communicate a succesful process.
* **Preconditions**: User has input a command into the system.
* **Main Flow**:
    * Program executes command
    * Program confirms that the command has been succesfully executed
    * Program outputs text as confirmation that the command has been executed succesfully
* **Postconditions**: System has successfully executed a command and given visual confirmation of success
---

## ***Research of Issue***
*Purpose*
* The purpose of this project is to find out how popular the metal music genre is by analyzing data using a tool called Pandas. Pandas is a software library in Python that helps us manage and analyze data. By using Pandas to organize and examine this data, we can identify trends and understand how popular metal music is compared to other genres or over time. The goal is to use data to get a clear picture of metal's popularity.
#
*Missing Data*
* This analysis is necessary as it provides a clear picture of the change in intrests over time as musical society and how different types of songs have become more popular even though they share a genre with other songs. Disappointingly, I have not been able to find data to this scale from before the 2000's.
#
*Stakeholders*
* Benefactors of this information could be musicians looking to take their music to the next level, as well as people who have a keen interest in music.
#
*Use*
* The data can be used to track trends which allows professional musicians to suit their work to the current trends and helps music enjoyers find would be underground artists.



## ***Privacy and Security***
* My data is being sourced form Kaggle, an online sharing website that shares datasets and raw information. The user that collected the data that I am using is responisble for ensuring that the data collected was gathered in an appropriate and legal manner, such as confirming artists consented and acknowledge that this data could be used in this manner, and the the owners of the songs listed can have their songs removed without question. Data that should be protected would include the names of the artists and the names of Spotify users listed as listeners of the songs listed.
#
*Application of Privacy*
* My responsibility is extremely similar, ensuring that the data privacy rights of the artists of the songs are respected. If this program were to be publised to github or any other social site, I would need to respect the rights of the original creator of the data by crediting them as the original source and allowing them to have the data they collected removed form my program if they want.
#
*Cyber Security*
* A program that is open to the public should include at least one user authenification process.
Common methods include:
    * Passwords/Pins and strong password rules
    * 2 Factor Authentification
    * Biometrics
* A program may also include post password measures, such as:
    * Encryption
    * Password Hashing
    * Web Application Firewalls
* ***User Authentication***
User authentication in computing is the process of verifying someone's identity when they try to access a computer system, website, or app. This is usually done by asking the person to enter a username and password. If the information matches what the system has on record, the person is allowed in. This process helps keep information safe by making sure only the right people can access certain parts of a system.
* ***Password Hashing***
Password hashing is a process in computing where a password is turned into a random-looking string of characters using a special algorithm. This makes it difficult for anyone to see or steal the original password. When you enter your password, the system hashes it and checks it against the stored hashed version. If they match, you get access. Hashing helps keep your password safe, even if someone gets hold of the stored data, because they can't easily reverse the hash to find out your actual password.
* ***Encryption***
Encryption in computing is the process of converting information or data into a secret code, so that only people with the correct key or password can read it. It’s like locking up your data in a safe, and only someone with the right key can open it and see what’s inside. This helps keep information secure from hackers or anyone who shouldn’t have access to it.

## ***Data Dictionary***
| Field | Datatype | Format for Display | Description | Example | Validation |
|-------|----------|--------------------|-------------|---------|------------|
| track | object | XX..XX | name of the song | Chop Suey! | Can be any amount of characters in the Modern English Alphabet |
| artist | object | XX..XX | name of the artist who made the song | System Of A Down | Can be any amount of characters in the Modern English Alphabet |
| genre | object | XX..XX | the top genre the song has | Metal | Can be any amount of characters in the Modern English Alphabet |
| popularity | integer64 | NN..NN | The popularity of a song out of 100 | 79 | Can only be numbers up to 100 |
| duration | integer64 | NN..NN | the duration of a song in milliseconds|185587 | Can be any amount of whole positive integers 
| year | integer64 | NN..NN | the release year of the song | 2024 | Must be inbetween 2000 and 2024 |
#
---

Popularity of metal songs over time

![image.png](attachment:image.png)

Amount of metal songs made per year

![image-2.png](attachment:image-2.png)

***Conclusion***

From this data, we can conclude that the number of metal songs produced each year has steadily increased over time, indicating that more artists and bands are contributing to the genre. However, despite this growth in production, the overall popularity of metal music has declined over the years. This decline suggests that even though more metal music is being made, it hasn't been connecting with a broader audience as strongly as it might have in the past.

Interestingly, in 2023, there appears to be a notable shift in this trend. After years of declining popularity, metal music seems to have experienced a resurgence, gaining more attention and possibly reaching new or returning audiences. This could be due to a variety of factors, such as a renewed interest in the genre, successful releases by popular bands, or even cultural shifts that have brought metal back into the spotlight. Overall, while metal music faced a period of decline in popularity, the data suggests that 2023 marks a potential turning point, indicating a revival of interest in the genre.

***Evaluation***
Throughout this project there have been many ups and downs, I have also seen the many upsides and unfortunate downside of the results. Throughout the process of writing the code for this project I have been through thourough testing and debugging, a huge obstacle to the creation of this program was the speed of the pandas api change due to the changes that pandas has had many tutorials do not give any valuable information and waste time but this was overcomable through time and effort. I feel that throughout the time we were given to complete this project I worked effectively for a majority of the time provide, however some parts of the project were left to be done quite late due to a bit of poor time management. 