## Case Study: Analyzing Anime Ratings in 2023

### Objectives
1. Determine the highest-rated anime shows of each genre. This can provide insights into the type of anime projects that a studio should work on in the future. 
2. Explore demographics like gender and age impact anime ratings. This can provide actionable insights, such as targeted recommendations for anime creators and marketers.
3. Determine if there is a relationship between the number of episodes and an anime's rating.

### Source
The dataset source is entitled "Anime Dataset 2023" compiled by Sajid Uddin from the website myanimelist.net

Source URL: https://www.kaggle.com/datasets/dbdmobile/myanimelist-dataset 

### Methodology

1. **Data Cleaning**: Ensure data is clean and consistent.
2. **Segmented Demographic Analysis**: Analyze scores by genre, gender, and age.
3. **Correlation Analysis**: Examine the relationship between number of episodes and scores.



### A. Data Cleaning

1. In the anime_dataset csv file, removed rows with missing values in the "anime_id" column from the "score" column using the filter option in Excel.
2. In anime_dataset, split the "genre" column values into "main_genre", etc. in Excel.
3. In anime_dataset, split the "premiere date" column values into "year_premiered" and "season" in Excel.
4. In anime_dataset, filtered the "type" column and only retained the "TV" results for the number of episodes vs rating correlation analysis. 
5. In the user_details csv file, removed blank rows and invalid values from the "gender" and "year" (birthyear) columns using the filter option in Excel.
6. In user_details, made sure that "rating" values are from numbers 1-10 using the filter option in Excel spreadsheet.
7. Removed duplicates in tables "anime_dataset" and "user_details" in Microsoft SQL server.





In [None]:
-- Removed duplicates, saved result as anime_dataset_clean

SELECT DISTINCT anime_id
      ,anime_name
      ,score
      ,main_genre
      ,episodes
      ,year_premiered
      ,studio
      ,popularity
FROM [anime_project].[dbo].[anime_dataset];


-- Removed duplicates, saved result as user_details_clean

SELECT DISTINCT user_ID
      ,username
      ,gender
      ,year
FROM [anime_project].[dbo].[user_details];



### B. Ratings per Genre

Compute the average score per genre for anime TV shows. 

This analysis will help in understanding the viewer preferences associated with different genres.


In [None]:
SELECT 
    main_genre,
    ROUND(AVG(score),2) AS average_score
FROM [anime_project].[dbo].[anime_dataset_clean]
WHERE 
    -- Removed invalid genres and UNKNOWN
    main_genre != 'Award Winning'
    AND main_genre != 'UNKNOWN'
GROUP BY
    main_genre
ORDER BY
    average_score DESC
    
    

Tableau viz URL: https://public.tableau.com/views/AverageScoreperAnimeGenreasof2023/Sheet1

![](https://i.imgur.com/SHnnHjt.png)

#### Insights on Top Anime Genres by Average Score

##### Top 3 Results: Sports, Drama, and Gourmet

1. **Viewer Preferences:**
   - **Sports**: The high average score for sports-themed anime like "Haikyuu!!" and "Hajime No Ippo" suggests that viewers enjoy dynamic and competitive storylines. The excitement, teamwork, and personal growth often depicted in sports anime resonate well with audiences.
   - **Drama**: Drama's high ranking indicates that viewers appreciate emotionally engaging and character-driven stories. Drama anime like "Fruits Basket" often explore complex relationships and life challenges, offering depth and relatability.
   - **Gourmet**: The popularity of gourmet-themed anime shows like "Shokugeki no Souma" suggests that viewers are interested in culinary arts and enjoy watching food preparation and tasting. These shows often combine visual appeal with interesting narratives about cooking and food culture.

2. **Implications and Recommendations:**
   - **For Content Creators**:
     - **Leverage Popular Themes**: Given the success of sports, drama, and gourmet genres, anime makers should consider incorporating elements of these genres into new series. For example, a sports anime could include dramatic personal backstories, or a gourmet anime could feature competitive cooking battles.
     - **Character Development**: Focus on creating strong, relatable characters and emotionally engaging storylines. The success of drama anime highlights the importance of well-developed characters and narratives.
     - **Visual and Sensory Appeal**: In genres like gourmet, the visual representation of food is critical. Investing in high-quality animation to depict food can enhance viewer engagement. Similarly, visually dynamic sports scenes can attract sports anime fans.

By understanding and leveraging these insights, anime creators can cater to audience preferences and create content that is likely to be well-received, thereby increasing viewership and overall success.

### C. Gender Preferences

Find the average score per genre for anime TV shows per gender. 

This analysis will help in determining the most preferred genres of males and of females.




In [None]:
SELECT 
    user_details.gender AS user_gender,
    anime_data.main_genre AS anime_genre,
    ROUND(AVG(CAST(user_score.rating AS FLOAT)), 2) AS average_score 
FROM anime_project.dbo.user_details_clean AS user_details 
JOIN
    anime_project.dbo.user_score AS user_score 
    ON user_score.user_id = user_details.user_ID
JOIN
    anime_project.dbo.anime_dataset_clean AS anime_data 
    ON anime_data.anime_id = user_score.anime_id
GROUP BY
    user_details.gender,
    anime_data.main_genre
ORDER BY
    user_gender,
    average_score DESC


Tableau viz URL: https://public.tableau.com/views/MalesvsFemalesAverageScoreperAnimeGenreasof2023/Dashboard1

![](https://i.imgur.com/4BbAIvX.png)

#### Insights on Anime Genre Ratings by Gender

##### High Ratings for "Supernatural" Genre

The analysis reveals that the "Supernatural" genre is the highest-rated among both male and female viewers. This indicates a strong preference for storylines that involve fantastical elements, mystical powers, and otherworldly settings. The universal appeal of supernatural themes might be attributed to their ability to captivate a broad audience by offering escapism and intriguing plot twists. For anime creators, focusing on well-developed supernatural themes could be a strategic move to attract and retain viewers across different demographics.

##### Low Ratings for "Ecchi" and "Boys Love" Genres

The data shows that the "Ecchi" genre is among the lowest rated by both males and females, while "Boys Love" is also notably low-rated among male viewers. These results suggest that explicit content and certain niche genres may not have widespread appeal. This insight can guide anime studios in understanding audience sensitivities and preferences. To cater to a broader audience, it might be beneficial to either avoid these genres or handle them with greater nuance and sensitivity.

##### Recommendations for Anime Creators

1. **Focus on Popular Genres**: Given the high ratings for the "Supernatural" genre, anime creators should consider investing more in developing high-quality supernatural series. Engaging storylines, unique characters, and innovative concepts within this genre can attract a larger audience base.

2. **Understanding Audience Preferences**: The low ratings for "Ecchi" and "Boys Love" highlight the importance of understanding and respecting audience preferences and sensitivities. Creators might explore these genres in more subtle ways or combine elements with other genres to make them more appealing.

3. **Diverse Content**: While certain genres are less popular, it's crucial to offer diverse content to cater to various audience segments. Balancing mainstream appeal with niche interests can help in building a loyal and diverse viewer base.

By focusing on these insights, anime creators can better align their content with viewer preferences, ultimately leading to higher satisfaction and engagement among their audience.

### D. Age Group Preferences

Determine the average score per genre for anime TV shows based on different age groups. 

This analysis will help reveal the preferences and differences in anime tastes among various age groups.

In [None]:
-- Categorize users into age groups based on year of birth
WITH age_groups AS (
    SELECT 
        user_ID,
        CASE 
            WHEN year < 1994 THEN 'thirties'
            WHEN year BETWEEN 1994 AND 2003 THEN 'twenties'
            WHEN year BETWEEN 2004 AND 2010 THEN 'teens'
            WHEN year BETWEEN 2011 AND 2016 THEN 'kids'
            ELSE 'unknown'
        END AS age_group
    FROM anime_project.dbo.user_details_clean
),
-- Joining the 3 tables
joined_tables AS (
    SELECT 
        anime_data.main_genre,
        age_groups.age_group,
        user_score.rating
    FROM 
        anime_project.dbo.user_score AS user_score
    JOIN 
        anime_project.dbo.anime_dataset_clean AS anime_data 
    ON user_score.anime_id = anime_data.anime_id
    JOIN 
        age_groups 
    ON user_score.user_id = age_groups.user_ID
),
-- Calculate the average score per genre within each age group and assign a rank
ranked_genres AS (
    SELECT 
        main_genre,
        age_group,
        ROUND(AVG(CAST(rating AS FLOAT)), 2) AS average_score,
        ROW_NUMBER() OVER (PARTITION BY age_group ORDER BY AVG(CAST(rating AS FLOAT)) DESC) AS rank
    FROM 
        joined_tables
    GROUP BY 
        main_genre,
        age_group
)
-- Filter the results to only  the top 5 genres
SELECT 
    main_genre,
    age_group,
    average_score
FROM 
    ranked_genres
WHERE 
    rank <= 5
ORDER BY 
    age_group, 
    average_score DESC;

Tableau viz URL: https://public.tableau.com/views/Top5GenresBasedonAverageScoreperAgeGroup/Dashboard1

![](https://i.imgur.com/EN2QZ5J.png)

#### Insights on Top 5 Anime Genre Ratings per Age Group

##### Kids (Aged 12 and Below)
The top genre for kids being "Hentai" is highly unusual and suggests potential issues with data quality in the dataset. This result likely indicates either mislabeled data or inappropriate age categorization. Further validation of the dataset is necessary to ensure that age-appropriate content is correctly categorized and analyzed.

##### Teens (Aged 13 to 19)
The top genre for teens being "Sports" aligns well with the interests of this age group. Sports anime like "Haikyuu!!" and "Slam Dunk" often feature themes of teamwork, perseverance, and personal growth, which resonate strongly with teenagers. This genre’s popularity could be leveraged by anime creators to develop more content aimed at teenage audiences, emphasizing these universal themes.

##### Twenties and Thirties
For both the twenties and thirties age groups, "Supernatural" tops the list. The appeal of supernatural anime like "Death Note" likely stems from its diverse storytelling and imaginative elements that cater to an older audience looking for complex and engaging narratives. Anime creators can focus on producing high-quality supernatural content to maintain and grow their viewer base among adults.

##### Common Genre Across All Age Groups
The inclusion of "Suspense" in the top 5 genres for all age groups highlights its broad appeal. Suspenseful storylines, like in the anime "Gyakkyou Burai Kaiji", keep viewers engaged regardless of age, suggesting that well-crafted, thrilling narratives can attract a diverse audience. Anime creators should consider incorporating suspense elements to appeal to a wide demographic.

##### Recommendations
1. **Data Validation:** Verify the dataset to correct any misclassifications, especially ensuring age-appropriate content.
2. **Targeted Content:** Create more sports-themed anime for teens and supernatural-themed anime for adults in their twenties and thirties.
3. **Suspense Elements:** Incorporate suspense in various genres to attract a broader audience.

These insights can help anime studios tailor their content strategies to better meet the preferences of different age groups.

### E. Number of Episodes vs Anime Rating Correlation Analysis

To explore whether the length of an anime series correlates with its rating by creating a scatter plot in R Programming. This visual analysis will help in observing any trend, such as whether longer series tend to receive higher or lower ratings compared to shorter ones. 

In [None]:
# loading packages

install.packages("tidyverse")
library(tidyverse)
library(dplyr)

install.packages("ggplot2")
library(ggplot2)

# importing dataset

anime_dataframe <- read_csv("anime_dataset_clean.csv")

head(anime_dataframe)

# filter outliers (anime shows with more than 25 total episodes)

anime_dataframe_filtered <- anime_dataframe %>%
  filter(episodes <= 25)

# calculate the correlation coefficient

correlation_coefficient <- cor(anime_dataframe_filtered$episodes, anime_dataframe_filtered$score, use = "complete.obs")
print(correlation_coefficient)

# plot the relationship between number of episodes and score

ggplot(data=anime_dataframe_filtered) + 
  geom_smooth(mapping=aes(x=episodes, y=score),color="purple") +
  labs(title="Anime Number of Episodes vs Score", 
       caption="The dataset source is entitled 'Anime Dataset 2023' compiled by Sajid Uddin from the website myanimelist.net 
Source URL: https://www.kaggle.com/datasets/dbdmobile/myanimelist-dataset ") +
  annotate("text", x = Inf, y = Inf, label = paste("Correlation Coefficient:", round(correlation_coefficient, 2)), hjust = 1.1, vjust = 1.1)


![](https://i.imgur.com/onatmHA.png)

#### Insights on the Correlation Analysis Between Anime Number of Episodes and Average Score

The smooth line slopes upward from left to right, which indicates a positive correlation, meaning that as the number of episodes increases, the score tends to increase.

But the correlation coefficient of 0.3492048, which is closer to 0 than to 1, indicates a weak to moderate positive relationship between the number of episodes and the score of anime shows.

##### Interpretation:

Weak to Moderate Relationship: A correlation coefficient of around 0.35 indicates that there is some positive relationship, but it's not strong enough to predict one variable based on the other with high confidence.

Positive Trend: Despite being weak, the positive trend means that generally, anime with more episodes tend to have slightly higher scores.

##### Practical Implication:

Influence: The number of episodes does have some influence on the score, but other factors likely play a significant role in determining the score of an anime.

Decision-Making: While considering the number of episodes might be useful, relying solely on this factor for decisions about anime ratings would be insufficient.

### Conclusion

**1. Insights on Top Anime Genres by Average Score**
The analysis reveals that the top three anime genres by average score in 2023 are Sports, Drama, and Gourmet. These genres likely resonate well with audiences due to their compelling narratives and character development. The strong showing of these genres suggests that viewers are highly engaged with emotionally driven and dynamic storylines.

**2. Insights on Anime Genre Ratings by Gender**
When analyzing ratings by gender, it was found that the "Supernatural" genre received high ratings from both males and females. Conversely, "Ecchi" and "Boys Love" genres received lower ratings. This indicates a preference for genres with imaginative and fantastical elements over those that may cater to more niche tastes.

**3. Insights on Top 5 Anime Genre Ratings per Age Group**
The analysis of anime genre ratings by age group shows that "Sports" is the top genre for teens, while "Supernatural" tops the list for both the twenties and thirties age groups. Additionally, the "Suspense" genre is consistently popular across all age groups, appearing in the top 5 for each. This suggests that while certain genres appeal to specific age groups, others have broad appeal across different demographics.

**4. Insights on the Correlation Analysis Between Anime Number of Episodes and Average Score**
The correlation analysis between the number of episodes and the average score of anime shows resulted in a correlation coefficient of 0.3492048. This indicates a weak to moderate positive relationship, suggesting that while animes with more episodes tend to have slightly higher scores, this factor alone does not strongly predict anime ratings. Other variables likely play a significant role in determining the success of an anime series.

Overall, these insights provide valuable information for anime producers and marketers to tailor their content to audience preferences and demographic trends, enhancing the appeal and success of future anime releases.
