# Assignment 4

### Randomization testing

### Dataset

A game enthusiast claims that Xbox 360 games are better than PS2 games. He randomly samples games from both consoles and collects information about the review scores that the sampled games received on this [website](https://www.ign.com/reviews/games). In the  dataset he collected, there is information about the game title, platform, the review score the games received, the genre, and the release dates.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats

sns.set_style("whitegrid")

### Exercise 1
Our game enthusiast wants to test whether Xbox 360 games receive a higher review score than PS2 games. 

1. Write down the null- and alternative hypothesis.
2. Load the data and print out the head.
3. To run the statistical tests and make plots, create two new variables that contain the data for the PS2 and Xbox 360 datasets. Look at your code from the previous weeks to remind you how to do that. 
4. Before doing a statistical test, it is always a good idea to have a look at your data. Choose an appropriate plot to visualize the data. Show the data for both consoles separately. 
5. Does the plotted data suggest that the hypothesis of our game enthusiast may be right?
6. Do these data meet the normality assumptions required for a t-test?<div style='text-align: right;'>**7 points**</div>

In [None]:
## your code/answer here
def print_question(question_number, sep_line_width = 60):
    print(f"Question {question_number}")
    print(sep_line_width * "=")

print_question(1)
print("H0: The mean review score is equal for Xbox 360 and PS2 games.")
print("H1: The mean review score is higher for Xbox 360 games.")

In [None]:
## your code/answer here
print_question(2)
dataframe = pd.read_csv('PS2_vs_Xbox360.csv')

dataframe.head()

In [None]:
## your code/answer here
print_question(3)
ps2_scores = dataframe[dataframe["platform"] == "PlayStation 2"]["score"]
print("PlayStation 2 scores:")
print(ps2_scores.head(), "\n")

x360_scores = dataframe[dataframe["platform"] == "Xbox 360"]["score"]
print("Xbox 360 scores:")
print(x360_scores.head())

In [None]:
## your code/answer here
print_question(4)

plt.figure(figsize=(10,6))
sns.histplot(ps2_scores, color='skyblue', label='PS2')
sns.histplot(x360_scores, color='lightcoral', label='Xbox 360')
plt.title('Distribution of Review Scores for PS2 and Xbox 360 Games')
plt.xlabel('Review Score')
plt.ylabel('Density')
plt.legend()
plt.show()

plt.figure(figsize=(10,6))
sns.boxplot(x='platform', y='score', data=dataframe, hue='platform', palette={'PlayStation 2': 'skyblue', 'Xbox 360': 'lightcoral'})
plt.title('Boxplot of Review Scores by Platform')
plt.xlabel('Platform')
plt.ylabel('Review Score')
plt.show()


In [None]:
## your code/answer here
print_question(5)
print("The histogram shows that the distribution for the Xbox 360 scores is shifted slightly to the right.")
print("The boxplot shows that the mean score for Xbox 360 is slightly higher than PS2.\n")
print("So, yes, the data seems to suggest that the hypothesis of the game enthusiast may be right.")

print("\n")
print_question(6)
print("When looking at the histogram, we can see that the distribtion is skewed to the left.")
print("Something similar can be seen in the boxplot, where the median is not centered in the box.")
print("So, no, the data does not meet the normality assumptions required for a t-test.\n")

ps2_skew = ps2_scores.skew()
x360_skew = x360_scores.skew()
print("PS2 skew: ", ps2_skew)
print("Xbox 360 skew: ", x360_skew) # if value is further away from 0 then the data is more skewed

### Exercise 2

Non-parameteric testing is a great alternative for datasets such as these. In this workgroup you will learn how you can use randomization to do hypothesis testing. Before we can do the randomization test, we will need to prepare some variables. Because the data are left skewed we will use the median instead of the mean to compare the two consoles. Do the following things:

 1. Calculate the difference in median score of both consoles, and store it in an appropriately named variable.
 2. Define how many randomization iterations you want to compute (usually 1000 up to 10000), and create an empty numpy array of this size.
 3. Now create a for-loop that loops $n$ times (the number you defined in item 2). Inside the loop, randomly assign each score to a console. Each console should have as many scores as it originally had. 
You can do this by:
1. shuffling all the rows of the 'games' pandas dataframe (using the sample method in pandas) 
2. selecting the first x number of cases that will represent the Xbox 360 games. 
3. selecting the remaining cases that represent the PS2 games. 
4. After you randomly reassigned the rows to the consoles, you calculate the (randomized) differences in the medians of the scores of the two consoles. Store these median differences in a numpy array (that you created in item 2).<div style='text-align: right;'>**5 points**</div>

In [None]:
## your code/answer here
print_question(1)
ps2_median = ps2_scores.median()
xbox360_median = x360_scores.median()
diff = xbox360_median - ps2_median

print(f"median difference is {diff}")

print_question(2)
iterations = 1000
random_median_diff = np.empty(iterations)

print(f"Empty numpy array created : {random_median_diff}")

print_question(3)

xbox = len(x360_scores)
ps2 = len(ps2_scores)

print(xbox)
print(ps2)

#random test
for i in range(iterations):
     #combine all score and shuffle it.
     all_score = pd.concat([x360_scores, ps2_scores])
     shuffled = all_score.sample(frac=1)
     
     #the head of number is xbox score.
     random_xbox = shuffled[:xbox]

     #the rest of number is ps2 score.
     random_ps2 = shuffled[xbox:]

    #calculate difference between two medians.
     random_median_diff[i] = random_xbox.median()- random_ps2.median()

print(f"First five median differences in randomised:{random_median_diff[:5]}")

#1000 times randomisations were run to find out.
#This is the probability of a difference so far by chance.

In [None]:
## your code/answer here

### Exercise 3

Now that the randomizations are done, we want to obtain a p-value on the basis of which we can conclude whether Xbox360 games are rated higher than the PS2 games. 

1. Calculate the p-value, i.e., the probability that the randomized median differences are higher than the actual difference in the medians. 
2. Plot the randomized median differences and the actual median difference in one plot (use a `histplot` or `displot` to plot the randomized differences and use `plt.vlines` to plot the actual median difference).
3. What would the p-value be for a 2-sided test?<div style='text-align: right;'>**5 points**</div>

In [None]:
## your code/answer here
print_question(1)

greater_val = []

for value in random_median_diff:
    if value >= diff:
        greater_val.append(value)

count = len(greater_val)

p = count / iterations

print(f"p value is {p}")


print_question(2)

#Draw histgram graph(randome diff)
sns.histplot(random_median_diff, color="red")

#Real diff
plt.vlines(diff, ymin=0, ymax=100, colors="green",label="Real Difference")


plt.title("Randomized Median and real difference")
plt.xlabel("Median Diff")
plt.ylabel("Count")
plt.legend()
plt.grid()
plt.show()


In [None]:
## your code/answer here
print_question(3)

abs_diff = abs(diff)

count_2side = 0

for i in random_median_diff:
    if abs(i) >= abs_diff:
        count_2side += 1

p_2sided = count_2side /iterations

print(f"2sided p value is : {p_2sided}")

### Exercise 4
Instead of randomization we could also use the Mann-Whitney U test in this case, which is a non-parametric test to compare whether data in one population are "higher" than in another. 

1. Use the scipy module `stats.mannwhitneyu` to run the test. 
2. How does the p-value compare to the one you obtained through randomization?<div style='text-align: right;'>**2 points**</div>

In [None]:
## your code/answer here
print_question(1)

#Assume that  that xbox is greater 
result = stats.mannwhitneyu(x360_scores,ps2_scores,alternative="greater")

#Show result

print(f"U statistics {result.statistic}")
print(f"p value : {result.pvalue}")

In [None]:
## your code/answer here
print_question(2)

print(f"p value (Mann-Whitney U test): {result.pvalue}")
print(f"p value(randomization, one-sided): {p}")
print(f"p value(randomization, two side): {p_2sided}")

print(f"When tested to see if they were within 5%, \n"
        "The Mann-Whitney U test produced a highly significant result with a slightly smaller p-value.\n"
        "The randomization test produced a moderately significant result with a slightly larger p-value.\n"
        "The 2-sided was the largest and not significant.\n" )

### Exercise 5
In the preceding exercises, you obtained three p-values, using different approaches. What is your conclusion regarding the hypothesis formulated in Exercise 1 above? <div style='text-align: right;'>**2 points**</div>

In [None]:
## your code/answer here
## your code/answer here
print_question(5)

print("The hypothesis from Exercise 1 was:")
print("H0: The mean review score is equal for Xbox 360 and PS2 games.")
print("H1: The mean review score is higher for Xbox 360 games.\n")

print("We used non-parametric tests (randomization and Mann-Whitney U) because the normality assumption was not met\n")

print(f"The median difference was: {diff:.8f}\n")

print("P values:")
print(f"Randomization test (one-sided): p = {p}")
print(f"Mann-Whitney U test (one-sided): p = {result.pvalue:.8f}\n")


# significance level
alpha = 0.05

# Make the conclusion based on the one-sided p-values
print(f"Both one-sided p-values ({p} and {result.pvalue:.8f}) are less than {alpha}.")
print("Therefore, we reject the null hypothesis (H0), that the mean review score is equal for Xbox 360 and PS2 games.\n")
print("Conclusion: There is statistically significant evidence to support the game enthusiast's claim that Xbox 360 games receive higher review scores than PS2 games.")

In this assignment, we used a small sample of a larger dataset that you can find [here](https://github.com/erilyth/DeepLearning-Challenges/blob/master/Sentiment_Analysis/ign.csv) in case you are interested in the ratings of all console games.

**Total number of points**: 21