<center> <h2> Why You Had to Suffer through That Space Game Thingy</h2> </center>

Suppose that you've been hired as a DS intern in a games research lab. In one of the recent research projects, researchers have investigated the influence of the appearance of AI agents on players' enjoyment of the game within the context of playing a space game. The game involved players collaborating with an AI agent to defend their spaceship against enemy spaceships. The researchers conducted an interesting experiment to compare three possible designs, despite the technical challenges. In this assignment, your supervisors would like your help in making an informed decision as to which design alternative to use for the representation of the AI agent in the game.

Based on prior research, the researchers came up with three alternatives for the representation of the AI agent: voice-based, robotic avatar, and humanoid avatar:
- The voice-based design represents the AI agent using an animated circle of dots.
- The robotic avatar displays a robot head to represent the AI agent.
- The humanoid avatar displays a human bust as the representation of the AI agent.

The representations are illustrated below:

<img src="https://i.ibb.co/M1ty00f/HW5-Conditions.png" width = 600>

<center> <h2> Experimental Design and Hypotheses</h2> </center>

The researchers were interested in the effect of avatar representation on game enjoyment, and you have been asked to analyze the data from an experiment designed to test which of these three designs would lead to better game enjoyment. For the experiment, they  recruited 240 participants and divided them into three groups of equal size to play the same game with one of the three avatar representations:

* The first group played the game with the voice representation and provided self-reported ratings of their enjoyment of the game.
* The second group played the game with the robot avatar and provided game enjoyment ratings as well.
* The third group played the game with the human avatar and also indicated their enjoyment of the game.

Game enjoyment was measured using a self-reported questionnaire from which an enjoyment score is computed in the range [1-7], with greater scores indicating greater game enjoyment. 

The **goal of the experiment** was to determine which of these three AI representations led to a more enjoyable gaming experience. Prior research indicates that the more embodied (having a bodily form) the representation of the avatar is, the more people engage in and enjoy the game. Therefore, the researchers predicted that the robot and human avatars would lead to greater game enjoyment, when compared to the voice representation. The researcher also predicted that the human avatar would lead to greater enjoyment levels compared to the robot avatar, as prior research shows the more human-like the avatar is, the more people like it.


<center> <h2> Dataset </h2> </center>

Here is the dataset, DS3000_HW5_Data.csv (be sure to download and place it in the same directory as this Notebook):

In [None]:
import pandas as pd
df = pd.read_csv("DS3000_HW5_Data.csv")
df.head()

Unnamed: 0,Condition,Enjoyment
0,Human,1.81545
1,Human,4.998519
2,Robot,4.49164
3,Voice,5.154917
4,Human,4.634129


<center> <h2> Part 1 </h2> </center>

### Question 1a
What is the independent variable of your experiment? What are the levels of the IV? Type your answer below:

In [None]:
# The independent variable is the avatar representation of the AI agent. The 3 levels of the IV are voice, robot, and human.

### Question 1b
What is the dependent variable of your experiment?

In [None]:
# The dependent variable is game enjoyment score.

### Question 1c (.5pts)
What type of experimental design did this study follow? Just name the design.

In [None]:
# Between-subjects experimental design.

### Question 1d
State the null hypothesis for the experiment described above.

In [None]:
# There is no difference in mean enjoyment score among the different agent avatars.

<center> <h2> Part 2 </h2> </center>

### Question 2 
Write a function that produces descriptive stats (count, mean, std, and sem) for each experimental condition. The function should take a dataframe and the names of columns for iv and dv and should return the descriptives dataframe as shown in the sample output.

In [None]:
def describe_data(data, iv, dv):
    return data.groupby('Condition').agg(["count", "mean", "std", "sem"])

In [None]:
descriptives = describe_data(data=df, iv="Condition", dv="Enjoyment")
descriptives

Unnamed: 0_level_0,Enjoyment,Enjoyment,Enjoyment,Enjoyment
Unnamed: 0_level_1,count,mean,std,sem
Condition,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2
Human,80,5.248073,1.220492,0.136455
Robot,80,4.643753,1.040908,0.116377
Voice,80,3.636178,0.950376,0.106255


### Question 3 
Write a function that visualizes a given dataframe that contains the data for a between-subjects experiment using **Plotly Express**. You are going to write a generic function that can be used to produce graphs for any between-subjects experiment. Here are the requirements:

- Your function should receive the following arguments: 
    - the name of the dataframe (argument name = data), 
    - the name of the column representing the grouping variable or IV (argument name = iv), 
    - the name of the column for the dependent variable or DV (argument name = dv),
    - the type of the visualization requested (argument name = vis_type),
        - possible values: "bar", "box", "violin"
        - the error bars of the bar graph should represent the standard error of each condition
        - hint: you will need to use the describe_data function when producing the bar graph
        
    - and the width of the visualization (argument name = vis_width)

- Then the function should produce the appropriate visualization based on the vis_type argument
- The function should use the same template for all visualizations, namely "ggplot2".
- The visualizations should use a distinct color for each condition, as shown in the sample outputs.
- The function should return the visualization.

Refer to the sample function calls and the corresponding visualizations (provided as links).

In [None]:
import plotly.express as px

def visualize_data(data, iv, dv, vis_type, vis_width):
    shared_args = {'x': iv, 'color':iv, 'width': vis_width, 'template': 'ggplot2'}
    if vis_type == 'bar':
        described_data = describe_data(data, iv, dv)[dv].reset_index()
        return px.bar(data_frame=described_data, y='mean', error_y='sem', **shared_args)
    if vis_type == 'box':
        return px.box(data_frame=data, y=dv, **shared_args)
    if vis_type == 'violin':
        return px.violin(data_frame=data, y=dv, points='all', **shared_args)

In [None]:
bar_graph = visualize_data(data=df, iv="Condition", dv="Enjoyment", vis_type="bar", vis_width=500)
bar_graph.show()

<img src="https://i.ibb.co/Mg42MLM/hw5bar.png" alt="hw5bar" border="0">

Here is the sample output for the bar graph (lesson learned: they won't embed; hence, the links):
- https://i.ibb.co/bH0tnfx/bar.png

In [None]:
box_graph = visualize_data(data=df, iv="Condition", dv="Enjoyment", vis_type="box", vis_width=500)
box_graph.show()

<img src="https://i.ibb.co/2gjHSJ0/hw5box.png" alt="hw5box" border="0">

Here is the sample output for the box graph:
- https://i.ibb.co/XZvDDHW/box.png

In [None]:
violin_graph = visualize_data(data=df, iv="Condition", dv="Enjoyment", vis_type="violin", vis_width=500)
violin_graph.show()

<img src="https://i.ibb.co/4pnS3xT/hw5violin.png" alt="hw5violin" border="0">

Here is the sample output for the violin graph:
- https://i.ibb.co/kJ9DVNc/violin.png

### Question 4 
Write a function that takes a dataframe and the names of the IV and DV and conducts a one-way ANOVA on the dataframe using the IV and DV. Your function should calculate the F-test values, check for assumptions, and perform post-hoc comparisons, as seen in the output. You should use pingouin for these tests. 

The output from your function should be formatted in a way that matches the sample output below. That said, your function shouldn't be hard-coded to the values shown in the output. The output of the function should always be based ont the arguments provided. You should carefully study the sample output and format your output accordingly. Figuring out how to do this is part of the assignment. The sample output is formatted using "\n" and "\t" and the other string formatting methods we've looked at in class. Use extra lines to properly format the ouput. The more readable, the better. Don't worry; you won't lose points for including an extra line or whitespace in your output. Simply make sure your output is not crammed and approximates the sample output as much as possible. 


The function should indicate whether the results of the ANOVA test were significant. 

Please note that in the sample output the posthoc tests are displayed, but this should only be done if the ANOVA is significant. If the ANOVA results are nonsignificant, no posthoc tests should be performed and the ouput shouldn't include them. Your code must check for both conditions. This cannot be seen in the sample output but should be included in your code.

Also note that for the posthoc tests, the sample output displays certain columns from the dataframes returned from pingouin. Your output should do the same, instead of directly printing out whatever is returned from pingouin functions.

The sample output doesn't display any messages for the assumption checks, as the returned results indicate whether the assumption was met (boolean values). However, you're more than welcome to add an appropriate interpretation. This is not required, though.

Assume an alpha level of .05 for hypothesis testing.

Use bonferonni correction for posthocs.

Use 3 decimal points when displaying floats.



In [None]:
import pingouin as pg

def oneway_ANOVA(data, iv, dv):
    
    alpha = 0.05
    
    levene_results = pg.homoscedasticity(data=data, dv=dv, group=iv)
    normality_results = pg.normality(data=data, dv=dv, group=iv)
    
    divider = '--------------------------'
    new_line = '\n'
    print(f'''{divider}{new_line}ONE-WAY ANOVA RESULTS{new_line}{divider}{new_line}
        {new_line}Assumption Checks{new_line}{divider}{new_line}
        {new_line}Assumption of Equality of Variances:{new_line}
        {new_line}{levene_results.round(3).to_string()}{new_line}
        {new_line}Assumption of Normality:{new_line}
        {new_line}{normality_results.round(3).to_string()}{new_line}{new_line}''')
    
    if levene_results['pval'][0] > alpha and (normality_results['pval'] > alpha).all():

        results = pg.anova(data=data, dv=dv, between=iv, detailed=True)

        print(f'''{new_line}F-test{new_line}{divider}{new_line}
            {new_line}F({results['DF'][0]},{results['DF'][1]}) = {results['F'][0]:.3f}, p = {results['p-unc'][0]:.3f}''')

        if results['p-unc'][0] < alpha:
            posthocs = pg.pairwise_ttests(data=data, dv=dv, between=iv, padjust='bonf')
            print(f'''{new_line}* Significant ANOVA results.{new_line}* Posthoc tests with bonferroni correction will be performed.{new_line}
                {new_line}Post-hoc Tests{new_line}{divider}{new_line}
                {new_line}{posthocs[['A', 'B', 'T', 'dof', 'p-corr']].round(3).to_string()}''')
        else:
            print(f'{new_line}* Insignificant ANOVA results.')
    
    else:
        print(f'{new_line}* Assumption checks failed.')
    

In [None]:
oneway_ANOVA(data=df, iv = "Condition", dv = "Enjoyment")

--------------------------
ONE-WAY ANOVA RESULTS
--------------------------

        
Assumption Checks
--------------------------

        
Assumption of Equality of Variances:

        
           W   pval  equal_var
levene  2.15  0.119       True

        
Assumption of Normality:

        
           W   pval  normal
Human  0.984  0.419    True
Robot  0.982  0.338    True
Voice  0.982  0.322    True



F-test
--------------------------

            
F(2,237) = 45.780, p = 0.000

* Significant ANOVA results.
* Posthoc tests with bonferroni correction will be performed.

                
Post-hoc Tests
--------------------------

                
       A      B      T    dof  p-corr
0  Human  Robot  3.370  158.0   0.003
1  Human  Voice  9.320  158.0   0.000
2  Robot  Voice  6.394  158.0   0.000


<center> <h2> Part 3 </h2> </center>

### Question 5 
Report the results of your test following the write-up example from the corresponding lecture notes. Failing to include all the required components (test and purpose and actual results) will lead to point deducations. You will basically explain your previous output using the template provided, but don't interpret the results here. Also describe the results of post hoc comparisons (were they significant?). Type your answer below in a markdown cell:

A one-way analysis of variance (ANOVA) was conducted to compare the enjoyment score of a game among voice, robot, and human AI agent avatars.
Results revealed a statistically significant difference among the three avatars, F(2, 237) = 45.780, p < .05.
Post-hoc comparisons using the bonferroni correction indicated that the average enjoyment score of the human avatar (M = 5.248, SE = 0.136) was significantly greater than the enjoyment score of the robot avatar (M = 4.644, SE = 0.116) and the voice avatar (M = 3.636, SE = 0.106). The robot avatar also had a significantly greater average enjoyment score than the voice avatar.

### Question 6 
**Interpret** your results from the previous step. Which avatar should be used? Why? Type your answer below in a markdown cell:

These results indicate that the human AI agent avatar is more enjoyable than both the robot and voice avatars. Results also indicate that the robot avatar was more enjoyable than the voice avatar. These findings were in line with prior research suggesting the more human-like the avatar is, the more people like it. Therefore, the human avatar should be used as the game's AI agent. 