# Calculate Correlations from paper

In this notebook all correlations from the paper are calculated and it is explained where the data comes from.

In [1]:
# Import dependencies
import pandas as pd
import numpy as np
from scipy.stats import spearmanr

The file `userstudy_including_survey.csv` contains data of the games played during the user study. In addition to that the survey data is already joined to the data extracted from the vidoes.

Furthermore, the file also contains information wheter the player landed in a hotspot or not. This was analyzed in a separate analysis step.

Because most correlations of the paper are calculated for beginner Fortnite players, we first load the data from the user study.

In [2]:
# Load the data
df = pd.read_csv("userstudy_including_survey.csv")

Correlations in the paper are calculated using Spearman's rank correlation coefficient. The degree of freedom (DF) for the spearman coefficient is calculated by the sample size - 2.

In [3]:
DF = len(df) - 2
print("DF=" + str(DF))

DF=85


This means that for all correlations for beginner Fortnite player, the DF is 85.

## Correlations for H1

In the following, we calculate the correlations for **H1**, which is __Satisfaction and enjoyment are correlated and are influenced by the player's success in the game__

In [4]:
# Correlation between Satisfaction and Enjoyment
spearmanr(df["Satisfaction"], df["Enjoyment"], nan_policy='omit')

SpearmanrResult(correlation=0.46805091489043021, pvalue=masked_array(data = 6.273626362055868e-06,
             mask = False,
       fill_value = 1e+20)
)

In [5]:
# Correlation between Satisfaction and duration of the game
spearmanr(df["Satisfaction"], df["playtime_seconds"], nan_policy='omit')

SpearmanrResult(correlation=0.60555915641562119, pvalue=masked_array(data = 8.265164539283189e-10,
             mask = False,
       fill_value = 1e+20)
)

In [6]:
# Correlation between Satisfaction and place
spearmanr(df["Satisfaction"], df["place"], nan_policy='omit')

SpearmanrResult(correlation=-0.53157827050353201, pvalue=masked_array(data = 1.6523359611615222e-07,
             mask = False,
       fill_value = 1e+20)
)

In [7]:
# Correlation between Satisfaction and kills
spearmanr(df["Satisfaction"], df["kills"], nan_policy='omit')

SpearmanrResult(correlation=0.47855633519539365, pvalue=masked_array(data = 3.612002994521827e-06,
             mask = False,
       fill_value = 1e+20)
)

## Correlations for H3b

In the following, we calculate the correlations for **H3b**, which is __The enjoyment of Fortnite beginners is in-
fluenced by the chosen landing spot__.

Therefore, we analyzed if the movement traces of the individual games started in a landing, activity, or killing hot spot and added the results of this analysis as columns `hotspotlanding`, `activityhotspot` and `Killhotspotlanding`.

In [8]:
# Correlation between landing on a landing hot spot and enjoyment
spearmanr(df["hotspotlanding"], df["Enjoyment"], nan_policy='omit')

SpearmanrResult(correlation=-0.27318429121962823, pvalue=masked_array(data = 0.01360312662632364,
             mask = False,
       fill_value = 1e+20)
)

In [9]:
# Correlation between landing in activity hot spot and the place
spearmanr(df["activityhotspot"], df["place"], nan_policy='omit')

SpearmanrResult(correlation=0.31213578074013532, pvalue=masked_array(data = 0.004306220020430569,
             mask = False,
       fill_value = 1e+20)
)

In [10]:
# Correlation between landing in activity hot spot and the playtime
spearmanr(df["activityhotspot"], df["playtime_seconds"], nan_policy='omit')

SpearmanrResult(correlation=-0.31208651621525746, pvalue=masked_array(data = 0.004312657697834659,
             mask = False,
       fill_value = 1e+20)
)

# Corelations for H4

In the following, we calculate the correlations for **H4**, which is __The amount of time playing other games influences the success when starting to play Fortnite__.

In [11]:
# Correlation between landing in a landing hot spot and duration of watched fortnite streams
spearmanr(df["hotspotlanding"], df["watchedHours"], nan_policy='omit')

SpearmanrResult(correlation=0.36583945924869671, pvalue=masked_array(data = 0.0007250312934609963,
             mask = False,
       fill_value = 1e+20)
)

Furthermore, we argue for H4 that achieving a good place and the number of kills are correlated. For analyzing this statement, we need the data of all valid games, which can be found in the file `all_valid_games.csv`

In [12]:
df_all = pd.read_csv("all_valid_games.csv")
spearmanr(df_all["kills"], df_all["place"], nan_policy='omit')

SpearmanrResult(correlation=-0.63313178622272914, pvalue=2.716163141849388e-92)

In [13]:
# The DF for this values correlation is now of course changed.
print("New DF: " + str(len(df_all) - 2))

New DF: 811
