# Run this on startup

In [1]:
import pandas as pd

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


Open all of the relevant data files/directories

In [23]:
data_dir = '../static/data/'

conditions_file = data_dir + 'conditions.csv'
conditions = pd.read_csv(conditions_file)

exit_survey_file = data_dir + 'exit_survey.csv'
exit_survey = pd.read_csv(exit_survey_file)
exit_survey = exit_survey.drop([0,1]) # drop extra qualtrics rows

Rename qualtrics values to something understandable by people. Only relevant values are defined here.

In [24]:
qualtrics_columns = {
    'StartDate': 'start_date',
    'EndDate': 'end_date',
    'Duration (in seconds)': 'duration',
    'Finished': 'is_finished', # 1 is finished
    'Q9': 'prolific_id', # user inputted prolific id
    'Q1': 'gender', # mapping below
    'Q8': 'age', # number input
    'Q3': 'drone_experience', # mapping below
    'Q4': 'video_game_experience', # mapping below
    'Q5': 'feedback_helped', # likert mapping below
    'Q6': 'change_from_feedback', # open text response
    'Q7': 'comments' # open text response, optional
}

# rename columns
exit_survey = exit_survey.rename(columns=qualtrics_columns)
# remove extra columns
exit_survey = exit_survey.drop(columns=['Status', 'Progress', 'RecordedDate', 'ResponseId', 'DistributionChannel', 'UserLanguage'])

# replace numeric values with strings
gender_map = {'1': "Man", '2': "Woman", '3': "Non-binary", '4': "Prefer not to say"}
exit_survey['gender'] = exit_survey['gender'].replace(gender_map)
drone_map = {'1': "None", '2': "Some", '3': "Regularly", '4': "Professional"}
exit_survey['drone_experience'] = exit_survey['drone_experience'].replace(drone_map)
game_map = {'1': "None", '2': "Monthly", '3': "Weekly", '4': "Daily"}
exit_survey['video_game_experience'] = exit_survey['video_game_experience'].replace(game_map)
likert_map = {'1': "Strongly Disagree", '2': "Disagree", '3': "Neutral", '4': "Agree", '5': "Strongly Agree"}
exit_survey['feedback_helped'] = exit_survey['feedback_helped'].replace(likert_map)

exit_survey

Unnamed: 0,start_date,end_date,duration,is_finished,prolific_id,gender,age,drone_experience,video_game_experience,feedback_helped,change_from_feedback,comments
2,2/16/2024 13:19,2/16/2024 13:44,1484,1,b,Woman,24,Some,Weekly,Agree,asd,asd
3,2/20/2024 11:39,2/20/2024 12:19,2394,1,Shivendra,Man,31,Some,,Agree,Yes,
4,2/21/2024 12:23,2/21/2024 12:24,53,1,8aksf09q,Man,19,,,Strongly Disagree,I hated this,
5,2/21/2024 12:24,2/21/2024 12:25,24,1,apel09210h,Woman,76,Some,Monthly,Disagree,it was awesome,
6,2/21/2024 12:25,2/21/2024 12:25,23,1,9sk3h59s,Non-binary,33,Regularly,Weekly,Neutral,meh,
7,2/21/2024 12:25,2/21/2024 12:26,24,1,9wbnns76,Prefer not to say,25,Professional,Daily,Agree,a fake answer,
8,2/21/2024 12:26,2/21/2024 12:26,22,1,s92hfks3,Man,44,Some,Weekly,Strongly Agree,what,


# Check participant data

## Match data sources

Do participants match between data from webpage and data from exit survey?

In [8]:
web_participants = conditions['user_id']
exit_survey_participants = exit_survey[qualtrics_columns['prolific_id']]

# check if all web participants are in exit survey
print("Web participants not in exit survey:")
for user_id in web_participants:
    if user_id not in exit_survey_participants:
        print("\t" + user_id)

# check if all exit survey participants are in web
print("Exit survey participants not in web:")
for user_id in exit_survey_participants:
    if user_id not in web_participants:
        print("\t" + user_id)

Web participants not in exit survey:
	breanne
	emily
	emily
	Shivendra Agrawal
	emily
	emily
	emily
	emily
	emily
	emily
Exit survey participants not in web:
	b
	Shivendra
	8aksf09q
	apel09210h
	9sk3h59s
	9wbnns76
	s92hfks3


Notes about manually fixing mismatches between web and exit survey data:
- Example here

## Distributions of demographic data

In [25]:
exit_survey['gender'].value_counts()


gender
Man                  3
Woman                2
Non-binary           1
Prefer not to say    1
Name: count, dtype: int64

In [26]:
exit_survey['age'].astype(int).describe()

count     7.000000
mean     36.000000
std      19.373521
min      19.000000
25%      24.500000
50%      31.000000
75%      38.500000
max      76.000000
Name: age, dtype: float64

In [27]:
exit_survey['drone_experience'].value_counts()

drone_experience
Some            4
None            1
Regularly       1
Professional    1
Name: count, dtype: int64

In [28]:
exit_survey['video_game_experience'].value_counts()

video_game_experience
Weekly     3
None       2
Monthly    1
Daily      1
Name: count, dtype: int64

In [29]:
exit_survey['feedback_helped'].value_counts()

feedback_helped
Agree                3
Strongly Disagree    1
Disagree             1
Neutral              1
Strongly Agree       1
Name: count, dtype: int64

# Research Questions

## How do learners perceive the feedback along each dimension?

## Which feedback modality leads to higher performance improvements?