# Meta Analysis of selected categories identified the systematic literature review

In this file we showcase how the coded data from the literature review could be used to do an quantitative meta analysis. The results are not presented in the final work.

## Important data

In [None]:
import pandas as pd

from src.data.make_dataset import create_data_set
from src.features.build_features import analyse_dimension

In [None]:
data = pd.read_csv("../data/interim/combined_data.csv")

## Meta Analysis Data Quality

### Identify dimensions for analysis

In [None]:
import re 

def identify_dimensions(data, regex):
    columns = [x for x in data.columns.tolist() if re.match(regex, x)] 
    count = data[columns].count().sort_values(ascending = False)
    return count[count >= 5]

In [None]:
identify_dimensions(data, "mobile_(?!participant|design).*")

In [None]:
identify_dimensions(data, "mobile_participants_.*")

### Examplary meta analysis

#### Completion time

In [None]:
completion_time = create_data_set(data, "mobile_completion_time_higher")

In [None]:
analyse_dimension(data, "mobile_completion_time_higher")

Our first research focus is the completion time of surveys on mobile phones compared to personal computers. In 21 surveys, there was statistically significant evidence that completion time is higher on mobile devices than on personal computers. In eight surveys, this hypothesis could not be verified. The evidence for the longest completion time of the survey on the smartphone is substantial. However, an interesting observation is that we could not verify the hypothesis for all surveys, where we know if they are optimised for mobile survey taking. This observation could indicate a change in completion time if the surveys are better designed for mobile devices.

#### Breakoff rate

In [None]:
breakoff_rate = create_data_set(data, "mobile_breakoff_rate_higher")

In [None]:
analyse_dimension(data, "mobile_breakoff_rate_higher")


Secondly, we will analyse the Breakoff rate of mobile surveys compared to PC surveys. We can see that 7 of 10 articles could validate the hypothesis that the breakoff rate is significantly higher on a smartphone than on a PC. There were no obvious patterns of the survey setups that could not validate this hypothesis. So that we interpret the evidence to be accurate, this could be caused by more multitasking and other deviations for mobile users. Another theory could be the shorter attention span of a mobile user, which lead to faster frustration and, therefore, faster breakoff. Another important is optimising mobile survey taking again since this can reduce the breakoff rate. 

#### Missing item rate

In [None]:
missing_item_rate = create_data_set(data, "mobile_missing_items_rate_higher")

In [None]:
analyse_dimension(data, "mobile_missing_items_rate_higher")

A third point is the missing items. We want to investigate the hypothesis that the rate of the missing item is higher on surveys taken on smartphones than on PC. We found mixed evidence for this hypothesis as five times the hypothesis was not supported and six times the hypothesis was supported. An interesting pattern is observable in the countries of survey operation. We have Germany and Netherland, where the hypothesis was two times validated, and Spain and USA, where the hypothesis was validated and not supported. In South Korea and the UK, it was not supported. This result could indicate a dependence on the users' country and mobile device sophistication. This theory would need more research.

#### Age of mobile participants

In [None]:
age = create_data_set(data, "mobile_participants_younger")

In [None]:
analyse_dimension(data, "mobile_participants_younger")

Another one is the age of mobile participants. There is strong evidence that people who decide to access online surveys via smartphone are younger than people who have a PC. 13 Survey could support this hypothesis, and only one could not support this. This result seems to fit the general trend of higher smartphone usage by younger adults.

#### Gender of mobile user

In [None]:
gender = create_data_set(data, "mobile_participants_more_female")

In [None]:
analyse_dimension(data, "mobile_participants_more_female")

When analysing the gender of the smartphone participants in web surveys, we have mixed results. Eight surveys could support the hypothesis that females use smartphones more often to access surveys online and, five could not support the hypothesis. Two surveys supported the hypothesis that more male users used smartphones to access surveys. While the evidence indicated more female than male users, it cannot be proven with statistical significance. We did not identify strong trends regarding countries or target groups. As the last survey was operated in 2017, these results need an update to adapt to current conditions. 

#### Education level of mobile user

In [None]:
education = create_data_set(data, "mobile_participant_education_higher")

In [None]:
analyse_dimension(data, "mobile_participant_education_higher")

Another dimension we analyse is the education level of people using smartphones to access the online survey. We have four surveys that do not support the hypothesis and two that do not support this theory. We do not have enough data to find patterns in the operation terms of these surveys. It is not clear if this effect is still relevant, as the last survey was operated in 2015.

This section marks the end of the file. One could use the extracted results to extend the analysis to other categories or do a more quantitative meta-analysis. This project would be an endeavour for a subsequent research project.