<!-- horizontal line -->

\newpage

# Introduction <!-- 800-1,000 words -->


```{=html}
<!-- Content: Introduce the topic and explain its relevance in both academic and practical contexts. Justify the research gap and formulate a clear and insightful research question.

Key Elements:
Background of the topic.
Identification of the research gap.
Relevance to current business practices.
Research question. -->
```


With the every-growing popularity of cellphones [@charted2023], the popularity of mobile applications is also steadily increasing. In 2024, mobile applications are estimated to generate over \$900 billion in revenue [@global2019]. Generally, mobile applications ('*apps*' from here on) tend to be categorized in three different categories [@roma2016]. Paid apps are the most transparent; they revenue is based on an up-front purchase by the user. Free apps, on the other hand, require no purchase by the user at any stage. According to @roma2016, these apps make their revenue from deals with third-parties, either through advertisement or other purposes such as market information.

Finally, freemium apps are, as the name suggests, a middle-ground between free and premium. Users get access to a basic version of the application first and can unlock more features through an in-app payment [@kumar2014]. Of these three revenue models, freemium is the most commonly used and the most [@salehudin2021] and leads to more downloads as well as revenue [@liu2014].

## Academic Background

<!-- Kort samengevat: Wat is de wetenschappelijke consensus over dit topic? Dit wordt verder uitgediept in de Theory and Hypothesis sectie.-->

Most research uses these three established categories—paid, freemium, and free—when discussing revenue models for apps. However, by limiting the discussion to these three terms, nuances within these categories might be missed.

In a review paper from 2023 [@djaruma2023], different levels of monetization are suggested based on previous literature. These levels provide a clear framework for the revenue models of mobile apps.

| Strategy | Description |
|------------------------------------|------------------------------------|
| Level 5: Premium | Pay to use the application. This either happens up-front, or after a trial period. |
| Level 4: Semi-premium | Use a limited number of features for free. Unlock the app with all features through an in-app purchase. |
| Level 3: In-app advertisement and in-app purchases | Free application with ads, encouraging users to remove ads or to make in-app purchases. |
| Level 2: Sample and premium | Two different versions of the same app. One is a version with limit features and/or ads. The other version is a premium version. |
| Level 1: In-app advertisement | Only one version of the app, with only ads and no in-app purchases. |
| Level 0: Free | The app has no monetization. However, money can still be made through selling user information. |

: Six levels of monetization for apps {#tbl-levels}

<!-- Ik heb dit expres kort gehouden, aangezien we ook nog een uitgebreidere literatuur review hebben. Maar, misschien moet dit toch nog iets langer? -Noa -->

## Societal Background

<!-- Is there any tie to practical contexts? How is this relevant to current business practices? -->

Currently, most apps utilize the freemium revenue model [@salehudin2021]. However, as discussed in @djaruma2023, there are many revenue models between completely premium and completely free. A more fine-grained classification of app revenue models beyond the traditional "paid-freemium-free" framework holds significant societal and business implications.

For society, such distinctions enhance transparency. Some monetization models, such as free or ad-filled apps, may rely on selling user information as a source of revenue [@bamberger2020]. Therefore, clearer distinctions regarding the revenue model will empower consumers to make informed choices. It may also enable policymakers to identify and regulate exploitative practices, such as manipulative microtransactions or intrusive ad models, ensuring all applications align with ethical and legal standards [@mileros2024].

For businesses, this paper should unlock more insight into the effectiveness of different revenue streams. This will allow developers to tailor monetization strategies to specific audiences. Furthermore, both consumers and regulatory bodies are growing more concerned with the privacy concerns of apps, especially ones that rely on market information [@mileros2024]. A granular understanding helps businesses adapt, aligning profitability with sustainability and ethical considerations.

<!-- Ik ben niet helemaal blij met de flow van dit stuk, ik mis een mooie conclusie. Maar, ik kan niet bedenken :(. Dit moet later toegevoegd worden -Noa. -->

## Research Gap

<!-- Kort samengevat: Wat is de relevantie? -->

In short, apps play an increasingly important role in our techno-centric society. To improve the user experience and increase profits, consideration of revenue models is key. Despite the great depth of research on this topic, literature tends to be focussed on the three big categories of paid, freemium, and free. This lack of nuance prevents us from understanding the fine-grained details that may help improve future apps.

<!-- Wat is de research gap? En wat is dus de onderzoekvraag? -->

The levels of monetization as proposed by @djaruma2023 would allow for this nuance. However, their framework has never been used in an empirical setting, as the paper by @djaruma2023 was published only last year. Applying this framework to see how different revenue streams impact the popularity of an app may yield valuable insights into the preferences of consumers. Therefore, the question to answer within this paper will be: *How are the 5 different revenue models as proposed by @djaruma2023 correlated to the success of an app?*

<!-- Wederom: kort maar krachtig, of té kort? -Noa -->

# Theory and Hypotheses <!-- 1,000-1,200 words -->


```{=html}
<!-- 

Comprehensive coverage of relevant literature, effectively building towards the research question.
Development of up to three hypotheses, clearly grounded in the theoretical framework.

Content: Review relevant literature to build a strong theoretical framework. Develop up to three hypotheses that directly connect to the theory and research question.

Key Elements:
Comprehensive literature review.
Explanation of key theoretical concepts.
Development of hypotheses. -->
```


In this section, prior research into the topic of revenue streams and its correlation to success in apps will be discussed. As mentioned in the Introduction section, this paper will apply the 5 levels of revenue as proposed by @djaruma2023 to app data. The following section will contain a holistic overview of the existing research, as well as hypotheses that arise from this theoretical framework.

## Literature Review {#sec-literature-review}

To answer the question "*How are the 5 different revenue models as proposed by @djaruma2023 correlate to the success of an app?*", we must first define what constitutes to success. In this paper, success will be defined by a couple of factors: popularity, rating, and estimated revenue.

### Popularity {#sec-popularity}

The popularity of an app can be measured by the number of downloads. It is important to note the popularity of an app is complex, and is not solely dependent on the chosen revenue model. Other features, such as whether an app is featured on charts, whether it has frequent updates, and word-of-mouth awareness, will also impact the popularity of an app [@aydingokgoz2021]. However, despite these other variables, to versions of the same app will still have drastically different performances with different revenue streams [@liu2014].

H1a: Apps that allow the user to have free access to all features (level 0 and 1) will have the highest amount of downloads overall. However, the ratings may fluctuate, as quality can vary for free-to-access apps. <!-- Is citation hier nodig? -->

H1b: The apps with the most downloads will be level 1. Most social media platforms, which dominate our culture, tend to have this revenue stream [@djaruma2023].

H1c: For apps that utilize a sample and a premium version of the same app (level 2), the free versions of an app will have more downloads than their paid-for counterpart. Most, if not all, users will download the free version first, and then might upgrade. This means there should be a disparity between the number of downloads between the apps, as is also demonstrated by @liu2012freemium.

H1d: The most downloaded apps in the gaming category will likely fall under level 4. Many popular games use this type of "pay-to-win" mechanism [@nieborg2016]. Therefore, it would be expected this same pattern would arise from our data.

### Rating {#sec-rating}

The downloads of an app are not everything. An app can be downloaded often, but may not be highly rated.

H2a: Apps that require the user to pay to unlock features (level 2, 3, and 4) will tend to have lower ratings than the version that requires payment upfront (level 5). The main draw of a freemium model is to attract users, and have them update to a paid version [@kumar2014]. However, as @kumar2014 points out, this can be a double-edged sword. Too few features, and it may not be attractive to users. Too many features, and the users will not update.

H2b: Fully premium apps (level 5) will have less variance in their ratings, while all other levels will have more. In the same vein as H2a, users have more realistic expectations of paid apps compared to apps that require you to unlock features [@kumar2014]. Therefore, more users downloading premium apps will be satisfied with their purchase, leading to less variance.

H3: For apps that utilize a sample and a premium version of the same app (level 2), the rating of the paid-for version is positively associated with the rating of the free version of the same app. This was true for the study on the most popular apps in the Google Play Store by @liu2012freemium, so it is expected a similar pattern should arise for this dataset.

### Revenue Estimation

<!-- Dit stuk kan later ook nog verschoven worden naar de discussie als dat beter is voor de flow. -Noa -->

It is important to point out downloads and ratings likely do not directly correlate to the actual revenue of an app. The revenue of apps "premium" apps that require an upfront payment, the revenue is relatively simple to track and compare. However, for apps that rely on advertisement, in-app purchases and/or selling market information, this is harder to track.

For apps that solely on advertisement, time retention can be a good measure of revenue [@ross2018]. However, this only works if the app solely relies on ads. An example of this given by @djaruma2023 is TikTok: this app relies not only on advertisement, but also on users purchasing products through its shop. Therefore, using solely the time retention would not accurately capture the revenue of an app with both revenue streams. Furthermore, the selling of user data is usually not publicized, meaning it is not possible to know the revenue from this.

Unfortunately, our data only contains the price of "premium" app versions. The data does not include any details regarding in-app purchases nor time-retention. Because of this lack of sufficient data, solely downloads and ratings will be taken into account as indicators of success.

<!-- Officieel mag je dus maar 3 hypothesen, maar ik heb ze onderverdeeld. Zou dit mogen? -Noa -->

# Methods and Data <!-- 1,000-1,200 words -->


```{=html}
<!--

Detailed description of the dataset, clear explanation of variable translation.
Proper use of statistical methods, with careful consideration of assumptions and appropriate handling of violations.

Content: Describe the dataset(s) used, explain variable selection and translation, and provide a detailed explanation of the statistical methods applied. Justify the methodological approach and handle assumptions rigorously.
Key Elements:
Dataset description.
Explanation of variable selection.
Statistical methods and justification.
Handling of assumptions.
-->
```


In this section, we will discuss the dataset and methods used to test the hypotheses outlined in the previous section. The focus lies on providing a comprehensive description of the dataset, including its structure and the variables it contains, followed by an explanation of the variable selection process. Additionally, we outline the statistical methods applied and discuss how assumptions, such as missing values and potential biases, were addressed to ensure the robustness of our analysis.

## Dataset Description

<!--Omschrijf de data: hoeveel instances, welke variables, waar gaat de data over?-->

The dataset used for this research consists of 1,016,666 instances and 27 variables, representing a detailed overview of mobile applications across various revenue models. Each instance corresponds to an app, and the variables capture key attributes such as app downloads, user ratings, and monetization strategies. Below is an overview of some of variables:

| Variable | Description |
|------------------------------------|------------------------------------|
| my_app_id (object) | Unique identifier for each app. |
| date_published (object) | The publication date of the app. Only three missing values (0.000295% null). |
| privacy_policy (object) | Information about the app's privacy policy, missing in 28.57% of cases. |
| rating_app (float64) | The average rating of the app, with 8.76% missing values. |
| nb_rating (object) | Number of ratings received by the app, missing in 8.76% of cases. |
| num_downloads (object) | The number of downloads for the app, nearly complete with only 15 missing values (0.001475% null). |
| price_gplay (object) | The price of the app as listed on Google Play, missing in 0.43% of cases. |
| in_app (bool) | Indicates whether the app has in-app purchases (no missing values). |
| has_ads (bool) | Indicates whether the app contains advertisements (no missing values). |
| content_rating_app (object) | The app's content rating, with three missing values (0.000295% null). |
| developer_name (object) | The name of the app developer, missing in only 16 instances (0.001574% null). |

The dataset includes several additional variables related to app features, developer information, and user engagement metrics such as visit_website, more_from_developer, and family_library. However, some variables, such as whats_new (100% null) and in_app_product (89.57% null), were deemed unsuitable for analysis due to their high proportion of missing data.

The primary purpose of this dataset in this study, is to analyze app monetization strategies by categorizing apps into distinct revenue levels and evaluating their performance based on key metrics like downloads and user ratings.

## Variable Selection


```{=html}
<!-- Welke variabelen gebruiken wij? Welke hebben we eruit gehaald? (Hierover uitbreiden in Handeling of Assumptions).

Hoe definiëren we de 5 levels?-->
```


The dataset utilized in this study consists of 1,016,666 entries, encompassing a broad range of attributes related to mobile applications. For the purpose of our analysis, 13 variables were selected, capturing critical information about app characteristics, user engagement, monetization strategies, and developer details. These variables include the app's unique identifier (my_app_id), the total number of downloads (num_downloads), average user ratings (rating_app), and the number of ratings (nb_rating). Additionally, the dataset provides information on app pricing (price_gplay), the presence of in-app purchases (in_app), and whether the app includes advertisements (has_ads). Other variables, such as content ratings (content_rating_app), app categories (categ_app), and developer information (developer_name and developer_info), further enhance the richness of the dataset. This subset of variables allows us to comprehensively examine the interplay between monetization strategies and app success.

To systematically explore monetization strategies, we classified the apps into six distinct levels based on their monetization models. These levels reflect varying approaches to generating revenue, ranging from completely free apps to fully premium paid apps.

Level 0 represents apps with no monetization, offering free services without ads or in-app purchases. At the opposite end, Level 5 includes premium apps requiring upfront payment, free from ads or in-app purchases, delivering a premium experience.

In between, Level 1 consists of free apps monetized solely through ads, while Level 3 combines ads and in-app purchases, offering additional features for users willing to pay. Level 4 refines the freemium model by removing ads and relying entirely on in-app purchases to monetize.

Level 2 employs a dual-version strategy, featuring both free sample apps with limited functionality (and potentially ads) and paid premium apps with comprehensive features and no ads or in-app purchases.

This classification is grounded in theoretical frameworks, such as the monetization levels proposed by Djaruma et al. (2023) and the App business models of (CITE), and allows for a nuanced analysis of how different revenue models impact app success metrics like user ratings and downloads. Our systematic categorization facilitates a deeper understanding of the relationship between monetization strategies and app performance.

## Statistical Methods

To test our hypotheses, we employed a combination of descriptive statistics, text processing, and machine learning techniques. Descriptive statistics were utilized to analyze distributions and trends in metrics such as num_downloads, rating_app, and price_gplay. We categorized applications into six monetization levels based on binary indicators: is_free, in_app, and has_ads. Price values were processed to distinguish between free and paid applications.

To identify paired sample and premium applications within level 2, we applied Term Frequency-Inverse Document Frequency (TF-IDF) vectorization combined with cosine similarity on application names. This approach is effective for measuring textual similarity between documents (CITE Source 2). Subsequent filtering involved prefix matching and the identification of indicative terms (e.g., "Free," "Pro") to ensure logical pairing based on naming conventions and shared developers.

To adjust ratings for applications with few reviews, we calculated a Bayesian average. This method provides a more robust measure of user satisfaction by accounting for the number of ratings and the overall average rating across all applications (CITE Source 3). Visualizations, including scatter plots and box plots, were employed to explore relationships between monetization levels and user engagement metrics which will be displayed in the results section.

### Handeling of Assumptions

We addressed missing values by removing rows with critical nulls, such as those in num_downloads, to maintain data integrity. Text-based variables like content_rating_app were standardized to ensure consistency. For price_gplay, currency symbols were removed to facilitate the classification of applications into free or paid categories.

Outliers in metrics like num_downloads were retained if they represented industry-leading applications, as their exclusion could skew the analysis. The use of Bayesian averages mitigated bias in rating_app due to low review counts, providing a more accurate reflection of user satisfaction. Covariance checks were conducted to ensure the absence of multicollinearity among numerical variables, thereby enhancing the reliability of correlation and regression analyses.

Some applications exhibited rare combinations of is_free, in_app, and has_ads that did not fit within the predefined monetization levels. These applications were excluded from the analysis but documented as a limitation. Edge cases in level 2 application pairing were flagged for potential mismatches due to naming ambiguities, ensuring transparency in the classification process.

These methodologies facilitated a systematic and accurate exploration of monetization models and their impact on application performance.

# Results <!-- 800-1,000 words -->


```{=html}
<!-- Clear presentation of results, including descriptive statistics and relevant tables/visualizations.
Effective interpretation of results, linked to hypotheses and research question.

Content: Present the results clearly, including descriptive statistics, tables, and visualizations. Interpret the results and link them back to the hypotheses and research question.

Key Elements:
Clear presentation of results.
Descriptive statistics.
Visualizations (e.g., graphs, charts).
Interpretation of results.

!! Voor volle punten, zorg dat je terug refereert naar de hypothesen !!
-->
```


In this section, we will visualize the data through tables and visualizations. These plots largely explore the data around the hypotheses and research question, we discussed in the previous sections. The aim of this section is to present possible evidence in supporting a hypothesis. This will be discussed and concluded upon in the next section.

Before we look at the results, let's revisit the research question: "*How are the 6 different revenue models as proposed by @djaruma2023 correlate to the success of an app?*". Where we identify 6 different revenue models, described in this section as (revenue) levels. For reference see @tbl-levels.

The dataset is divided into these different levels. More than 75% of all the apps belong to level 0 and 1 (see @fig-distribution-amount-apps-levels). With the smallest population being level 2 with two different version of the same app.


In [None]:
#| echo: false
#| warning: false
#| error: false
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from IPython.display import Markdown
from tabulate import tabulate

df_level_combined = pd.read_csv('./data/google_data_levels_combined.csv', dtype={16: 'string'})
df_mapping_level_2 = pd.read_csv('./data/mapping_level_2.csv', dtype={16: 'string'})

In [None]:
#| echo: false
#| warning: false
#| error: false
#| label: fig-distribution-amount-apps-levels
#| fig-cap: Distribution of the amount of apps across the revenue levels

# Assuming df_level_combined['level'] exists
level_counts = df_level_combined['level'].value_counts()
level_counts = level_counts.reindex(["0", "1", "2 (Sample)", "2 (Premium)", "3", "4", "5"])  # Specify the order

# Creating labels for the legend with counts and percentages
total = level_counts.sum()
labels = [f"{level} - {count} ({count / total:.1%})" for level, count in level_counts.items()]

# Plotting the pie chart
plt.figure(figsize=(5, 5))
wedges, texts = plt.pie(
    level_counts, 
    labels=None,  # Hide labels on the pie chart itself
    startangle=90, 
    colors=plt.cm.Paired.colors
)

# Adding the custom legend
plt.legend(wedges, labels, title="Levels", loc="center left", bbox_to_anchor=(1, 0, 0.5, 1))

plt.show()


To find the success of an app, as discussed in @sec-literature-review. Metrics like popularity, the rating and the revenue estimation of an application can be used. The popularity measured by the number of downloads and rating by the average rating given by users. (Note: Revenue estimation is exlcuded, because this is not measurable with the data.) These metrics are used to evaluate the hypotheses (see @sec-literature-review).

## Number of Downloads.

@fig-total-average-downloads-by-level provides insights into the number of app downloads categorized by revenue levels. @fig-total-average-downloads-by-level-1 displays the total amount of downloads, while the @fig-total-average-downloads-by-level-2 displays the average amount of downloads. In the next following three subsections, we will look into the key takeaways from these two graphs:

-   Distribution of app downloads: how the downloads are distributed among the revenue levels.

-   Gaming apps: analyzing the importance of gaming apps.

-   Free vs Paid: comparing free and paid apps.


In [None]:
#| label: fig-total-average-downloads-by-level
#| fig-cap: Comparison of Total and Average Downloads by Level
#| fig-subcap:
#|   - Total Number of Downloads by Level
#|   - Average Number of Downloads by Level
#| layout-ncol: 2
#| echo: false
#| warning: false
#| error: false

df_h1b = df_level_combined.copy()

# Calculate the sum of downloads for each level
total_downloads_h1b = df_h1b.groupby('level')['num_downloads'].sum().reset_index()

# Define the order of the levels
level_order = ["0", "1", "2 (Sample)", "2 (Premium)", "3", "4", "5"]

# First bar chart: Total Number of Downloads
sns.barplot(
    data=total_downloads_h1b,
    x='level',
    y='num_downloads',
    hue='level',
    palette='viridis',
    dodge=False,
    order=level_order
)
plt.xlabel('Level')
plt.ylabel('Total Number of Downloads')
plt.xticks(rotation=90)  # Rotate x-axis labels
plt.tight_layout()  # Adjust layout to avoid overlap
plt.show()

# Second bar chart: Average Number of Downloads
sns.barplot(
    data=df_h1b,
    x='level',
    y='num_downloads',
    hue='level',
    palette='viridis',
    dodge=False,
    order=level_order
)
plt.xlabel('Level')
plt.ylabel('Average Number of Downloads')
plt.xticks(rotation=90)  # Rotate x-axis labels
plt.tight_layout()  # Adjust layout to avoid overlap
plt.show()

### Distribution of app downloads.

There is a significant difference between the total and average number of downloads for level 0 and 1, in @fig-total-average-downloads-by-level-1 and @fig-total-average-downloads-by-level-2 respectively. This is largely a result of the fact that 75% of the total apps fall under these levels (see @fig-distribution-amount-apps-levels.) By testing the hypothesis 1b, in @sec-popularity, it claims that apps in level 1 are downloaded the most, like social media platforms that dominate our culture. At a first glance the claim that this level "dominates our culture" doesn't appear to be the case.

@fig-distribution-app-downloads provides insights into distribution inequality. The distribution of the app downloads are highly concentrated, meaning a small proportion of the apps make up for the vast majority of the downloads. @fig-distribution-app-downloads-1 shows relatively the same inequality as @fig-distribution-app-downloads-2. Which indicates that a small amount of apps dominate the numbers, however this might also be present in other levels.


In [None]:
#| label: fig-distribution-app-downloads
#| fig-cap: Distribution of App Downloads (Top-Heavy Analysis)
#| fig-subcap:
#|   - All levels except 1
#|   - Level 1 Only
#| layout-ncol: 2
#| echo: false
#| warning: false
#| error: false

df_h1b = df_level_combined.copy()

# convert float to int for num_downloads
df_h1b['num_downloads'] = df_h1b['num_downloads'].astype(int)

# relevant columns
relevant_columns_h1b = ['my_app_id', 'num_downloads', 'categ_app', 'level', 'developer_name']

# filter the relevant columns
df_h1b = df_h1b[relevant_columns_h1b]

# Filter the dataframe to exclude level '1'
df_no_level_1 = df_h1b[df_h1b['level'] != '1'].copy()

# Sort the dataframe by 'num_downloads' in descending order
df_no_level_1_sorted = df_no_level_1.sort_values(by='num_downloads', ascending=False)

# Calculate the cumulative percentage of downloads
df_no_level_1_sorted['cumulative_percentage_downloads'] = (
    df_no_level_1_sorted['num_downloads'].cumsum() / df_no_level_1_sorted['num_downloads'].sum() * 100
)

# Create a Lorenz curve-style plot for all levels except '1'
plt.plot(
    range(1, len(df_no_level_1_sorted) + 1),
    df_no_level_1_sorted['cumulative_percentage_downloads'],
    label='Cumulative Downloads (All Levels Except 1)'
)
plt.plot(
    [1, len(df_no_level_1_sorted)],
    [0, 100],
    linestyle='--',
    color='gray',
    label='Equality Line'
)

# Add labels and legend
plt.xlabel('Apps (sorted by downloads)')
plt.ylabel('Cumulative Percentage of Downloads')
plt.legend()
plt.grid(True)
plt.show()


# Filter the dataframe for level '1'
df_level_1 = df_h1b[df_h1b['level'] == '1'].copy()

# Sort the dataframe by 'num_downloads' in descending order
df_level_1_sorted = df_level_1.sort_values(by='num_downloads', ascending=False)

# Calculate the cumulative percentage of downloads
df_level_1_sorted['cumulative_percentage_downloads'] = df_level_1_sorted['num_downloads'].cumsum() / df_level_1_sorted['num_downloads'].sum() * 100

# Create a Lorenz curve-style plot for level '1'
plt.plot(
    range(1, len(df_level_1_sorted) + 1),
    df_level_1_sorted['cumulative_percentage_downloads'],
    label='Cumulative Downloads (Level 1)'
)
plt.plot(
    [1, len(df_level_1_sorted)],
    [0, 100],
    linestyle='--',
    color='gray',
    label='Equality Line'
)

# Add labels and legend
plt.xlabel('Apps (sorted by downloads)')
plt.ylabel('Cumulative Percentage of Downloads')
plt.legend()
plt.grid(True)
plt.show()

The download range of apps in @fig-num-downloads-by-level-log illustrates that level 1 dominates in terms of apps with over 1 billion+ downloads. Furthermore, level 0 also has a few apps with over 1 billion downloads.


In [None]:
#| label: fig-num-downloads-by-level-log
#| fig-cap: Number of Downloads by Level (Log Scale)
#| echo: false
#| warning: false
#| error: false

downloads_across_level = df_level_combined.copy()

# Define the bins and labels
bins = [0, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000, np.inf]
labels = ['0-100', '101-1k', '1k-10k', '10k-100k', '100k-1M', '1M-10M', '10M-100M', '100M-1B', '1B+']

# Create a new column 'downloads_bin' based on the number of downloads
downloads_across_level['downloads_bin'] = pd.cut(downloads_across_level['num_downloads'], bins=bins, labels=labels)

# Group by 'level' and 'downloads_bin' to get the count of apps per download bin for each level
grouped = downloads_across_level.groupby(['level', 'downloads_bin']).size().unstack(fill_value=0)

# Plotting the bar chart with log scale
grouped.plot(kind='bar', stacked=False, figsize=(12, 8), logy=True)
plt.xlabel('Level')
plt.ylabel('Number of Apps (Log Scale)')
plt.xticks(rotation=90)
plt.legend(title='Download Range', bbox_to_anchor=(1.05, 1), loc='upper left')
plt.tight_layout()
plt.show()


The apps categories Communication and Social classify as social platforms as shown in @fig-num-downloads-1-billion. Attributing to 40% of all the apps with more than 1 billion downloads.


In [None]:
#| label: fig-num-downloads-1-billion
#| fig-cap: 1 Billion+ Downloads in Level 1
#| echo: false
#| warning: false
#| error: false

downloads_across_level = df_level_combined.copy()
downloads_across_level = downloads_across_level[downloads_across_level['level'] == '1'].copy()

# Add '1B Downloads' column
downloads_across_level['1B Downloads'] = downloads_across_level['num_downloads'] >= 1_000_000_000

# Filter for apps with 1 billion or more downloads
billion_downloads = downloads_across_level[downloads_across_level['1B Downloads'] == True]

# Preparing data
billion_downloads_summary = billion_downloads['categ_app'].value_counts().reset_index()
billion_downloads_summary.columns = ['Category', 'Number of Apps']

# Sort and place "Social" next to "Communication"
custom_order = ['Communication', 'Social'] + [
    cat for cat in billion_downloads_summary['Category'] if cat not in ['Communication', 'Social']
]
billion_downloads_summary = billion_downloads_summary.set_index('Category').loc[custom_order].reset_index()

# Data for the pie chart
sizes = billion_downloads_summary['Number of Apps']
labels = billion_downloads_summary['Category']

# Plotting the pie chart
plt.figure(figsize=(6, 6))  # Adjust the size of the pie chart
wedges, texts, autotexts = plt.pie(
    sizes, 
    labels=None,  # Hide labels on the pie chart itself
    autopct='%1.1f%%',  # Display percentages
    startangle=90, 
    colors=plt.cm.Paired(np.linspace(0, 1, len(labels)))  # Distinct colors
)

# Customize percentage text size and color
for autotext in autotexts:
    autotext.set_color('white')
    autotext.set_fontsize(10)

# Adding the custom legend
plt.legend(
    wedges, 
    labels, 
    title="Categories", 
    loc="center left", 
    bbox_to_anchor=(1, 0, 0.5, 1)  # Custom legend position
)

# Display the chart
plt.tight_layout()
plt.show()

Taking a closer look at the apps provided by [@djaruma2023] in @tbl-top-8-apps-paper. We do see that Facebook, Instagram, Spotify, Snapchat and Amazon Shopping fall under level 1. (Disclaimer: TikTok didn't exist up until 2019).


In [None]:
#| label: tbl-top-8-apps-paper
#| tbl-cap: Top 8 Apps by Downloads (in Billions)
#| echo: false
#| warning: false
#| error: false


df_h1b = df_h1b[df_h1b['my_app_id'].isin(['com.amazon.mShop.android.shopping', 'com.instagram.android', 'com.facebook.lite', 'com.netflix.mediaclient', 'com.snapchat.android', 'com.spotify.music', 'com.whatsapp', 'com.facebook.orca'])]

df_h1b = df_h1b[['my_app_id', 'num_downloads', 'categ_app', 'level']]

df_h1b['num_downloads'] = df_h1b['num_downloads'] / 1_000_000_000

app_mapping = {
    'com.amazon.mShop.android.shopping': 'Amazon Shopping',
    'com.instagram.android': 'Instagram',
    'com.facebook.lite': 'Facebook Lite',
    'com.netflix.mediaclient': 'Netflix',
    'com.snapchat.android': 'Snapchat',
    'com.spotify.music': 'Spotify',
    'com.whatsapp': 'WhatsApp',
    'com.facebook.orca': 'Facebook Messenger'
}

# Map app names to the 'my_app_id' column
df_h1b['app_name'] = df_h1b['my_app_id'].map(app_mapping)

# Reorder columns for better readability
df_h1b = df_h1b[['app_name', 'num_downloads', 'categ_app', 'level']]


# Rename columns as per requirement
df_h1b = df_h1b.rename(columns={
    'app_name': 'App Name',
    'num_downloads': 'Downloads (in B)',
    'level': 'Level',
    'categ_app': 'Category'
})

df_h1b.sort_values(by='Downloads (in B)', ascending=False, inplace=True)
df_h1b.reset_index(drop=True, inplace=True)
Markdown(tabulate(df_h1b, headers=df_h1b.columns))

H1b: *The apps with the most downloads will be level 1. Most social media platforms, which dominate our culture, tend to have this revenue stream [@djaruma2023].*

It can be concluded that indeed the most downloaded apps fall under level 1 (see @fig-total-average-downloads-by-level-1). But this requires a nuanced view, since 50% of the apps fall under level 1 (see @fig-distribution-amount-apps-levels) and it has a relatively low average among other levels (see @fig-total-average-downloads-by-level-2). So, it requires further analysis to understand the popularity of level 1. As seen from @fig-distribution-app-downloads it shows that there is high inequality, resulting in a small proportion of the apps accounting to a majority of the downloads. Furthermore, @fig-num-downloads-by-level-log shows that apps in level 1 are considered among the top downloaded apps. In particular, the categories under which social media platform fall (see @fig-num-downloads-1-billion). Looking at the apps provided by @djaruma2023, it can be concluded that there is enough evidence to support h1b (see @tbl-top-8-apps-paper).

### Gaming apps

1.  **Figuur 2 verwijzen.**
2.  **Gebeurt dit –\> leg uit waarom je het zo ziet.**
3.  **Niet duidelijk waarom het gebeurt.**
4.  **Hypothese introduceren.**
5.  **Mogelijk heeft het te maken met dit (plots). (Testen van hypothese)**
6.  **Dus conclusie.**

**Important takeaway:** Revenue level 3 and 4 stands out with high average app downloads. Suggesting these revenue levels can be considered as the most popular revenue level. Important to note is that they have two things in common, they are free and contain in-app purchases. The only difference is that level 3 contain ads.

This takeaway aligns well with hypothesis 1d, which takes a closer look into these 2 revenue levels:

*The most downloaded apps in the gaming category will likely fall under level 4. Many popular games use this type of "pay-to-win" mechanism [@nieborg2016]. Therefore, it would be expected this same pattern would arise from our data.*

Games are hugely popular on the Google Play Store. In the graphs below, you'll find that they are responsible for more than 25% of the total downloads, while only populating 15% of the total apps.


In [None]:
#| echo: false
#| warning: false
#| error: false

all_apps = df_level_combined.copy()

gaming_categories = [
    "Puzzle", "Casual", "Arcade", "Simulation", "Action", "Adventure", 
    "Trivia", "Racing", "Educational", "Card", "Word", "Board", 
    "Casino", "Role Playing", "Strategy", "Brain Games", 
    "Action & Adventure", "Pretend Play"
]

# Filter other than gaming apps
other_apps = all_apps[~all_apps['categ_app'].isin(gaming_categories)]


df_level_combined_h1d = df_level_combined.copy()

# relevant columns
relevant_columns_h1d = ['my_app_id', 'num_downloads', 'level', 'categ_app']

# filter the relevant columns
df_level_combined_h1d = df_level_combined_h1d[relevant_columns_h1d]

# Creating a list of categories that can be considered as part of the gaming category
gaming_categories = [
    "Puzzle", "Casual", "Arcade", "Simulation", "Action", "Adventure", 
    "Trivia", "Racing", "Educational", "Card", "Word", "Board", 
    "Casino", "Role Playing", "Strategy", "Brain Games", 
    "Action & Adventure", "Pretend Play"
]

# Filter on games category
df_level_combined_h1d = df_level_combined_h1d[df_level_combined_h1d['categ_app'].isin(gaming_categories)]

gaming_apps = df_level_combined_h1d.copy()

# relevant columns
relevant_columns_other_apps = ['my_app_id', 'num_downloads', 'level', 'categ_app']

# filter the relevant columns
other_apps = other_apps[relevant_columns_other_apps]

# Calculate the sum of downloads for each category
total_downloads_gaming_apps = gaming_apps['num_downloads'].sum()
total_downloads_other_apps = other_apps['num_downloads'].sum()
avg_downloads_gaming_apps = gaming_apps['num_downloads'].mean()
avg_downloads_other_apps = other_apps['num_downloads'].mean()
total_amount_gaming_apps = len(gaming_apps)
total_amount_other_apps = len(other_apps)

# Creating a new dataframe to represent the totals
summary_data = {
    "Category": ["Gaming Apps", "Other Apps"],
    "Total Downloads": [total_downloads_gaming_apps, total_downloads_other_apps],
    "Total Amount": [total_amount_gaming_apps, total_amount_other_apps],
    "Average Downloads": [avg_downloads_gaming_apps, avg_downloads_other_apps]

}
df_summary = pd.DataFrame(summary_data)

# Creating pie charts with a legend next to each other
fig, axs = plt.subplots(1, 2, figsize=(16, 8))

fig.suptitle('Figure 6: Proportion of Total and Average Downloads between gaming and other categories', fontsize=16)

# Pie chart for Total Downloads
wedges1, texts1, autotexts1 = axs[0].pie(
    df_summary['Total Downloads'], labels=df_summary['Category'], autopct='%1.1f%%', startangle=140
)
axs[0].set_title('Proportion of Total Downloads: Gaming vs Other Apps', fontsize=14)

# Pie chart for Total Amount
wedges2, texts2, autotexts2 = axs[1].pie(
    df_summary['Total Amount'], labels=df_summary['Category'], autopct='%1.1f%%', startangle=140
)
axs[1].set_title('Proportion of Total Amount: Gaming vs Other Apps', fontsize=14)

# Adding legends
axs[0].legend(wedges1, df_summary['Category'], title="Categories", loc="upper right", bbox_to_anchor=(1.2, 1))
axs[1].legend(wedges2, df_summary['Category'], title="Categories", loc="upper right", bbox_to_anchor=(1.2, 1))

# Adjust layout
plt.tight_layout()
plt.show()

The table below is a comparison between the average downloads of gaming apps and app with other categories. The gaming apps are therefore downloaded more on average than apps from other categories, with nearly 100% higher download average. This indicates that gaming apps are hugely popular on the Google Play Store.


In [None]:
#| echo: false
#| warning: false
#| error: false

# Combine both datasets
all_apps = pd.concat([gaming_apps, other_apps])

# Calculate averages
gaming_avg = gaming_apps['num_downloads'].mean()
other_avg = other_apps['num_downloads'].mean()
overall_avg = all_apps['num_downloads'].mean()

# Percentage difference between gaming apps and overall average
gaming_vs_baseline_difference = ((gaming_avg - overall_avg) / overall_avg) * 100

# Percentage difference between other apps and overall average
other_vs_baseline_difference = ((other_avg - overall_avg) / overall_avg) * 100

# Percentage difference between gaming apps and other apps
gaming_vs_other_difference = ((gaming_avg - other_avg) / other_avg) * 100

# Create a summary table with shorter column names
summary_table = pd.DataFrame({
    'Category': ['Gaming', 'Other', 'Baseline'],
    'Avg Downloads': [gaming_avg, other_avg, overall_avg],
    '% Diff vs Baseline': [
        gaming_vs_baseline_difference, 
        other_vs_baseline_difference, 
        0  # Baseline itself, so difference is 0%
    ],
    '% Diff Gaming vs Other': [
        gaming_vs_other_difference, 
        gaming_vs_other_difference * -1, 
        None  # No comparison for the baseline
    ]
})

# Round the values for better readability
summary_table = summary_table.round(2)

summary_table


In [None]:
#| echo: false
#| warning: false
#| error: false

df_level_combined_h1d = df_level_combined.copy()

# relevant columns
relevant_columns_h1d = ['my_app_id', 'num_downloads', 'level', 'categ_app']

# filter the relevant columns
df_level_combined_h1d = df_level_combined_h1d[relevant_columns_h1d]

# Creating a list of categories that can be considered as part of the gaming category
gaming_categories = [
    "Puzzle", "Casual", "Arcade", "Simulation", "Action", "Adventure", 
    "Trivia", "Racing", "Educational", "Card", "Word", "Board", 
    "Casino", "Role Playing", "Strategy", "Brain Games", 
    "Action & Adventure", "Pretend Play"
]

# Filter on games category
df_level_combined_h1d = df_level_combined_h1d[df_level_combined_h1d['categ_app'].isin(gaming_categories)]

# barplot the number of downloads for each level
plt.figure(figsize=(5, 3))
sns.barplot(data=df_level_combined_h1d, x='level', y='num_downloads')
plt.title('Figure 7: Average amount of Downloads for Gaming Apps by Level', fontsize=14)
plt.xlabel('Level')
plt.ylabel('Number of Downloads')
plt.xticks(rotation=90)  # Rotate x-axis labels
plt.show()

By filtering on the gaming categories we can see that the average downloads is also significantly higher in level 3 and 4. With level 4 having a higher variance than level 3 and 4.

To further explore this, the table below shows the top 5 most downloaded games from each revenue level. It reveals a noteable variance in downloads between the 2 revenue levels. The variance in level 4 is much larger than in level 3, even if it only displays the top 5. For instance, the downloads in level 4 varies from 0.5 to 0.1 billion. While the downloads in level 3 range from 1.0 to 0.5 billiom.


In [None]:
#| echo: false
#| warning: false
#| error: false

# Filter for level 3, sort by downloads, and pick top 5
top_apps_level_3 = df_level_combined_h1d[df_level_combined_h1d['level'] == '3'] \
    .sort_values(by='num_downloads', ascending=False) \
    .head(5)

app_name_mapping = {
    "com.kiloo.subwaysurf": "Subway Surfers",
    "com.fingersoft.hillclimb": "Hill Climb Racing",
    "com.imangi.templerun2": "Temple Run 2",
    "com.outfit7.mytalkingtomfree": "My Talking Tom",
    "me.pou.app": "Pou"
}

# Convert num_downloads to per billion
top_apps_level_3['num_downloads'] = top_apps_level_3['num_downloads'] / 1_000_000_000

# Mapping the app column to their normal names
top_apps_level_3['App Name'] = top_apps_level_3['my_app_id'].map(app_name_mapping)

# Rename columns as per requirement
top_apps_level_3 = top_apps_level_3.rename(columns={
    'num_downloads': 'Downloads (in Billions)',
    'categ_app': 'Gaming Category'
})

top_apps_level_3.reset_index(drop=True, inplace=True)
top_apps_level_3 = top_apps_level_3[['App Name', 'Downloads (in Billions)', 'level']]

# Filter for level 4, sort by downloads, and pick top 5
top_apps_level_4 = df_level_combined_h1d[df_level_combined_h1d['level'] == '4'] \
    .sort_values(by='num_downloads', ascending=False) \
    .head(5)

additional_app_name_mapping = {
    "com.king.candycrushsaga": "Candy Crush Saga",
    "com.supercell.clashofclans": "Clash of Clans",
    "com.king.petrescuesaga": "Pet Rescue Saga",
    "com.king.farmheroessaga": "Farm Heroes Saga",
    "com.king.candycrushsodasaga": "Candy Crush Soda Saga"
}

# Convert num_downloads to per billion
top_apps_level_4['num_downloads'] = top_apps_level_4['num_downloads'] / 1_000_000_000

# Mapping the app column to their normal names
top_apps_level_4['App Name'] = top_apps_level_4['my_app_id'].map(additional_app_name_mapping)

# Rename columns as per requirement
top_apps_level_4 = top_apps_level_4.rename(columns={
    'num_downloads': 'Downloads (in Billions)',
    'categ_app': 'Gaming Category'
})

top_apps_level_4.reset_index(drop=True, inplace=True)
top_apps_level_4 = top_apps_level_4[['App Name', 'Downloads (in Billions)','level']]

# Combine the tables
combined_tables = pd.concat([top_apps_level_3, top_apps_level_4])
combined_tables.sort_values(by='Downloads (in Billions)', ascending=False, inplace=True)
combined_tables.reset_index(drop=True, inplace=True)
combined_tables

### Free vs Paid

1.  **Figuur 2 verwijzen.**
2.  **Gebeurt dit –\> leg uit waarom je het zo ziet.**
3.  **Niet duidelijk waarom het gebeurt.**
4.  **Hypothese introduceren.**
5.  **Mogelijk heeft het te maken met dit (plots). (Testen van hypothese)**
6.  **Dus conclusie.**

**Key takeaways:**

-   Revenue level 2 premium and 5 have the least amount of total and average app downloads. These apps cost money upon downloading the app.

-   Revenue level 2 sample and premium have a high download offset in both average and total amounts.

We see high disparity in downloads between free (-mium) and paid apps. To further investigate this we follow the guideline of hypotheses: 1a and 1c.

*H1a: Apps that allow the user to have free access to all features (level 0 and 1) will have the highest amount of downloads overall. However, the ratings may fluctuate, as quality can vary for free-to-access apps.* <!-- Is citation hier nodig? -->

*H1c: For apps that utilize a sample and a premium version of the same app (level 2), the free versions of an app will have more downloads than their paid-for counterpart. Most, if not all, users will download the free version first, and then might upgrade. This means there should be a disparity between the number of downloads between the apps, as is also demonstrated by @liu2012freemium.*

The graph below illustrates that level 0 and 1 have indeed the highest overall downloads, compared to the other levels combined. (Apps that fall under level 0 and 1 have free access to all features.)


In [None]:
#| echo: false
#| warning: false
#| error: false

df_h1a = df_level_combined.copy()

# if level is 0 or 1 then new column category is 'Free apps'
df_h1a['category'] = ['Free all feature apps' if level in ['0', '1'] else 'Free/Paid not all feature apps' for level in df_h1a['level']]

# Calculate the sum of downloads for each category
total_downloads_h1a = df_h1a.groupby('category')['num_downloads'].sum().reset_index()


# Make a bar chart 
plt.figure(figsize=(10, 6))
sns.barplot(data=total_downloads_h1a, x='category', y='num_downloads', hue='category', palette='viridis', dodge=False)
plt.title('Figure 8: Total Number of Downloads by Category',fontsize=14)
plt.xlabel('Category')
plt.ylabel('Total Number of Downloads')
plt.show()

As show in figure 2, level 2 sample and premium have a high download offset in both average and total amounts. Illustrating just how high this offset is, the graph below illustrates that out of all the downloaded apps in level 2, just 1.1% are premium apps.


In [None]:
#| echo: false
#| warning: false
#| error: false

df_sample_h1c = df_level_combined[df_level_combined['level'] == '2 (Sample)']
df_premium_h1c = df_level_combined[df_level_combined['level'] == '2 (Premium)']
df_mapping_h1c = df_mapping_level_2.copy()

# relevant columns
relevant_columns_h1c = ['my_app_id', 'num_downloads', 'level', 'rating_app']
relevant_columns_mapping_h1c =  ['Sample app name',	'Premium app name']

# filter the relevant columns
df_sample_h1c = df_sample_h1c[relevant_columns_h1c]
df_premium_h1c = df_premium_h1c[relevant_columns_h1c]
df_mapping_h1c = df_mapping_h1c[relevant_columns_mapping_h1c]

# Merge mapping dataframe with sample dataframe and premium dataframe
df_mapped_h1c = (
    df_mapping_h1c
    .merge(df_sample_h1c[['my_app_id', 'num_downloads', 'rating_app']], left_on='Sample app name', right_on='my_app_id', how='left')
    .drop(columns=['my_app_id'])
    .rename(columns={'num_downloads': 'Sample Downloads', 'rating_app': 'Sample Rating'})
    .merge(df_premium_h1c[['my_app_id', 'num_downloads', 'rating_app']], left_on='Premium app name', right_on='my_app_id', how='left')
    .drop(columns=['my_app_id'])
    .rename(columns={'num_downloads': 'Premium Downloads', 'rating_app': 'Premium Rating'})
)

# Calculate metrics
df_mapped_h1c['Download Difference'] = df_mapped_h1c['Premium Downloads'] - df_mapped_h1c['Sample Downloads']
df_mapped_h1c['Download Ratio'] = df_mapped_h1c['Premium Downloads'] / df_mapped_h1c['Sample Downloads']

# Calculating total sample and premium downloads
total_sample_downloads = df_mapped_h1c['Sample Downloads'].sum()
total_premium_downloads = df_mapped_h1c['Premium Downloads'].sum()

# Creating a new dataframe to represent the totals
summary_data_h1c = {
    "Version Type": ["Sample Downloads", "Premium Downloads"],
    "Total Downloads": [total_sample_downloads, total_premium_downloads]
}
df_summary_h1c = pd.DataFrame(summary_data_h1c)

# Creating a pie chart to show the proportion of sample vs premium downloads
fig, ax = plt.subplots(figsize=(5, 5))
ax.pie(df_summary_h1c['Total Downloads'], labels=df_summary_h1c['Version Type'], autopct='%1.1f%%', startangle=140)
ax.set_title('Figure 9: Proportion of Total Downloads: Sample vs Premium', fontsize=14)

plt.show()

In terms of average downloads the disparity is also high. As can be seen by the graph below.


In [None]:
#| echo: false
#| warning: false
#| error: false

# Plot 4: Boxplot for Sample and Premium Downloads
plt.figure(figsize=(10, 6))
plt.boxplot([df_mapped_h1c['Sample Downloads'], df_mapped_h1c['Premium Downloads']], labels=['Sample Downloads', 'Premium Downloads'])
plt.ylabel('Number of Downloads')
plt.title('Figure 10: Boxplot of Sample vs Premium Downloads', fontsize=14)
plt.grid(axis='y', linestyle='--')
plt.show()

However, by looking at the average downloads in the graph below, we see that the average download difference is close to 500.000. With an average download ratio of nearly 27%. Meaning that about one in fourth users that download the sample app, also download the premium app.


In [None]:
#| echo: false
#| warning: false
#| error: false

df_sample_h1c = df_level_combined[df_level_combined['level'] == '2 (Sample)']
df_premium_h1c = df_level_combined[df_level_combined['level'] == '2 (Premium)']
df_mapping_h1c = df_mapping_level_2.copy()

# relevant columns
relevant_columns_h1c = ['my_app_id', 'num_downloads', 'level', 'rating_app']
relevant_columns_mapping_h1c =  ['Sample app name',	'Premium app name']

# filter the relevant columns
df_sample_h1c = df_sample_h1c[relevant_columns_h1c]
df_premium_h1c = df_premium_h1c[relevant_columns_h1c]
df_mapping_h1c = df_mapping_h1c[relevant_columns_mapping_h1c]

# Merge mapping dataframe with sample dataframe and premium dataframe
df_mapped_h1c = (
    df_mapping_h1c
    .merge(df_sample_h1c[['my_app_id', 'num_downloads', 'rating_app']], left_on='Sample app name', right_on='my_app_id', how='left')
    .drop(columns=['my_app_id'])
    .rename(columns={'num_downloads': 'Sample Downloads', 'rating_app': 'Sample Rating'})
    .merge(df_premium_h1c[['my_app_id', 'num_downloads', 'rating_app']], left_on='Premium app name', right_on='my_app_id', how='left')
    .drop(columns=['my_app_id'])
    .rename(columns={'num_downloads': 'Premium Downloads', 'rating_app': 'Premium Rating'})
)

# Calculate metrics
df_mapped_h1c['Download Difference'] = df_mapped_h1c['Premium Downloads'] - df_mapped_h1c['Sample Downloads']
df_mapped_h1c['Download Ratio'] = df_mapped_h1c['Premium Downloads'] / df_mapped_h1c['Sample Downloads']

# Data for the new dataframe
metric_data_h1c = {
    "Metric": [
        "Average Download Difference",
        "Average Download Ratio"
    ],
    "Value": [
        df_mapped_h1c['Download Difference'].mean(),
        df_mapped_h1c['Download Ratio'].mean(),
    ]
}

# Create the new dataframe
df_metrics_h1c = pd.DataFrame(metric_data_h1c)
df_metrics_h1c

## Ratings

Plotting the distribution of the ratings across all the different revenue models doesn't have the same insight as with the number of downloads. As can be seen below, the ratings in itself doesn't vary all that much within the revenue levels.


In [None]:
#| echo: false
#| warning: false
#| error: false

ratings_across_level = df_level_combined.copy()
# Group by 'level' and calculate mean for 'rating_app' and 'bayesian_average'
average_ratings = ratings_across_level.groupby('level')['rating_app'].mean()
bayesian_ratings = ratings_across_level.groupby('level')['bayesian_average'].mean()
# Plotting the ratings across levels
plt.figure(figsize=(10, 6))

# Bar plot for ratings and Bayesian ratings
average_ratings.plot(kind='bar', width=0.4, position=1, label='Average Rating', color='blue', alpha=0.6)
bayesian_ratings.plot(kind='bar', width=0.4, position=0, label='Bayesian Rating', color='green', alpha=0.6)

# Adding labels and title
plt.xlabel('Level')
plt.ylabel('Rating')
plt.title('Figure 11: Ratings Across Levels', fontsize=14)
plt.legend()
plt.xticks(rotation=0)

# Show plot
plt.tight_layout()
plt.show()

When plotting the variance of the ratings across the different levels, we do see a lot of variance. However, this variance diminishes when considering the Bayesian average, which smooths out the fluctuations and provides a more consistent view of the ratings. Indicating that some levels have strong outliers.


In [None]:
#| echo: false
#| warning: false
#| error: false

variance_rating = df_level_combined.copy()

# Assuming your dataframe is named df
columns_to_calculate = ['rating_app', 'bayesian_average']
variance_specific = variance_rating.groupby('level')[columns_to_calculate].var().reset_index()

# Melting the dataframe to long format for easier plotting
variance_melted = variance_specific.melt(id_vars='level', var_name='Attribute', value_name='Variance')

plt.figure(figsize=(12, 6))
sns.barplot(data=variance_melted, x='level', y='Variance', hue='Attribute', palette='viridis')
plt.title('Figure 12: Variance of Different Attributes Grouped by Level', fontsize=14)
plt.xlabel('Level')
plt.ylabel('Variance')
plt.xticks(rotation=45)
plt.legend(title='Attribute', loc='upper right')
plt.tight_layout()
plt.show()

In the next two subsections we investigate further on the variance and the difference in rating between paid and free apps.

### Quality of Apps

Hypotheses 1a and 2a partially touch on the quality of apps being a reason for the fluctuations in ratings. While 1a looks at the free access to all feature apps, 2a looks more at the drawbacks of the freemium revenue model. Where the balance between free and paid features can result in lower ratings.

*H1a: Apps that allow the user to have free access to all features (level 0 and 1) will have the highest amount of downloads overall. However, the ratings may fluctuate, as quality can vary for free-to-access apps.* <!-- Is citation hier nodig? -->

*H2a: Apps that require the user to pay to unlock features (level 2, 3, and 4) will tend to have lower ratings than the version that requires payment upfront (level 5). The main draw of a freemium model is to attract users, and have them update to a paid version [@kumar2014]. However, as @kumar2014 points out, this can be a double-edged sword. Too few features, and it may not be attractive to users. Too many features, and the users will not update.*

The table below shows the variance and standard deviation between "free access to all feature apps" and "free access to not all features or paid apps". There is not a big statistical difference.


In [None]:
#| echo: false
#| warning: false
#| error: false


df_h1a = df_level_combined.copy()

# if level is 0 or 1 then new column category is 'Free apps'
df_h1a['category'] = ['Free all feature apps' if level in ['0', '1'] else 'Free/Paid not all feature apps' for level in df_h1a['level']]

# Calculate variance and standard deviation for each category
df_stats_h1a = df_h1a.groupby('category')['rating_app'].agg(['var', 'std']).reset_index()

# Rename columns for clarity
df_stats_h1a.rename(columns={'var': 'variance in rating', 'std': 'standard deviation in rating'}, inplace=True)

df_stats_h1a

As can also be seen in the graph below, that shows the boxplot of the ratings.


In [None]:
#| echo: false
#| warning: false
#| error: false

plt.figure(figsize=(10, 6))
sns.boxplot(data=df_h1a, x='category', y='rating_app')
plt.title('Figure 13: App Ratings by Category', fontsize=14)
plt.xlabel('Category')
plt.ylabel('Ratings')
plt.show()

Conversely, the mean rating and Bayesian average of apps where you need to unlock features versus paid apps with all features also doesn't see a big difference. The premium apps do tend to have a higher rating. As can be seen in the graph below.


In [None]:
#| echo: false
#| warning: false
#| error: false

df_level_combined_h2a = df_level_combined.copy()

# Filter only levels 2, 3, 4, and 5
df_h2a = df_level_combined_h2a[df_level_combined_h2a['level'].isin(['2 (Sample)', '2 (Premium)', '3', '4', '5'])].copy()

# Filter on premium and free apps
df_h2a['category'] = np.where((df_h2a['level'] == '5') | (df_h2a['level'] == '2 (Premium)'), 'Premium apps', 'Free apps')

# relevant columns
relevant_columns_h2a = ['category', 'rating_app', 'bayesian_average']

# filter the relevant columns
df_h2a = df_h2a[relevant_columns_h2a]

# Calculating the mean rating for Free and Premium apps
mean_rating_free = df_h2a[df_h2a['category'] == 'Free apps']['rating_app'].mean()
mean_rating_premium = df_h2a[df_h2a['category'] == 'Premium apps']['rating_app'].mean()

# Calculating the mean Bayesian average for Free and Premium apps
mean_bayesian_free = df_h2a[df_h2a['category'] == 'Free apps']['bayesian_average'].mean()
mean_bayesian_premium = df_h2a[df_h2a['category'] == 'Premium apps']['bayesian_average'].mean()

# Creating a new DataFrame from these results
mean_results = pd.DataFrame({
    'Category': ['Free apps', 'Premium apps'],
    'Mean Rating': [mean_rating_free, mean_rating_premium],
    'Mean Bayesian Average': [mean_bayesian_free, mean_bayesian_premium]
})

# Combine ratings for boxplot
categories = ['Free apps', 'Premium apps']
ratings_data = [df_h2a[df_h2a['category'] == cat]['rating_app'] for cat in categories]

plt.boxplot(ratings_data, labels=categories, patch_artist=True)
plt.title('Figure 14: Boxplot of Ratings for Free and Premium Apps', fontsize=14)
plt.ylabel('Ratings')
plt.xlabel('App Category')
plt.grid(axis='y', linestyle='--', alpha=0.7)

plt.show()

### Variance in Ratings

Hypothesis 2b touches on the variance between premium and their counterpart freemium apps.

*H2b: Fully premium apps (level 5) will have less variance in their ratings, while all other levels will have more. In the same vein as H2a, users have more realistic expectations of paid apps compared to apps that require you to unlock features [@kumar2014]. Therefore, more users downloading premium apps will be satisfied with their purchase, leading to less variance.*

In the graph below, we see a significant difference in the variance of ratings. Premium apps tend to have a higher variance, with more outliers.


In [None]:
#| echo: false
#| warning: false
#| error: false


df_level_combined_h2b = df_level_combined.copy()

# Filter on level 5 and others
df_level_combined_h2b['category'] = np.where(df_level_combined_h2b['level'] == '5', 'Premium apps', 'Other apps')

# relevant columns
relevant_columns_h2b = ['category', 'rating_app', 'bayesian_average']

# filter the relevant columns
df_level_combined_h2b = df_level_combined_h2b[relevant_columns_h2b]

# Filtering data for Free and Premium apps
other_apps = df_level_combined_h2b[df_level_combined_h2b['category'] == 'Other apps']
premium_apps = df_level_combined_h2b[df_level_combined_h2b['category'] == 'Premium apps']

# Calculating variances for ratings and bayesian averages
other_variance_rating = other_apps['rating_app'].var()
premium_variance_rating = premium_apps['rating_app'].var()

# Creating a bar plot for variances in ratings
fig, ax = plt.subplots(figsize=(10, 6))
ax.bar(['Other apps', 'Premium apps'], [other_variance_rating, premium_variance_rating], color=['skyblue', 'orange'])
ax.set_ylabel('Variance in Ratings', fontsize=12)
ax.set_title('Figure 15: Variance in Ratings for Other vs Premium Apps', fontsize=14)
ax.set_ylim(0, max(other_variance_rating, premium_variance_rating) * 1.2)

plt.tight_layout()
plt.show()

In the graph below, we once again see that level 5 has a high variance compared to the others. But when adjusted with the Bayesian average, the variance is one of the lowest. This difference may arise because users who are satisfied with the app tend to rate the app highly, while users who are not satisfied, tend to rate it much lower, having paid for the app.


In [None]:
#| echo: false
#| warning: false
#| error: false

variance_rating = df_level_combined.copy()

# Assuming your dataframe is named df
columns_to_calculate = ['rating_app', 'bayesian_average']
variance_specific = variance_rating.groupby('level')[columns_to_calculate].var().reset_index()

# Melting the dataframe to long format for easier plotting
variance_melted = variance_specific.melt(id_vars='level', var_name='Attribute', value_name='Variance')

# Melting the dataframe to long format
variance_melted = variance_specific.melt(id_vars='level', var_name='Attribute', value_name='Variance')

# Plotting line plot with seaborn
plt.figure(figsize=(12, 6))
sns.lineplot(data=variance_melted, x='level', y='Variance', hue='Attribute', marker='o', palette='Set1')
plt.title('Figure 16: Variance of Attributes Across Different Levels', fontsize=14)
plt.xlabel('Level')
plt.ylabel('Variance')
plt.xticks(rotation=45)
plt.legend(title='Attribute')
plt.tight_layout()
plt.show()

### Relationships between App Versions

Level 2 has two version of the same app. A sample and a premium version. According to hypothesis 3, the rating of a paid version positively correlates with the rating of the sample version.

*H3: For apps that utilize a sample and a premium version of the same app (level 2), the rating of the paid-for version is positively associated with the rating of the free version of the same app. This was true for the study on the most popular apps in the Google Play Store by @liu2012freemium, so it is expected a similar pattern should arise for this dataset.*

In the table below the correlation between ratings of sample and premium apps is considered moderate positive with 0.35. The correlation between the Bayesian average rating of sample and premium apps is considered moderate to strong positive correlation with 0.5.


In [None]:
#| echo: false
#| warning: false
#| error: false

df_sample_h3 = df_level_combined[df_level_combined['level'] == '2 (Sample)']
df_premium_h3 = df_level_combined[df_level_combined['level'] == '2 (Premium)']
df_mapping_h3 = df_mapping_level_2.copy()

# relevant columns
relevant_columns_h3 = ['my_app_id', 'level', 'rating_app', 'bayesian_average']
relevant_columns_mapping_h3 =  ['Sample app name',	'Premium app name']

# filter the relevant columns
df_sample_h3 = df_sample_h3[relevant_columns_h3]
df_premium_h3 = df_premium_h3[relevant_columns_h3]
df_mapping_h3 = df_mapping_h3[relevant_columns_mapping_h3]

# Merge mapping dataframe with sample dataframe and premium dataframe
df_mapped_h3 = (
    df_mapping_h3
    .merge(df_sample_h3[['my_app_id', 'bayesian_average', 'rating_app']], left_on='Sample app name', right_on='my_app_id', how='left')
    .drop(columns=['my_app_id'])
    .rename(columns={'num_downloads': 'Sample Downloads', 'rating_app': 'Sample Rating', 'bayesian_average': 'Sample Bayesian Average'})
    .merge(df_premium_h3[['my_app_id', 'bayesian_average', 'rating_app']], left_on='Premium app name', right_on='my_app_id', how='left')
    .drop(columns=['my_app_id'])
    .rename(columns={'num_downloads': 'Premium Downloads', 'rating_app': 'Premium Rating', 'bayesian_average': 'Premium Bayesian Average'})
)

# Correlation Coefficient
correlation_rating_h3 = df_mapped_h3["Sample Rating"].corr(df_mapped_h3["Premium Rating"])

# Correlation Coefficient for Bayesian averages
correlation_bayesian_rating_h3 = df_mapped_h3["Sample Bayesian Average"].corr(df_mapped_h3["Premium Bayesian Average"])

correlation_data_h3 = {
    "Metric": ["Sample Rating vs Premium Rating", "Sample Bayesian Average vs Premium Bayesian Average"],
    "Correlation Coefficient": [correlation_rating_h3, correlation_bayesian_rating_h3]
}
pd.DataFrame(correlation_data_h3)

### Conclusion

**Opsommen van de resultaten. Antwoord geven op de research question.**

Dit was onze onderzoeksvraag.

Succes geformuleerd als 3 dingen. Pop, rating en revenue(niet meegenomen).

Pop: a (wel/niet), b (wel/niet), c (wel/niet) etc.

# Discussion <!--500-700 words -->


```{=html}
<!-- Insightful discussion of findings in relation to the research question and literature.
Reflection on practical implications and contributions to the ongoing debate.

Content: Provide a thoughtful discussion of the findings in relation to the research question and the literature. Reflect on the practical implications and contributions to the academic debate.

Key Elements:
Reflection on findings.
Linkage to existing literature.
Practical
-->
```


## Reflection on the Findings

Downloads do not necessarily indicate revenue for freemium models [@djaruma2023]. The time the user spends on an app and the purchases made within this app [@ross2018] are better measures of the revenue for freemium applications.

## Practical Implications for Businesses

## Future Research Directions

# References

::: {#refs}
:::