# Applied Analytics Portfolio

**Predicting and Explaining Healthcare App Quality**

Group: `_1_`

Names & Student IDs: `Louis - `、`Christian - `、`Min Zhu - 5607778`

---

## 1. Introduction

Briefly describe the **decision context**:

- Mental health unit that wants to recommend high-quality healthcare apps.
- Patients have a range of mental illnesses and somatic comorbidities.

Explain **why prediction helps** and what the **overall goal** of this portfolio is:

- Use app metadata and user reviews to estimate whether an app is likely to be highly rated.
- Identify key factors that drive user-perceived app quality.

Conclude with a short **structure overview** of the notebook/report (what is done in Sections 2–5).

## 2. Data Understanding and Preparation

### 2.1 Research Goal and Operationalization
- Formulate a **precise prediction question**.
- Define what **"high-quality" / "highly rated"** means (e.g., rating threshold + minimum number of ratings).
- Specify which apps you will include (e.g., which categories, filters).

### 2.2 Data Overview
- Number of apps and reviews after filtering.
- Brief description of key variables (metadata and text).

### 2.3 Cleaning and Filtering
- Handle outliers (e.g., extreme prices, extremely low number of ratings).
- Check missing values in important variables and decide on imputation vs. dropping.
- Document inclusion criteria and any comparator groups (e.g., medical vs. non-medical apps).

In [None]:
# 2. Data Understanding and Preparation
# TODO: Load your datasets here and perform basic checks
import pandas as pd

# Example:
# apps = pd.read_csv('apps.csv')
# reviews = pd.read_csv('reviews.csv')
# apps.head()


## 3. Data Exploration

- Explore distributions of ratings, number of ratings, prices, categories, etc.
- Visualize relevant relationships (e.g., rating vs. price, rating vs. category).
- Use basic text mining on reviews: word frequencies, simple sentiment or topic structure.
- Create and justify **new features** that may help prediction (e.g., sentiment score, review length, price bins).
- Comment on what these patterns suggest about app quality.

In [None]:
# 3. Data Exploration
# TODO: EDA plots and feature creation
import matplotlib.pyplot as plt

# Example placeholder:
# apps['log_ratings'] = np.log1p(apps['ratingCount'])
# apps['averageRating'].hist()
# plt.show()


## 4. Modeling Approach

### 4.1 Review Sentiment with Zero-/Few-Shot Learning (SetFit or alternative)

1. Define sentiment classes (e.g., positive / neutral / negative).
2. Manually label a small, balanced subset of reviews.
3. Fine-tune a SetFit model or a LLM model and evaluate performance.
5. Aggregate predicted sentiment to the **app level** (e.g., share of positive reviews).

These aggregated sentiment metrics will be used as features in Section 4.2.

In [None]:
# 4.1 Sentiment Modeling
# TODO: Implement SetFit / alternative sentiment classifier and aggregate results
# Hint: start with a small labeled subset of reviews
pass


### 4.2 Predictive Modeling of App Quality

1. **Define the target** variable at app level (e.g., high_quality = 1 if avg rating ≥ threshold and sufficient rating count).
2. **Model A – Simple & interpretable:** Logistic Regression or a small Decision Tree.
3. **Model B – More powerful:** e.g., Random Forest or Gradient Boosting with basic hyperparameter tuning.
4. Compare performance (accuracy, precision, recall, F1, ROC-AUC, etc.) and comment on the trade-off between interpretability and performance.

In [None]:
# 4.2 Predictive Modeling of App Quality
# TODO: Build train/test split, fit Model A and Model B, and evaluate
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix

# Example placeholder:
# X = apps_model_features
# y = apps['high_quality']
# X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# ...


## 5. Interpretation and Argumentation of Results

1. **Model Interpretation / Explainable AI**
- Inspect and visualize feature importance (e.g., SHAP values or model-specific importances).
- Discuss which features most strongly influence predicted app quality.

2. **Fairness & Bias Reflection**
- Where could sampling bias, measurement error, or missing data affect your results?
- Briefly relate your reflections to fairness notions mentioned in the course.

3. **LLM / SetFit as Method**
- Discuss where these methods might introduce bias or instability.
- Mention how sensitive your results are to label definitions or prompts (short reflection).

4. **Practical Insights for the Clinic**
- List 2–4 concrete, comprehensible recommendations that the mental health unit could use.
- Focus on what your results *suggest they should pay attention to* when recommending apps.

## 6. AI Tools and References

- Briefly describe where AI tools (e.g., ChatGPT, Copilot) were used, in line with FU guidelines.
- List key papers, blog posts, or documentation that you relied on for methods.
