# **Reshaping explicit and implicit data into preference pairs**

### **Weighted formula:**

$$
\begin{aligned}
\text{Score} =\ & 
0.2 \cdot \text{TimeOnTask} +
0.1 \cdot \text{ScrollDepth} +
0.1 \cdot \text{ScrollEvents} +
0.2 \cdot \text{CompletionRate} \\
& +
0.1 \cdot \text{ActiveMinutes} +
0.1 \cdot \text{MemoryUse} +
0.05 \cdot \text{TutorInteractions} +
0.05 \cdot \text{AvgModuleRating} \\
& +
0.05 \cdot \text{Satisfaction} +
0.05 \cdot (\text{PostSkill} - \text{PreSkill}) +
0.05 \cdot (\text{Relevance} + \text{Trust} - \text{Difficulty}) \\
& +
0.02 \cdot \text{Pace} -
0.02 \cdot \text{Retries} -
0.01 \cdot \text{ResponseTime}
\end{aligned}
$$

Constraints:
- time_on_task: Max 10'800 seconds (3 hours)
- active_minutes: 600 minutes (10 hours)

Drawbacks:
- Weights are chosen manually, not learned from data
- Assumes all signals are comparable, without considering scale differences
- Doesn’t adapt to different users or contexts
- Sensitive to outliers in some metrics
- Lacks automatic improvement with more data or real user feedback



In [1]:
%load_ext autoreload
%autoreload 2

from preference_pairs import *
import json
import os

In [2]:
# Load data
DATA_PATH = 'data'
data_names = ["synth_10-samples_gpt-4o-mini_2025-05-05_08-31", "synth_10-samples_gpt-4o-mini_2025-05-05_08-48"]
user_data = []
for file in data_names:
    with open(os.path.join(DATA_PATH, f"{file}.json"), "r") as f:
        user_data += (json.load(f))

# Print data to see format
print(json.dumps(user_data[0], indent=2, sort_keys=True))


{
  "explicit_data": {
    "approval_of_content_modifications": [
      {
        "change": "Added more examples in Python",
        "status": "approved"
      }
    ],
    "curriculum_editing_feedback": "Would like more practical exercises.",
    "difficulty_feedback": 3,
    "drag_and_drop_curriculum_edits": [],
    "explicit_learning_goals": "Master Python for data analysis.",
    "preferred_content_format": "video",
    "ratings_on_modules": {
      "Introduction to Data Science": 5,
      "Python for Data Analysis": 4
    },
    "reflection_inputs": "I learned a lot about Python fundamentals.",
    "relevance_feedback": 5,
    "satisfaction_surveys": {
      "content_relevance": 5,
      "interface_usability": 5,
      "overall_satisfaction": 4
    },
    "skill_self_assessments": {
      "after_training": 5,
      "before_training": 3
    },
    "trust_feedback": 5
  },
  "implicit_data": {
    "content_adaptation_requests": [],
    "drop_off_events": [],
    "engagement_metrics"

In [3]:
# Generate preference pairs
OUTPUT_PATH = "output"
preference_dataset = generate_preference_pairs(user_data)
#print(json.dumps(preference_dataset, indent=2, sort_keys=True))
with open(os.path.join(OUTPUT_PATH, f"preference_pairs.json"), "w") as f:
    json.dump(preference_dataset, f, indent=2)

In [4]:
# Comparison of chosen vs rejected
SCORE_DIFF_THRESH = 700
for pair in preference_dataset:
    if pair["score_diff"] > SCORE_DIFF_THRESH:
        print(json.dumps(pair, indent=2, sort_keys=True))

{
  "chosen": {
    "explicit_data": {
      "approval_of_content_modifications": [],
      "curriculum_editing_feedback": "Modules are challenging but rewarding.",
      "difficulty_feedback": 4,
      "drag_and_drop_curriculum_edits": [],
      "explicit_learning_goals": "Deepen understanding of deep learning models.",
      "preferred_content_format": "text",
      "ratings_on_modules": {
        "Computer Vision": 4,
        "Deep Learning": 5
      },
      "reflection_inputs": "I'm excited about the advancements in deep learning.",
      "relevance_feedback": 5,
      "satisfaction_surveys": {
        "content_relevance": 5,
        "interface_usability": 5,
        "overall_satisfaction": 5
      },
      "skill_self_assessments": {
        "after_training": 5,
        "before_training": 3
      },
      "trust_feedback": 5
    },
    "implicit_data": {
      "content_adaptation_requests": [],
      "drop_off_events": [],
      "engagement_metrics": {
        "active_minutes": 2