ORIGIN Recommender

This program is the first experiment in building a recommender engine to support the ORIGIN app.

DDeR 2026-02-02

In [2]:
import numpy as np
from collections import defaultdict
from datetime import datetime
from typing import Dict, List, Optional, Tuple
import json

The following code was generated with Claude as an assistant. Here is the spec:

An important aspect of the app is that it asks the user for information about their mood, so that it can track their mood after a story (if they provide this information) and use this to assist in recommendations.

The app passes the following information to the python program. These are called analytic events and they are timestamped:
* The user's ID
* User viewed a story
* User completes a story
* User's answer to end of story mood score question
* User favourited a story (this means they want to keep it to come back to later)
* User provides a mood score
* User searches for a story, browsing by theme or using free text

The app will also need to build up its own record of the users and the stories they have read, loading and saving this state through a simple API.

The mood score has multiple dimensions, It is based on a system called PANAS SF which has 20 categories each with value 1-5. We can use a simplified version of this, and the users will  be choosing emojis to represent their mood - so each emoji will correspond  with a multidimensional numeric mood score.

The app may request recommendations at any time. This includes when the user has first logged in and no other information is known or only their mood score is known.

Recommendations should match current mood and include stories that improve mood and stories that are similar to ones that previously improved mood. Users cannot specify their intent but perhaps the recommender algorithm can effectively infer it. In particular, if a user selects away from certain themes then we do not want to recommend those themes.

For saving and loading, just use in memory to start with and we will refine this aspect later.

I am assuming recommendations will be based on the user's personal history and on the choices of other users. There may also be some information provided from the context, e.g. there may be a promotion of particular themes or stories as part of the ongoing content generation.  Perhaps this can be done by the free text tags that are associated with stories.

First the classes

In [3]:
class MoodScore:
    """Represents a multi-dimensional mood score (simplified PANAS-SF)"""
    def __init__(self, dimensions: Dict[str, int]):
        """
        dimensions: dict of dimension_name -> value (1-5)
        For simplified PANAS, we might have:
        - positive_affect: ['interested', 'excited', 'strong', 'enthusiastic', 'proud', ...]
        - negative_affect: ['distressed', 'upset', 'guilty', 'scared', 'hostile', ...]
        """
        self.dimensions = dimensions
        self.vector = np.array(list(dimensions.values()), dtype=float)

    def distance_to(self, other: 'MoodScore') -> float:
        """Calculate Euclidean distance between mood scores"""
        return np.linalg.norm(self.vector - other.vector)

    def similarity_to(self, other: 'MoodScore') -> float:
        """Cosine similarity between mood scores (0-1, higher is more similar)"""
        dot_product = np.dot(self.vector, other.vector)
        magnitude = np.linalg.norm(self.vector) * np.linalg.norm(other.vector)
        if magnitude == 0:
            return 0.0
        return dot_product / magnitude

    def to_dict(self) -> Dict:
        return self.dimensions

    @classmethod
    def from_dict(cls, data: Dict) -> 'MoodScore':
        return cls(data)


class Story:
    """Represents a story with metadata"""
    def __init__(self, story_id: str, title: str, theme: str, tags: List[str] = None):
        self.id = story_id
        self.title = title
        self.theme = theme
        self.tags = tags or []

        # Computed features
        self.mood_associations = []  # List of (mood_before, mood_after) tuples
        self.avg_mood_change = None  # Average mood improvement

    def to_dict(self) -> Dict:
        return {
            'id': self.id,
            'title': self.title,
            'theme': self.theme,
            'tags': self.tags,
            'mood_associations': [(mb.to_dict(), ma.to_dict()) for mb, ma in self.mood_associations],
            'avg_mood_change': self.avg_mood_change
        }

    @classmethod
    def from_dict(cls, data: Dict) -> 'Story':
        story = cls(data['id'], data['title'], data['theme'], data['tags'])
        story.mood_associations = [
            (MoodScore.from_dict(mb), MoodScore.from_dict(ma))
            for mb, ma in data.get('mood_associations', [])
        ]
        story.avg_mood_change = data.get('avg_mood_change')
        return story


class AnalyticsEvent:
    """Represents a user interaction event"""
    def __init__(self, user_id: str, event_type: str, timestamp: datetime, **kwargs):
        self.user_id = user_id
        self.event_type = event_type  # 'view', 'complete', 'mood_after', 'favorite', 'mood_general', 'search'
        self.timestamp = timestamp
        self.data = kwargs

    def to_dict(self) -> Dict:
        return {
            'user_id': self.user_id,
            'event_type': self.event_type,
            'timestamp': self.timestamp.isoformat(),
            'data': self.data
        }

    @classmethod
    def from_dict(cls, data: Dict) -> 'AnalyticsEvent':
        return cls(
            data['user_id'],
            data['event_type'],
            datetime.fromisoformat(data['timestamp']),
            **data['data']
        )


class UserProfile:
    """Represents a user's preferences and history"""
    def __init__(self, user_id: str):
        self.user_id = user_id
        self.viewed_stories = set()
        self.completed_stories = set()
        self.favorited_stories = set()

        # Theme preferences (positive = likes, negative = dislikes)
        self.theme_scores = defaultdict(float)

        # Mood tracking
        self.mood_history = []  # List of (timestamp, MoodScore)
        self.current_mood = None

        # Story-mood associations
        self.story_mood_impact = {}  # story_id -> mood_change_score

        # Recent interactions for temporal filtering
        self.recent_story_views = []  # Last N viewed stories with timestamps

    def get_avoided_themes(self, threshold: float = -1.0) -> List[str]:
        """Get themes the user seems to avoid"""
        return [theme for theme, score in self.theme_scores.items() if score < threshold]

    def get_preferred_themes(self, threshold: float = 1.0) -> List[str]:
        """Get themes the user prefers"""
        return [theme for theme, score in self.theme_scores.items() if score > threshold]



Now the actual recommender logic

In [None]:
class StoryRecommender:
    """Main recommender system"""

    def __init__(self):
        self.stories: Dict[str, Story] = {}
        self.users: Dict[str, UserProfile] = {}
        self.events: List[AnalyticsEvent] = []

        # Cached computations
        self._story_similarity_cache = {}
        self._theme_to_stories = defaultdict(list)

    def add_story(self, story_id: str, title: str, theme: str, tags: List[str] = None):
        """Add a new story to the catalog"""
        story = Story(story_id, title, theme, tags)
        self.stories[story_id] = story
        self._theme_to_stories[theme].append(story_id)

        # Invalidate similarity cache
        self._story_similarity_cache = {}

    def add_event(self, event: AnalyticsEvent):
        """Process an analytics event"""
        self.events.append(event)
        user_id = event.user_id

        # Ensure user profile exists
        if user_id not in self.users:
            self.users[user_id] = UserProfile(user_id)

        user = self.users[user_id]

        # Process based on event type
        if event.event_type == 'view':
            story_id = event.data['story_id']
            user.viewed_stories.add(story_id)
            user.recent_story_views.append((event.timestamp, story_id))

            # Update theme exposure (neutral at first)
            if story_id in self.stories:
                theme = self.stories[story_id].theme
                user.theme_scores[theme] += 0.1

        elif event.event_type == 'complete':
            story_id = event.data['story_id']
            user.completed_stories.add(story_id)

            # Completion is a strong positive signal
            if story_id in self.stories:
                theme = self.stories[story_id].theme
                user.theme_scores[theme] += 1.0

        elif event.event_type == 'mood_after':
            story_id = event.data['story_id']
            mood_after = MoodScore(event.data['mood_score'])

            # Calculate mood change if we have mood before
            if user.current_mood:
                mood_change = self._calculate_mood_improvement(user.current_mood, mood_after)
                user.story_mood_impact[story_id] = mood_change

                # Update story's mood associations
                if story_id in self.stories:
                    self.stories[story_id].mood_associations.append((user.current_mood, mood_after))
                    self._update_story_mood_stats(story_id)

                    # Adjust theme preference based on mood change
                    theme = self.stories[story_id].theme
                    user.theme_scores[theme] += mood_change * 0.5

            user.current_mood = mood_after
            user.mood_history.append((event.timestamp, mood_after))

        elif event.event_type == 'favorite':
            story_id = event.data['story_id']
            user.favorited_stories.add(story_id)

            # Strong positive signal
            if story_id in self.stories:
                theme = self.stories[story_id].theme
                user.theme_scores[theme] += 2.0

        elif event.event_type == 'mood_general':
            mood_score = MoodScore(event.data['mood_score'])
            user.current_mood = mood_score
            user.mood_history.append((event.timestamp, mood_score))

        elif event.event_type == 'search':
            # Track what user searches for
            if 'theme' in event.data:
                theme = event.data['theme']
                user.theme_scores[theme] += 0.5

    def _calculate_mood_improvement(self, mood_before: MoodScore, mood_after: MoodScore) -> float:
        """
        Calculate mood improvement score.
        Positive values = mood improved, negative = mood worsened
        """
        # Simple approach: compare positive vs negative dimensions
        # This assumes PANAS structure with positive/negative affect

        # For now, use a simple heuristic: higher values on positive dimensions = better
        # You can refine this based on your specific PANAS dimensions
        before_sum = np.sum(mood_before.vector)
        after_sum = np.sum(mood_after.vector)

        # Normalize by number of dimensions
        improvement = (after_sum - before_sum) / len(mood_before.vector)
        return improvement

    def _update_story_mood_stats(self, story_id: str):
        """Update story's average mood change statistics"""
        story = self.stories[story_id]
        if not story.mood_associations:
            return

        improvements = [
            self._calculate_mood_improvement(before, after)
            for before, after in story.mood_associations
        ]
        story.avg_mood_change = np.mean(improvements)

    def get_recommendations(self,
                           user_id: str,
                           context: Dict = None,
                           n_recommendations: int = 10) -> List[Tuple[str, float]]:
        """
        Get personalized story recommendations.

        Args:
            user_id: User requesting recommendations
            context: Optional context dict with keys like:
                - 'current_mood': MoodScore object
                - 'promotional_tags': List of tags to boost
            n_recommendations: Number of recommendations to return (5-10)

        Returns:
            List of (story_id, score) tuples, sorted by score
        """
        context = context or {}

        # Ensure user exists
        if user_id not in self.users:
            self.users[user_id] = UserProfile(user_id)

        user = self.users[user_id]

        # Update current mood if provided in context
        if 'current_mood' in context:
            user.current_mood = context['current_mood']

        # Score all stories
        story_scores = {}
        for story_id, story in self.stories.items():
            # Skip recently viewed stories (within last 10 views)
            recent_story_ids = [sid for _, sid in user.recent_story_views[-10:]]
            if story_id in recent_story_ids:
                continue

            score = self._score_story_for_user(user, story, context)
            story_scores[story_id] = score

        # Sort and return top N
        sorted_stories = sorted(story_scores.items(), key=lambda x: x[1], reverse=True)
        return sorted_stories[:n_recommendations]

    def _score_story_for_user(self, user: UserProfile, story: Story, context: Dict) -> float:
        """
        Calculate recommendation score for a story given user profile and context.
        Combines multiple signals.
        """
        score = 0.0

        # 1. MOOD MATCHING (if current mood available)
        if user.current_mood and story.mood_associations:
            # Find stories with similar "before" moods
            mood_similarities = [
                user.current_mood.similarity_to(before_mood)
                for before_mood, _ in story.mood_associations
            ]
            if mood_similarities:
                score += np.mean(mood_similarities) * 2.0  # Weight: 2.0

        # 2. MOOD IMPROVEMENT POTENTIAL
        if story.avg_mood_change is not None:
            # Stories that historically improve mood
            score += story.avg_mood_change * 3.0  # Weight: 3.0

        # 3. PERSONAL MOOD HISTORY WITH THIS STORY
        if story.id in user.story_mood_impact:
            score += user.story_mood_impact[story.id] * 2.5  # Weight: 2.5

        # 4. THEME PREFERENCES
        theme_score = user.theme_scores.get(story.theme, 0)
        score += theme_score * 1.5  # Weight: 1.5

        # Penalize avoided themes heavily
        if story.theme in user.get_avoided_themes():
            score -= 5.0

        # 5. COLLABORATIVE FILTERING
        collab_score = self._collaborative_filtering_score(user, story)
        score += collab_score * 2.0  # Weight: 2.0

        # 6. CONTENT-BASED (similar to favorited/completed stories)
        content_score = self._content_based_score(user, story)
        score += content_score * 1.5  # Weight: 1.5

        # 7. FAVORITES SIMILARITY
        if user.favorited_stories:
            favorite_similarities = [
                self._story_similarity(story.id, fav_id)
                for fav_id in user.favorited_stories
                if fav_id in self.stories
            ]
            if favorite_similarities:
                score += max(favorite_similarities) * 2.0  # Weight: 2.0

        # 8. PROMOTIONAL BOOST
        if 'promotional_tags' in context:
            promo_tags = set(context['promotional_tags'])
            if any(tag in promo_tags for tag in story.tags):
                score += 1.5  # Promotional boost

        # 9. NOVELTY (slight preference for unseen stories)
        if story.id not in user.viewed_stories:
            score += 0.5

        return score

    def _collaborative_filtering_score(self, user: UserProfile, story: Story) -> float:
        """
        Score based on what similar users liked.
        Users are similar if they completed/favorited similar stories.
        """
        if not user.completed_stories and not user.favorited_stories:
            return 0.0

        # Find users who liked stories this user liked
        user_liked = user.completed_stories | user.favorited_stories

        similar_users_scores = []
        for other_user_id, other_user in self.users.items():
            if other_user_id == user.user_id:
                continue

            other_liked = other_user.completed_stories | other_user.favorited_stories

            # Jaccard similarity
            intersection = len(user_liked & other_liked)
            union = len(user_liked | other_liked)

            if union == 0:
                continue

            similarity = intersection / union

            # If similar user liked this story, boost score
            if story.id in other_liked:
                similar_users_scores.append(similarity)

        if similar_users_scores:
            return max(similar_users_scores)
        return 0.0

    def _content_based_score(self, user: UserProfile, story: Story) -> float:
        """
        Score based on similarity to stories user completed/favorited.
        """
        user_liked = user.completed_stories | user.favorited_stories

        if not user_liked:
            return 0.0

        similarities = [
            self._story_similarity(story.id, liked_id)
            for liked_id in user_liked
            if liked_id in self.stories
        ]

        if similarities:
            return max(similarities)
        return 0.0

    def _story_similarity(self, story_id1: str, story_id2: str) -> float:
        """
        Calculate similarity between two stories based on theme and tags.
        Uses caching for efficiency.
        """
        if story_id1 == story_id2:
            return 1.0

        # Check cache (bidirectional)
        cache_key = tuple(sorted([story_id1, story_id2]))
        if cache_key in self._story_similarity_cache:
            return self._story_similarity_cache[cache_key]

        story1 = self.stories.get(story_id1)
        story2 = self.stories.get(story_id2)

        if not story1 or not story2:
            return 0.0

        similarity = 0.0

        # Theme match
        if story1.theme == story2.theme:
            similarity += 0.5

        # Tag overlap (Jaccard similarity)
        if story1.tags and story2.tags:
            tags1 = set(story1.tags)
            tags2 = set(story2.tags)
            intersection = len(tags1 & tags2)
            union = len(tags1 | tags2)
            if union > 0:
                similarity += 0.5 * (intersection / union)

        # Cache result
        self._story_similarity_cache[cache_key] = similarity

        return similarity

    # State management
    def save_state(self) -> Dict:
        """Export system state as a dictionary"""
        return {
            'stories': {sid: story.to_dict() for sid, story in self.stories.items()},
            'users': {
                uid: {
                    'user_id': user.user_id,
                    'viewed_stories': list(user.viewed_stories),
                    'completed_stories': list(user.completed_stories),
                    'favorited_stories': list(user.favorited_stories),
                    'theme_scores': dict(user.theme_scores),
                    'mood_history': [(ts.isoformat(), mood.to_dict()) for ts, mood in user.mood_history],
                    'current_mood': user.current_mood.to_dict() if user.current_mood else None,
                    'story_mood_impact': user.story_mood_impact,
                    'recent_story_views': [(ts.isoformat(), sid) for ts, sid in user.recent_story_views]
                }
                for uid, user in self.users.items()
            },
            'events': [event.to_dict() for event in self.events]
        }

    def load_state(self, state: Dict):
        """Load system state from a dictionary"""
        # Load stories
        self.stories = {
            sid: Story.from_dict(data)
            for sid, data in state.get('stories', {}).items()
        }

        # Rebuild theme index
        self._theme_to_stories = defaultdict(list)
        for sid, story in self.stories.items():
            self._theme_to_stories[story.theme].append(sid)

        # Load users
        self.users = {}
        for uid, user_data in state.get('users', {}).items():
            user = UserProfile(uid)
            user.viewed_stories = set(user_data['viewed_stories'])
            user.completed_stories = set(user_data['completed_stories'])
            user.favorited_stories = set(user_data['favorited_stories'])
            user.theme_scores = defaultdict(float, user_data['theme_scores'])
            user.mood_history = [
                (datetime.fromisoformat(ts), MoodScore.from_dict(mood))
                for ts, mood in user_data['mood_history']
            ]
            if user_data['current_mood']:
                user.current_mood = MoodScore.from_dict(user_data['current_mood'])
            user.story_mood_impact = user_data['story_mood_impact']
            user.recent_story_views = [
                (datetime.fromisoformat(ts), sid)
                for ts, sid in user_data['recent_story_views']
            ]
            self.users[uid] = user

        # Load events
        self.events = [
            AnalyticsEvent.from_dict(event_data)
            for event_data in state.get('events', [])
        ]

Initialize recommender

In [4]:
recommender = StoryRecommender()

Add some stories

In [5]:
recommender.add_story("story1", "The Happy Garden", "nature", ["uplifting", "peaceful"])
recommender.add_story("story2", "Dark Night", "mystery", ["thriller", "suspense"])
recommender.add_story("story3", "Summer Joy", "nature", ["uplifting", "warm"])
recommender.add_story("story4", "The Detective", "mystery", ["investigation", "clever"])

Simulate user events

In [6]:
user_id = "user123"

# User provides initial mood (using simplified PANAS - just a few dimensions for demo)
mood_dimensions = {
    'interested': 3,
    'excited': 2,
    'strong': 3,
    'distressed': 2,
    'upset': 1
}

event1 = AnalyticsEvent(
    user_id,
    'mood_general',
    datetime.now(),
    mood_score=mood_dimensions
)
recommender.add_event(event1)

# User views and completes a story
event2 = AnalyticsEvent(user_id, 'view', datetime.now(), story_id='story1')
recommender.add_event(event2)

event3 = AnalyticsEvent(user_id, 'complete', datetime.now(), story_id='story1')
recommender.add_event(event3)

# User's mood after story (improved)
mood_after = {
    'interested': 4,
    'excited': 4,
    'strong': 4,
    'distressed': 1,
    'upset': 1
}
event4 = AnalyticsEvent(user_id, 'mood_after', datetime.now(),
                       story_id='story1', mood_score=mood_after)
recommender.add_event(event4)

Get recommendations

In [7]:
current_mood_context = {
    'current_mood': MoodScore(mood_after),
    'promotional_tags': ['uplifting']
}

recommendations = recommender.get_recommendations(user_id, current_mood_context, n_recommendations=5)

print("\nRecommendations:")
for story_id, score in recommendations:
    story = recommender.stories[story_id]
    print(f"  {story.title} (Theme: {story.theme}) - Score: {score:.2f}")



Recommendations:
  Summer Joy (Theme: nature) - Score: 5.10
  Dark Night (Theme: mystery) - Score: 0.50
  The Detective (Theme: mystery) - Score: 0.50


Save state

In [8]:
state = recommender.save_state()
print(f"\nState saved. Total events: {len(state['events'])}")


State saved. Total events: 4
