<a href="https://colab.research.google.com/github/Bekyilma/DLH_RecSys/blob/main/Projects/Projet1_Art/Project_1_ArtVibe_Recommender.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ArtVibe Gallery Recommender

Welcome to the **ArtVibe Gallery Recommender** project! Your team will design a web app that recommends artworks based on style and mood, with AI-generated explanations and image displays. No coding required—use dropdowns and buttons to complete daily tasks.

## Introduction to Recommendation Systems

### What is a Recommendation System?
A recommendation system is a tool that suggests items to users based on their preferences or behavior. For example, Netflix suggests movies you might like, or Spotify recommends songs. There are two main types of recommendation systems:
- **Content-Based Filtering**: Recommends items similar to what you like, based on item features (e.g., style, mood of an artwork).
- **Collaborative Filtering**: Recommends items based on what similar users like (not used in this project).

### How Does ArtVibe Work?
ArtVibe uses **content-based filtering** to recommend artworks:
1. **Features**: We use the artwork’s style (e.g., "Oil Painting") and mood (e.g., "calm") as input features.
2. **Algorithm**:
   - **Sentence Embeddings**: We convert artwork descriptions (e.g., "The Starry Night Oil on canvas European") into numerical vectors using a pre-trained model called `all-MiniLM-L6-v2`. This model understands the meaning of text and represents it as numbers.
   - **Cosine Similarity**: We measure how similar the user’s input (e.g., "Oil Painting energetic") is to each artwork’s description vector. The top 5 most similar artworks are recommended.
   - **Weighting**: You can control how much style vs. mood influences the recommendations using sliders (e.g., 70% style, 30% mood).
3. **Generative AI for Explanations**: We use a language model (`distilgpt2`) to generate explanations, such as why an artwork matches your preferences. This model creates human-like text based on the prompt you provide.

### Daily Tasks Explained
- **Day 1**: Define a user persona to understand their art preferences. Test recommendations to see if they match the persona, focusing on visual appeal (images).
- **Day 2**: Design the logic for how recommendations are prioritized (e.g., style over mood). Analyze how different inputs affect outputs.
- **Day 3**: Create a UI mockup to visualize the app. Test recommendations to ensure they fit the UI design.
- **Day 4**: Generate AI explanations to enhance user trust, refining prompts for better quality.
- **Day 5**: Evaluate recommendations for relevance and prepare a presentation to pitch your app.

## Learning Goals
1. **Content-Based Filtering**: Learn how artwork features (style, mood) drive personalized recommendations.
2. **Human-Centered Design (HCI)**: Design a user-friendly app emphasizing trust and engagement.
3. **Generative AI**: Craft AI explanations to make recommendations appealing.
4. **Evaluation**: Assess recommendation quality for relevance and diversity.
5. **Collaboration**: Discuss design choices and present ideas as a team.

## Competition
Compete against the EduResource team!
- **Daily Challenges** (10 points each):
  - Day 1: Best Persona Story (3-5-min pitch, instructor/AI-judged).
  - Day 2: Logic Innovation (unique recommendation rule, instructor/AI-judged).
  - Day 3: UI Creativity (peer vote on mockup, instructor/AI-judged).
  - Day 4: Best Explanation (peer vote, instructor/AI-judged).
  - Day 5: Pitch Battle (50 points, instructor/AI 50%, peer 50%).
- **Leaderboard**: Updated daily on Teams.
- **Awards**: Most Creative, Best Pitch, User Champion (Daily one or two members of the winner team will get a Free Immersive Digital Art Therapy session).

## Instructions
1. Click `Runtime` > `Run all` to set up (~1 min).
2. Complete daily tasks in sections below.
3. Copy outputs (text, images) to Google Slides for your presentation.
4. Ask instructor if stuck.

## Daily Tasks Overview
- **Day 1**: Create persona, test recommendations with images.
- **Day 2**: Design recommendation logic, compare inputs.
- **Day 3**: Create mockup, generate recommendations with images.
- **Day 4**: Generate 3 explanations, refine prompts.
- **Day 5**: Evaluate recommendations, present.

## Install Dependencies
This section installs the libraries needed for the project.

In [1]:
# Install required libraries
!pip install -q numpy pandas sentence-transformers ipywidgets transformers torch pillow googletrans==3.1.0a0 requests

# numpy: For numerical operations
# pandas: For handling datasets
# sentence-transformers: For generating sentence embeddings
# ipywidgets: For interactive dropdowns and buttons
# transformers: For generative AI (explanations)
# torch: Backend for transformers
# pillow: For displaying images
# googletrans: For translating non-English titles
# requests: For making API calls to fetch image URLs

## Load Dataset
This section loads a curated dataset of 100 unique paintings with images from The Metropolitan Museum of Art (CC0 license). All artworks have valid image URLs.

The dataset is filtered in the background to include only paintings with existing image files, ensuring all recommendations have visuals.

In [2]:
import pandas as pd
from googletrans import Translator
import re
import requests

# Initialize translator for non-English titles
translator = Translator()

# Load the Met Museum dataset
url = 'https://media.githubusercontent.com/media/metmuseum/openaccess/refs/heads/master/MetObjects.csv'
data = pd.read_csv(url)

# Filter for paintings with valid metadata
data = data[
    (data['Classification'].str.contains('Painting', na=False)) &
    (data['Artist Display Name'].notnull()) &
    (~data['Artist Display Name'].str.contains('nan', na=False)) &
    (~data['Title'].str.contains('\\(\\?\\)', na=False)) &
    (data['Medium'].notnull()) &
    (data['Culture'].notnull()) &
    (data['Is Public Domain'] == True)
]

# Function to fetch image URL from Met API
def get_image_url(object_id):
    try:
        response = requests.get(f'https://collectionapi.metmuseum.org/public/collection/v1/objects/{object_id}')
        if response.status_code == 200:
            data = response.json()
            return data.get('primaryImage', '')
    except:
        return ''
    return ''

# Add image URLs to the dataset
data['image_url'] = data['Object ID'].apply(get_image_url)

# Filter for artworks with valid image URLs
data = data[data['image_url'] != ''].head(100)

# Create description field
data['description'] = data['Title'] + ' ' + data['Medium'].fillna('') + ' ' + data['Culture'].fillna('')

# Select relevant columns
data = data[['Object ID', 'Title', 'Artist Display Name', 'Classification', 'Medium', 'Culture', 'description', 'image_url']]

# Convert to list of dictionaries
curated_artworks = data.to_dict('records')

# Convert to DataFrame
art_data = pd.DataFrame(curated_artworks)

# Translate non-English titles (if any)
def translate_text(text):
    # Convert input to string to handle potential non-string types
    text_str = str(text)
    if not text_str or not re.search(r'[a-zA-Z]', text_str):
        try:
            # Check if text_str is not an empty string before translating
            if text_str:
                return translator.translate(text_str, dest='en').text
            else:
                return text_str # Return empty string if input was empty
        except Exception as e:
            # print(f"Translation error for text: '{text}', error: {e}") # Optional: for debugging
            return text_str # Return original string in case of translation error
    return text_str

art_data['Title'] = art_data['Title'].apply(translate_text)
art_data['description'] = art_data['description'].astype(str)

print(f'ArtVibe dataset ready ({len(art_data)} paintings with images)!')

  data = pd.read_csv(url)


ArtVibe dataset ready (100 paintings with images)!


In [3]:
art_data.head()

Unnamed: 0,Object ID,Title,Artist Display Name,Classification,Medium,Culture,description,image_url
0,33284,"Portrait of Walter Devereux (1539–1576), First...",British Painter,Miscellaneous-Paintings & Portraits,Oil on wood,British,"Portrait of Walter Devereux (1539–1576), First...",https://images.metmuseum.org/CRDImages/aa/orig...
1,35654,"Cosimo II de' Medici (1590–1621), Grand Duke o...",Justus Sustermans,Miscellaneous-Paintings & Portraits,"Oil on canvas, transferred from wood",Flemish,"Cosimo II de' Medici (1590–1621), Grand Duke o...",https://images.metmuseum.org/CRDImages/aa/orig...
2,35968,清 佚名 台南地區荷蘭城堡|Forts Zeelandia and Provintia ...,Unidentified artist,Paintings,Wall hanging; ink and color on deerskin,China,清 佚名 台南地區荷蘭城堡|Forts Zeelandia and Provintia ...,https://images.metmuseum.org/CRDImages/as/orig...
3,35970,明 丁雲鵬 潯陽送客圖 軸|Song of the Lute,Ding Yunpeng,Paintings,Hanging scroll; ink and color on paper,China,明 丁雲鵬 潯陽送客圖 軸|Song of the Lute Hanging s...,https://images.metmuseum.org/CRDImages/as/orig...
4,35971,清 佚名 倣王翬 倣李成山水圖 軸|Landscape after Li C...,Wang Hui|Unidentified artist,Paintings,Hanging scroll; ink on silk,China,清 佚名 倣王翬 倣李成山水圖 軸|Landscape after Li C...,https://images.metmuseum.org/CRDImages/as/orig...


## Recommendation Section
Generate recommendations with images for Days 1-3.

**Instructions**:
1. Select style and mood using the dropdowns.
2. Adjust the weights for style and mood using the sliders (e.g., 70% style, 30% mood).
3. Click “Get Recommendations.”
4. View top 5 artworks with images.
5. Copy text and screenshot images to Slides.

In [4]:
# Import libraries for recommendation
import pandas as pd
import ipywidgets as widgets
from IPython.display import display, Image
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

# Load the pre-trained sentence transformer model for embeddings
art_model = SentenceTransformer('all-MiniLM-L6-v2')

# Generate embeddings for artwork descriptions
# Embeddings are numerical representations of text that capture meaning
art_embeddings = art_model.encode(art_data['description'].tolist())

# Create dropdown widgets for user input
style_dropdown = widgets.Dropdown(options=['Painting', 'Oil Painting', 'Watercolor'], description='Style:')
mood_dropdown = widgets.Dropdown(options=['calm', 'energetic', 'melancholic'], description='Mood:')

# Create sliders for controlling style and mood weights
style_weight_slider = widgets.FloatSlider(value=0.7, min=0.0, max=1.0, step=0.1, description='Style Weight:')
mood_weight_slider = widgets.FloatSlider(value=0.3, min=0.0, max=1.0, step=0.1, description='Mood Weight:')

# Ensure weights sum to 1
def update_mood_weight(change):
    mood_weight_slider.value = 1.0 - change['new']

def update_style_weight(change):
    style_weight_slider.value = 1.0 - change['new']

style_weight_slider.observe(update_mood_weight, names='value')
mood_weight_slider.observe(update_style_weight, names='value')

button = widgets.Button(description='Get Recommendations')
output = widgets.Output()

def on_button_clicked(b):
    with output:
        # Clear previous output
        output.clear_output()

        # Get user inputs
        style = style_dropdown.value
        mood = mood_dropdown.value
        style_weight = style_weight_slider.value
        mood_weight = mood_weight_slider.value

        # Generate embeddings for style and mood separately
        style_embedding = art_model.encode([style])[0]
        mood_embedding = art_model.encode([mood])[0]

        # Combine embeddings using weights
        # weighted_embedding = (style_weight * style_embedding) + (mood_weight * mood_embedding)
        weighted_embedding = (style_weight * style_embedding + mood_weight * mood_embedding)

        # Compute cosine similarity between weighted embedding and artwork embeddings
        similarities = cosine_similarity([weighted_embedding], art_embeddings)

        # Get indices of the top 5 most similar artworks
        top_indices = similarities.argsort()[0][-5:][::-1]

        # Display the recommendations
        print(f'Recommended Artworks (Style Weight: {style_weight*100:.0f}%, Mood Weight: {mood_weight*100:.0f}%):')
        for idx in top_indices:
            title = art_data.iloc[idx]['Title']
            artist = art_data.iloc[idx]['Artist Display Name']
            classification = art_data.iloc[idx]['Classification']
            description = art_data.iloc[idx]['description']
            image_url = art_data.iloc[idx]['image_url']
            print(f'{title} by {artist} ({classification})')
            print(f'Description: {description}\n')
            try:
                # Display the image using the URL
                display(Image(url=image_url, width=150, height=150))
            except Exception as e:
                print(f'Error displaying image: {e}\n')

button.on_click(on_button_clicked)
display(widgets.VBox([style_dropdown, mood_dropdown, style_weight_slider, mood_weight_slider, button, output]))

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


VBox(children=(Dropdown(description='Style:', options=('Painting', 'Oil Painting', 'Watercolor'), value='Paint…

## Explanation Section
Generate explanations for Day 4.

**Instructions**:
1. Enter a prompt (e.g., “Explain why ‘Madonna and Child’ is recommended for calm art”).
2. Click “Generate Explanation.”
3. Copy to Slides.

In [5]:
# Import libraries for generative AI
import warnings
from transformers import pipeline
import ipywidgets as widgets
from IPython.display import display
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import re

# Suppress warnings from transformers to hide pad_token_id and max_new_tokens messages
warnings.filterwarnings("ignore", category=UserWarning, module="transformers")

# Initialize the text generation pipeline with DistilBART
# DistilBART is a model designed for summarization and text generation, offering better coherence than bloom-560m
explainer = pipeline('text-generation', model='sshleifer/distilbart-cnn-12-6', device=-1)

# Load the pre-trained sentence transformer model for embeddings (same as recommendation section)
art_model = SentenceTransformer('all-MiniLM-L6-v2')

# Create widgets for user input
prompt_input = widgets.Text(value='Explain why "Madonna and Child" is recommended for a user who likes calm art', description='Prompt:', layout={'width': '500px'})
style_weight_input = widgets.FloatSlider(value=0.7, min=0.0, max=1.0, step=0.1, description='Style Weight:')
mood_weight_input = widgets.FloatSlider(value=0.3, min=0.0, max=1.0, step=0.1, description='Mood Weight:')
gen_button = widgets.Button(description='Generate Explanation')
gen_output = widgets.Output()

# Ensure weights sum to 1
def update_mood_weight(change):
    mood_weight_input.value = 1.0 - change['new']

def update_style_weight(change):
    style_weight_input.value = 1.0 - change['new']

style_weight_input.observe(update_mood_weight, names='value')
mood_weight_input.observe(update_style_weight, names='value')

def on_gen_button_clicked(b):
    with gen_output:
        # Clear previous output
        gen_output.clear_output()

        # Get the user prompt
        base_prompt = prompt_input.value

        # Extract the artwork title and user preference from the prompt
        # Assumes format: "Explain why \"Artwork Title\" is recommended for a user who likes calm art"
        try:
            artwork_title = base_prompt.split('\"')[1]
            user_preference = base_prompt.split("who likes ")[1]
        except IndexError:
            artwork_title = "the artwork"
            user_preference = "calm art"

        # Extract style and mood from the user preference (assumes format: "calm art")
        mood = user_preference.split()[0]  # e.g., "calm"
        style = None
        for s in ['Painting', 'Oil Painting', 'Watercolor']:
            if s in art_data['description'].iloc[0]:
                style = s
                break
        if not style:
            style = "Painting"  # Default style

        # Find the artwork in the dataset to get metadata
        artwork_row = art_data[art_data['Title'].str.contains(artwork_title, na=False, case=False)]
        if not artwork_row.empty:
            artist = artwork_row.iloc[0]['Artist Display Name']
            medium = artwork_row.iloc[0]['Medium']
            culture = artwork_row.iloc[0]['Culture']
            description = artwork_row.iloc[0]['description']
        else:
            artist = "the artist"
            medium = "traditional medium"
            culture = "its cultural context"
            description = artwork_title

        # Recompute the similarity score for this artwork using the recommendation logic
        style_weight = style_weight_input.value
        mood_weight = mood_weight_input.value

        # Generate embeddings for style and mood separately
        style_embedding = art_model.encode([style])[0]
        mood_embedding = art_model.encode([mood])[0]

        # Combine embeddings using weights
        weighted_embedding = (style_weight * style_embedding + mood_weight * mood_embedding)

        # Generate embedding for the artwork's description
        artwork_embedding = art_model.encode([description])[0]

        # Compute cosine similarity
        similarity_score = cosine_similarity([weighted_embedding], [artwork_embedding])[0][0]

        # Create a detailed prompt with dataset and recommendation context
        focused_prompt = (
            f"You are an art historian explaining a painting recommendation. The painting '{artwork_title}' by {artist}, "
            f"created using {medium} in {culture}, was recommended for a user who prefers {user_preference}. "
            f"The recommendation was based on a {style_weight*100:.0f}% weighting for style ('{style}') and a {mood_weight*100:.0f}% "
            f"weighting for mood ('{mood}'), resulting in a similarity score of {similarity_score:.2f}. "
            f"Explain why this painting aligns with the user's preference, focusing on its artistic elements (e.g., colors, composition, emotional tone) "
            f"and how the style and mood weights contributed to the recommendation. Keep the explanation concise and under 50 words."
        )

        try:
            # Generate explanation using DistilBART
            explanation = explainer(
                focused_prompt,
                max_length=50,
                num_return_sequences=1,
                do_sample=True,
                top_k=40,
                top_p=0.9,
                truncation=True
            )[0]['generated_text']

            # Clean the output to remove the prompt if repeated
            if explanation.startswith(focused_prompt):
                explanation = explanation[len(focused_prompt):].strip()

            # Trim to the first complete sentence for conciseness
            first_sentence_end = explanation.find('. ') + 1
            if first_sentence_end > 0:
                explanation = explanation[:first_sentence_end]

            # If the explanation is too short or empty, provide a default
            if len(explanation.split()) < 5:
                explanation = (
                    f"'{artwork_title}' aligns with {user_preference} due to its {medium} and {culture} roots, "
                    f"evoking a serene mood with soft colors. The {style_weight*100:.0f}% style weight emphasized its {style} form."
                )

            print('Explanation:', explanation)
        except Exception as e:
            # Fallback explanation if generation fails
            explanation = (
                f"'{artwork_title}' aligns with {user_preference} due to its {medium} and {culture} roots, "
                f"evoking a serene mood with soft colors. The {style_weight*100:.0f}% style weight emphasized its {style} form."
            )
            print('Explanation:', explanation)

gen_button.on_click(on_gen_button_clicked)
display(widgets.VBox([prompt_input, style_weight_input, mood_weight_input, gen_button, gen_output]))

Device set to use cpu


VBox(children=(Text(value='Explain why "Madonna and Child" is recommended for a user who likes calm art', desc…

## Evaluation Section
Rate recommendations for Day 5.

**Instructions**:
1. Enter artwork name.
2. Rate 1-5 stars.
3. Repeat for 3 artworks.

In [6]:
# Create widgets for evaluation
import ipywidgets as widgets
from IPython.display import display

# Text input for artwork name
item_input = widgets.Text(description='Artwork Name:', placeholder='e.g., Madonna and Child')

# Dropdown for rating (1-5 stars)
rating_dropdown = widgets.Dropdown(options=[1, 2, 3, 4, 5], description='Rating (1-5):')

# Button to submit rating
eval_button = widgets.Button(description='Submit Rating')

# Output area for displaying ratings
eval_output = widgets.Output()
ratings = []

def on_eval_button_clicked(b):
    with eval_output:
        # Clear previous output
        eval_output.clear_output()

        # Get user input
        item = item_input.value
        rating = rating_dropdown.value

        # Store the rating
        ratings.append((item, rating))

        # Display the recorded rating and all ratings
        print(f'Recorded: {item} - {rating} stars')
        print('Current Ratings:', ratings)

eval_button.on_click(on_eval_button_clicked)
display(widgets.VBox([item_input, rating_dropdown, eval_button, eval_output]))

VBox(children=(Text(value='', description='Artwork Name:', placeholder='e.g., Madonna and Child'), Dropdown(de…