<a href="https://colab.research.google.com/github/MA-ysr/MDE_XAI_MODELS2024/blob/main/RecommenderSystem_MDE_XAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Install packages**

In [1]:
!pip install pyecore

Collecting pyecore
  Downloading pyecore-0.15.1-py3-none-any.whl (43 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.7/43.7 kB[0m [31m775.6 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting ordered-set>=4.0.1 (from pyecore)
  Downloading ordered_set-4.1.0-py3-none-any.whl (7.6 kB)
Collecting restrictedpython>=5.3,>=6.1 (from pyecore)
  Downloading RestrictedPython-7.1-py3-none-any.whl (26 kB)
Collecting future-fstrings (from pyecore)
  Downloading future_fstrings-1.2.0-py2.py3-none-any.whl (6.1 kB)
Installing collected packages: restrictedpython, ordered-set, future-fstrings, pyecore
Successfully installed future-fstrings-1.2.0 ordered-set-4.1.0 pyecore-0.15.1 restrictedpython-7.1


In [2]:
!pip install pyecoregen

Collecting pyecoregen
  Downloading pyecoregen-0.5.1-py3-none-any.whl (13 kB)
Collecting pymultigen (from pyecoregen)
  Downloading pymultigen-0.2.0-py3-none-any.whl (12 kB)
Collecting autopep8 (from pyecoregen)
  Downloading autopep8-2.3.1-py2.py3-none-any.whl (45 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.7/45.7 kB[0m [31m735.4 kB/s[0m eta [36m0:00:00[0m
[?25hCollecting pycodestyle>=2.12.0 (from autopep8->pyecoregen)
  Downloading pycodestyle-2.12.0-py2.py3-none-any.whl (31 kB)
Installing collected packages: pymultigen, pycodestyle, autopep8, pyecoregen
Successfully installed autopep8-2.3.1 pycodestyle-2.12.0 pyecoregen-0.5.1 pymultigen-0.2.0


**Import packages**

In [20]:
import numpy as np
from sklearn.decomposition import NMF
from sklearn.preprocessing import MinMaxScaler
from pyecore.resources import ResourceSet, URI
from pyecore.ecore import EPackage
from pyecore.resources.xmi import XMIResource
from pyecore.ecore import EFloat
from functools import partial

**Load the metamodel & Generate classes**

In [17]:
# Load the metamodel
rset = ResourceSet()
resource = rset.get_resource(URI('content_recommendation.ecore'))
mm_root = resource.contents[0]
rset.metamodel_registry[mm_root.nsURI] = mm_root

# Generate Python classes from the metamodel
from pyecoregen.ecore import EcoreGenerator
generator = EcoreGenerator()
generator.generate(mm_root, 'content_recommendation_mm')


**Import generated classes**

In [18]:
from content_recommendation_mm.airecommendationsystem import (  # Import from the single file
    User, Content, Rating, Recommendation, Explanation, ExplanationFactor,
    AIRecommendationEngine, AIModel, AIModelTracer, PredictionTrace, ExplanationType,
    getEClassifier, eClassifiers
)
import content_recommendation_mm.airecommendationsystem as mm  # Import the module itself

**Add method to calculate feature importance and trace prediction**

In [50]:
def trace_prediction(self, user, content, recommendation):
    trace = mm.PredictionTrace()
    trace.userId = user.id
    trace.contentId = content.id
    trace.predictedRating = recommendation.predictedRating
    trace.explanation = recommendation.explanation
    self.traces.append(trace)
    return trace

mm.AIModelTracer.tracePrediction = trace_prediction

**AI-based Recommendation Engine**

In [74]:
class OriginalAIModel(AIModel):
    def __init__(self):
        super().__init__()
        self.isExplainable = False
        self.model = NMF(n_components=2, init='random', random_state=42)
        self.user_factors = None
        self.item_factors = None
        self.scaler = MinMaxScaler(feature_range=(1.0, 5.0))  # Use float values

    def fit(self, rating_matrix):
        self.user_factors = self.model.fit_transform(rating_matrix)
        self.item_factors = self.model.components_.T

        # Fit the scaler on all predicted ratings
        all_predictions = np.dot(self.user_factors, self.item_factors.T).flatten()
        self.scaler.fit(all_predictions.reshape(-1, 1))

    def predict(self, user_idx, content_idx):
        user_vector = self.user_factors[user_idx]
        content_vector = self.item_factors[content_idx]
        predicted_rating = float(np.dot(user_vector, content_vector))  # Convert to float
        return float(self.scaler.transform([[predicted_rating]])[0][0])  # Convert to float

    def extractFactors(self, user_idx, content_idx):
        return self.user_factors[user_idx].astype(float), self.item_factors[content_idx].astype(float)

class ExplainableAIModel(AIModel):
    def __init__(self, original_model):
        super().__init__()
        self.isExplainable = True
        self.original_model = original_model

    def fit(self, rating_matrix):
        self.original_model.fit(rating_matrix)

    def predict(self, user_idx, content_idx):
        return self.original_model.predict(user_idx, content_idx)

    def extractFactors(self, user_idx, content_idx):
        return self.original_model.extractFactors(user_idx, content_idx)

    def predictAndExplain(self, user_idx, content_idx):
        predicted_rating = self.predict(user_idx, content_idx)
        user_vector, content_vector = self.extractFactors(user_idx, content_idx)

        explanation = mm.Explanation()
        explanation.type = mm.ExplanationType.FACTOR_BASED
        explanation.content = f"The predicted rating of {predicted_rating:.2f} is based on the following factors:"

        factor_values = [abs(float(u*c)) for u, c in zip(user_vector, content_vector)]
        total_importance = sum(factor_values)

        for i, (user_factor, content_factor) in enumerate(zip(user_vector, content_vector)):
            factor = ExplanationFactorImpl()  # Use the custom implementation
            factor.name = f"Latent Factor {i+1}"
            factor.value = float(user_factor * content_factor)
            factor.importance = float(abs(factor.value) / total_importance) if total_importance != 0 else 0.0
            explanation.factors.append(factor)

        return predicted_rating, explanation

def transform_model(original_model):
    return ExplainableAIModel(original_model)

class RealTimeExplainer:
    def __init__(self, ai_model):
        self.ai_model = ai_model
        self.explanation_history = []

    def explain_in_realtime(self, user_idx, content_idx):
        predicted_rating, explanation = self.ai_model.predictAndExplain(user_idx, content_idx)
        self.explanation_history.append(explanation)
        return predicted_rating, explanation

class ExplanationFactorImpl(mm.ExplanationFactor):
    def __init__(self):
        super().__init__()
        self._importance = 0.0

    @property
    def importance(self):
        return self._importance

    @importance.setter
    def importance(self, value):
        self._importance = float(value)  # Ensure the value is always converted to float

# Replace the original ExplanationFactor with our implementation
mm.ExplanationFactor = ExplanationFactorImpl

class AIRecommendationEngineImpl(AIRecommendationEngine):
    def __init__(self):
        super().__init__()
        self.original_model = OriginalAIModel()
        self.explainable_model = None
        self.real_time_explainer = None
        self.tracer = mm.AIModelTracer()
        self.user_id_map = {}
        self.content_id_map = {}

    def trainModel(self):
        self.user_id_map = {user.id: i for i, user in enumerate(self.users)}
        self.content_id_map = {content.id: i for i, content in enumerate(self.contentCatalog)}

        rating_matrix = np.zeros((len(self.users), len(self.contentCatalog)))
        for rating in self.ratings:
            user_idx = self.user_id_map[rating.user.id]
            content_idx = self.content_id_map[rating.content.id]
            rating_matrix[user_idx, content_idx] = rating.score

        rating_matrix += 0.01  # Add small constant to avoid zero entries
        self.original_model.fit(rating_matrix)
        self.explainable_model = transform_model(self.original_model)
        self.real_time_explainer = RealTimeExplainer(self.explainable_model)

    def predictAndExplain(self, user, content):
        user_idx = self.user_id_map[user.id]
        content_idx = self.content_id_map[content.id]

        predicted_rating, explanation = self.real_time_explainer.explain_in_realtime(user_idx, content_idx)

        recommendation = mm.Recommendation()
        recommendation.user = user
        recommendation.recommendedContent = content
        recommendation.predictedRating = float(predicted_rating)
        recommendation.explanation = explanation  # Ensure this line is present

        self.tracer.tracePrediction(user, content, recommendation)

        return recommendation, explanation

    def add_user(self, id, name):
        user = mm.User()
        user.id = id
        user.name = name
        self.users.append(user)
        return user

    def add_content(self, id, title, genre):
        content = mm.Content()
        content.id = id
        content.title = title
        content.genre = genre
        self.contentCatalog.append(content)
        return content

    def add_rating(self, user, content, score):
        rating = mm.Rating()
        rating.user = user
        rating.content = content
        rating.score = float(score)
        self.ratings.append(rating)
        return rating

**Create sample data**

In [75]:
# Create and populate the AI Recommendation Engine
engine = AIRecommendationEngineImpl()

# Add users
for i in range(1, 6):
    engine.add_user(i, f"User{i}")

# Add content
content_data = [
    (1, "Action Movie 1", "Action"),
    (2, "Comedy Movie 1", "Comedy"),
    (3, "Drama Movie 1", "Drama"),
    (4, "Action Movie 2", "Action"),
    (5, "Comedy Movie 2", "Comedy")
]

for id, title, genre in content_data:
    engine.add_content(id, title, genre)

# Add ratings
for user in engine.users:
    for content in engine.contentCatalog:
        if np.random.random() > 0.2:  # 80% chance of rating each item
            score = np.random.uniform(1, 5)  # Generate a float between 1 and 5
            engine.add_rating(user, content, score)

print(f"Created {len(engine.ratings)} ratings")

Created 22 ratings


**Explanation format**

In [78]:
def generate_user_friendly_explanation(user, content, predicted_rating, explanation):
    #print(f"Debug: Explanation for {content.title} - {explanation}")  # Debug print
    if explanation is None or not explanation.factors:
        return f"We predict you'll rate '{content.title}' {predicted_rating:.1f} out of 5 stars, but we don't have enough information to explain why."

    non_zero_factors = [f for f in explanation.factors if f.value != 0]

    if len(non_zero_factors) == 0:
        return f"We predict you'll rate '{content.title}' {predicted_rating:.1f} out of 5 stars, but we can't determine the specific reasons."

    explanation_text = f"We predict you'll rate '{content.title}' {predicted_rating:.1f} out of 5 stars.\n"

    if len(non_zero_factors) == 1:
        factor = non_zero_factors[0]
        if factor.name == "Latent Factor 1":
            explanation_text += f"This is mainly because it seems to match your general movie preferences (Impact: {factor.importance:.0%})."
        else:
            explanation_text += f"This is mainly because it's similar to other movies you've enjoyed (Impact: {factor.importance:.0%})."
    else:
        explanation_text += "This is based on a combination of factors:\n"
        for factor in non_zero_factors:
            if factor.name == "Latent Factor 1":
                explanation_text += f"- It matches your general movie preferences (Impact: {factor.importance:.0%})\n"
            else:
                explanation_text += f"- It's similar to other movies you've enjoyed (Impact: {factor.importance:.0%})\n"

    return explanation_text

**Training the model and generate  explanations**

In [79]:
# Train the model
engine.trainModel()

# Generate and print recommendations with explanations for each user
for user in engine.users:
    print(f"\nRecommendations for {user.name}:")
    for content in engine.contentCatalog:
        recommendation, explanation = engine.predictAndExplain(user, content)
        user_friendly_explanation = generate_user_friendly_explanation(user, content, recommendation.predictedRating, explanation)
        print(user_friendly_explanation)

# Print tracing information
print("\nPrediction Traces:")
for trace in engine.tracer.traces:
    print(f"User {trace.userId} - Content {trace.contentId}: Predicted Rating {trace.predictedRating:.2f}")
    if trace.explanation and trace.explanation.factors:
        print(f"  Explanation: {trace.explanation.content}")
        for factor in trace.explanation.factors:
            print(f"    - {factor.name}: Value = {factor.value:.2f}, Importance = {factor.importance:.2f}")
    else:
        print("  No detailed explanation available.")


Recommendations for User1:
We predict you'll rate 'Action Movie 1' 1.0 out of 5 stars, but we can't determine the specific reasons.
We predict you'll rate 'Comedy Movie 1' 4.0 out of 5 stars.
This is mainly because it's similar to other movies you've enjoyed (Impact: 100%).
We predict you'll rate 'Drama Movie 1' 2.3 out of 5 stars.
This is mainly because it's similar to other movies you've enjoyed (Impact: 100%).
We predict you'll rate 'Action Movie 2' 3.2 out of 5 stars.
This is mainly because it's similar to other movies you've enjoyed (Impact: 100%).
We predict you'll rate 'Comedy Movie 2' 3.4 out of 5 stars.
This is mainly because it's similar to other movies you've enjoyed (Impact: 100%).

Recommendations for User2:
We predict you'll rate 'Action Movie 1' 3.6 out of 5 stars.
This is mainly because it seems to match your general movie preferences (Impact: 100%).
We predict you'll rate 'Comedy Movie 1' 3.1 out of 5 stars.
This is based on a combination of factors:
- It matches your