[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mparrott-at-wiris/aimodelshare/blob/master/notebooks/moral_compass_model_submissions_and_challenge_progress.ipynb)


# Moral Compass: Multi-Model Submissions & Challenge Progress

This notebook:
1. Sets up (or attaches to) the same playground initialized in `moral_compass_playground_setup.ipynb`.
2. Demonstrates credential/user setup (best effort, non-fatal if not available).
3. Submits three model types (scikit-learn, Keras, PyTorch) using a simple tabular dataset.
4. Updates Moral Compass metrics (accuracy + fairness placeholder) via `MoralcompassApiClient` and `ChallengeManager`.
5. Shows progressive improvement and challenge task/question completion for the Justice & Equity challenge.

If credentials are missing, protected API operations will be skipped gracefully.

## 1. Install Dependencies

In [None]:
%pip install -q aimodelshare scikit-learn pandas numpy seaborn tensorflow torch torchvision --upgrade

## 2. Imports & Environment

In [None]:
import os, time, json, logging, math
import numpy as np, pandas as pd, seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import aimodelshare as ai
from aimodelshare.playground import ModelPlayground
from aimodelshare.moral_compass import MoralcompassApiClient
from aimodelshare.moral_compass.challenge import ChallengeManager, JusticeAndEquityChallenge

logging.basicConfig(level=logging.INFO)
print('aimodelshare version:', ai.__version__)

## 3. Configuration
Ensure this matches the playground created in the first notebook.

In [None]:
PLAYGROUND_ID = 'moral_compass_quickstart'  # must match first notebook
TABLE_SUFFIX = '-mc'
TABLE_ID = f"{PLAYGROUND_ID}{TABLE_SUFFIX}"
USERNAME = os.getenv('AIMODELSHARE_USERNAME') or os.getenv('username') or 'demo_user'
PLAYGROUND_PRIVATE = True  # set False if you made it public
PLAYGROUND_URL_PLACEHOLDER = f'https://example.com/playground/{PLAYGROUND_ID}'
print('Configured TABLE_ID:', TABLE_ID)
print('Using USERNAME:', USERNAME)

## 4. Credential & Token Setup (Best Effort)
Skips silently if not available.

In [None]:
try:
    from aimodelshare.aws import get_aws_token
    token = get_aws_token()
    if token:
        os.environ['AWS_TOKEN'] = token
        print('AWS token acquired.')
except Exception as e:
    print('AWS token acquisition skipped:', e)

try:
    from aimodelshare.modeluser import get_jwt_token, create_user_getkeyandpassword
    if os.getenv('AIMODELSHARE_USERNAME') and os.getenv('AIMODELSHARE_PASSWORD'):
        get_jwt_token(os.getenv('AIMODELSHARE_USERNAME'), os.getenv('AIMODELSHARE_PASSWORD'))
        try:
            create_user_getkeyandpassword()
        except Exception:
            pass
        print('JWT token retrieved.')
    else:
        print('No credentials in env; continuing without protected ops.')
except Exception as e:
    print('JWT retrieval skipped:', e)

## 5. Attach to Existing Playground (or Create if Missing)
Suppress errors if already created.

In [None]:
playground = ModelPlayground(input_type='tabular', task_type='classification', private=PLAYGROUND_PRIVATE)
try:
    playground.create(eval_data=[], public=not PLAYGROUND_PRIVATE)
    print('Playground created (or re-created).')
except Exception as e:
    print('Likely already exists, attach OK:', e)

## 6. Dataset Preparation
Small tabular classification (predict penguin sex).

In [None]:
penguins = sns.load_dataset('penguins').dropna()
FEATURES = ['bill_length_mm','bill_depth_mm','flipper_length_mm','body_mass_g']
X = penguins[FEATURES]
y = penguins['sex']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=7)
print('Train size:', X_train.shape, 'Test size:', X_test.shape)

## 7. Model 1: Scikit-Learn Logistic Regression

In [None]:
sklearn_pipeline = Pipeline([
    ('scaler', StandardScaler()),
    ('lr', LogisticRegression(max_iter=300))
])
sklearn_pipeline.fit(X_train, y_train)
sk_preds = sklearn_pipeline.predict(X_test)
sk_acc = accuracy_score(y_test, sk_preds)
print('Sklearn model accuracy:', sk_acc)

### Submit Sklearn Model (Experiment)

In [None]:
meta_sklearn = {
    'description': 'Sklearn LogisticRegression baseline',
    'tags': 'moral_compass,sklearn',
    'Moral_Compass_Fairness': '0'
}
try:
    playground.submit_model(
        model=sklearn_pipeline,
        preprocessor=None,
        prediction_submission=sk_preds,
        input_dict=meta_sklearn,
        submission_type='experiment'
    )
    print('Sklearn model submitted.')
except Exception as e:
    print('Sklearn submission skipped/failed:', e)


## 8. Model 2: Keras (Simple Dense Network)
Note: For simplicity we use the numeric features directly after standardization.

In [None]:
import tensorflow as tf
from tensorflow import keras

scaler = StandardScaler().fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

keras_model = keras.Sequential([
    keras.layers.Input(shape=(len(FEATURES),)),
    keras.layers.Dense(16, activation='relu'),
    keras.layers.Dense(8, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')
])
keras_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

label_map = {label: idx for idx, label in enumerate(sorted(y.unique()))}
y_train_enc = y_train.map(label_map).values
y_test_enc = y_test.map(label_map).values

keras_model.fit(X_train_scaled, y_train_enc, epochs=10, batch_size=16, verbose=0)
keras_probs = keras_model.predict(X_test_scaled, verbose=0).ravel()
keras_preds_bin = (keras_probs > 0.5).astype(int)
inv_label_map = {v:k for k,v in label_map.items()}
keras_preds = [inv_label_map[v] for v in keras_preds_bin]
keras_acc = accuracy_score(y_test, keras_preds)
print('Keras model accuracy:', keras_acc)

### Submit Keras Model

In [None]:
meta_keras = {
    'description': 'Keras dense network',
    'tags': 'moral_compass,keras',
    'Moral_Compass_Fairness': '0'
}
try:
    playground.submit_model(
        model=keras_model,
        preprocessor=scaler,
        prediction_submission=keras_preds,
        input_dict=meta_keras,
        submission_type='experiment'
    )
    print('Keras model submitted.')
except Exception as e:
    print('Keras submission skipped/failed:', e)


## 9. Model 3: PyTorch (Simple MLP)
Demonstrates a basic PyTorch workflow.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim

torch.manual_seed(0)

X_train_t = torch.tensor(scaler.transform(X_train), dtype=torch.float32)
X_test_t = torch.tensor(scaler.transform(X_test), dtype=torch.float32)
y_train_t = torch.tensor(y_train.map(label_map).values, dtype=torch.float32).view(-1,1)

class SimpleMLP(nn.Module):
    def __init__(self, in_features):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(in_features, 16), nn.ReLU(),
            nn.Linear(16, 8), nn.ReLU(),
            nn.Linear(8, 1), nn.Sigmoid()
        )
    def forward(self, x): return self.net(x)

torch_model = SimpleMLP(len(FEATURES))
criterion = nn.BCELoss()
optimizer = optim.Adam(torch_model.parameters(), lr=0.01)

for epoch in range(15):
    optimizer.zero_grad()
    out = torch_model(X_train_t)
    loss = criterion(out, y_train_t)
    loss.backward()
    optimizer.step()
    if (epoch+1) % 5 == 0:
        print(f"Epoch {epoch+1} loss: {loss.item():.4f}")

torch_probs = torch_model(X_test_t).detach().numpy().ravel()
torch_preds_bin = (torch_probs > 0.5).astype(int)
torch_preds = [inv_label_map[v] for v in torch_preds_bin]
torch_acc = accuracy_score(y_test, torch_preds)
print('PyTorch model accuracy:', torch_acc)

### Submit PyTorch Model

In [None]:
meta_torch = {
    'description': 'PyTorch MLP model',
    'tags': 'moral_compass,torch',
    'Moral_Compass_Fairness': '0'
}
try:
    playground.submit_model(
        model=torch_model,
        preprocessor=scaler,
        prediction_submission=torch_preds,
        input_dict=meta_torch,
        submission_type='experiment'
    )
    print('PyTorch model submitted.')
except Exception as e:
    print('PyTorch submission skipped/failed:', e)


## 10. Retrieve Playground Leaderboard (If Available)

In [None]:
try:
    lb = playground.get_leaderboard()
    if isinstance(lb, dict):
        lb_df = pd.DataFrame(lb)
    else:
        lb_df = lb
    print(lb_df.head())
except Exception as e:
    print('Leaderboard retrieval skipped:', e)


## 11. Moral Compass Metric Updates
We now:
1. Create (or reuse) the moral compass table.
2. Perform incremental updates to metrics (accuracy, fairness placeholder).
3. Use `ChallengeManager` to simulate progress through tasks/questions.

In [None]:
api = MoralcompassApiClient()
try:
    api.create_table(
        table_id=TABLE_ID,
        display_name=f'Moral Compass - {PLAYGROUND_ID}',
        playground_url=PLAYGROUND_URL_PLACEHOLDER
    )
    print('Moral compass table created.')
except Exception as e:
    print('Table may already exist or creation skipped:', e)


### 11.1 Initial Metric Seed
Start with minimal accuracy/fairness placeholders.

In [None]:
try:
    seed_resp = api.update_moral_compass(
        table_id=TABLE_ID,
        username=USERNAME,
        metrics={'accuracy': float(sk_acc), 'fairness': 0.0},
        tasks_completed=0, total_tasks=6,
        questions_correct=0, total_questions=6,
        primary_metric='accuracy'
    )
    print('Seed response:', seed_resp)
except Exception as e:
    print('Initial metric seed skipped:', e)


### 11.2 ChallengeManager Progressive Updates
Simulate improvement: fairness metric increases, tasks & questions completed.

In [None]:
manager = ChallengeManager(table_id=TABLE_ID, username=USERNAME, api_client=api)
manager.set_metric('accuracy', float(max(sk_acc, keras_acc, torch_acc)), primary=True)
manager.set_metric('fairness', 0.0)

challenge = manager.challenge
fairness_stages = [0.0, 0.3, 0.55, 0.7]
stage_index = 0

for task in challenge.tasks:
    manager.complete_task(task.id)
    for q in task.questions:
        manager.answer_question(task.id, q.id, q.correct_index)
    if stage_index < len(fairness_stages)-1:
        stage_index += 1
    manager.set_metric('fairness', fairness_stages[stage_index])
    try:
        resp = manager.sync()
        print(f"After task {task.id} sync -> moralCompassScore: {resp.get('moralCompassScore'):.4f} | metrics: {resp.get('metrics')}")
    except Exception as e:
        print('Sync skipped:', e)

print('Final local summary:', manager.get_progress_summary())

### 11.3 Leaderboard User Entry Check

In [None]:
try:
    users_resp = api.list_users(TABLE_ID, limit=200)
    found = [u for u in users_resp.get('users', []) if u.get('username') == USERNAME]
    if found:
        print('User leaderboard entry:', found[0])
    else:
        print('User not located in leaderboard response yet.')
except Exception as e:
    print('Leaderboard user fetch skipped:', e)


## 12. Updating Moral Compass Fairness Metadata in Future Submissions
To reflect a new fairness assessment in the playground's model metadata, submit a new model (or re-submit) with an updated `Moral_Compass_Fairness` key (e.g. '0.55'). The leaderboard moral compass fairness *score* is distinct and managed through the API calls above.

Example snippet (not executed automatically):
```python
playground.submit_model(
    model=sklearn_pipeline,
    preprocessor=None,
    prediction_submission=sk_preds,
    input_dict={
        'description': 'Updated fairness metadata',
        'tags': 'moral_compass,sklearn',
        'Moral_Compass_Fairness': '0.55'
    },
    submission_type='experiment'
)
```
This metadata value is informational inside the playground context; the moral compass API metrics remain authoritative for scoring.

## 13. Summary
- Three model types submitted (sklearn, Keras, PyTorch) with initial fairness placeholder metadata.
- Moral Compass metrics (accuracy + fairness) updated progressively via `ChallengeManager`.
- Justice & Equity challenge tasks/questions completed with incremental score gains.

End of notebook.