# NEXUS Hello World

**Goal:** Get familiar with the NEXUS model

This notebook demonstrates the core workflow:
1. Installation & Authentication
2. The Estimator Loop: Instantiate → Fit → Predict
3. Save & Resume patterns

## 1. Installation & Setup

First, install the Fundamental SDK:

In [None]:
# Uncomment to install
# !pip install fundamental-client

### Authentication

Set up your API key. You can either:
- Set the `FUNDAMENTAL_API_KEY` environment variable, or
- Pass it directly to the client (shown below)

In [None]:
from fundamental import Fundamental, NEXUSClassifier, NEXUSRegressor
import fundamental
import pandas as pd
import numpy as np

# Initialize the client
client = Fundamental(api_url= 'https://api-demo.fundamental-dev.tech',api_key="<api_key>")
fundamental.set_client(client)

## Data 

In [None]:
# Create sample classification dataset
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Generate synthetic data
X, y = make_classification(
    n_samples=1000,
    n_features=20,
    n_informative=15,
    n_redundant=5,
    random_state=42
)

# Split into train/test
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

print(f"Training set: {X_train.shape}")
print(f"Test set: {X_test.shape}")

## Train a Classification Model

### The Estimator Loop: Instantiate → Fit → Predict

NEXUS follows the scikit-learn API convention for familiarity.

In [None]:
# Instantiate the classifier
# mode="quality" ensures higher accuracy for benchmarks
clf = NEXUSClassifier(
    mode="quality"  # Options: "speed" or "quality"
)

print("Starting training...")

# Fit the model (synchronous training)
clf.fit(X_train, y_train)

print(f"✅ Training Complete!")
print(f"Model ID: {clf.trained_model_id_}")

### Make Predictions

In [None]:
# Predict on test set
predictions = clf.predict(X_test)

print(f"Predictions shape: {predictions.shape}")
print(f"First 10 predictions: {predictions[:10]}")

### Evaluate Model Performance

In [None]:
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

# Calculate accuracy
accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy:.4f}")

# Detailed classification report
print("\nClassification Report:")
print(classification_report(y_test, predictions))

# Confusion matrix
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, predictions))

### Probability Predictions

In [None]:
# Get probability estimates
probabilities = clf.predict_proba(X_test)

print(f"Probability shape: {probabilities.shape}")
print(f"First 5 probability estimates:\n{probabilities[:5]}")

## Train a Regression Model

The workflow is identical for regression tasks:

In [None]:
# Create regression dataset
from sklearn.datasets import make_regression

X_reg, y_reg = make_regression(
    n_samples=500,
    n_features=20,
    n_informative=15,
    noise=10,
    random_state=42
)

X_train_reg, X_test_reg, y_train_reg, y_test_reg = train_test_split(
    X_reg, y_reg, test_size=0.2, random_state=42
)

In [None]:
# Train regressor
reg = NEXUSRegressor(mode="quality")

print("Training regression model...")
reg.fit(X_train_reg, y_train_reg)

print(f"✅ Regression Training Complete!")
print(f"Model ID: {reg.trained_model_id_}")

In [None]:
# Make predictions
reg_predictions = reg.predict(X_test_reg)

# Evaluate
from sklearn.metrics import mean_squared_error, r2_score

mse = mean_squared_error(y_test_reg, reg_predictions)
r2 = r2_score(y_test_reg, reg_predictions)

print(f"Mean Squared Error: {mse:.4f}")
print(f"R² Score: {r2:.4f}")

## 5. Save & Resume Models

### Option A: Using Model ID

Every trained model has a unique `model_id` that you can use to reload it later:

In [None]:
# Save the model ID
saved_model_id = clf.trained_model_id_
print(f"Saved Model ID: {saved_model_id}")

# Later, in a new session, reload the model
clf_reloaded = NEXUSClassifier()
clf_reloaded.trained_model_id_ = saved_model_id

# Use the reloaded model
reloaded_predictions = clf_reloaded.predict(X_test)
print(f"Reloaded model predictions: {reloaded_predictions[:5]}")

### Option B: List All Models

You can retrieve all your trained models from the registry:

In [None]:
# List all models
models = client.models.list()

print(f"Total models: {len(models)}")
print("\nModel Details:")
for model in models[:5]:  # Show first 5
    print(f"  - ID: {model}")

## 6. Model Management

### Tagging Models

Add metadata to your models for organization:

In [None]:
# Tag the model
attrs={
    "project": "customer_churn",
    "version": "1.0",
    "stage": "development",
    "author": "data-science-team"
}

clf_reloaded.set_attributes(attrs
)

print("✅ Model tagged successfully")

### Deleting Models

Clean up models you no longer need:

In [None]:
# Delete a specific model (uncomment to use)
# client.models.delete(id=saved_model_id)
# print(f"Model {saved_model_id} deleted")

## 7. Training Mode Comparison


In [None]:
# Train multiple models in parallel
configs = [
    {"mode": "speed", "name": "fast_model"},
    {"mode": "quality", "name": "quality_model"},
]

jobs = []
for config in configs:
    clf = NEXUSClassifier(mode=config["mode"])
    job = clf.submit_fit_task(X, y)
    jobs.append({"name": config["name"], "job": job, "clf": clf})
    print(f"Started {config['name']}: Job ID {job}")

print(f"\n{len(jobs)} training jobs running in parallel!")

In [None]:
import time

# Speed mode
clf_speed = NEXUSClassifier(mode="speed")
start = time.time()
clf_speed.fit(X_train, y_train)
speed_time = time.time() - start
speed_accuracy = accuracy_score(y_test, clf_speed.predict(X_test))

# Quality mode
clf_quality = NEXUSClassifier(mode="quality")
start = time.time()
clf_quality.fit(X_train, y_train)
quality_time = time.time() - start
quality_accuracy = accuracy_score(y_test, clf_quality.predict(X_test))

print("\nMode Comparison:")
print(f"Speed Mode:   {speed_time:.2f}s | Accuracy: {speed_accuracy:.4f}")
print(f"Quality Mode: {quality_time:.2f}s | Accuracy: {quality_accuracy:.4f}")

# Feature Importance

After fitting a model, you can compute feature importance to quantify the contribution of each input feature to the model's output:



In [None]:
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from fundamental import NEXUSClassifier

# Fit the model
classifier = NEXUSClassifier()
classifier.fit(X_train, y_train)
# Get feature importance (waits for computation to complete)
feature_importance = classifier.get_feature_importance(X_test)

Feature importance computation can take a significant amount of time. If you prefer to submit the task and check its status periodically, you can use the asynchronous approach:



In [None]:
print(f"Feature importance: {feature_importance}")

# Submit task without waiting
task_id = classifier.submit_feature_importance_task(X_test)

# Poll status later
result = classifier.poll_feature_importance_result(task_id)
if result is not None:
    print(f"Feature importance: {result}")