# Argon: Local Branching & S3 Demo

This notebook demonstrates Argon's core features: MongoDB branching, stateless compute, and S3 storage. All operations are local (no Databricks/Spark required).

## 1. Setup and Configuration

Install required libraries and configure Argon and S3 access. Ensure Docker is running and your `.env` is set up with AWS credentials.

In [None]:
# Install dependencies (run in terminal if not already installed)
# !pip install boto3 python-dotenv docker

import os
from core import branch_manager, metadata

# Initialize metadata DB
metadata.init_db()

# S3 and Docker are configured via .env and Docker Desktop
print('Setup complete. Ready for Argon demo!')


: 

## 2. Create, List, and Time-Travel Branches

Create a new MongoDB branch from a base snapshot (from S3), list all branches, and demonstrate point-in-time restore (time travel) to a previous state.

In [None]:
# Read data from Argon
argon_df = spark.read.format("argon").option("uri", argon_uri).load()
argon_df.show()

# Write data to Argon
# argon_df.write.format("argon").option("uri", argon_uri).mode("append").save()

# Create a new branch (from base S3 snapshot)
branch = branch_manager.create_branch('demo-branch')
print('Created branch:', branch)

# List all branches
branches = branch_manager.list_branches()
print('All branches:', branches)

# Simulate some changes and delete the branch to create a new S3 version
# (In a real demo, you would modify data in the running MongoDB container here)
result = branch_manager.delete_branch('demo-branch')
print('Branch deleted:', result)

# List all branches again (should be empty)
branches = branch_manager.list_branches()
print('All branches after delete:', branches)

# Time-travel: restore the branch to its previous state using the CLI
# Example CLI command (run in terminal):
# python3 -m cli.main branch time-travel demo-branch --from demo-branch --timestamp "2025-05-27T12:00:00Z"

# Or use the Python API directly:
from core.metadata import get_branch_version_by_time
vinfo = get_branch_version_by_time('demo-branch', '2025-05-27T12:00:00Z')
if vinfo:
    print('Found version for time-travel:', vinfo)
    branch = branch_manager.create_branch('demo-branch', vinfo['s3_path'], vinfo['version_id'])
    print('Restored branch from time-travel:', branch)
else:
    print('No version found for time-travel at the given timestamp.')

## 3. Use a Branch for Local Development

Connect to the MongoDB branch (container) on its assigned port for development, testing, or CI/CD workflows.

In [None]:
# Create a new branch for an experiment (pseudo-code)
import requests
from pymongo import MongoClient

branch_name = "experiment-2025-05-27"
# requests.post(f"{argon_uri}/branches", json={"name": branch_name, "from": "main"})

# Use the branch for ML training
experiment_uri = f"argon://<your-argon-endpoint>/{branch_name}"
experiment_df = spark.read.format("argon").option("uri", experiment_uri).load()

# Train a model on the experiment branch
def train_model(df):
    # ... ML training code ...
    pass

train_model(experiment_df)

# Record branch name in MLflow for reproducibility
# mlflow.log_param("argon_branch", branch_name)

# Example: Connect to the running MongoDB branch
branch_port = branch['port']
client = MongoClient(f'mongodb://localhost:{branch_port}/')
db = client['test']
print('Collections:', db.list_collection_names())

# You can now use this branch for any local dev/test/CI workflow

## 4. Delete a Branch and Persist to S3

When finished, delete the branch. Argon will dump the data to S3 and remove the container.

In [None]:
# Example: Store features in Argon
features = [
    {"user_id": 1, "features": {"age": 25, "country": "US", "history": [1, 2, 3]}},
    {"user_id": 2, "features": {"age": 32, "country": "UK", "history": [2, 4, 6]}}
]

features_df = spark.createDataFrame(features)
features_df.write.format("argon").option("uri", argon_uri).mode("append").save()

# Retrieve features for ML
retrieved_df = spark.read.format("argon").option("uri", argon_uri).load()
retrieved_df.show()

# Delete the branch (dump to S3, remove container)
result = branch_manager.delete_branch('demo-branch')
print('Branch deleted:', result)

# List branches again to confirm
git_branches = branch_manager.list_branches()
print('Remaining branches:', git_branches)

## 5. Summary: Argon Local Demo Value

- Instantly create isolated MongoDB branches from S3 snapshots
- Use branches for local dev, testing, or CI/CD
- Persist branch data to S3 for stateless compute
- No Databricks or Spark required for this demo

In [None]:
# Store inference results
inference_results = [
    {"user_id": 1, "prediction": 0.87, "timestamp": "2025-05-27T10:00:00Z"},
    {"user_id": 2, "prediction": 0.42, "timestamp": "2025-05-27T10:01:00Z"}
]

inference_df = spark.createDataFrame(inference_results)
inference_df.write.format("argon").option("uri", argon_uri).mode("append").save()

# Retrieve feedback for retraining
feedback_df = spark.read.format("argon").option("uri", argon_uri).load()
feedback_df.show()

## 6. Model Metadata and Artifact Storage

Use Argon to store model metadata, inputs, outputs, and evaluations.

In [None]:
# Store model metadata
model_metadata = {
    "model_name": "churn-predictor-v1",
    "inputs": ["age", "country", "history"],
    "outputs": ["churn_probability"],
    "metrics": {"auc": 0.91, "accuracy": 0.88},
    "trained_on_branch": branch_name,
    "timestamp": "2025-05-27T10:05:00Z"
}

metadata_df = spark.createDataFrame([model_metadata])
metadata_df.write.format("argon").option("uri", argon_uri).mode("append").save()

## 7. Governance Integration with Unity Catalog

Argon can expose its schema and access controls through a Unity Catalog-compatible interface, enabling unified governance, data lineage, and security in Databricks environments.