# Getting Started with brmspy

This notebook demonstrates how to use **brmspy** - a Pythonic interface to the brms R package.

**brmspy v0.1.0** features:
- CmdStanPy backend (official Stan interface)
- Explicit brms version control
- Support for multiple inference methods (MCMC, VI, optimization)
- Python 3.8+ compatibility

## Setup

This notebook works in two scenarios:
1. **Installed package**: `pip install brmspy`
2. **Cloned repository**: Running from repository root

In [None]:
# Import brmspy - works for both installed package and cloned repo
import sys
import os

try:
    # Try importing as installed package
    import brmspy
    print(f"âœ“ Using installed brmspy {brmspy.__version__}")
except ImportError:
    # Fall back to importing from cloned repo
    # Add parent directory to path (assumes running from examples/)
    repo_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
    if repo_root not in sys.path:
        sys.path.insert(0, repo_root)
    
    try:
        import brmspy
        print(f"âœ“ Using brmspy from repository: {repo_root}")
        print(f"  Version: {brmspy.__version__}")
    except ImportError as e:
        print("âœ— Could not import brmspy!")
        print("  Please either:")
        print("  1. Install: pip install brmspy")
        print("  2. Or run from repository root with: jupyter notebook examples/getting_started.ipynb")
        raise

# Import other required packages
import pandas as pd
import numpy as np

# Visualization (optional)
try:
    import arviz as az
    import matplotlib.pyplot as plt
    HAS_VIZ = True
    print("âœ“ Visualization libraries available (arviz, matplotlib)")
except ImportError:
    HAS_VIZ = False
    print("â„¹ Visualization libraries not available (install with: pip install brmspy[viz])")

print("\n" + "="*60)
print("Setup complete!")
print("="*60)

## 1. First Time Setup (One-Time Only)

Install brms R package and CmdStan compiler. This only needs to be done once per environment.

In [None]:
# Check if brms and CmdStan are already installed
try:
    version = brmspy.get_brms_version()
    print(f"âœ“ brms already installed (version {version})")
    
    import cmdstanpy
    cmdstan_path = cmdstanpy.cmdstan_path()
    print(f"âœ“ CmdStan already installed at: {cmdstan_path}")
    print("\nSkipping installation - already set up!")
    
except Exception as e:
    print("Installing brms and CmdStan...")
    print("This may take a few minutes on first run.")
    print("")
    
    # Install both brms and CmdStan
    brmspy.install_brms()
    
    print("\n" + "="*60)
    print("Installation complete!")
    print("="*60)

## 2. Load Example Data

brmspy provides access to all datasets included in the brms package.

In [None]:
# Load the epilepsy dataset
epilepsy = brmspy.get_brms_data("epilepsy")

print("Epilepsy Dataset")
print("="*60)
print(f"Shape: {epilepsy.shape}")
print(f"\nColumns: {', '.join(epilepsy.columns)}")
print(f"\nFirst few rows:")
epilepsy.head()

In [None]:
# Quick data exploration
print("Data Summary")
print("="*60)
print(epilepsy.describe())

print("\nTarget Variable Distribution (count):")
print(epilepsy['count'].value_counts().sort_index().head(10))

## 3. Fit a Simple Model

Let's fit a Poisson regression model with random effects.

**Model:** `count ~ zAge + zBase * Trt + (1|patient)`

This model includes:
- Fixed effects: `zAge`, `zBase`, `Trt` and their interaction
- Random effect: varying intercepts by patient

In [None]:
# Fit the model
# Using fewer iterations for faster demonstration
# For real analysis, use default values (iter_warmup=1000, iter_sampling=1000)

print("Fitting Bayesian Poisson regression model...")
print("This will take a minute or two.\n")

model = brmspy.fit(
    formula="count ~ zAge + zBase * Trt + (1|patient)",
    data=epilepsy,
    family="poisson",
    warmup=250,      # Warmup iterations (for demo; use 1000+ for real analysis)
    iter=500,    # Sampling iterations (for demo; use 1000+ for real analysis) 
    chains=2       # Number of MCMC chains
)

print("\n" + "="*60)
print("Model fitting complete!")
print("="*60)

## 4. Examine Results

View the posterior summary statistics.

In [None]:
# Get summary statistics
summary = model.summary()

print("Posterior Summary")
print("="*60)
print(summary)

# Check convergence diagnostics
print("\nConvergence Check")
print("="*60)
if 'R_hat' in summary.columns:
    max_rhat = summary['R_hat'].max()
    print(f"Max R-hat: {max_rhat:.4f}")
    if max_rhat < 1.01:
        print("âœ“ Excellent convergence (R-hat < 1.01)")
    elif max_rhat < 1.05:
        print("Average convergence (R-hat < 1.05)")
    else:
        print("âš  Poor convergence (R-hat >= 1.05) - consider more iterations")
else:
    print("R-hat not available in summary")

## 5. Visualization (Optional)

Visualize posterior distributions using arviz.

In [None]:
if HAS_VIZ:
    # Convert to arviz InferenceData
    idata = az.from_cmdstan(model)
    
    # Plot posterior distributions for fixed effects
    fig = az.plot_posterior(
        idata,
        var_names=['b_Intercept', 'b_zAge', 'b_zBase', 'b_Trt', 'b_zBase:Trt'],
        figsize=(12, 8),
        textsize=10
    )
    plt.suptitle('Posterior Distributions - Fixed Effects', y=1.02, fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.show()
    
    print("âœ“ Posterior distributions plotted")
else:
    print("âš  Visualization skipped (arviz not installed)")
    print("  Install with: pip install brmspy[viz]")

In [None]:
if HAS_VIZ:
    # Trace plots to check MCMC behavior
    az.plot_trace(
        idata,
        var_names=['b_Intercept', 'b_zAge', 'b_zBase'],
        figsize=(12, 6)
    )
    plt.suptitle('MCMC Trace Plots', y=1.02, fontsize=14, fontweight='bold')
    plt.tight_layout()
    plt.show()
    
    print("âœ“ Trace plots show MCMC chain behavior")

## 6. Extract and Analyze Samples

Access the raw posterior samples for custom analysis.

In [None]:
# Get parameter draws
draws = model.draws()

print("Posterior Samples")
print("="*60)
print(f"Shape: {draws.shape}")
print(f"  (chains, iterations, parameters)")
print(f"\nTotal posterior samples: {draws.shape[0] * draws.shape[1]}")

# Example: Extract specific parameter
if HAS_VIZ:
    # Get samples for zAge coefficient
    b_zAge_samples = idata.posterior['b_zAge'].values.flatten()
    
    print(f"\nb_zAge (effect of age):")
    print(f"  Mean: {b_zAge_samples.mean():.4f}")
    print(f"  Median: {np.median(b_zAge_samples):.4f}")
    print(f"  95% CI: [{np.percentile(b_zAge_samples, 2.5):.4f}, {np.percentile(b_zAge_samples, 97.5):.4f}]")
    
    # Simple histogram
    plt.figure(figsize=(10, 4))
    plt.hist(b_zAge_samples, bins=50, density=True, alpha=0.7, edgecolor='black')
    plt.axvline(b_zAge_samples.mean(), color='red', linestyle='--', linewidth=2, label='Mean')
    plt.xlabel('b_zAge')
    plt.ylabel('Density')
    plt.title('Posterior Distribution of zAge Coefficient')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()

## 7. Model with Custom Priors

Specify informative priors for better inference.

In [None]:
# Fit model with custom priors
print("Fitting model with custom priors...\n")

model_with_priors = brmspy.fit(
    formula="count ~ zAge + zBase",
    data=epilepsy,
    family="poisson",
    priors=[
        ("normal(0, 0.5)", "b"),         # Regularizing prior on coefficients
        ("normal(1, 0.5)", "Intercept")  # Informative prior on intercept
    ],
    iter_warmup=500,
    iter_sampling=500,
    chains=2,
    show_console=False
)

print("\n" + "="*60)
print("Model with priors fitted!")
print("="*60)

# Compare summaries
print("\nSummary:")
print(model_with_priors.summary())

## 8. Other Example Datasets

brmspy includes access to all brms datasets.

In [None]:
# Load kidney dataset
kidney = brmspy.get_brms_data("kidney")
print("Kidney Dataset")
print("="*60)
print(f"Shape: {kidney.shape}")
print(f"\nFirst few rows:")
print(kidney.head())

print("\n" + "="*60)

# Load inhaler dataset
inhaler = brmspy.get_brms_data("inhaler")
print("\nInhaler Dataset")
print("="*60)
print(f"Shape: {inhaler.shape}")
print(f"\nFirst few rows:")
print(inhaler.head())

## Summary

This notebook demonstrated:
1. âœ… Setting up brmspy (installing brms and CmdStan)
2. âœ… Loading example datasets
3. âœ… Fitting Bayesian regression models
4. âœ… Examining posterior summaries and diagnostics
5. âœ… Visualizing results with arviz
6. âœ… Extracting and analyzing posterior samples
7. âœ… Specifying custom priors

### Next Steps

- Read the [MIGRATION.md](../MIGRATION.md) guide for API changes from v0.0.3
- Check [ARCHITECTURE.md](../ARCHITECTURE.md) for system design details
- See brms documentation: https://paul-buerkner.github.io/brms/
- Explore CmdStanPy features: https://mc-stan.org/cmdstanpy/

### Support

- GitHub: https://github.com/kaitumisuuringute-keskus/brmspy
- Issues: https://github.com/kaitumisuuringute-keskus/brmspy/issues

Happy modeling! ðŸŽ‰