# CVaR Portfolio Optimization with cuOpt Python API

This notebook demonstrates Conditional Value at Risk (CVaR) portfolio optimization using NVIDIA's cuOpt Python API with S&P 500 stock data.

## Overview

**Conditional Value at Risk (CVaR)** is a risk measure that quantifies the expected loss in the worst-case scenarios beyond a certain confidence level. It's particularly useful for portfolio optimization as it provides a coherent risk measure that captures tail risk.

### CVaR Formulation

The CVaR portfolio optimization problem can be formulated as:

$$
\begin{align}
\text{maximize: } & \mu^T w - \lambda \text{CVaR}_\alpha(w) \\
\text{subject to: } & \mathbf{1}^T w = 1 \\
& w_i^{\min} \leq w_i \leq w_i^{\max}, \quad i = 1, \ldots, n
\end{align}
$$

Where:
- $w$ is the portfolio weight vector
- $\mu$ is the expected return vector
- $\lambda$ is the risk aversion parameter
- $\text{CVaR}_\alpha(w)$ is the Conditional Value at Risk at confidence level $\alpha$

### Data Source
We use S&P 500 stock data fetched directly from the GitHub repository, which contains historical price data for S&P 500 constituents. The data is loaded from:
`https://raw.githubusercontent.com/NVIDIA/cuopt-examples/refs/heads/branch-25.10/portfolio_optimization/cuFOLIO_portfolio_optimization/data/stock_data/sp500.csv`

### Requirements
- **GPU**: NVIDIA GPU with CUDA support (recommended for optimal performance)
- **CUDA**: Version 12.x or 13.x
- **Python**: 3.10 or higher
- **Memory**: Sufficient RAM for large-scale optimization (8GB+ recommended)

### Installation Notes
- cuOpt requires an NVIDIA GPU and CUDA toolkit
- The package is available through NVIDIA's PyPI index
- Different versions are available for different CUDA versions (cu11, cu12)
- For CPU-only environments, consider using alternative optimization libraries


## 1. Environment Setup and Installation

### 1.1 Install Required Dependencies


In [None]:
import subprocess
import html
from IPython.display import display, HTML

def check_gpu():
    try:
        result = subprocess.run(["nvidia-smi"], capture_output=True, text=True, timeout=5)
        result.check_returncode()
        lines = result.stdout.splitlines()
        gpu_info = lines[2] if len(lines) > 2 else "GPU detected"
        gpu_info_escaped = html.escape(gpu_info)
        display(HTML(f"""
        <div style="border:2px solid #4CAF50;padding:10px;border-radius:10px;background:#e8f5e9;">
            <h3>✅ GPU is enabled</h3>
            <pre>{gpu_info_escaped}</pre>
        </div>
        """))
        return True
    except (subprocess.CalledProcessError, subprocess.TimeoutExpired, FileNotFoundError, IndexError) as e:
        display(HTML("""
        <div style="border:2px solid red;padding:15px;border-radius:10px;background:#ffeeee;">
            <h3>⚠️ GPU not detected!</h3>
            <p>This notebook requires a <b>GPU runtime</b>.</p>
            
            <h4>If running in Google Colab:</h4>
            <ol>
              <li>Click on <b>Runtime → Change runtime type</b></li>
              <li>Set <b>Hardware accelerator</b> to <b>GPU</b></li>
              <li>Then click <b>Save</b> and <b>Runtime → Restart runtime</b>.</li>
            </ol>
            
            <h4>If running in Docker:</h4>
            <ol>
              <li>Ensure you have <b>NVIDIA Docker runtime</b> installed (<code>nvidia-docker2</code>)</li>
              <li>Run container with GPU support: <code>docker run --gpus all ...</code></li>
              <li>Or use: <code>docker run --runtime=nvidia ...</code> for older Docker versions</li>
              <li>Verify GPU access: <code>docker run --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi</code></li>
            </ol>
            
            <p><b>Additional resources:</b></p>
            <ul>
              <li><a href="https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html" target="_blank">NVIDIA Container Toolkit Installation Guide</a></li>
            </ul>
        </div>
        """))
        return False

check_gpu()

In [None]:
# Enable this in case you are running this in google colab or such places where cuOpt is not yet installed
#!pip uninstall -y cuda-python cuda-bindings cuda-core
#!pip install --upgrade --extra-index-url=https://pypi.nvidia.com cuopt-cu12 nvidia-nvjitlink-cu12 rapids-logger==0.1.19
#!pip install --upgrade --extra-index-url=https://pypi.nvidia.com cuopt-cu13 nvidia-nvjitlink-cu13 rapids-logger==0.1.19

In [None]:
!pip install --extra-index-url https://pypi.nvidia.com "numpy>=1.24.4" "pandas>=2.2.1" "scipy==1.15.2" "seaborn>=0.13.2"

### 1.2 Import Required Libraries


In [None]:
# Import required libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import warnings
warnings.filterwarnings('ignore')

# cuOpt imports
from cuopt.linear_programming.problem import Problem, VType, sense, LinearExpression
from cuopt.linear_programming.solver_settings import SolverSettings, PDLPSolverMode
from cuopt.linear_programming.solver.solver_parameters import *

# Set random seed for reproducibility
np.random.seed(42)

### 1.3 Configure Solver Settings


In [None]:
# Configure solver settings for larger problem
solver_settings = SolverSettings()
solver_settings.set_parameter("time_limit", 300.0)  # 5 minute time limit for larger problem
solver_settings.set_parameter("log_to_console", True)  # Enable solver logging
solver_settings.set_parameter("method", 0)  # Use default method


### 1.4 Load S&P 500 Data


In [None]:
# Load S&P 500 data from GitHub
data_url = 'https://raw.githubusercontent.com/NVIDIA/cuopt-examples/refs/heads/branch-25.10/portfolio_optimization/cuFOLIO_portfolio_optimization/data/stock_data/sp500.csv'
df = pd.read_csv(data_url, index_col='Date', parse_dates=True)

print(f"Date range: {df.index.min()} to {df.index.max()}")
print(f"Number of assets: {len(df.columns)}")
print(f"\nFirst few columns: {list(df.columns[:10])}")

# Display basic statistics
df.head()


## 2. Data Preprocessing and Return Calculation


In [None]:
# Use all S&P 500 assets with complete data
# Remove any assets with missing data
price_data = df.dropna(axis=1, how='any')  # Drop columns with any NaN values
selected_assets = price_data.columns

print(f"Total assets in dataset: {len(df.columns)}")
print(f"Assets with complete data: {len(selected_assets)}")
print(f"Price data shape: {price_data.shape}")
print(f"Selected assets (first 10): {list(selected_assets[:10])}")

# Calculate log returns
returns = np.log(price_data / price_data.shift(1)).dropna()

print(f"Returns data shape: {returns.shape}")
print(f"Returns date range: {returns.index.min()} to {returns.index.max()}")

# Display return statistics
print("\nReturn Statistics (first 5 assets):")
print(returns.iloc[:, :5].describe())


In [None]:
# Calculate expected returns and covariance matrix
mu = returns.mean().values  # Expected returns
Sigma = returns.cov().values  # Covariance matrix
n_assets = len(selected_assets)

# Annualize returns (assuming 252 trading days)
mu_annual = mu * 252
Sigma_annual = Sigma * 252

print(f"\nAnnualized expected returns (top 5):")
for i in range(5):
    print(f"{selected_assets[i]}: {mu_annual[i]:.4f}")


## 3. CVaR Scenario Generation

For CVaR optimization, we need to generate scenarios of portfolio returns. We'll use historical simulation and Monte Carlo methods.


In [None]:
# Historical simulation scenarios
historical_scenarios = returns.values
n_scenarios_hist = historical_scenarios.shape[0]

print(f"Historical scenarios: {n_scenarios_hist}")
print(f"Number of assets: {len(selected_assets)}")

# For computational efficiency with many assets, use fewer Monte Carlo scenarios
# Adjust based on problem size
n_scenarios_mc = min(2000, n_scenarios_hist)  # Use at most 2000 MC scenarios
mc_scenarios = np.random.multivariate_normal(mu, Sigma, n_scenarios_mc)

print(f"Monte Carlo scenarios: {n_scenarios_mc}")

# Combine scenarios
all_scenarios = np.vstack([historical_scenarios, mc_scenarios])
n_scenarios_total = all_scenarios.shape[0]
scenario_probs = np.ones(n_scenarios_total) / n_scenarios_total

print(f"Total scenarios: {n_scenarios_total}")
print(f"Scenario matrix shape: {all_scenarios.shape}")
print(f"Problem size: {len(selected_assets)} assets × {n_scenarios_total} scenarios = {len(selected_assets) * n_scenarios_total} scenario-asset combinations")


## 4. CVaR Portfolio Optimization with cuOpt

Now we'll implement the CVaR optimization using cuOpt's linear programming interface. The CVaR optimization problem can be reformulated as a linear program.


In [None]:
def solve_cvar_portfolio(scenarios, scenario_probs, mu, alpha=0.95, lambda_risk=1.0, 
                        w_min=None, w_max=None, solver_settings=None):
    """
    Solve CVaR portfolio optimization using cuOpt linear programming.
    
    Parameters:
    - scenarios: numpy array of return scenarios (n_scenarios x n_assets)
    - scenario_probs: probability weights for scenarios
    - mu: expected returns vector
    - alpha: confidence level for CVaR (default 0.95)
    - lambda_risk: risk aversion parameter (default 1.0)
    - w_min, w_max: bounds on portfolio weights
    - solver_settings: cuOpt solver settings
    
    Returns:
    - optimal_weights: optimal portfolio weights
    - cvar_value: CVaR value at optimum
    - expected_return: expected portfolio return
    """
    
    n_scenarios, n_assets = scenarios.shape
    
    if w_min is None:
        w_min = np.zeros(n_assets)
    if w_max is None:
        w_max = np.ones(n_assets)
    
    # Create the linear programming problem
    problem = Problem("cvar_portfolio_optimization")
    
    # Decision variables
    # Portfolio weights
    w = {}
    for i in range(n_assets):
        w[i] = problem.addVariable(name=f"w_{i}", vtype=VType.CONTINUOUS, 
                                  lb=w_min[i], ub=w_max[i])
    
    # CVaR auxiliary variables
    t = problem.addVariable(name="t", vtype=VType.CONTINUOUS, 
                           lb=-float('inf'), ub=float('inf'))  # VaR variable
    u = {}
    for s in range(n_scenarios):
        u[s] = problem.addVariable(name=f"u_{s}", vtype=VType.CONTINUOUS, 
                                  lb=0.0, ub=float('inf'))  # CVaR auxiliary
    
    # Objective: maximize expected return - lambda * CVaR
    # CVaR = t + (1/(1-alpha)) * sum(p_s * u_s)
    objective_expr = LinearExpression([], [], 0.0)
    
    # Add expected return terms
    for i in range(n_assets):
        if mu[i] != 0:
            objective_expr += w[i] * mu[i]
    
    # Subtract CVaR terms to penalize higher risk (lower CVaR increases objective value)
    if lambda_risk != 0:
        objective_expr -= t * lambda_risk
        cvar_coeff = lambda_risk / (1.0 - alpha)
        for s in range(n_scenarios):
            if scenario_probs[s] != 0:
                objective_expr -= u[s] * (cvar_coeff * scenario_probs[s])
    
    problem.setObjective(objective_expr, sense.MAXIMIZE)
    
    # Constraints
    # Budget constraint: sum of weights = 1
    budget_expr = LinearExpression([], [], 0.0)
    for i in range(n_assets):
        budget_expr += w[i]
    problem.addConstraint(budget_expr == 1.0, name="budget")
    
    # CVaR constraints: u_s >= -R_s^T * w - t for all scenarios s
    for s in range(n_scenarios):
        cvar_constraint_expr = LinearExpression([], [], 0.0)
        cvar_constraint_expr += u[s]  # u_s
        cvar_constraint_expr += t     # + t
        
        # Add portfolio return terms: + R_s^T * w
        for i in range(n_assets):
            if scenarios[s, i] != 0:
                cvar_constraint_expr += w[i] * scenarios[s, i]
        
        problem.addConstraint(cvar_constraint_expr >= 0.0, name=f"cvar_{s}")
    
    # Solve the optimization problem
    if solver_settings is not None:
        problem.solve(solver_settings)
    else:
        problem.solve()
    
    if problem.Status.name == "Optimal":
        # Extract optimal solution
        optimal_weights = np.array([w[i].getValue() for i in range(n_assets)])
        t_value = t.getValue()
        u_values = np.array([u[s].getValue() for s in range(n_scenarios)])
        
        # Calculate CVaR and expected return
        cvar_value = t_value + (1.0 / (1.0 - alpha)) * np.sum(scenario_probs * u_values)
        expected_return = np.dot(mu, optimal_weights)
        
        return optimal_weights, cvar_value, expected_return, problem
    else:
        raise RuntimeError(f"Optimization failed with status: {problem.Status.name}")

## 5. Solve the CVaR Optimization Problem


In [None]:
# Set optimization parameters
alpha = 0.95  # 95% confidence level
lambda_risk = 2.0  # Risk aversion parameter

# Portfolio weight bounds for DIVERSIFIED portfolio
w_min = np.zeros(n_assets)  # No short selling
w_max = np.ones(n_assets) # Maximum can be 100% in any single asset

print(f"Diversification constraints:")
print(f"- Maximum weight per asset: {w_max[0]:.1%}")
print(f"- This forces allocation across at least {1/w_max[0]:.0f} assets")

# Alternative diversification strategies (uncomment to try):

# Strategy 1: Even more diversified (max 10% per asset)
# w_max = np.ones(n_assets) * 0.10

# Strategy 2: Minimum holdings requirement (forces broader diversification)
# min_holdings = 30  # Require at least 30 assets
# w_min = np.zeros(n_assets)
# w_min[:min_holdings] = 0.005  # Minimum 0.5% in top assets

# Strategy 3: Lower risk aversion (allows more return-seeking behavior)
# lambda_risk = 0.5  # Less conservative approach

print(f"- Confidence level (alpha): {alpha}")
print(f"- Risk aversion (lambda): {lambda_risk}")
print(f"- Number of scenarios: {n_scenarios_total}")
print(f"- Number of assets: {n_assets}")

# Solve the optimization problem
try:
    optimal_weights, cvar_value, expected_return, solve_result = solve_cvar_portfolio(
        scenarios=all_scenarios,
        scenario_probs=scenario_probs,
        mu=mu_annual,  # Use annualized returns
        alpha=alpha,
        lambda_risk=lambda_risk,
        w_min=w_min,
        w_max=w_max,
        solver_settings=solver_settings
    )
    
    print(f"\nOptimization successfuli!")
    print(f"Status: {solve_result.Status.name}")
    print(f"Objective value: {solve_result.ObjValue:.6f}")
    print(f"Expected annual return: {expected_return:.4f} ({expected_return*100:.2f}%)")
    print(f"CVaR (95%): {cvar_value:.4f}")
    
except Exception as e:
    print(f"Optimization failed: {e}")


## 6. Analyze the Optimal Portfolio


In [None]:
# Create portfolio results DataFrame
portfolio_df = pd.DataFrame({
    'Asset': selected_assets,
    'Weight': optimal_weights,
    'Expected_Return': mu_annual
})

# Sort by weight (descending)
portfolio_df = portfolio_df.sort_values('Weight', ascending=False)

# Display portfolio composition (top holdings only)
significant_holdings = portfolio_df[portfolio_df['Weight'] > 0.001]  # Only assets with weight > 0.1%
top_holdings = significant_holdings.head(20)  # Show top 20 holdings

print("Optimal Portfolio Composition (Top 20 Holdings):")
print("=" * 70)
for _, row in top_holdings.iterrows():
    print(f"{row['Asset']:>6}: {row['Weight']:>8.4f} ({row['Weight']*100:>6.2f}%) | Expected Return: {row['Expected_Return']:>8.4f}")

In [None]:
# Visualize portfolio composition
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 8))

# Portfolio weights bar chart (top 20 holdings)
top_20_holdings = significant_holdings.head(20)
bars = ax1.bar(range(len(top_20_holdings)), top_20_holdings['Weight'])
ax1.set_xlabel('Assets (Top 20 Holdings)')
ax1.set_ylabel('Portfolio Weight')
ax1.set_title(f'Optimal Portfolio Weights - Top 20 Holdings\n({len(selected_assets)} total assets, {len(significant_holdings)} with positive weights)')
ax1.set_xticks(range(len(top_20_holdings)))
ax1.set_xticklabels(top_20_holdings['Asset'], rotation=45, ha='right')
ax1.grid(True, alpha=0.3)

# Add value labels on bars for top holdings
for i, bar in enumerate(bars):
    height = bar.get_height()
    if height > 0.01:  # Only label if weight > 1%
        ax1.text(bar.get_x() + bar.get_width()/2., height + 0.001,
                f'{height:.3f}', ha='center', va='bottom', fontsize=8)

# Portfolio weights pie chart (top 10 holdings)
top_10_holdings = significant_holdings.head(10)
other_weight = significant_holdings.iloc[10:]['Weight'].sum() if len(significant_holdings) > 10 else 0

if other_weight > 0:
    pie_data = list(top_10_holdings['Weight']) + [other_weight]
    pie_labels = list(top_10_holdings['Asset']) + [f'Others ({len(significant_holdings)-10} assets)']
else:
    pie_data = top_10_holdings['Weight']
    pie_labels = top_10_holdings['Asset']

wedges, texts, autotexts = ax2.pie(pie_data, labels=pie_labels, autopct='%1.1f%%', 
                                  startangle=90, textprops={'fontsize': 9})
ax2.set_title('Portfolio Allocation - Top 10 Holdings + Others')

# Improve pie chart readability
for autotext in autotexts:
    autotext.set_color('white')
    autotext.set_fontweight('bold')

plt.tight_layout()
plt.show()

# Additional statistics
print(f"\nConcentration Analysis:")
print(f"Herfindahl-Hirschman Index (HHI): {np.sum(optimal_weights**2):.6f}")
print(f"Effective number of assets: {1/np.sum(optimal_weights**2):.2f}")
print(f"Diversification ratio: {len(significant_holdings)}/{len(selected_assets)} = {len(significant_holdings)/len(selected_assets):.2%}")


In [None]:
# Final summary statistics
print("CVaR Portfolio Optimization Summary")
print("=" * 50)
print(f"Dataset: S&P 500 stocks ({n_assets} assets)")
print(f"Optimization method: CVaR with cuOpt GPU acceleration")
print(f"Confidence level: {alpha*100}%")
print(f"Risk aversion parameter: {lambda_risk}")
print(f"Number of scenarios: {n_scenarios_total:,}")

if 'optimal_weights' in locals():
    portfolio_std = np.std(all_scenarios @ optimal_weights) * np.sqrt(252)
    print(f"\nOptimal Portfolio Performance:")
    print(f"- Expected annual return: {expected_return:.2%}")
    print(f"- Annual volatility: {portfolio_std:.2%}")
    print(f"- Sharpe ratio: {expected_return/portfolio_std:.3f}")
    print(f"- CVaR (95%): {cvar_value:.2%}")
    print(f"- Number of assets with positive weights: {np.sum(optimal_weights > 0.001)}")
    
    # Top 5 holdings
    top_5 = portfolio_df.head(5)
    print(f"\nTop 5 Holdings:")
    for _, row in top_5.iterrows():
        if row['Weight'] > 0.001:
            print(f"- {row['Asset']}: {row['Weight']:.2%}")
    
    print(f"\nComputational Performance:")
    print(f"- Solver status: {solve_result.Status.name}")
    print(f"- Objective value: {solve_result.ObjValue:.6f}")
else:
    print("\nOptimization was not successful - please check the previous cells.")


## 8. Summary and Key Takeaways

This notebook demonstrated how to implement CVaR portfolio optimization using NVIDIA's cuOpt Python API with S&P 500 data. 

### Key Features Implemented:
1. **GPU-Accelerated Optimization**: Used cuOpt for fast linear programming solution
2. **CVaR Risk Management**: Implemented conditional value-at-risk as the risk measure
3. **Scenario-Based Approach**: Combined historical and Monte Carlo simulation scenarios
4. **Diversification Constraints**: Added maximum weight limits to improve portfolio diversification
5. **Comprehensive Analysis**: Portfolio composition, risk metrics, and visualization

### Diversification Strategies Available:
- **Maximum Weight Constraints**: Limit concentration in any single asset
- **Minimum Weight Requirements**: Force broader asset allocation across more assets
- **Risk Aversion Adjustment**: Lower lambda_risk for more return-seeking behavior

---
SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.

SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
