# Chapter 49: Security and Compliance

## Learning Objectives

By the end of this chapter, you will be able to:

- Identify the key security risks in a machine learning system, especially for financial data
- Implement encryption for data at rest and in transit to protect sensitive NEPSE information
- Design an access control system with authentication, authorization, and role‑based access control (RBAC)
- Secure prediction APIs against common attacks (injection, denial‑of‑service, model stealing)
- Understand model‑specific threats: model inversion, membership inference, and adversarial examples
- Apply privacy‑preserving techniques such as data anonymization, differential privacy, and federated learning
- Navigate regulatory compliance requirements (GDPR, HIPAA, SOC2) that may apply to your prediction system
- Establish audit trails and logging for security monitoring and forensic analysis
- Adopt industry best practices for developing and operating secure ML systems

---

## Introduction

In a financial prediction system like the one we are building for NEPSE, security and compliance are not optional—they are foundational. The system handles potentially sensitive market data, and its predictions could influence trading decisions worth real money. A breach could lead to financial loss, reputational damage, and regulatory penalties. Moreover, the model itself is an intellectual asset that competitors might want to steal.

This chapter covers the security and compliance landscape for machine learning systems. We will start with basic data security (encryption, access control), then move to API security, model‑specific threats, privacy, and finally regulatory compliance. Throughout, we will use the NEPSE system as a concrete example, showing how to apply these principles in practice.

---

## 49.1 Security Fundamentals for ML Systems

Security is often thought of as a set of barriers: firewalls, encryption, authentication. But in a modern ML system, the attack surface is broader. Threats can target:

- **Data**: Unauthorized access, leakage, or tampering.
- **Model**: Stealing the model, extracting training data, or fooling it with adversarial inputs.
- **Infrastructure**: Compromising servers, containers, or cloud accounts.
- **Supply chain**: Malicious dependencies or compromised model artifacts.

A comprehensive security strategy must address all these layers. The principle of **defense in depth** applies: multiple layers of security controls so that if one fails, others still protect the system.

For the NEPSE system, we must assume that adversaries are sophisticated—they could be competitors, malicious traders, or even state‑actors. Therefore, we will implement security at every level.

---

## 49.2 Data Security

Data is the lifeblood of any ML system. For NEPSE, the data includes historical prices, volumes, and potentially derived features. While this data is not personally identifiable, it is commercially sensitive. If a competitor gets our feature set, they could replicate our model. Moreover, if the data is tampered with, the model's predictions become unreliable.

### 49.2.1 Encryption at Rest

Data stored in databases, data lakes, or even CSV files must be encrypted. This ensures that if an attacker gains access to the storage media, they cannot read the data without the decryption key.

**Example: Encrypting a CSV file with Python using `cryptography`**

```python
from cryptography.fernet import Fernet

# Generate a key (store this securely, e.g., in a key management service)
key = Fernet.generate_key()
cipher = Fernet(key)

# Encrypt a file
with open('nepse_sensitive.csv', 'rb') as f:
    data = f.read()
encrypted = cipher.encrypt(data)
with open('nepse_sensitive.enc', 'wb') as f:
    f.write(encrypted)

# To decrypt later:
with open('nepse_sensitive.enc', 'rb') as f:
    encrypted_data = f.read()
decrypted = cipher.decrypt(encrypted_data)
# write decrypted to a temporary file or process in memory
```

In production, you would not manage keys like this. Instead, use a **Key Management Service (KMS)** such as AWS KMS, Google Cloud KMS, or HashiCorp Vault. These services handle key rotation, access control, and auditing.

**Example: Using AWS KMS to encrypt data in S3**

When storing data in S3, you can enable server‑side encryption (SSE‑KMS) with a KMS key. This is configured at the bucket level or per object. All reads and writes are automatically encrypted/decrypted by S3.

### 49.2.2 Encryption in Transit

Data moving between services—from your data source to Kafka, from Kafka to the stream processor, from the model to the API client—must be encrypted to prevent eavesdropping or man‑in‑the‑middle attacks.

- **TLS/SSL** is the standard. All web services (APIs) should use HTTPS.
- For Kafka, enable SSL encryption between brokers and clients.
- For databases, use TLS connections (e.g., PostgreSQL with `sslmode=require`).

**Example: Enforcing HTTPS in a FastAPI app**

```python
from fastapi import FastAPI
import uvicorn
from fastapi.middleware.https import HTTPSRedirectMiddleware

app = FastAPI()
app.add_middleware(HTTPSRedirectMiddleware)  # Redirect all HTTP to HTTPS

# ... routes ...

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000, ssl_keyfile="key.pem", ssl_certfile="cert.pem")
```

In production behind a load balancer (e.g., AWS ALB), you can terminate TLS at the load balancer and use HTTP internally, which is simpler to manage.

### 49.2.3 Key Management

Keys are the most sensitive assets. Best practices:

- Never hard‑code keys in source code. Use environment variables or secret management tools.
- Rotate keys regularly.
- Use separate keys for different environments (dev, staging, prod).
- Restrict access to keys using IAM roles and policies.

**Example: Using environment variables for database passwords**

```python
import os
import psycopg2

DB_PASSWORD = os.environ.get('DB_PASSWORD')
conn = psycopg2.connect(
    host='db.example.com',
    user='myuser',
    password=DB_PASSWORD,
    sslmode='require'
)
```

In Kubernetes, you can store secrets and mount them as environment variables or files.

---

## 49.3 Access Control

Access control ensures that only authenticated and authorized users or services can interact with your system.

### 49.3.1 Authentication

Authentication verifies the identity of a client. For human users, this might be username/password with multi‑factor authentication (MFA). For machine‑to‑machine communication, API keys or JWT tokens are common.

**Example: JWT authentication in FastAPI**

```python
from fastapi import FastAPI, Depends, HTTPException
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
import jwt

SECRET_KEY = "your-secret-key"  # Store in environment variable
ALGORITHM = "HS256"

security = HTTPBearer()

def verify_token(credentials: HTTPAuthorizationCredentials = Depends(security)):
    token = credentials.credentials
    try:
        payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
        return payload
    except jwt.PyJWTError:
        raise HTTPException(status_code=403, detail="Invalid token")

@app.get("/predict")
async def predict(symbol: str, user=Depends(verify_token)):
    # Only authenticated users can access
    return {"symbol": symbol, "prediction": 0.65}
```

**Explanation:**  
The client includes a JWT in the `Authorization: Bearer <token>` header. The `verify_token` dependency decodes and validates it. If invalid, a 403 error is returned.

### 49.3.2 Authorization

Authorization determines what an authenticated user is allowed to do. **Role‑Based Access Control (RBAC)** is a common model: users are assigned roles (e.g., `admin`, `analyst`, `viewer`), and permissions are granted to roles.

**Example: Adding role‑based authorization**

```python
from functools import wraps

def require_role(required_role: str):
    def decorator(func):
        @wraps(func)
        async def wrapper(*args, **kwargs):
            user = kwargs.get('user')
            if user.get('role') != required_role:
                raise HTTPException(status_code=403, detail="Insufficient permissions")
            return await func(*args, **kwargs)
        return wrapper
    return decorator

@app.get("/admin/metrics")
@require_role("admin")
async def admin_metrics(user=Depends(verify_token)):
    return {"secret": "sensitive data"}
```

### 49.3.3 Service Accounts and IAM

In cloud environments, use **Identity and Access Management (IAM)** roles for services. For example, a Kubernetes pod that needs to read from a specific S3 bucket should have an IAM role attached via a service account, not hard‑coded credentials.

**Example: AWS IAM role for an EKS pod**

```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: nepse-predictor-sa
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/nepse-s3-reader
```

The pod assumes this role, and AWS SDKs automatically retrieve temporary credentials.

---

## 49.4 API Security

The prediction API is the entry point for users. It must be secured against various attacks.

### 49.4.1 Rate Limiting

Prevent abuse and denial‑of‑service by limiting the number of requests a client can make.

**Example: Rate limiting with `slowapi` in FastAPI**

```python
from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address

limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(429, _rate_limit_exceeded_handler)

@app.get("/predict")
@limiter.limit("100/minute")
async def predict(request: Request, symbol: str):
    # ...
```

### 49.4.2 Input Validation

Never trust user input. Validate and sanitise all parameters to prevent injection attacks.

```python
from pydantic import BaseModel, validator

class PredictionRequest(BaseModel):
    symbol: str
    features: dict

    @validator('symbol')
    def symbol_must_be_valid(cls, v):
        if not v.isalpha() or len(v) > 10:
            raise ValueError('Invalid symbol')
        return v.upper()
```

### 49.4.3 HTTPS and Secure Headers

Use HTTPS and set security headers to mitigate common web vulnerabilities.

```python
from fastapi.middleware.trustedhost import TrustedHostMiddleware
from fastapi.middleware.cors import CORSMiddleware

app.add_middleware(TrustedHostMiddleware, allowed_hosts=["api.nepse.example.com"])
app.add_middleware(
    CORSMiddleware,
    allow_origins=["https://dashboard.nepse.example.com"],
    allow_methods=["GET"],
    allow_headers=["Authorization"],
)
```

Headers like `Strict-Transport-Security`, `X-Content-Type-Options`, and `Content-Security-Policy` can be added via middleware.

---

## 49.5 Model Security

The trained model itself is a valuable asset and a potential attack surface.

### 49.5.1 Model Theft

If an attacker can query your API many times, they might attempt to **steal the model** by collecting input‑output pairs and training a surrogate model. This is called **model extraction**.

**Mitigations:**

- Rate limiting (as above) limits the number of queries.
- Add noise to predictions (e.g., rounding probabilities to a few decimals) to make extraction harder.
- Monitor for unusual query patterns (e.g., many requests from one IP with diverse inputs).

### 49.5.2 Adversarial Attacks

An attacker could craft inputs specifically designed to cause misclassification (e.g., making the model predict "up" when it should be "down"). In the financial domain, this could be used to manipulate trading algorithms.

**Defenses:**

- Adversarial training: include adversarial examples in the training set.
- Input sanitization: detect and reject outliers or unrealistic inputs.
- Ensemble methods: combine multiple models to increase robustness.

**Example of a simple adversarial detection: check if input features are within reasonable bounds.**

```python
def validate_features(features):
    # Load precomputed min/max from training data
    if features['volume'] < 0 or features['volume'] > 1e9:
        raise ValueError("Volume out of range")
    if features['rsi'] < 0 or features['rsi'] > 100:
        raise ValueError("RSI out of range")
    return True
```

### 49.5.3 Model Watermarking

To prove ownership if your model is stolen, you can embed a **watermark**—a unique pattern that does not affect normal predictions but can be verified.

**Techniques:**

- Train on some unusual but valid input‑output pairs that only you know.
- Add a backdoor that activates with a specific trigger.

Example: include a rare but valid symbol `WATERMARK` in the training data with a known prediction. If a stolen model predicts correctly on that symbol, you have evidence.

---

## 49.6 Privacy Protection

Even though NEPSE data is not personal, in other domains you might handle personal information. Privacy regulations require protecting that data.

### 49.6.1 Data Anonymization

Remove or mask personally identifiable information (PII). For financial data, this might mean removing trader IDs or anonymising account numbers.

**Example: Hashing user IDs**

```python
import hashlib

def anonymize_user_id(user_id):
    salt = "static-salt"  # Keep secret
    return hashlib.sha256((user_id + salt).encode()).hexdigest()
```

### 49.6.2 Differential Privacy

Differential privacy ensures that the output of a computation does not reveal whether any individual's data was included. This is crucial when training models on sensitive data.

**Example using `opacus` for differentially private PyTorch training**

```python
from opacus import PrivacyEngine

model = ...  # your PyTorch model
optimizer = torch.optim.SGD(model.parameters(), lr=0.05)
privacy_engine = PrivacyEngine(
    model,
    batch_size=64,
    sample_size=len(train_loader.dataset),
    epochs=10,
    max_grad_norm=1.0,
)
privacy_engine.attach(optimizer)

# Train normally; Opacus clips gradients and adds noise
for epoch in range(10):
    for data, target in train_loader:
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()

# After training, you can get the privacy budget spent
epsilon = privacy_engine.get_epsilon(delta=1e-5)
print(f"Privacy budget spent: ε = {epsilon}")
```

**Explanation:**  
Opacus modifies the optimizer to clip gradients per sample and add Gaussian noise, providing a differential privacy guarantee (ε, δ). This prevents the model from memorising individual training examples.

### 49.6.3 Federated Learning

Federated learning trains a model across decentralised data without moving the data to a central server. Each client (e.g., a brokerage) trains locally and sends only model updates to a central aggregator. This can be used if NEPSE data is distributed across multiple institutions.

**Frameworks:** TensorFlow Federated, PySyft, FATE.

---

## 49.7 Regulatory Compliance

Financial systems are often subject to strict regulations. Even if you are building a prototype, understanding these requirements is essential for eventual production deployment.

### 49.7.1 GDPR (General Data Protection Regulation)

If you process data of EU citizens, GDPR applies. Key requirements:

- **Lawful basis** for processing.
- **Right to access, rectification, erasure** ("right to be forgotten").
- **Data protection by design and default**.
- **Breach notification** within 72 hours.

For an ML system, the right to erasure is challenging because once a model is trained, you cannot easily remove the influence of one data point. Solutions include:

- Training with differential privacy (limits memorisation).
- Maintaining a mapping from users to their data and retraining without that user if requested.

### 49.7.2 HIPAA (Health Insurance Portability and Accountability Act)

If your system ever touches health data (e.g., if you expand to healthcare predictions), HIPAA imposes strict safeguards:

- Administrative, physical, and technical safeguards.
- Encryption of data at rest and in transit.
- Access controls and audit logs.
- Business associate agreements with any vendors.

### 49.7.3 SOC2 (System and Organization Controls)

SOC2 is an auditing standard for service organisations. It covers security, availability, processing integrity, confidentiality, and privacy. Achieving SOC2 compliance demonstrates that you have appropriate controls in place.

**Common controls relevant to ML systems:**

- Change management (CI/CD pipelines).
- Access control (RBAC, MFA).
- Incident response.
- Data backup and disaster recovery.

### 49.7.4 Financial Regulations

Depending on how predictions are used, you might fall under financial regulations like **MiFID II** (in Europe) or **SEC rules** (in the US). These may require:

- Algorithm testing and validation.
- Transparency and explainability.
- Audit trails of all trading decisions.
- Registration as an investment advisor if providing advice.

---

## 49.8 Auditing and Logging

To demonstrate compliance and investigate security incidents, you need comprehensive logs.

### 49.8.1 What to Log

- Authentication events (successes and failures).
- API access (who, what, when, from where).
- Data access (which datasets were read, by whom).
- Model predictions (inputs and outputs, for later analysis).
- System changes (deployments, configuration updates).

**Important:** Do not log sensitive data like passwords or full credit card numbers. Hash or redact where necessary.

### 49.8.2 Centralised Logging

Use a central logging system (e.g., ELK stack, Splunk, Loki) to aggregate logs from all services. This makes searching and alerting easier.

**Example: Structured logging with JSON for easy ingestion**

```python
import logging
import json

class JsonFormatter(logging.Formatter):
    def format(self, record):
        log_record = {
            'timestamp': self.formatTime(record),
            'level': record.levelname,
            'message': record.getMessage(),
            'module': record.module,
        }
        if hasattr(record, 'user'):
            log_record['user'] = record.user
        return json.dumps(log_record)

handler = logging.StreamHandler()
handler.setFormatter(JsonFormatter())
logging.basicConfig(level=logging.INFO, handlers=[handler])
```

### 49.8.3 Audit Trails

For financial compliance, you may need to prove that predictions were made at a certain time and not tampered with later. Consider using **digital signatures** or storing logs in an append‑only, immutable store (e.g., AWS CloudTrail, blockchain‑based logging).

---

## 49.9 Security Best Practices

Summarising the key best practices:

1. **Principle of least privilege**: Give users and services only the permissions they need.
2. **Defense in depth**: Multiple layers of security (network, application, data).
3. **Regular updates**: Keep all dependencies patched against known vulnerabilities.
4. **Security testing**: Include security in your CI/CD pipeline (SAST, DAST, dependency scanning).
5. **Incident response plan**: Have a plan for when (not if) a breach occurs.
6. **Training**: Educate your team on secure coding and phishing.
7. **Third‑party audits**: Consider periodic security assessments by external firms.

---

## Chapter Summary

In this chapter, we covered the essential security and compliance considerations for a production‑grade time‑series prediction system, using the NEPSE example as a guide. We explored:

- Data security: encryption at rest and in transit, key management.
- Access control: authentication, authorization, RBAC, and IAM roles.
- API security: rate limiting, input validation, HTTPS, secure headers.
- Model‑specific threats: theft, adversarial attacks, watermarking.
- Privacy: anonymization, differential privacy, federated learning.
- Regulatory compliance: GDPR, HIPAA, SOC2, financial regulations.
- Auditing and logging for accountability and forensics.

By implementing these measures, you protect not only your system and its users but also your organisation’s reputation and legal standing. Security is an ongoing process, not a one‑time checklist. Regularly review and update your practices as new threats emerge and as your system evolves.

In the next chapter, we will discuss **Cloud Deployment**, focusing on how to deploy your NEPSE prediction system on major cloud platforms while maintaining security, scalability, and cost efficiency.

---

**End of Chapter 49**

<div style='width:100%; display:flex; justify-content:space-between; align-items:center; margin: 1em 0;'>
  <a href='48. scalability_and_performance_optimization.ipynb' style='font-weight:bold; font-size:1.05em;'>&larr; Previous</a>
  <a href='../TOC.md' style='font-weight:bold; font-size:1.05em; text-align:center;'>Table of Contents</a>
  <a href='50. cloud_deployment.ipynb' style='font-weight:bold; font-size:1.05em;'>Next &rarr;</a>
</div>
