# Functional Analysis for Machine Learning
## Kernel Methods and Gaussian Processes

Welcome to the **infinite-dimensional world** of functional analysis! This branch of mathematics provides the theoretical foundation for kernel methods, Gaussian processes, and modern deep learning theory.

### What You'll Master
By the end of this notebook, you'll understand:
1. **Normed and metric spaces** - The foundation of distance and convergence
2. **Banach and Hilbert spaces** - Complete infinite-dimensional spaces
3. **Reproducing Kernel Hilbert Spaces (RKHS)** - The theory behind kernel methods
4. **Gaussian processes** - Infinite-dimensional Bayesian models
5. **Functional derivatives** - Optimization in function spaces
6. **Operators and functionals** - Linear maps between infinite-dimensional spaces

### Why This is Transformative
- **Kernel methods** - SVMs, Gaussian processes, kernel PCA
- **Neural network theory** - Universal approximation theorems
- **Optimal transport** - Wasserstein GANs and distributional learning
- **Variational inference** - Bayesian deep learning

### Real-World Applications
- **Computer vision**: Kernel-based image classification
- **Gaussian processes**: Uncertainty quantification in ML
- **Natural language**: Kernel methods for text classification
- **Reinforcement learning**: Function approximation theory

Let's explore the beautiful world of infinite dimensions! ‚àû

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import linalg, optimize
from scipy.stats import multivariate_normal
from sklearn.datasets import make_classification, make_regression
from sklearn.gaussian_process import GaussianProcessRegressor, GaussianProcessClassifier
from sklearn.gaussian_process.kernels import RBF, Matern, WhiteKernel, ConstantKernel
from sklearn.svm import SVC
from sklearn.kernel_ridge import KernelRidge
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, mean_squared_error
import pandas as pd
import warnings
warnings.filterwarnings('ignore')

# Set style
plt.style.use('seaborn-v0_8')
sns.set_palette("viridis")
np.random.seed(42)

print("‚àû Functional Analysis toolkit loaded!")
print("Ready to explore infinite-dimensional spaces!")

## 1. Normed and Metric Spaces

### From Finite to Infinite Dimensions
Functional analysis extends linear algebra to **infinite-dimensional spaces** where "vectors" are functions.

### Metric Spaces
A **metric space** (X, d) has a distance function d: X √ó X ‚Üí ‚Ñù satisfying:
1. **Positive definite**: d(x, y) ‚â• 0, d(x, y) = 0 ‚ü∫ x = y
2. **Symmetric**: d(x, y) = d(y, x)
3. **Triangle inequality**: d(x, z) ‚â§ d(x, y) + d(y, z)

### Normed Spaces
A **normed space** (X, ‚Äñ¬∑‚Äñ) has a norm function ‚Äñ¬∑‚Äñ: X ‚Üí ‚Ñù satisfying:
1. **Positive definite**: ‚Äñx‚Äñ ‚â• 0, ‚Äñx‚Äñ = 0 ‚ü∫ x = 0
2. **Homogeneity**: ‚ÄñŒ±x‚Äñ = |Œ±|‚Äñx‚Äñ
3. **Triangle inequality**: ‚Äñx + y‚Äñ ‚â§ ‚Äñx‚Äñ + ‚Äñy‚Äñ

### Important Function Spaces
**L·µñ spaces**: ‚Äñf‚Äñ‚Çö = (‚à´|f(x)|·µñ dx)^(1/p)
- **L¬π**: Integrable functions
- **L¬≤**: Square-integrable functions (energy finite)
- **L‚àû**: Essentially bounded functions

**C‚Å∞ space**: Continuous functions with sup norm ‚Äñf‚Äñ‚àû = sup|f(x)|

### Completeness
A space is **complete** if every Cauchy sequence converges:
- **Banach space**: Complete normed space
- **Hilbert space**: Complete inner product space

### Why This Matters for ML
- **Function approximation**: Neural networks approximate functions in these spaces
- **Regularization**: Norms control function complexity
- **Kernel methods**: Work in reproducing kernel Hilbert spaces
- **Optimization**: Gradient descent in function spaces

In [None]:
def demonstrate_function_spaces():
    """Explore normed spaces and function approximation"""
    
    print("üìê Function Spaces: From Finite to Infinite Dimensions")
    print("=" * 54)
    
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    
    # 1. Different norms on function spaces
    print("\n1. Function Norms: Different Ways to Measure Function Size")
    
    # Create test functions
    x = np.linspace(0, 1, 1000)
    dx = x[1] - x[0]
    
    functions = {
        'Smooth': np.sin(2 * np.pi * x),
        'Sharp peak': np.exp(-50 * (x - 0.5)**2),
        'Discontinuous': np.where(x < 0.5, 1, -1),
        'Oscillatory': np.sin(20 * np.pi * x) * np.exp(-x)
    }
    
    # Compute different norms
    norms_data = []
    
    for name, f in functions.items():
        # L1 norm (integral of absolute value)
        l1_norm = np.sum(np.abs(f)) * dx
        
        # L2 norm (energy norm)
        l2_norm = np.sqrt(np.sum(f**2) * dx)
        
        # L‚àû norm (supremum norm)
        linf_norm = np.max(np.abs(f))
        
        norms_data.append({
            'Function': name,
            'L¬π norm': l1_norm,
            'L¬≤ norm': l2_norm,
            'L‚àû norm': linf_norm
        })
    
    # Plot functions
    colors = plt.cm.viridis(np.linspace(0, 1, len(functions)))
    
    for i, (name, f) in enumerate(functions.items()):
        if i < 3:
            ax = axes[0, i]
            ax.plot(x, f, color=colors[i], linewidth=2)
            ax.set_title(f'{name} Function')
            ax.set_xlabel('x')
            ax.set_ylabel('f(x)')
            ax.grid(True, alpha=0.3)
            
            # Add norm information
            norm_info = norms_data[i]
            ax.text(0.05, 0.95, f"L¬π: {norm_info['L¬π norm']:.3f}\n" +
                              f"L¬≤: {norm_info['L¬≤ norm']:.3f}\n" +
                              f"L‚àû: {norm_info['L‚àû norm']:.3f}",
                   transform=ax.transAxes, verticalalignment='top',
                   bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.8))
    
    # Display norms table
    norms_df = pd.DataFrame(norms_data)
    print("   Function norms comparison:")
    print(norms_df.to_string(index=False, float_format='%.3f'))
    
    # 2. Function approximation in different spaces
    print("\n2. Function Approximation: Polynomial vs Fourier")
    
    # Target function
    def target_function(x):
        return np.sin(3*x) + 0.5*np.cos(7*x) + 0.3*np.sin(15*x)
    
    x_fine = np.linspace(0, 2*np.pi, 1000)
    y_target = target_function(x_fine)
    
    # Polynomial approximation
    n_poly = 10
    x_sample = np.linspace(0, 2*np.pi, 20)
    y_sample = target_function(x_sample)
    
    # Fit polynomial
    poly_coeffs = np.polyfit(x_sample, y_sample, n_poly)
    y_poly = np.polyval(poly_coeffs, x_fine)
    
    # Fourier approximation
    n_fourier = 10
    fourier_approx = np.zeros_like(x_fine)
    
    # Compute Fourier coefficients
    for k in range(1, n_fourier + 1):
        # Approximate coefficients using samples
        a_k = np.mean(y_sample * np.cos(k * x_sample)) * 2
        b_k = np.mean(y_sample * np.sin(k * x_sample)) * 2
        fourier_approx += a_k * np.cos(k * x_fine) + b_k * np.sin(k * x_fine)
    
    # Add constant term
    fourier_approx += np.mean(y_sample)
    
    # Plot approximations
    axes[1, 0].plot(x_fine, y_target, 'k-', linewidth=2, label='Target function')
    axes[1, 0].plot(x_fine, y_poly, 'r--', linewidth=2, label=f'Polynomial (degree {n_poly})')
    axes[1, 0].plot(x_sample, y_sample, 'bo', markersize=4, label='Sample points')
    axes[1, 0].set_xlabel('x')
    axes[1, 0].set_ylabel('f(x)')
    axes[1, 0].set_title('Polynomial Approximation')
    axes[1, 0].legend()
    axes[1, 0].grid(True, alpha=0.3)
    
    axes[1, 1].plot(x_fine, y_target, 'k-', linewidth=2, label='Target function')
    axes[1, 1].plot(x_fine, fourier_approx, 'b--', linewidth=2, label=f'Fourier ({n_fourier} terms)')
    axes[1, 1].plot(x_sample, y_sample, 'ro', markersize=4, label='Sample points')
    axes[1, 1].set_xlabel('x')
    axes[1, 1].set_ylabel('f(x)')
    axes[1, 1].set_title('Fourier Approximation')
    axes[1, 1].legend()
    axes[1, 1].grid(True, alpha=0.3)
    
    # Compute approximation errors
    poly_error = np.sqrt(np.mean((y_target - y_poly)**2))
    fourier_error = np.sqrt(np.mean((y_target - fourier_approx)**2))
    
    print(f"   Polynomial approximation L¬≤ error: {poly_error:.4f}")
    print(f"   Fourier approximation L¬≤ error: {fourier_error:.4f}")
    
    # 3. Completeness and convergence
    print("\n3. Completeness: Cauchy Sequences and Convergence")
    
    # Demonstrate convergence in L¬≤ space
    # Approximate a step function with smooth functions
    x_step = np.linspace(-1, 1, 1000)
    step_function = np.where(x_step > 0, 1, -1)
    
    # Smooth approximations with increasing steepness
    steepness_values = [1, 2, 5, 10, 20, 50]
    smooth_approximations = []
    l2_errors = []
    
    for alpha in steepness_values:
        smooth_approx = np.tanh(alpha * x_step)
        smooth_approximations.append(smooth_approx)
        
        # L¬≤ error
        error = np.sqrt(np.mean((step_function - smooth_approx)**2))
        l2_errors.append(error)
    
    # Plot convergence
    axes[1, 2].plot(x_step, step_function, 'k-', linewidth=3, label='Step function')
    
    colors_conv = plt.cm.plasma(np.linspace(0, 1, len(steepness_values)))
    for i, (alpha, approx) in enumerate(zip(steepness_values, smooth_approximations)):
        if i % 2 == 0:  # Plot every other approximation to avoid clutter
            axes[1, 2].plot(x_step, approx, color=colors_conv[i], linewidth=2,
                          alpha=0.7, label=f'Œ±={alpha}')
    
    axes[1, 2].set_xlabel('x')
    axes[1, 2].set_ylabel('f(x)')
    axes[1, 2].set_title('Convergence to Discontinuous Function')
    axes[1, 2].legend()
    axes[1, 2].grid(True, alpha=0.3)
    
    print(f"   L¬≤ convergence errors: {[f'{err:.4f}' for err in l2_errors]}")
    print(f"   Sequence is Cauchy and converges in L¬≤ space")
    
    plt.tight_layout()
    plt.show()
    
    return norms_df, poly_error, fourier_error

norms_comparison, poly_err, fourier_err = demonstrate_function_spaces()

## 2. Reproducing Kernel Hilbert Spaces (RKHS)

### The Heart of Kernel Methods
A **Reproducing Kernel Hilbert Space** is a Hilbert space ‚Ñã of functions with a special property: point evaluation is a continuous linear functional.

### The Reproducing Property
For every x in the domain, there exists k(¬∑, x) ‚àà ‚Ñã such that:
```
f(x) = ‚ü®f, k(¬∑, x)‚ü©_‚Ñã  for all f ‚àà ‚Ñã
```

### Kernel Function
The **kernel function** k(x, y) = ‚ü®k(¬∑, x), k(¬∑, y)‚ü©_‚Ñã satisfies:
1. **Symmetry**: k(x, y) = k(y, x)
2. **Positive definiteness**: For any x‚ÇÅ, ..., x‚Çô and c‚ÇÅ, ..., c‚Çô:
   Œ£·µ¢‚±º c·µ¢c‚±ºk(x·µ¢, x‚±º) ‚â• 0

### Common Kernels
**RBF (Gaussian)**: k(x, y) = exp(-Œ≥‚Äñx - y‚Äñ¬≤)
**Polynomial**: k(x, y) = (Œ≥‚ü®x, y‚ü© + r)^d
**Linear**: k(x, y) = ‚ü®x, y‚ü©
**Mat√©rn**: k(x, y) = (2^(1-ŒΩ)/Œì(ŒΩ))(‚àö(2ŒΩ)r)^ŒΩ K_ŒΩ(‚àö(2ŒΩ)r)

### The Representer Theorem
**Most important theorem in kernel methods**:
The solution to the optimization problem:
```
min_f Œ£·µ¢ L(y·µ¢, f(x·µ¢)) + ŒªŒ©(‚Äñf‚Äñ_‚Ñã)
```
has the form f(x) = Œ£·µ¢ Œ±·µ¢k(x, x·µ¢)

### Why RKHS Matters
- **Feature maps**: k(x, y) = ‚ü®œÜ(x), œÜ(y)‚ü© (kernel trick)
- **Infinite dimensions**: Work implicitly in high-dimensional spaces
- **Regularization**: ‚Äñf‚Äñ_‚Ñã controls function complexity
- **Universal approximation**: RBF kernels are universal approximators

In [None]:
def demonstrate_rkhs_and_kernels():
    """Explore Reproducing Kernel Hilbert Spaces and kernel methods"""
    
    print("üéØ RKHS and Kernel Methods: The Kernel Trick in Action")
    print("=" * 56)
    
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    
    # 1. Kernel functions and their properties
    print("\n1. Kernel Functions: Building Blocks of RKHS")
    
    # Define different kernel functions
    def rbf_kernel(x, y, gamma=1.0):
        return np.exp(-gamma * np.linalg.norm(x - y)**2)
    
    def polynomial_kernel(x, y, degree=3, gamma=1.0, coef0=1.0):
        return (gamma * np.dot(x, y) + coef0)**degree
    
    def linear_kernel(x, y):
        return np.dot(x, y)
    
    def matern_kernel(x, y, length_scale=1.0, nu=1.5):
        r = np.linalg.norm(x - y) / length_scale
        if nu == 0.5:
            return np.exp(-r)
        elif nu == 1.5:
            return (1 + np.sqrt(3) * r) * np.exp(-np.sqrt(3) * r)
        elif nu == 2.5:
            return (1 + np.sqrt(5) * r + 5 * r**2 / 3) * np.exp(-np.sqrt(5) * r)
        else:
            # General case (simplified)
            return np.exp(-r)
    
    # Visualize kernels as functions of distance
    distances = np.linspace(0, 3, 100)
    x_ref = np.array([0.0])
    
    kernel_values = {
        'RBF (Œ≥=1)': [rbf_kernel(x_ref, np.array([d]), gamma=1.0) for d in distances],
        'RBF (Œ≥=5)': [rbf_kernel(x_ref, np.array([d]), gamma=5.0) for d in distances],
        'Mat√©rn (ŒΩ=0.5)': [matern_kernel(x_ref, np.array([d]), nu=0.5) for d in distances],
        'Mat√©rn (ŒΩ=2.5)': [matern_kernel(x_ref, np.array([d]), nu=2.5) for d in distances],
    }
    
    colors = ['blue', 'red', 'green', 'orange']
    for i, (name, values) in enumerate(kernel_values.items()):
        axes[0, 0].plot(distances, values, color=colors[i], linewidth=2, label=name)
    
    axes[0, 0].set_xlabel('Distance |x - y|')
    axes[0, 0].set_ylabel('Kernel Value k(x, y)')
    axes[0, 0].set_title('Kernel Functions vs Distance')
    axes[0, 0].legend()
    axes[0, 0].grid(True, alpha=0.3)
    
    print(f"   Kernels measure similarity: k(x, x) = 1, k(x, y) ‚Üí 0 as |x-y| ‚Üí ‚àû")
    
    # 2. Kernel matrix and positive definiteness
    print("\n2. Kernel Matrix: Positive Definiteness")
    
    # Generate random points
    np.random.seed(42)
    n_points = 8
    X_kernel = np.random.randn(n_points, 2)
    
    # Compute kernel matrices for different kernels
    K_rbf = np.zeros((n_points, n_points))
    K_poly = np.zeros((n_points, n_points))
    K_linear = np.zeros((n_points, n_points))
    
    for i in range(n_points):
        for j in range(n_points):
            K_rbf[i, j] = rbf_kernel(X_kernel[i], X_kernel[j], gamma=1.0)
            K_poly[i, j] = polynomial_kernel(X_kernel[i], X_kernel[j], degree=2)
            K_linear[i, j] = linear_kernel(X_kernel[i], X_kernel[j])
    
    # Check positive definiteness (all eigenvalues ‚â• 0)
    eigvals_rbf = np.linalg.eigvals(K_rbf)
    eigvals_poly = np.linalg.eigvals(K_poly)
    eigvals_linear = np.linalg.eigvals(K_linear)
    
    # Visualize RBF kernel matrix
    im = axes[0, 1].imshow(K_rbf, cmap='Blues', aspect='auto')
    axes[0, 1].set_title('RBF Kernel Matrix')
    axes[0, 1].set_xlabel('Data Point j')
    axes[0, 1].set_ylabel('Data Point i')
    plt.colorbar(im, ax=axes[0, 1], shrink=0.8)
    
    # Add kernel values
    for i in range(n_points):
        for j in range(n_points):
            axes[0, 1].text(j, i, f'{K_rbf[i, j]:.2f}', ha='center', va='center',
                           fontsize=6, color='white' if K_rbf[i, j] > 0.5 else 'black')
    
    print(f"   RBF kernel eigenvalues (all ‚â• 0): min = {eigvals_rbf.min():.3f}, max = {eigvals_rbf.max():.3f}")
    print(f"   Polynomial kernel eigenvalues: min = {eigvals_poly.min():.3f}, max = {eigvals_poly.max():.3f}")
    print(f"   Linear kernel eigenvalues: min = {eigvals_linear.min():.3f}, max = {eigvals_linear.max():.3f}")
    
    # 3. Kernel ridge regression
    print("\n3. Kernel Ridge Regression: Representer Theorem in Action")
    
    # Generate 1D regression data
    np.random.seed(42)
    X_reg = np.linspace(0, 1, 20).reshape(-1, 1)
    y_reg = np.sin(2 * np.pi * X_reg.flatten()) + 0.3 * np.random.randn(20)
    
    # Test points
    X_test = np.linspace(0, 1, 100).reshape(-1, 1)
    
    # Different regularization parameters
    alphas = [0.001, 0.1, 1.0]
    colors_reg = ['blue', 'red', 'green']
    
    for i, alpha in enumerate(alphas):
        # Kernel Ridge Regression
        krr = KernelRidge(kernel='rbf', gamma=1.0, alpha=alpha)
        krr.fit(X_reg, y_reg)
        y_pred = krr.predict(X_test)
        
        axes[0, 2].plot(X_test.flatten(), y_pred, color=colors_reg[i], linewidth=2,
                       label=f'Œ±={alpha}')
    
    # Plot true function and data
    y_true = np.sin(2 * np.pi * X_test.flatten())
    axes[0, 2].plot(X_test.flatten(), y_true, 'k--', linewidth=2, label='True function')
    axes[0, 2].scatter(X_reg.flatten(), y_reg, color='black', s=50, zorder=5, label='Data')
    
    axes[0, 2].set_xlabel('x')
    axes[0, 2].set_ylabel('y')
    axes[0, 2].set_title('Kernel Ridge Regression')
    axes[0, 2].legend()
    axes[0, 2].grid(True, alpha=0.3)
    
    print(f"   Representer theorem: f(x) = Œ£·µ¢ Œ±·µ¢ k(x, x·µ¢)")
    print(f"   Regularization Œ± controls smoothness")
    
    # 4. Feature maps and the kernel trick
    print("\n4. Feature Maps: The Kernel Trick Revealed")
    
    # Demonstrate feature map for polynomial kernel
    # For degree 2 polynomial kernel in 2D: œÜ(x) = [x‚ÇÅ¬≤, ‚àö2x‚ÇÅx‚ÇÇ, x‚ÇÇ¬≤, ‚àö2x‚ÇÅ, ‚àö2x‚ÇÇ, 1]
    
    def polynomial_feature_map_2d(x, degree=2):
        """Explicit feature map for degree-2 polynomial kernel in 2D"""
        x1, x2 = x[0], x[1]
        if degree == 2:
            return np.array([x1**2, np.sqrt(2)*x1*x2, x2**2, np.sqrt(2)*x1, np.sqrt(2)*x2, 1])
        else:
            return x
    
    # Generate 2D data
    np.random.seed(42)
    X_2d = np.random.randn(5, 2)
    
    # Compute kernel matrix using explicit feature map
    phi_X = np.array([polynomial_feature_map_2d(x) for x in X_2d])
    K_explicit = phi_X @ phi_X.T
    
    # Compute kernel matrix using kernel function directly
    K_direct = np.zeros((5, 5))
    for i in range(5):
        for j in range(5):
            K_direct[i, j] = polynomial_kernel(X_2d[i], X_2d[j], degree=2, gamma=1.0, coef0=1.0)
    
    # Plot comparison
    im1 = axes[1, 0].imshow(K_explicit, cmap='RdBu', aspect='auto')
    axes[1, 0].set_title('Kernel via Feature Map')
    axes[1, 0].set_xlabel('Data Point')
    axes[1, 0].set_ylabel('Data Point')
    
    im2 = axes[1, 1].imshow(K_direct, cmap='RdBu', aspect='auto')
    axes[1, 1].set_title('Kernel Function Direct')
    axes[1, 1].set_xlabel('Data Point')
    axes[1, 1].set_ylabel('Data Point')
    
    # Add values
    for i in range(5):
        for j in range(5):
            axes[1, 0].text(j, i, f'{K_explicit[i, j]:.2f}', ha='center', va='center', fontsize=8)
            axes[1, 1].text(j, i, f'{K_direct[i, j]:.2f}', ha='center', va='center', fontsize=8)
    
    plt.colorbar(im1, ax=axes[1, 0], shrink=0.8)
    plt.colorbar(im2, ax=axes[1, 1], shrink=0.8)
    
    # Check if they're the same
    kernel_difference = np.max(np.abs(K_explicit - K_direct))
    print(f"   Feature map dimension: {phi_X.shape[1]} (vs original 2D)")
    print(f"   Kernel matrices difference: {kernel_difference:.10f}")
    print(f"   Kernel trick: k(x, y) = ‚ü®œÜ(x), œÜ(y)‚ü© without computing œÜ explicitly")
    
    # 5. Universal approximation with RBF kernels
    print("\n5. Universal Approximation: RBF Kernels Can Approximate Any Function")
    
    # Complex target function
    def complex_function(x):
        return np.sin(5*x) * np.exp(-x) + 0.5*np.cos(10*x)
    
    X_approx = np.linspace(0, 2, 100).reshape(-1, 1)
    y_target = complex_function(X_approx.flatten())
    
    # Different numbers of basis functions
    n_basis_functions = [5, 10, 20]
    colors_approx = ['blue', 'red', 'green']
    
    for i, n_basis in enumerate(n_basis_functions):
        # Select basis points
        X_basis = np.linspace(0, 2, n_basis).reshape(-1, 1)
        y_basis = complex_function(X_basis.flatten())
        
        # Kernel ridge regression with RBF kernel
        krr_approx = KernelRidge(kernel='rbf', gamma=5.0, alpha=0.01)
        krr_approx.fit(X_basis, y_basis)
        y_approx = krr_approx.predict(X_approx)
        
        axes[1, 2].plot(X_approx.flatten(), y_approx, color=colors_approx[i],
                       linewidth=2, label=f'{n_basis} basis functions')
    
    axes[1, 2].plot(X_approx.flatten(), y_target, 'k--', linewidth=2, label='Target function')
    axes[1, 2].set_xlabel('x')
    axes[1, 2].set_ylabel('f(x)')
    axes[1, 2].set_title('Universal Approximation with RBF')
    axes[1, 2].legend()
    axes[1, 2].grid(True, alpha=0.3)
    
    # Compute approximation errors
    for i, n_basis in enumerate(n_basis_functions):
        X_basis = np.linspace(0, 2, n_basis).reshape(-1, 1)
        y_basis = complex_function(X_basis.flatten())
        krr_approx = KernelRidge(kernel='rbf', gamma=5.0, alpha=0.01)
        krr_approx.fit(X_basis, y_basis)
        y_approx = krr_approx.predict(X_approx)
        error = np.sqrt(np.mean((y_target - y_approx)**2))
        print(f"   {n_basis} basis functions: RMSE = {error:.4f}")
    
    plt.tight_layout()
    plt.show()
    
    return K_rbf, eigvals_rbf, kernel_difference

K_demo, eigvals_demo, kernel_diff = demonstrate_rkhs_and_kernels()