# 09_optimization_roots
Optimization, root-finding techniques

In [None]:
% Content to be added

# OctaveMasterPro: Optimization & Root-Finding

Master numerical optimization and root-finding techniques! This notebook covers unconstrained and constrained optimization, root-finding algorithms, and advanced numerical methods essential for solving real-world engineering and scientific problems.

**Learning Objectives:**
- Implement root-finding algorithms (bisection, Newton-Raphson, secant)
- Master unconstrained optimization techniques
- Apply constrained optimization methods
- Understand convergence criteria and numerical stability
- Solve complex optimization problems in multiple variables

---

## 1. Root-Finding Fundamentals

```octave
% Root-finding algorithms and applications
fprintf('=== Root-Finding Fundamentals ===\n');

% Define test functions with known roots
f1 = @(x) x^3 - 6*x^2 + 11*x - 6;  % Roots: 1, 2, 3
f1_prime = @(x) 3*x^2 - 12*x + 11;  % Derivative

f2 = @(x) exp(-x) - x;  % Transcendental equation
f2_prime = @(x) -exp(-x) - 1;

f3 = @(x) cos(x) - x;  % Another transcendental
f3_prime = @(x) -sin(x) - 1;

fprintf('Test functions defined with theoretical roots\n');

% Bisection Method
fprintf('\n1. Bisection Method:\n');

function [root, iterations, errors] = bisection(func, a, b, tol, max_iter)
    % Bisection method for root finding
    % Input: func - function handle, [a,b] - bracket, tol - tolerance, max_iter - max iterations
    % Output: root - approximate root, iterations - number of iterations, errors - error history
    
    % Check initial bracket
    if func(a) * func(b) > 0
        error('Function must have opposite signs at endpoints');
    end
    
    errors = [];
    
    for iter = 1:max_iter
        c = (a + b) / 2;  % Midpoint
        fc = func(c);
        
        % Store error estimate
        error_est = (b - a) / 2;
        errors(iter) = error_est;
        
        % Check convergence
        if abs(error_est) < tol || abs(fc) < tol
            root = c;
            iterations = iter;
            return;
        end
        
        % Update bracket
        if func(a) * fc < 0
            b = c;
        else
            a = c;
        end
    end
    
    root = c;
    iterations = max_iter;
end

% Test bisection on polynomial
[root_bisect, iter_bisect, errors_bisect] = bisection(f1, 0.5, 1.5, 1e-8, 100);
fprintf('   Polynomial f(x) = x³ - 6x² + 11x - 6 on [0.5, 1.5]:\n');
fprintf('   Root found: %.8f (iterations: %d)\n', root_bisect, iter_bisect);
fprintf('   Verification: f(%.8f) = %.2e\n', root_bisect, f1(root_bisect));
fprintf('   Theoretical root: 1.0\n');

% Newton-Raphson Method
fprintf('\n2. Newton-Raphson Method:\n');

function [root, iterations, errors] = newton_raphson(func, func_prime, x0, tol, max_iter)
    % Newton-Raphson method for root finding
    % Input: func - function, func_prime - derivative, x0 - initial guess
    % Output: root, iterations, error history
    
    x = x0;
    errors = [];
    
    for iter = 1:max_iter
        fx = func(x);
        fpx = func_prime(x);
        
        % Check for zero derivative
        if abs(fpx) < eps
            error('Derivative too close to zero');
        end
        
        % Newton update
        x_new = x - fx / fpx;
        
        % Error estimate
        error_est = abs(x_new - x);
        errors(iter) = error_est;
        
        % Check convergence
        if error_est < tol || abs(fx) < tol
            root = x_new;
            iterations = iter;
            return;
        end
        
        x = x_new;
    end
    
    root = x;
    iterations = max_iter;
end

% Test Newton-Raphson on transcendental equation
[root_newton, iter_newton, errors_newton] = newton_raphson(f2, f2_prime, 0.5, 1e-10, 50);
fprintf('   Transcendental f(x) = e^(-x) - x, x0 = 0.5:\n');
fprintf('   Root found: %.10f (iterations: %d)\n', root_newton, iter_newton);
fprintf('   Verification: f(%.10f) = %.2e\n', root_newton, f2(root_newton));

% Secant Method
fprintf('\n3. Secant Method:\n');

function [root, iterations, errors] = secant_method(func, x0, x1, tol, max_iter)
    % Secant method for root finding
    % Input: func - function, x0, x1 - initial points, tol, max_iter
    % Output: root, iterations, error history
    
    errors = [];
    
    for iter = 1:max_iter
        f0 = func(x0);
        f1 = func(x1);
        
        % Check for parallel secant line
        if abs(f1 - f0) < eps
            error('Function values too close - secant line undefined');
        end
        
        % Secant update
        x2 = x1 - f1 * (x1 - x0) / (f1 - f0);
        
        % Error estimate
        error_est = abs(x2 - x1);
        errors(iter) = error_est;
        
        % Check convergence
        if error_est < tol || abs(func(x2)) < tol
            root = x2;
            iterations = iter;
            return;
        end
        
        % Update points
        x0 = x1;
        x1 = x2;
    end
    
    root = x1;
    iterations = max_iter;
end

% Test secant method
[root_secant, iter_secant, errors_secant] = secant_method(f3, 0.5, 1.0, 1e-10, 50);
fprintf('   Transcendental f(x) = cos(x) - x, x0 = 0.5, x1 = 1.0:\n');
fprintf('   Root found: %.10f (iterations: %d)\n', root_secant, iter_secant);
fprintf('   Verification: f(%.10f) = %.2e\n', root_secant, f3(root_secant));

% Convergence comparison
fprintf('\n4. Convergence Rate Comparison:\n');
fprintf('   Method      | Iterations | Final Error\n');
fprintf('   ------------|------------|------------\n');
fprintf('   Bisection   | %8d   | %.2e\n', iter_bisect, errors_bisect(end));
fprintf('   Newton-Raph | %8d   | %.2e\n', iter_newton, errors_newton(end));
fprintf('   Secant      | %8d   | %.2e\n', iter_secant, errors_secant(end));

% Fixed-point iteration
fprintf('\n5. Fixed-Point Iteration:\n');

function [root, iterations, errors] = fixed_point(g, x0, tol, max_iter)
    % Fixed-point iteration: x = g(x)
    % Input: g - iteration function, x0 - initial guess
    % Output: root, iterations, error history
    
    x = x0;
    errors = [];
    
    for iter = 1:max_iter
        x_new = g(x);
        
        error_est = abs(x_new - x);
        errors(iter) = error_est;
        
        if error_est < tol
            root = x_new;
            iterations = iter;
            return;
        end
        
        x = x_new;
    end
    
    root = x;
    iterations = max_iter;
end

% Convert f(x) = x³ - 6x² + 11x - 6 = 0 to x = g(x) form
% Rearrange to x = (6x² - 11x + 6) / x for x near root at 1
g1 = @(x) (6*x^2 - 11*x + 6) / (3*x);  % More stable form
[root_fp, iter_fp, errors_fp] = fixed_point(g1, 1.1, 1e-8, 100);

fprintf('   Fixed-point form: x = (6x² - 11x + 6)/(3x)\n');
fprintf('   Root found: %.8f (iterations: %d)\n', root_fp, iter_fp);
fprintf('   Verification: f(%.8f) = %.2e\n', root_fp, f1(root_fp));
```

## 2. Systems of Nonlinear Equations

```octave
% Systems of nonlinear equations
fprintf('\n=== Systems of Nonlinear Equations ===\n');

% Define system: f1(x,y) = 0, f2(x,y) = 0
% Example: x² + y² - 4 = 0, x² - y - 1 = 0
F = @(x) [x(1)^2 + x(2)^2 - 4; x(1)^2 - x(2) - 1];  % System of equations

% Jacobian matrix
J = @(x) [2*x(1), 2*x(2); 2*x(1), -1];

fprintf('System: x² + y² = 4, x² - y = 1\n');

% Newton's method for systems
fprintf('\n1. Newton''s Method for Systems:\n');

function [root, iterations] = newton_system(F, J, x0, tol, max_iter)
    % Newton's method for systems of nonlinear equations
    % Input: F - system function, J - Jacobian, x0 - initial guess
    % Output: root - solution vector, iterations
    
    x = x0;
    
    for iter = 1:max_iter
        Fx = F(x);
        Jx = J(x);
        
        % Check for convergence
        if norm(Fx) < tol
            root = x;
            iterations = iter;
            return;
        end
        
        % Check for singular Jacobian
        if rcond(Jx) < eps
            error('Jacobian is nearly singular');
        end
        
        % Newton update: x_new = x - J^(-1) * F(x)
        delta_x = Jx \ Fx;
        x = x - delta_x;
        
        % Check step size
        if norm(delta_x) < tol
            root = x;
            iterations = iter;
            return;
        end
    end
    
    root = x;
    iterations = max_iter;
end

% Solve system with different initial guesses
initial_guesses = [1.5, 1.0; -1.5, 1.0; 1.0, -1.0];

for i = 1:size(initial_guesses, 1)
    x0 = initial_guesses(i, :)';
    [sol, iters] = newton_system(F, J, x0, 1e-10, 50);
    
    fprintf('   Initial guess [%.1f, %.1f]: Solution [%.6f, %.6f] (%d iterations)\n', ...
            x0(1), x0(2), sol(1), sol(2), iters);
    fprintf('     Verification: ||F(x)|| = %.2e\n', norm(F(sol)));
end

% Broyden's method (quasi-Newton)
fprintf('\n2. Broyden''s Quasi-Newton Method:\n');

function [root, iterations] = broyden_method(F, x0, tol, max_iter)
    % Broyden's method for systems (approximate Jacobian)
    % Input: F - system function, x0 - initial guess
    % Output: root, iterations
    
    x = x0;
    n = length(x0);
    
    % Initial Jacobian approximation (finite differences)
    B = zeros(n, n);
    h = sqrt(eps);
    F0 = F(x);
    
    for j = 1:n
        x_pert = x;
        x_pert(j) = x_pert(j) + h;
        B(:, j) = (F(x_pert) - F0) / h;
    end
    
    for iter = 1:max_iter
        Fx = F(x);
        
        if norm(Fx) < tol
            root = x;
            iterations = iter;
            return;
        end
        
        % Check for singular approximation
        if rcond(B) < eps
            error('Jacobian approximation is singular');
        end
        
        % Solve B * delta_x = -F(x)
        delta_x = B \ (-Fx);
        x_new = x + delta_x;
        
        % Update Jacobian approximation (Broyden's update)
        Fx_new = F(x_new);
        y = Fx_new - Fx;
        
        if norm(delta_x) > eps
            B = B + (y - B * delta_x) * delta_x' / (delta_x' * delta_x);
        end
        
        x = x_new;
        
        if norm(delta_x) < tol
            root = x;
            iterations = iter;
            return;
        end
    end
    
    root = x;
    iterations = max_iter;
end

[sol_broyden, iter_broyden] = broyden_method(F, [1.5; 1.0], 1e-8, 100);
fprintf('   Broyden solution: [%.6f, %.6f] (%d iterations)\n', ...
        sol_broyden(1), sol_broyden(2), iter_broyden);
fprintf('   Verification: ||F(x)|| = %.2e\n', norm(F(sol_broyden)));
```

## 3. Unconstrained Optimization

```octave
% Unconstrained optimization methods
fprintf('\n=== Unconstrained Optimization ===\n');

% Define test functions
% Rosenbrock function: f(x,y) = 100(y - x²)² + (1 - x)²
rosenbrock = @(x) 100*(x(2) - x(1)^2)^2 + (1 - x(1))^2;
rosenbrock_grad = @(x) [-400*x(1)*(x(2) - x(1)^2) - 2*(1 - x(1)); 200*(x(2) - x(1)^2)];

% Himmelblau's function: f(x,y) = (x² + y - 11)² + (x + y² - 7)²
himmelblau = @(x) (x(1)^2 + x(2) - 11)^2 + (x(1) + x(2)^2 - 7)^2;
himmelblau_grad = @(x) [4*x(1)*(x(1)^2 + x(2) - 11) + 2*(x(1) + x(2)^2 - 7); ...
                       2*(x(1)^2 + x(2) - 11) + 4*x(2)*(x(1) + x(2)^2 - 7)];

fprintf('Test functions: Rosenbrock and Himmelblau\n');

% Line search function using Armijo condition
function alpha_opt = armijo_line_search(func, grad, x, p, alpha_init, c1, rho)
    % Armijo line search
    % Input: func - objective, grad - gradient, x - current point, p - search direction
    % Output: alpha_opt - step size
    
    alpha = alpha_init;
    f0 = func(x);
    g0 = grad(x);
    phi0 = g0' * p;
    
    % Check descent direction
    if phi0 >= 0
        alpha_opt = 0;
        return;
    end
    
    while func(x + alpha * p) > f0 + c1 * alpha * phi0
        alpha = rho * alpha;
        if alpha < 1e-12
            break;
        end
    end
    
    alpha_opt = alpha;
end

% Gradient Descent with line search
fprintf('\n1. Gradient Descent with Line Search:\n');

function [x_opt, f_opt, iterations] = gradient_descent(func, grad, x0, tol, max_iter)
    % Gradient descent optimization with Armijo line search
    % Input: func - objective function, grad - gradient, x0 - initial point
    % Output: x_opt - optimal point, f_opt - optimal value, iterations
    
    x = x0;
    
    for iter = 1:max_iter
        g = grad(x);
        
        % Check convergence
        if norm(g) < tol
            x_opt = x;
            f_opt = func(x);
            iterations = iter;
            return;
        end
        
        % Search direction (steepest descent)
        p = -g;
        
        % Line search
        alpha = armijo_line_search(func, grad, x, p, 1.0, 1e-4, 0.5);
        
        % Update
        x = x + alpha * p;
    end
    
    x_opt = x;
    f_opt = func(x);
    iterations = max_iter;
end

% Test gradient descent on Rosenbrock
[x_gd, f_gd, iter_gd] = gradient_descent(rosenbrock, rosenbrock_grad, [-1.2; 1.0], 1e-6, 10000);
fprintf('   Rosenbrock function, x0 = [-1.2, 1.0]:\n');
fprintf('   Optimal point: [%.6f, %.6f]\n', x_gd(1), x_gd(2));
fprintf('   Optimal value: %.8f (iterations: %d)\n', f_gd, iter_gd);
fprintf('   True optimum: [1, 1], f = 0\n');

% Newton's Method for Optimization
fprintf('\n2. Newton''s Method for Optimization:\n');

function [x_opt, f_opt, iterations] = newton_optimization(func, grad, hess, x0, tol, max_iter)
    % Newton's method for optimization
    % Input: func, grad, hess - function, gradient, Hessian, x0 - initial point
    % Output: x_opt, f_opt, iterations
    
    x = x0;
    
    for iter = 1:max_iter
        g = grad(x);
        H = hess(x);
        
        if norm(g) < tol
            x_opt = x;
            f_opt = func(x);
            iterations = iter;
            return;
        end
        
        % Check positive definiteness and modify if needed
        [L, flag] = chol(H, 'lower');
        if flag ~= 0
            % Add regularization for non-positive definite Hessian
            lambda = 1e-3;
            H = H + lambda * eye(size(H));
        end
        
        % Newton update: x = x - H^(-1) * g
        delta_x = H \ g;
        
        % Line search for robustness
        alpha = armijo_line_search(func, grad, x, -delta_x, 1.0, 1e-4, 0.5);
        x = x - alpha * delta_x;
        
        if norm(alpha * delta_x) < tol
            x_opt = x;
            f_opt = func(x);
            iterations = iter;
            return;
        end
    end
    
    x_opt = x;
    f_opt = func(x);
    iterations = max_iter;
end

% Rosenbrock Hessian
rosenbrock_hess = @(x) [-400*(x(2) - 3*x(1)^2) + 2, -400*x(1); -400*x(1), 200];

[x_newton, f_newton, iter_newton] = newton_optimization(rosenbrock, rosenbrock_grad, ...
                                                       rosenbrock_hess, [-1.2; 1.0], 1e-8, 100);
fprintf('   Newton''s method on Rosenbrock:\n');
fprintf('   Optimal point: [%.8f, %.8f]\n', x_newton(1), x_newton(2));
fprintf('   Optimal value: %.10f (iterations: %d)\n', f_newton, iter_newton);

% BFGS Quasi-Newton Method
fprintf('\n3. BFGS Quasi-Newton Method:\n');

function [x_opt, f_opt, iterations] = bfgs_method(func, grad, x0, tol, max_iter)
    % BFGS quasi-Newton method
    % Input: func, grad - function and gradient, x0 - initial point
    % Output: x_opt, f_opt, iterations
    
    x = x0;
    n = length(x0);
    H = eye(n);  % Initial Hessian approximation
    
    for iter = 1:max_iter
        g = grad(x);
        
        if norm(g) < tol
            x_opt = x;
            f_opt = func(x);
            iterations = iter;
            return;
        end
        
        % Search direction
        p = -H * g;
        
        % Line search
        alpha = armijo_line_search(func, grad, x, p, 1.0, 1e-4, 0.5);
        
        % Update
        x_new = x + alpha * p;
        g_new = grad(x_new);
        
        % BFGS update
        s = x_new - x;
        y = g_new - g;
        
        % Check curvature condition
        if abs(y' * s) > eps && norm(s) > eps
            rho = 1 / (y' * s);
            H = (eye(n) - rho * s * y') * H * (eye(n) - rho * y * s') + rho * s * s';
        end
        
        x = x_new;
    end
    
    x_opt = x;
    f_opt = func(x);
    iterations = max_iter;
end

[x_bfgs, f_bfgs, iter_bfgs] = bfgs_method(rosenbrock, rosenbrock_grad, [-1.2; 1.0], 1e-8, 1000);
fprintf('   BFGS method on Rosenbrock:\n');
fprintf('   Optimal point: [%.8f, %.8f]\n', x_bfgs(1), x_bfgs(2));
fprintf('   Optimal value: %.10f (iterations: %d)\n', f_bfgs, iter_bfgs);

% Method comparison
fprintf('\n4. Method Comparison:\n');
fprintf('   Method           | Iterations | Final Value     | Error\n');
fprintf('   -----------------|------------|-----------------|----------\n');
fprintf('   Gradient Descent | %8d   | %13.8f | %.2e\n', iter_gd, f_gd, norm(x_gd - [1;1]));
fprintf('   Newton           | %8d   | %13.10f | %.2e\n', iter_newton, f_newton, norm(x_newton - [1;1]));
fprintf('   BFGS             | %8d   | %13.10f | %.2e\n', iter_bfgs, f_bfgs, norm(x_bfgs - [1;1]));
```

## 4. Constrained Optimization

```octave
% Constrained optimization methods
fprintf('\n=== Constrained Optimization ===\n');

% Problem: minimize f(x,y) = (x-2)² + (y-1)² subject to x² + y² ≤ 1
objective = @(x) (x(1) - 2)^2 + (x(2) - 1)^2;
objective_grad = @(x) [2*(x(1) - 2); 2*(x(2) - 1)];

% Constraint: g(x,y) = x² + y² - 1 ≤ 0
constraint = @(x) x(1)^2 + x(2)^2 - 1;
constraint_grad = @(x) [2*x(1); 2*x(2)];

fprintf('Problem: minimize (x-2)² + (y-1)² subject to x² + y² ≤ 1\n');

% Penalty Method
fprintf('\n1. Penalty Method:\n');

function [x_opt, f_opt, iterations] = penalty_method(func, grad_func, constraint_func, constraint_grad_func, x0, tol)
    % Penalty method for constrained optimization
    % Input: func, grad_func - objective, constraint_func, constraint_grad_func - constraint
    % Output: x_opt, f_opt, iterations
    
    penalty_param = 1;
    max_penalty_iter = 15;
    
    x_current = x0;
    
    for penalty_iter = 1:max_penalty_iter
        % Create penalized objective and gradient
        penalized_func = @(x) func(x) + penalty_param * max(0, constraint_func(x))^2;
        
        penalized_grad = @(x) penalty_gradient_func(x, grad_func, constraint_func, constraint_grad_func, penalty_param);
        
        % Solve unconstrained problem
        [x_opt, ~, ~] = bfgs_method(penalized_func, penalized_grad, x_current, tol, 1000);
        
        % Check constraint satisfaction
        constraint_violation = max(0, constraint_func(x_opt));
        
        if constraint_violation < tol
            f_opt = func(x_opt);
            iterations = penalty_iter;
            return;
        end
        
        % Increase penalty parameter
        penalty_param = penalty_param * 10;
        x_current = x_opt;  % Warm start
        
        % Print progress
        fprintf('     Penalty iteration %d: violation = %.2e, penalty = %.1e\n', ...
                penalty_iter, constraint_violation, penalty_param);
    end
    
    f_opt = func(x_opt);
    iterations = max_penalty_iter;
end

function g_pen = penalty_gradient_func(x, grad_func, constraint_func, constraint_grad_func, penalty_param)
    % Gradient of penalized objective
    g_pen = grad_func(x);
    constraint_val = constraint_func(x);
    
    if constraint_val > 0
        g_pen = g_pen + 2 * penalty_param * constraint_val * constraint_grad_func(x);
    end
end

% Apply penalty method
[x_penalty, f_penalty, iter_penalty] = penalty_method(objective, objective_grad, ...
                                                     constraint, constraint_grad, [0.5; 0.5], 1e-8);

fprintf('   Penalty method result:\n');
fprintf('   Optimal point: [%.8f, %.8f]\n', x_penalty(1), x_penalty(2));
fprintf('   Optimal value: %.10f (iterations: %d)\n', f_penalty, iter_penalty);
fprintf('   Constraint value: %.8f (should be ≤ 0)\n', constraint(x_penalty));

% Analytical solution for comparison
% The constrained optimum lies on the boundary at the point closest to (2,1)
% This is at (2/√5, 1/√5) ≈ (0.8944, 0.4472)
target = [2; 1];
x_analytical = target / norm(target);
f_analytical = objective(x_analytical);

fprintf('   Analytical solution: [%.8f, %.8f]\n', x_analytical(1), x_analytical(2));
fprintf('   Analytical value: %.10f\n', f_analytical);
fprintf('   Error: %.8f\n', norm(x_penalty - x_analytical));

% Lagrange Multipliers (KKT conditions)
fprintf('\n2. Lagrange Multipliers Analysis:\n');

function [x_opt, lambda_opt, iterations] = lagrange_newton(func, func_grad, constraint_func, constraint_grad_func, x0, tol, max_iter)
    % Newton's method for Lagrange multiplier system
    % Input: func, func_grad - objective, constraint_func, constraint_grad_func - constraint
    % Output: x_opt - optimal point, lambda_opt - multiplier, iterations
    
    % Initialize [x; lambda]
    z = [x0; 1.0];
    
    for iter = 1:max_iter
        x = z(1:2);
        lambda = z(3);
        
        % KKT system: ∇f + λ∇g = 0, g = 0
        grad_f = func_grad(x);
        grad_g = constraint_grad_func(x);
        g_val = constraint_func(x);
        
        F = [grad_f + lambda * grad_g; g_val];
        
        if norm(F) < tol
            x_opt = x;
            lambda_opt = lambda;
            iterations = iter;
            return;
        end
        
        % Jacobian of KKT system
        % Hessian of objective: ∇²f = [2, 0; 0, 2]
        % Hessian of constraint: ∇²g = [2, 0; 0, 2]
        hess_f = [2, 0; 0, 2];
        hess_g = [2, 0; 0, 2];
        
        KKT_jacobian = [hess_f + lambda * hess_g, grad_g; ...
                       grad_g', 0];
        
        % Newton update
        delta_z = KKT_jacobian \ (-F);
        z = z + delta_z;
        
        if norm(delta_z) < tol
            x_opt = z(1:2);
            lambda_opt = z(3);
            iterations = iter;
            return;
        end
    end
    
    x_opt = z(1:2);
    lambda_opt = z(3);
    iterations = max_iter;
end

[x_lagrange, lambda_opt, iter_lagrange] = lagrange_newton(objective, objective_grad, ...
                                                         constraint, constraint_grad, [0.8; 0.4], 1e-12, 100);

fprintf('   Lagrange multiplier method:\n');
fprintf('   Optimal point: [%.10f, %.10f]\n', x_lagrange(1), x_lagrange(2));
fprintf('   Optimal value: %.12f\n', objective(x_lagrange));
fprintf('   Lagrange multiplier λ: %.8f\n', lambda_opt);
fprintf('   Constraint satisfaction: %.2e\n', abs(constraint(x_lagrange)));
fprintf('   Iterations: %d\n', iter_lagrange);

% Projected Gradient Method
fprintf('\n3. Projected Gradient Method:\n');

function [x_opt, f_opt, iterations] = projected_gradient(func, grad_func, constraint_func, x0, tol, max_iter)
    % Projected gradient method for inequality constraints
    % Input: func, grad_func - objective, constraint_func - constraint, x0 - initial point
    % Output: x_opt, f_opt, iterations
    
    x = x0;
    
    % Ensure initial point is feasible
    if constraint_func(x) > 0
        % Simple projection onto unit circle
        x = x / norm(x) * 0.99;  % Slightly inside boundary
    end
    
    for iter = 1:max_iter
        g = grad_func(x);
        
        if norm(g) < tol
            x_opt = x;
            f_opt = func(x);
            iterations = iter;
            return;
        end
        
        % Take gradient step
        alpha = 0.01;  % Small step size for stability
        x_trial = x - alpha * g;
        
        % Project onto feasible region if necessary
        if constraint_func(x_trial) > 0
            % Project onto unit circle
            x_trial = x_trial / norm(x_trial);
        end
        
        % Check for progress
        if func(x_trial) < func(x) || norm(x_trial - x) < tol
            x = x_trial;
        else
            % Reduce step size
            alpha = alpha / 2;
            if alpha < 1e-10
                break;
            end
            x = x - alpha * g;
            if constraint_func(x) > 0
                x = x / norm(x);
            end
        end
        
        % Convergence check
        if norm(alpha * g) < tol
            x_opt = x;
            f_opt = func(x);
            iterations = iter;
            return;
        end
    end
    
    x_opt = x;
    f_opt = func(x);
    iterations = max_iter;
end

[x_projected, f_projected, iter_projected] = projected_gradient(objective, objective_grad, constraint, [0.5; 0.5], 1e-6, 5000);

fprintf('   Projected gradient method:\n');
fprintf('   Optimal point: [%.8f, %.8f]\n', x_projected(1), x_projected(2));
fprintf('   Optimal value: %.10f (iterations: %d)\n', f_projected, iter_projected);
fprintf('   Constraint satisfaction: %.8f\n', constraint(x_projected));
fprintf('   Error from analytical: %.8f\n', norm(x_projected - x_analytical));
```

## 5. Global Optimization and Advanced Methods

```octave
% Global optimization techniques
fprintf('\n=== Global Optimization ===\n');

% Multi-modal test function: Rastrigin function
rastrigin_2d = @(x) 20 + x(1)^2 + x(2)^2 - 10*(cos(2*pi*x(1)) + cos(2*pi*x(2)));

fprintf('Test function: 2D Rastrigin (highly multi-modal)\n');
fprintf('Global optimum: [0, 0], f = 0\n');

% Simulated Annealing
fprintf('\n1. Simulated Annealing:\n');

function [x_best, f_best, iterations] = simulated_annealing(func, x0, bounds, T0, cooling_rate, max_iter)
    % Simulated annealing global optimization
    % Input: func - objective, x0 - initial point, bounds - [lower, upper], T0 - initial temperature
    % Output: x_best, f_best, iterations
    
    x_current = x0;
    f_current = func(x_current);
    
    x_best = x_current;
    f_best = f_current;
    
    T = T0;
    accepted = 0;
    
    for iter = 1:max_iter
        % Generate neighbor
        step_size = min(1.0, T / T0);  % Adaptive step size
        x_new = x_current + step_size * randn(size(x_current));
        
        % Enforce bounds
        x_new = max(bounds(1), min(bounds(2), x_new));
        
        f_new = func(x_new);
        
        % Accept or reject
        delta_f = f_new - f_current;
        
        if delta_f < 0 || (T > eps && rand() < exp(-delta_f / T))
            x_current = x_new;
            f_current = f_new;
            accepted = accepted + 1;
            
            % Update best
            if f_current < f_best
                x_best = x_current;
                f_best = f_current;
            end
        end
        
        % Cool down
        T = T * cooling_rate;
        
        % Print progress every 1000 iterations
        if mod(iter, 1000) == 0
            fprintf('     Iteration %d: T = %.2e, Best = %.6f, Accept rate = %.2f%%\n', ...
                    iter, T, f_best, 100 * accepted / iter);
        end
        
        if T < 1e-12
            iterations = iter;
            return;
        end
    end
    
    iterations = max_iter;
end

% Test simulated annealing on Rastrigin function
[x_sa, f_sa, iter_sa] = simulated_annealing(rastrigin_2d, [2.5; -3.1], [-5, 5], 10.0, 0.995, 50000);

fprintf('   Simulated Annealing on 2D Rastrigin:\n');
fprintf('   Best point: [%.6f, %.6f]\n', x_sa(1), x_sa(2));
fprintf('   Best value: %.8f (iterations: %d)\n', f_sa, iter_sa);
fprintf('   Distance from global optimum: %.6f\n', norm(x_sa));

% Genetic Algorithm
fprintf('\n2. Genetic Algorithm:\n');

function [x_best, f_best] = genetic_algorithm(func, bounds, pop_size, generations)
    % Genetic algorithm for global optimization
    % Input: func - objective, bounds - [lower, upper], pop_size, generations
    % Output: x_best, f_best
    
    dim = 2;  % 2D problem
    
    % Initialize population
    population = bounds(1) + (bounds(2) - bounds(1)) * rand(pop_size, dim);
    
    f_best = inf;
    
    for gen = 1:generations
        % Evaluate fitness
        fitness = zeros(pop_size, 1);
        for i = 1:pop_size
            fitness(i) = func(population(i, :)');
        end
        
        % Find best in current generation
        [f_best_gen, best_idx] = min(fitness);
        x_best_gen = population(best_idx, :)';
        
        if f_best_gen < f_best
            x_best = x_best_gen;
            f_best = f_best_gen;
        end
        
        % Selection, crossover, and mutation
        new_population = zeros(size(population));
        
        % Keep best individual (elitism)
        new_population(1, :) = population(best_idx, :);
        
        for i = 2:pop_size
            % Tournament selection
            tournament_size = 5;
            tournament_idx = randperm(pop_size, tournament_size);
            tournament_fitness = fitness(tournament_idx);
            [~, winner_idx] = min(tournament_fitness);
            parent1 = population(tournament_idx(winner_idx), :);
            
            % Second parent
            tournament_idx = randperm(pop_size, tournament_size);
            tournament_fitness = fitness(tournament_idx);
            [~, winner_idx] = min(tournament_fitness);
            parent2 = population(tournament_idx(winner_idx), :);
            
            % Crossover (blend)
            alpha = 0.5;
            child = alpha * parent1 + (1 - alpha) * parent2;
            
            % Mutation
            mutation_rate = 0.1;
            mutation_strength = 0.5;
            if rand() < mutation_rate
                child = child + mutation_strength * randn(size(child));
            end
            
            % Enforce bounds
            child = max(bounds(1), min(bounds(2), child));
            
            new_population(i, :) = child;
        end
        
        population = new_population;
        
        % Print progress every 50 generations
        if mod(gen, 50) == 0
            fprintf('     Generation %d: Best fitness = %.6f\n', gen, f_best);
        end
    end
end

[x_ga, f_ga] = genetic_algorithm(rastrigin_2d, [-5, 5], 100, 500);

fprintf('   Genetic Algorithm on 2D Rastrigin:\n');
fprintf('   Best point: [%.6f, %.6f]\n', x_ga(1), x_ga(2));
fprintf('   Best value: %.8f\n', f_ga);
fprintf('   Distance from global optimum: %.6f\n', norm(x_ga));

% Multi-start optimization
fprintf('\n3. Multi-Start Local Optimization:\n');

function [x_best, f_best, success_rate] = multistart_optimization(func, grad_func, bounds, n_starts)
    % Multi-start local optimization
    % Input: func, grad_func - objective and gradient, bounds, n_starts
    % Output: x_best, f_best, success_rate
    
    dim = 2;
    f_best = inf;
    x_best = [];
    
    local_minima = [];
    successful_runs = 0;
    
    for start = 1:n_starts
        % Random starting point
        x0 = bounds(1) + (bounds(2) - bounds(1)) * rand(dim, 1);
        
        try
            [x_local, f_local, ~] = bfgs_method(func, grad_func, x0, 1e-8, 1000);
            
            % Check if optimization was successful
            if ~isnan(f_local) && ~isinf(f_local)
                successful_runs = successful_runs + 1;
                
                % Check if this is a new local minimum
                is_new = true;
                for i = 1:size(local_minima, 2)
                    if norm(x_local - local_minima(:, i)) < 0.2
                        is_new = false;
                        break;
                    end
                end
                
                if is_new
                    local_minima = [local_minima, x_local];
                end
                
                if f_local < f_best
                    f_best = f_local;
                    x_best = x_local;
                end
            end
        catch
            % Optimization failed - continue
        end
    end
    
    success_rate = successful_runs / n_starts;
    fprintf('     Found %d distinct local minima from %d successful runs\n', ...
            size(local_minima, 2), successful_runs);
end

% Test multi-start on Himmelblau function (has 4 global minima)
himmel_2d = @(x) himmelblau(x);
himmel_grad_2d = @(x) himmelblau_grad(x);

[x_multi, f_multi, success] = multistart_optimization(himmel_2d, himmel_grad_2d, [-5, 5], 50);

fprintf('   Multi-start on Himmelblau function:\n');
fprintf('   Best point: [%.6f, %.6f]\n', x_multi(1), x_multi(2));
fprintf('   Best value: %.8f\n', f_multi);
fprintf('   Success rate: %.1f%%\n', success * 100);

% Known global minima of Himmelblau: (3,2), (-2.805118, 3.131312), (-3.779310, -3.283186), (3.584428, -1.848126)
known_minima = [3, 2; -2.805118, 3.131312; -3.779310, -3.283186; 3.584428, -1.848126];
min_distance = inf;
closest_minimum = [];
for i = 1:size(known_minima, 1)
    distance = norm(x_multi - known_minima(i, :)');
    if distance < min_distance
        min_distance = distance;
        closest_minimum = known_minima(i, :);
    end
end

fprintf('   Distance to nearest known minimum [%.3f, %.3f]: %.6f\n', ...
        closest_minimum(1), closest_minimum(2), min_distance);

% Differential Evolution
fprintf('\n4. Differential Evolution:\n');

function [x_best, f_best] = differential_evolution(func, bounds, pop_size, generations, F, CR)
    % Differential Evolution algorithm
    % Input: func - objective, bounds, pop_size, generations, F - mutation factor, CR - crossover rate
    % Output: x_best, f_best
    
    dim = 2;
    
    % Initialize population
    population = bounds(1) + (bounds(2) - bounds(1)) * rand(pop_size, dim);
    
    % Evaluate initial population
    fitness = zeros(pop_size, 1);
    for i = 1:pop_size
        fitness(i) = func(population(i, :)');
    end
    
    [f_best, best_idx] = min(fitness);
    x_best = population(best_idx, :)';
    
    for gen = 1:generations
        for i = 1:pop_size
            % Select three random individuals (different from current)
            candidates = setdiff(1:pop_size, i);
            selected = candidates(randperm(length(candidates), 3));
            a = selected(1); b = selected(2); c = selected(3);
            
            % Mutation: v = xa + F * (xb - xc)
            mutant = population(a, :) + F * (population(b, :) - population(c, :));
            
            % Ensure bounds
            mutant = max(bounds(1), min(bounds(2), mutant));
            
            % Crossover
            trial = population(i, :);
            for j = 1:dim
                if rand() < CR || j == randi(dim)  % Ensure at least one component from mutant
                    trial(j) = mutant(j);
                end
            end
            
            % Selection
            f_trial = func(trial');
            if f_trial < fitness(i)
                population(i, :) = trial;
                fitness(i) = f_trial;
                
                if f_trial < f_best
                    x_best = trial';
                    f_best = f_trial;
                end
            end
        end
        
        % Print progress
        if mod(gen, 100) == 0
            fprintf('     Generation %d: Best fitness = %.6f\n', gen, f_best);
        end
    end
end

[x_de, f_de] = differential_evolution(rastrigin_2d, [-5, 5], 50, 1000, 0.5, 0.7);

fprintf('   Differential Evolution on 2D Rastrigin:\n');
fprintf('   Best point: [%.6f, %.6f]\n', x_de(1), x_de(2));
fprintf('   Best value: %.8f\n', f_de);
fprintf('   Distance from global optimum: %.6f\n', norm(x_de));

% Global optimization comparison
fprintf('\n5. Global Method Comparison on Rastrigin:\n');
fprintf('   Method               | Best Value | Distance to Global | Success\n');
fprintf('   ---------------------|------------|-------------------|--------\n');
fprintf('   Simulated Annealing  | %8.4f   | %15.6f   | %s\n', f_sa, norm(x_sa), f_sa < 1 ? 'Yes' : 'No');
fprintf('   Genetic Algorithm    | %8.4f   | %15.6f   | %s\n', f_ga, norm(x_ga), f_ga < 1 ? 'Yes' : 'No');
fprintf('   Differential Evolution| %8.4f   | %15.6f   | %s\n', f_de, norm(x_de), f_de < 1 ? 'Yes' : 'No');
```

---

## 6. Advanced Convergence Analysis and Visualization

```octave
% Advanced convergence analysis and method comparison
fprintf('\n=== Advanced Convergence Analysis ===\n');

% Convergence rate analysis for different methods
fprintf('\n1. Theoretical vs Actual Convergence Rates:\n');

% Test all root-finding methods on the same function
test_func = @(x) x^3 - 2*x - 5;  % Root approximately at x = 2.094551
test_func_prime = @(x) 3*x^2 - 2;
true_root = 2.0945514815423265;  % High-precision root

fprintf('   Testing function: f(x) = x³ - 2x - 5\n');
fprintf('   True root: %.10f\n\n', true_root);

% Bisection method detailed analysis
[root_bis, iter_bis, errors_bis] = bisection(test_func, 1.5, 2.5, 1e-12, 50);
actual_errors_bis = abs(true_root - root_bis);

% Newton-Raphson detailed analysis  
[root_nr, iter_nr, errors_nr] = newton_raphson(test_func, test_func_prime, 2.5, 1e-12, 20);
actual_errors_nr = abs(true_root - root_nr);

% Secant method detailed analysis
[root_sec, iter_sec, errors_sec] = secant_method(test_func, 1.8, 2.3, 1e-12, 30);
actual_errors_sec = abs(true_root - root_sec);

% Display detailed convergence comparison
fprintf('   Detailed Method Comparison:\n');
fprintf('   ================================================================\n');
fprintf('   Method         | Iterations | Final Error  | Convergence Rate\n');
fprintf('   ================================================================\n');

% Calculate convergence rates
if length(errors_bis) >= 3
    q_bis = log(errors_bis(end)/errors_bis(end-1)) / log(errors_bis(end-1)/errors_bis(end-2));
else
    q_bis = 1.0;
end

if length(errors_nr) >= 3
    q_nr = log(errors_nr(end)/errors_nr(end-1)) / log(errors_nr(end-1)/errors_nr(end-2));
else
    q_nr = 2.0;
end

if length(errors_sec) >= 3
    q_sec = log(errors_sec(end)/errors_sec(end-1)) / log(errors_sec(end-1)/errors_sec(end-2));
else
    q_sec = 1.618;
end

fprintf('   Bisection      | %8d   | %10.2e   | %8.2f (Linear)\n', iter_bis, actual_errors_bis, q_bis);
fprintf('   Newton-Raphson | %8d   | %10.2e   | %8.2f (Quadratic)\n', iter_nr, actual_errors_nr, q_nr);
fprintf('   Secant         | %8d   | %10.2e   | %8.2f (Superlinear)\n', iter_sec, actual_errors_sec, q_sec);
fprintf('   ================================================================\n');

% Work-precision diagram analysis
fprintf('\n2. Work-Precision Analysis:\n');

tolerances = [1e-4, 1e-6, 1e-8, 1e-10, 1e-12];
methods = {'Bisection', 'Newton-Raphson', 'Secant'};

fprintf('   Tolerance   | Bisection Iter | Newton Iter | Secant Iter\n');
fprintf('   ------------|----------------|-------------|------------\n');

for i = 1:length(tolerances)
    tol = tolerances(i);
    
    try
        [~, iter_b, ~] = bisection(test_func, 1.5, 2.5, tol, 100);
    catch
        iter_b = NaN;
    end
    
    try
        [~, iter_n, ~] = newton_raphson(test_func, test_func_prime, 2.5, tol, 50);
    catch
        iter_n = NaN;
    end
    
    try
        [~, iter_s, ~] = secant_method(test_func, 1.8, 2.3, tol, 50);
    catch
        iter_s = NaN;
    end
    
    fprintf('   %8.0e   | %12d   | %9d   | %9d\n', tol, iter_b, iter_n, iter_s);
end

% Robustness analysis
fprintf('\n3. Robustness Analysis:\n');
fprintf('   Testing methods with poor initial guesses:\n\n');

poor_guesses = [0.5, 5.0, -1.0, 10.0];
success_count = zeros(3, length(poor_guesses));

for i = 1:length(poor_guesses)
    guess = poor_guesses(i);
    fprintf('   Initial guess: %.1f\n', guess);
    
    % Bisection (needs bracket adjustment)
    try
        if test_func(guess) * test_func(guess + 1) < 0
            [~, ~, ~] = bisection(test_func, guess, guess + 1, 1e-8, 50);
            success_count(1, i) = 1;
            fprintf('     Bisection: SUCCESS\n');
        else
            fprintf('     Bisection: FAILED (no sign change)\n');
        end
    catch
        fprintf('     Bisection: FAILED (numerical error)\n');
    end
    
    % Newton-Raphson
    try
        [root_test, ~, ~] = newton_raphson(test_func, test_func_prime, guess, 1e-8, 50);
        if abs(root_test - true_root) < 1e-6
            success_count(2, i) = 1;
            fprintf('     Newton-Raphson: SUCCESS\n');
        else
            fprintf('     Newton-Raphson: CONVERGED TO WRONG ROOT\n');
        end
    catch
        fprintf('     Newton-Raphson: FAILED (numerical error)\n');
    end
    
    % Secant
    try
        [root_test, ~, ~] = secant_method(test_func, guess, guess + 0.1, 1e-8, 50);
        if abs(root_test - true_root) < 1e-6
            success_count(3, i) = 1;
            fprintf('     Secant: SUCCESS\n');
        else
            fprintf('     Secant: CONVERGED TO WRONG ROOT\n');
        end
    catch
        fprintf('     Secant: FAILED (numerical error)\n');
    end
    fprintf('\n');
end

% Overall success rates
fprintf('   Overall Success Rates:\n');
fprintf('   Bisection: %.0f%% | Newton-Raphson: %.0f%% | Secant: %.0f%%\n', ...
        100*sum(success_count(1,:))/length(poor_guesses), ...
        100*sum(success_count(2,:))/length(poor_guesses), ...
        100*sum(success_count(3,:))/length(poor_guesses));
```

## 7. Optimization Method Benchmarking Suite

```octave
% Comprehensive optimization benchmarking
fprintf('\n=== Optimization Method Benchmarking ===\n');

% Define comprehensive test suite of optimization problems
fprintf('1. Standard Test Function Suite:\n');

% Test function definitions
test_functions = struct();

% Sphere function (unimodal, separable)
test_functions(1).name = 'Sphere';
test_functions(1).func = @(x) sum(x.^2);
test_functions(1).grad = @(x) 2*x;
test_functions(1).optimum = [0; 0];
test_functions(1).opt_value = 0;
test_functions(1).init = [3; -2];

% Rosenbrock function (unimodal, non-separable)
test_functions(2).name = 'Rosenbrock';
test_functions(2).func = rosenbrock;
test_functions(2).grad = rosenbrock_grad;
test_functions(2).optimum = [1; 1];
test_functions(2).opt_value = 0;
test_functions(2).init = [-1.2; 1.0];

% Beale function (unimodal)
beale_func = @(x) (1.5 - x(1) + x(1)*x(2))^2 + (2.25 - x(1) + x(1)*x(2)^2)^2 + (2.625 - x(1) + x(1)*x(2)^3)^2;
beale_grad = @(x) [2*(1.5 - x(1) + x(1)*x(2))*(x(2) - 1) + 2*(2.25 - x(1) + x(1)*x(2)^2)*(x(2)^2 - 1) + 2*(2.625 - x(1) + x(1)*x(2)^3)*(x(2)^3 - 1); ...
                   2*(1.5 - x(1) + x(1)*x(2))*x(1) + 2*(2.25 - x(1) + x(1)*x(2)^2)*2*x(1)*x(2) + 2*(2.625 - x(1) + x(1)*x(2)^3)*3*x(1)*x(2)^2];

test_functions(3).name = 'Beale';
test_functions(3).func = beale_func;
test_functions(3).grad = beale_grad;
test_functions(3).optimum = [3; 0.5];
test_functions(3).opt_value = 0;
test_functions(3).init = [1; 1];

% Himmelblau function (multimodal)
test_functions(4).name = 'Himmelblau';
test_functions(4).func = himmelblau;
test_functions(4).grad = himmelblau_grad;
test_functions(4).optimum = [3; 2];  % One of four global minima
test_functions(4).opt_value = 0;
test_functions(4).init = [0; 0];

% Benchmarking framework
methods = {'Gradient Descent', 'Newton', 'BFGS'};
results = struct();

fprintf('\n2. Performance Benchmarking Results:\n');
fprintf('=====================================================================================================================\n');
fprintf('Function     | Method           | Iterations | Final Value      | Error to Optimum | Time (Relative) | Status\n');
fprintf('=====================================================================================================================\n');

for f = 1:length(test_functions)
    prob = test_functions(f);
    
    for m = 1:length(methods)
        method = methods{m};
        
        % Time the optimization
        tic;
        
        try
            switch m
                case 1  % Gradient Descent with adaptive step size
                    [x_opt, f_opt, iters] = gradient_descent(prob.func, prob.grad, prob.init, 1e-8, 5000);
                case 2  % Newton's Method
                    % Use numerical Hessian for general case
                    hess_func = @(x) numerical_hessian(prob.func, x);
                    [x_opt, f_opt, iters] = newton_optimization(prob.func, prob.grad, hess_func, prob.init, 1e-8, 100);
                case 3  % BFGS
                    [x_opt, f_opt, iters] = bfgs_method(prob.func, prob.grad, prob.init, 1e-8, 1000);
            end
            
            elapsed_time = toc;
            error_to_opt = norm(x_opt - prob.optimum);
            
            % Determine status
            if error_to_opt < 1e-4 && abs(f_opt - prob.opt_value) < 1e-6
                status = 'SUCCESS';
            elseif iters >= (m == 1 ? 5000 : (m == 2 ? 100 : 1000))
                status = 'MAX_ITER';
            else
                status = 'PARTIAL';
            end
            
        catch ME
            elapsed_time = toc;
            f_opt = NaN;
            error_to_opt = NaN;
            iters = NaN;
            status = 'FAILED';
        end
        
        % Store results
        results(f, m).function = prob.name;
        results(f, m).method = method;
        results(f, m).iterations = iters;
        results(f, m).final_value = f_opt;
        results(f, m).error = error_to_opt;
        results(f, m).time = elapsed_time;
        results(f, m).status = status;
        
        % Print results
        fprintf('%-12s | %-16s | %8d   | %14.8f   | %14.6e   | %9.4f   | %s\n', ...
                prob.name, method, iters, f_opt, error_to_opt, elapsed_time, status);
    end
    fprintf('---------------------------------------------------------------------------------------------------------------------\n');
end

% Numerical Hessian function
function H = numerical_hessian(func, x)
    n = length(x);
    H = zeros(n, n);
    h = sqrt(eps);
    
    f0 = func(x);
    
    for i = 1:n
        for j = 1:n
            x_pp = x; x_pp(i) = x_pp(i) + h; x_pp(j) = x_pp(j) + h;
            x_pm = x; x_pm(i) = x_pm(i) + h; x_pm(j) = x_pm(j) - h;
            x_mp = x; x_mp(i) = x_mp(i) - h; x_mp(j) = x_mp(j) + h;
            x_mm = x; x_mm(i) = x_mm(i) - h; x_mm(j) = x_mm(j) - h;
            
            H(i,j) = (func(x_pp) - func(x_pm) - func(x_mp) + func(x_mm)) / (4*h^2);
        end
    end
end

% Performance summary
fprintf('\n3. Method Performance Summary:\n');

success_rates = zeros(1, length(methods));
avg_iterations = zeros(1, length(methods));
avg_times = zeros(1, length(methods));

for m = 1:length(methods)
    method_results = results(:, m);
    
    % Success rate
    successes = sum(strcmp({method_results.status}, 'SUCCESS'));
    success_rates(m) = successes / length(test_functions) * 100;
    
    % Average iterations (only successful runs)
    successful_iters = [method_results(strcmp({method_results.status}, 'SUCCESS')).iterations];
    if ~isempty(successful_iters)
        avg_iterations(m) = mean(successful_iters);
    else
        avg_iterations(m) = NaN;
    end
    
    % Average time
    avg_times(m) = mean([method_results.time]);
end

fprintf('   Method Performance Rankings:\n');
fprintf('   ========================================\n');
fprintf('   Method           | Success Rate | Avg Iterations | Avg Time\n');
fprintf('   ========================================\n');

for m = 1:length(methods)
    fprintf('   %-16s | %10.1f%% | %12.1f   | %8.4fs\n', ...
            methods{m}, success_rates(m), avg_iterations(m), avg_times(m));
end

fprintf('\n4. Recommendations:\n');
fprintf('   Based on benchmarking results:\n');

[~, best_success] = max(success_rates);
[~, fastest_method] = min(avg_iterations(~isnan(avg_iterations)));
[~, most_efficient] = min(avg_times);

fprintf('   • Most Reliable: %s (%.1f%% success rate)\n', methods{best_success}, success_rates(best_success));
fprintf('   • Fastest Convergence: %s (%.1f average iterations)\n', methods{fastest_method}, avg_iterations(fastest_method));
fprintf('   • Most Time Efficient: %s (%.4fs average time)\n', methods{most_efficient}, avg_times(most_efficient));

fprintf('\n   Selection Guidelines:\n');
fprintf('   • Use BFGS for general-purpose optimization (best balance)\n');
fprintf('   • Use Newton''s method when Hessian is available and cheap to compute\n');
fprintf('   • Use Gradient Descent for very large-scale problems or when memory is limited\n');
fprintf('   • Always try multiple starting points for non-convex problems\n');
```

---

## Summary

**Optimization & Root-Finding Mastery Completed:**

This comprehensive notebook covered all essential numerical optimization and root-finding techniques:

- ✅ **Root-Finding**: Bisection, Newton-Raphson, secant, fixed-point iteration methods
- ✅ **Nonlinear Systems**: Newton's method, Broyden's quasi-Newton for equation systems  
- ✅ **Unconstrained Optimization**: Gradient descent, Newton's method, BFGS quasi-Newton
- ✅ **Constrained Optimization**: Penalty methods, Lagrange multipliers, projected gradient
- ✅ **Global Optimization**: Simulated annealing, genetic algorithms, multi-start, differential evolution

**Key Algorithmic Insights:**
1. **Convergence Rates**: Newton (quadratic) > Secant (superlinear) > Bisection (linear)
2. **Robustness vs Speed**: Bisection most robust, Newton fastest but needs good initial guess
3. **Quasi-Newton**: BFGS provides superlinear convergence without computing Hessian
4. **Global vs Local**: Local methods fast but can miss global optimum
5. **Constraint Handling**: Lagrange multipliers provide exact solutions when applicable

**Professional Optimization Skills:**
- Select appropriate algorithms based on problem characteristics
- Implement robust convergence criteria and error handling
- Understand trade-offs between accuracy, speed, and reliability
- Apply global optimization for multi-modal problems
- Handle constrained problems with KKT conditions

**Real-World Applications:**
- **Engineering**: Design optimization, control system tuning, parameter estimation
- **Finance**: Portfolio optimization, risk management, option pricing models
- **Machine Learning**: Training algorithms, hyperparameter optimization
- **Scientific Computing**: Model fitting, inverse problems, calibration
- **Operations Research**: Resource allocation, scheduling, logistics

**Best Practices Established:**
- Always verify convergence criteria and check solution quality
- Use multiple starting points for complex optimization landscapes  
- Implement proper line search and step size control
- Validate solutions by checking optimality conditions
- Consider problem scaling and conditioning for numerical stability

**Next Steps:**
- Apply these methods to domain-specific optimization problems
- Explore modern metaheuristic algorithms for complex problems
- Study convex optimization theory for guaranteed global solutions
- Implement specialized methods for specific constraint types

Your optimization toolkit is now research and industry ready!