diff --git a/examples/alphaevolve_math_problems/README.md b/examples/alphaevolve_math_problems/README.md
new file mode 100644
index 00000000..172c1464
--- /dev/null
+++ b/examples/alphaevolve_math_problems/README.md
@@ -0,0 +1,23 @@
+## AlphaEvolve mathematical problems
+
+This folder contains the necessary evaluator and initial program files for all of the 14 problems from [AlphaEvolve's Appendices A and B](https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf). The circle packing problem was among the first implemented by OpenEvolve. This folder implements the remaining 13 problems. 
+
+Not all problems take the form of a maximization problem, however in order to make this problem set more standardized we chose to **make it so all evaluator files are aiming to maximize the target metric**. We achieve by some straightforward algebraic manipulation, but this can be easily edited and changed if the user finds it necessary.
+
+The problem from Appendix A is the following:
+- [Matrix multiplication](matmul): obtain a faster algorithm for multiplying two matrices of sizes $m \times n$ and $n \times p$ (c.f. Appendix A).
+
+The remaining problems are from Appendix B:
+1. [First autocorrelation inequality](first_autocorr_ineq): Construct a nonnegative step function $f:\mathbb{R} \mapsto \mathbb{R}$ to improve an upper bound on a constant related to the autoconvolution of $f$ (c.f. Appendix B.1.).
+2. [Second autocorrelation inequality](second_autocorr_ineq): Construct a nonnegative step function $f:\mathbb{R} \mapsto \mathbb{R}$ to improve a lower bound on a constant related to the norm of the autoconvolution of $f$ (c.f. Appendix B.2.).
+3. [Third autocorrelation inequality](third_autocorr_ineq): Construct a nonnegative step function $f:\mathbb{R} \mapsto \mathbb{R}$ to improve an upper bound on a constant related to th absolute value of the autoconvolution of $f$ (c.f. Appendix B.3.).
+4. [An uncertainty inequality](uncertainty_ineq): Construct a function $f:\mathbb{R} \mapsto \mathbb{R}$ to obtain an upper bound on a constant related to $f$ and its fourier transform. (c.f. Appendix B.4.).
+5. [Erdos minimum overlap problem](erdos_min_overlap): Construct a nonnegative step function $f:\mathbb{R} \mapsto \mathbb{R}$ satisfyind some special properties to improve an upper bound on a constant that controls the asymptotics of the Minimum Overlap Problem (c.f. Appendix B.5.).
+6. [Sums and differences of finite sets](sums_diffs_finite_sets): Construct a set of nonnegative integers $U$ satisfying some special properties to improve a lower bound to a constant related to sums and differences of finite sets (c.f. Appendix B.6.).
+7. [Packing unit regular hexagons inside a regular hexagon](hexagon_packing): Place $n$ disjoint unit regular hexagons inside a larger regular hexagon, minimizing the side length of the outer hexagon. We consider the case where $n = 11$ and $n = 12$ (c.f. Appendix B.7.).
+8. [Minimizing the ratio of maximum to minimum distance](minimizing_max_min_dist): Place $n$ $d$-dimensional points in order to minimize the ratio between the maximum and minimum pairwise distances. We consider the cases where $n=16,d=2$ and $n=14,d=3$ (c.f. Appendix B.8.).
+9. [The Heilbronn problem for triangles](heilbronn_triangle): Place $n$ points on or inside a triangle with unit area so that the area of the smallest triangle formed by these points is maximized. We consider the case where $n = 11$ (c.f. Appendix B.9.).
+10. [The Heilbronn problem for convex regions](heilbronn_convex): Place $n$ points on or inside a convex region with unit area so that the area of the smallest triangle formed by these points is maximized. We consider the case where $n = 13$ and $n=14$ (c.f. Appendix B.10.).
+11. [Kissing number in dimension 11](kissing_number): Increase the lower bound on the $11$-dimensional kissing number, i.e., the number of disjoint unit spheres that can be packed tangent to a given unit sphere (c.f. Appendix B.11.).
+12. [Packing circles inside a unit square to maximize sum of radii](../circle_packing): Place $n$ disjoint circles inside a unit square so as to maximize the sum of their radii (c.f. Appendix B.12.).
+13. [Packing circles inside a rectangle of perimeter 4 to maximize sum of radii](circle_packing_rect):  Place $n$ disjoint circles inside a rectangle of perimeter $4$ so as to maximize the sum of their radii (c.f. Appendix B.13.).
diff --git a/examples/alphaevolve_math_problems/circle_packing_rect/config.yaml b/examples/alphaevolve_math_problems/circle_packing_rect/config.yaml
new file mode 100644
index 00000000..6e270a54
--- /dev/null
+++ b/examples/alphaevolve_math_problems/circle_packing_rect/config.yaml
@@ -0,0 +1,62 @@
+# Evolution settings
+max_iterations: 100
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert computational geometer and optimization specialist with deep expertise in circle packing problems, geometric optimization algorithms, and constraint satisfaction.
+    Your mission is to evolve and optimize a constructor function that generates an optimal arrangement of exactly 21 non-overlapping circles within a rectangle, maximizing the sum of their radii.
+
+    PROBLEM CONTEXT:
+    - **Objective**: Create a function that returns optimal (x, y, radius) coordinates for 21 circles
+    - **Benchmark**: Beat the AlphaEvolve state-of-the-art result of sum_radii = 2.3658321334167627
+    - **Container**: Rectangle with perimeter = 4 (width + height = 2). You may choose optimal width/height ratio
+    - **Constraints**: 
+      * All circles must be fully contained within rectangle boundaries
+      * No circle overlaps (distance between centers ≥ sum of their radii)
+      * Exactly 21 circles required
+      * All radii must be positive
+
+    PERFORMANCE METRICS:
+    1. **sum_radii**: Total sum of all 21 circle radii (PRIMARY OBJECTIVE - maximize)
+    2. **combined_score**: sum_radii / 2.3658321334167627 (progress toward beating benchmark)  
+    3. **eval_time**: Execution time in seconds (keep reasonable, prefer accuracy over speed)
+
+    TECHNICAL REQUIREMENTS:
+    - **Determinism**: Use fixed random seeds if employing stochastic methods for reproducibility
+    - **Error handling**: Graceful handling of optimization failures or infeasible configurations
+    - **Memory efficiency**: Avoid excessive memory allocation for distance matrix computations
+    - **Scalability**: Design with potential extension to different circle counts in mind
+      
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/circle_packing_rect/evaluator.py b/examples/alphaevolve_math_problems/circle_packing_rect/evaluator.py
new file mode 100644
index 00000000..25d8ab08
--- /dev/null
+++ b/examples/alphaevolve_math_problems/circle_packing_rect/evaluator.py
@@ -0,0 +1,111 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the circle packing problem on a rectangle
+# of perimeter 4.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import time
+import numpy as np
+import sys
+import os
+from importlib import __import__
+
+BENCHMARK = 2.3658321334167627
+NUM_CIRCLES = 21
+TOL = 1e-6
+
+
+def minimum_circumscribing_rectangle(circles: np.ndarray):
+    """Returns the width and height of the minimum circumscribing rectangle.
+
+    Args:
+    circles: A numpy array of shape (num_circles, 3), where each row is of the
+        form (x, y, radius), specifying a circle.
+
+    Returns:
+    A tuple (width, height) of the minimum circumscribing rectangle.
+    """
+    min_x = np.min(circles[:, 0] - circles[:, 2])
+    max_x = np.max(circles[:, 0] + circles[:, 2])
+    min_y = np.min(circles[:, 1] - circles[:, 2])
+    max_y = np.max(circles[:, 1] + circles[:, 2])
+    return max_x - min_x, max_y - min_y
+
+
+def validate_packing_radii(radii: np.ndarray) -> None:
+    n = len(radii)
+    for i in range(n):
+        if radii[i] < 0:
+            raise ValueError(f"Circle {i} has negative radius {radii[i]}")
+        elif np.isnan(radii[i]):
+            raise ValueError(f"Circle {i} has nan radius")
+
+
+def validate_packing_overlap_wtol(circles: np.ndarray, tol: float = 1e-6) -> None:
+    n = len(circles)
+    for i in range(n):
+        for j in range(i + 1, n):
+            dist = np.sqrt(np.sum((circles[i, :2] - circles[j, :2]) ** 2))
+            if dist < circles[i, 2] + circles[j, 2] - tol:
+                raise ValueError(
+                    f"Circles {i} and {j} overlap: dist={dist}, r1+r2={circles[i,2]+circles[j,2]}"
+                )
+
+
+def validate_packing_inside_rect_wtol(circles: np.array, tol: float = 1e-6) -> None:
+    width, height = minimum_circumscribing_rectangle(circles)
+    if width + height > (2 + tol):
+        raise ValueError("Circles are not contained inside a rectangle of perimeter 4.")
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        circles = None
+        eval_time = 0
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+
+            start_time = time.time()
+            circles = program.circle_packing21()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        if not isinstance(circles, np.ndarray):
+            circles = np.array(circles)
+
+        if circles.shape != (NUM_CIRCLES, 3):
+            raise ValueError(
+                f"Invalid shapes: circles = {circles.shape}, expected {(NUM_CIRCLES,3)}"
+            )
+
+        validate_packing_radii(circles[:, -1])
+        validate_packing_overlap_wtol(circles, TOL)
+        validate_packing_inside_rect_wtol(circles, TOL)
+
+        radii_sum = np.sum(circles[:, -1])
+
+        return {
+            "radii_sum": float(radii_sum),
+            "combined_score": float(radii_sum / BENCHMARK),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/circle_packing_rect/initial_program.py b/examples/alphaevolve_math_problems/circle_packing_rect/initial_program.py
new file mode 100644
index 00000000..601ae386
--- /dev/null
+++ b/examples/alphaevolve_math_problems/circle_packing_rect/initial_program.py
@@ -0,0 +1,22 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def circle_packing21() -> np.ndarray:
+    """
+    Places 21 non-overlapping circles inside a rectangle of perimeter 4 in order to maximize the sum of their radii.
+
+    Returns:
+        circles: np.array of shape (21,3), where the i-th row (x,y,r) stores the (x,y) coordinates of the i-th circle of radius r.
+    """
+    n = 21
+    circles = np.zeros((n, 3))
+
+    return circles
+
+
+# EVOLVE-BLOCK-END
+
+if __name__ == "__main__":
+    circles = circle_packing21()
+    print(f"Radii sum: {np.sum(circles[:,-1])}")
diff --git a/examples/alphaevolve_math_problems/circle_packing_rect/requirements.txt b/examples/alphaevolve_math_problems/circle_packing_rect/requirements.txt
new file mode 100644
index 00000000..296d6545
--- /dev/null
+++ b/examples/alphaevolve_math_problems/circle_packing_rect/requirements.txt
@@ -0,0 +1 @@
+numpy
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/erdos_min_overlap/config.yaml b/examples/alphaevolve_math_problems/erdos_min_overlap/config.yaml
new file mode 100644
index 00000000..f33512d3
--- /dev/null
+++ b/examples/alphaevolve_math_problems/erdos_min_overlap/config.yaml
@@ -0,0 +1,60 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert in harmonic analysis, numerical optimization, and AI-driven mathematical discovery.
+    Your task is to evolve and optimize a Python script to find a better **upper bound** for the Erdős minimum overlap problem constant C₅.
+    
+    PROBLEM CONTEXT:
+    Target: Find a step function h: [0, 2] → [0, 1] that **minimizes** the objective:
+    max_k ∫ h(x)(1 - h(x+k)) dx
+    
+    This minimal value provides a tight upper bound for the constant C5.
+    
+    Current best known upper bound: C5 ≤ 0.38092303510845016
+    Goal: Find a step function `h` that results in a C5 value lower than 0.38092303510845016.
+    
+    CONSTRAINTS:
+    1. The function `h` must have values in the range [0, 1].
+    2. The integral of h(x) over [0, 2] must be exactly 1.
+    
+    PERFORMANCE METRICS:
+    - c5_bound: The bound found by the program.
+    - combined_score: 0.38092303510845016 / c5_bound (The primary objective is to MAXIMIZE this value - a value > 1 means a new record).
+    - n_points: number of points used in the discretization.
+    - eval_time: evaluation time of the program.
+    
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/erdos_min_overlap/evaluator.py b/examples/alphaevolve_math_problems/erdos_min_overlap/evaluator.py
new file mode 100644
index 00000000..2fd91bac
--- /dev/null
+++ b/examples/alphaevolve_math_problems/erdos_min_overlap/evaluator.py
@@ -0,0 +1,76 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the erdos minimum overlap problem.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+import numpy as np
+
+# Known bounds
+BENCHMARK = 0.38092303510845016
+
+
+def verify_c5_solution(h_values: np.ndarray, c5_achieved: float, n_points: int):
+    """Verifies the C5 upper bound solution."""
+
+    if h_values.shape != (n_points,):
+        raise ValueError(f"Expected h shape ({n_points},), got {h_values.shape}")
+
+    # Verify h(x) in [0, 1] constraint
+    if np.any(h_values < 0) or np.any(h_values > 1):
+        raise ValueError(f"h(x) is not in [0, 1]. Range: [{h_values.min()}, {h_values.max()}]")
+
+    # Verify integral of h = 1 constraint
+    dx = 2.0 / n_points
+    integral_h = np.sum(h_values) * dx
+    if not np.isclose(integral_h, 1.0, atol=1e-3):
+        raise ValueError(f"Integral of h is not close to 1. Got: {integral_h:.6f}")
+
+    # Re-calculate the C5 bound using np.correlate
+    j_values = 1.0 - h_values
+    correlation = np.correlate(h_values, j_values, mode="full") * dx
+    computed_c5 = np.max(correlation)
+
+    # Check for consistency
+    if not np.isclose(computed_c5, c5_achieved, atol=1e-4):
+        raise ValueError(f"C5 mismatch: reported {c5_achieved:.6f}, computed {computed_c5:.6f}")
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            h_values, c5_bound, n_points = program.run()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        verify_c5_solution(h_values, c5_bound, n_points)
+
+        return {
+            "c5_bound": float(c5_bound),
+            "combined_score": BENCHMARK / float(c5_bound),
+            "n_points": int(n_points),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/erdos_min_overlap/initial_program.py b/examples/alphaevolve_math_problems/erdos_min_overlap/initial_program.py
new file mode 100644
index 00000000..7d07d8cd
--- /dev/null
+++ b/examples/alphaevolve_math_problems/erdos_min_overlap/initial_program.py
@@ -0,0 +1,96 @@
+# EVOLVE-BLOCK-START
+import jax
+import jax.numpy as jnp
+import optax
+import numpy as np
+from dataclasses import dataclass
+import tqdm
+
+
+@dataclass
+class Hyperparameters:
+    num_intervals: int = 200
+    learning_rate: float = 0.005
+    num_steps: int = 20000
+    penalty_strength: float = 1000000.0
+
+
+class ErdosOptimizer:
+    """
+    Finds a step function h that minimizes the maximum overlap integral.
+    """
+
+    def __init__(self, hypers: Hyperparameters):
+        self.hypers = hypers
+        self.domain_width = 2.0
+        self.dx = self.domain_width / self.hypers.num_intervals
+
+    def _objective_fn(self, latent_h_values: jnp.ndarray) -> jnp.ndarray:
+        """
+        The loss function includes the objective and a penalty for the constraint.
+        """
+        # Enforce h(x) in [0, 1] via sigmoid (hard constraint)
+        h = jax.nn.sigmoid(latent_h_values)
+
+        # Calculate the primary objective (max correlation)
+        j = 1.0 - h
+        N = self.hypers.num_intervals
+        h_padded = jnp.pad(h, (0, N))
+        j_padded = jnp.pad(j, (0, N))
+        corr_fft = jnp.fft.fft(h_padded) * jnp.conj(jnp.fft.fft(j_padded))
+        correlation = jnp.fft.ifft(corr_fft).real
+        scaled_correlation = correlation * self.dx
+        objective_loss = jnp.max(scaled_correlation)
+
+        # Calculate the penalty for the integral constraint
+        integral_h = jnp.sum(h) * self.dx
+        constraint_loss = (integral_h - 1.0) ** 2
+
+        # Combine the objective with the penalty
+        total_loss = objective_loss + self.hypers.penalty_strength * constraint_loss
+        return total_loss
+
+    def run_optimization(self):
+        optimizer = optax.adam(self.hypers.learning_rate)
+
+        key = jax.random.PRNGKey(42)
+        latent_h_values = jax.random.normal(key, (self.hypers.num_intervals,))
+
+        opt_state = optimizer.init(latent_h_values)
+
+        @jax.jit
+        def train_step(latent_h_values, opt_state):
+            loss, grads = jax.value_and_grad(self._objective_fn)(latent_h_values)
+            updates, opt_state = optimizer.update(grads, opt_state)
+            latent_h_values = optax.apply_updates(latent_h_values, updates)
+            return latent_h_values, opt_state, loss
+
+        print(f"Optimizing a step function with {self.hypers.num_intervals} intervals...")
+        for step in tqdm.tqdm(range(self.hypers.num_steps), desc="Optimizing"):
+            latent_h_values, opt_state, loss = train_step(latent_h_values, opt_state)
+
+        # Final h is just the sigmoid of the latent values
+        final_h = jax.nn.sigmoid(latent_h_values)
+
+        # Re-calculate final objective loss without the penalty for the report
+        j = 1.0 - final_h
+        N = self.hypers.num_intervals
+        h_padded = jnp.pad(final_h, (0, N))
+        j_padded = jnp.pad(j, (0, N))
+        corr_fft = jnp.fft.fft(h_padded) * jnp.conj(jnp.fft.fft(j_padded))
+        correlation = jnp.fft.ifft(corr_fft).real
+        c5_bound = jnp.max(correlation * self.dx)
+
+        print(f"Optimization complete. Final C5 upper bound: {c5_bound:.8f}")
+        return np.array(final_h), float(c5_bound)
+
+
+def run():
+    hypers = Hyperparameters()
+    optimizer = ErdosOptimizer(hypers)
+    final_h_values, c5_bound = optimizer.run_optimization()
+
+    return final_h_values, c5_bound, hypers.num_intervals
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/erdos_min_overlap/requirements.txt b/examples/alphaevolve_math_problems/erdos_min_overlap/requirements.txt
new file mode 100644
index 00000000..3dee6952
--- /dev/null
+++ b/examples/alphaevolve_math_problems/erdos_min_overlap/requirements.txt
@@ -0,0 +1,3 @@
+numpy
+jax
+optax
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/first_autocorr_ineq/config.yaml b/examples/alphaevolve_math_problems/first_autocorr_ineq/config.yaml
new file mode 100644
index 00000000..44603d2e
--- /dev/null
+++ b/examples/alphaevolve_math_problems/first_autocorr_ineq/config.yaml
@@ -0,0 +1,83 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert in functional analysis, harmonic analysis, numerical optimization, and AI-driven mathematical discovery.
+    Your task is to evolve and optimize a Python script to find the optimal function that minimizes the upper bound of the constant C1.
+    
+    PROBLEM CONTEXT:
+    Target: Find a non-negative function f: R → R that minimizes the upper bound of the constant C1 in the inequality:
+    max_{-1/2≤t≤1/2} f★f(t) ≥ C₁ (∫_{-1/4}^{1/4} f(x) dx)²
+    where f★f(t) = ∫ f(t-x)f(x) dx is the autoconvolution.
+    
+    Current best known bounds:
+    * literature: 1.28 ≤ C1 ≤ 1.5098
+    * alphaevolve: C1 ≤ 1.5052939684401607
+    Goal: Beat the current upper bound of 1.5052939684401607 discovered by step functions and alphaevolve.
+    
+    Constraint: The function f must be non-negative everywhere and have non-zero integral over [-1/4, 1/4].
+    
+    MATHEMATICAL FORMULATION:
+    Given: Discretized domain [-1/4, 1/4] with n_points equally-spaced grid points.
+    Objective: Minimize min_{t∈[-1/2,1/2]} (f★f)(t) / (∫f dx)² over all non-negative functions f.
+    Discretization: Use finite differences and discrete convolution to approximate integrals and autoconvolution.
+
+    PERFORMANCE METRICS:
+    combined_score: The 1.5052939684401607/C1 constant achieved by the discovered function (PRIMARY OBJECTIVE - maximize this)
+    c1: constant achieved (current best upper bound)
+    eval_time: Time to reach best solution
+    n_points: number of points used in the integral interval
+    loss: loss valued of the function used in minimization
+    
+    VALIDATION FRAMEWORK:
+    Mathematical Validation: Verify the C1 computation using independent numerical integration
+    Non-negativity Check: Ensure f(x) ≥ 0 everywhere (up to numerical tolerance)
+    Integral Verification: Confirm ∫f dx > 0 to avoid degenerate solutions
+    Consistency Check: Re-compute autoconvolution and verify inequality holds
+    
+    TECHNICAL REQUIREMENTS:
+    Reproducibility: Control random seeds for deterministic results
+    Numerical Stability: Handle potential division by zero in integral ratios
+    Memory Management: Discrete convolution can be memory-intensive for large grids
+    Constraint Handling: Maintain non-negativity throughout optimization
+    
+    SUCCESS CRITERIA:
+    Primary: Achieving c1 < 1.5052939684401607 (beating current record)
+    Secondary: Finding interpretable functions that achieve high C1 values
+    Robustness: Solutions that work across multiple runs and parameter settings
+    Efficiency: Fast convergence to high-quality solutions
+
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/first_autocorr_ineq/evaluator.py b/examples/alphaevolve_math_problems/first_autocorr_ineq/evaluator.py
new file mode 100644
index 00000000..f0e32b78
--- /dev/null
+++ b/examples/alphaevolve_math_problems/first_autocorr_ineq/evaluator.py
@@ -0,0 +1,87 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the first autocorrelation inequality problem.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+import numpy as np
+
+# known bounds
+BENCHMARK = 1.5052939684401607
+
+
+def verify_autocorrelation_solution(f_values: np.ndarray, c1_achieved: float, n_points: int):
+    """Verify the autocorrelation solution for UPPER BOUND optimization"""
+
+    # Check shape
+    if f_values.shape != (n_points,):
+        raise ValueError(f"Expected function values shape {(n_points,)}. Got {f_values.shape}.")
+
+    # Check non-negativity
+    if np.any(f_values < 0.0):
+        raise ValueError("Function must be non-negative.")
+
+    # Recompute C1 to verify
+    dx = 0.5 / n_points
+    f_nonneg = np.maximum(f_values, 0.0)
+
+    # Compute the FULL autoconvolution
+    autoconv = np.convolve(f_nonneg, f_nonneg, mode="full") * dx
+
+    # The rest of the calculation can be simplified as we now take the max over the whole result
+    integral_sq = (np.sum(f_nonneg) * dx) ** 2
+
+    if integral_sq < 1e-8:
+        raise ValueError("Function integral is too small.")
+
+    # The max of the full autoconv is the correct value
+    computed_c1 = float(np.max(autoconv / integral_sq))
+
+    # Verify consistency
+    delta = abs(computed_c1 - c1_achieved)
+    if delta > 1e-6:
+        raise ValueError(
+            f"C1 mismatch: reported {c1_achieved:.6f}, computed {computed_c1:.6f}, delta: {delta:.6f}"
+        )
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            f_values, c1_achieved, loss, n_points = program.run()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        verify_autocorrelation_solution(f_values, c1_achieved, n_points)
+        return {
+            "c1": float(c1_achieved),
+            "combined_score": BENCHMARK / float(c1_achieved),
+            "loss": float(loss),
+            "n_points": int(n_points),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/first_autocorr_ineq/initial_program.py b/examples/alphaevolve_math_problems/first_autocorr_ineq/initial_program.py
new file mode 100644
index 00000000..94a7aa58
--- /dev/null
+++ b/examples/alphaevolve_math_problems/first_autocorr_ineq/initial_program.py
@@ -0,0 +1,117 @@
+# EVOLVE-BLOCK-START
+import jax
+import jax.numpy as jnp
+import optax
+import numpy as np
+from dataclasses import dataclass
+
+
+@dataclass
+class Hyperparameters:
+    """Hyperparameters for the optimization process."""
+
+    num_intervals: int = 600
+    learning_rate: float = 0.005
+    end_lr_factor: float = 1e-4
+    num_steps: int = 40000
+    warmup_steps: int = 2000
+
+
+class AutocorrelationOptimizer:
+    """
+    Optimizes a discretized function to find the minimal C1 constant.
+    """
+
+    def __init__(self, hypers: Hyperparameters):
+        self.hypers = hypers
+        self.domain_width = 0.5
+        self.dx = self.domain_width / self.hypers.num_intervals
+
+    def _objective_fn(self, f_values: jnp.ndarray) -> jnp.ndarray:
+        """
+        Computes the objective function, which is the C1 ratio.
+        We minimize this ratio to find a tight upper bound.
+        """
+        f_non_negative = jax.nn.relu(f_values)
+        integral_f = jnp.sum(f_non_negative) * self.dx
+
+        eps = 1e-9
+        integral_f_safe = jnp.maximum(integral_f, eps)
+
+        N = self.hypers.num_intervals
+        padded_f = jnp.pad(f_non_negative, (0, N))
+
+        fft_f = jnp.fft.fft(padded_f)
+        fft_conv = fft_f * fft_f
+        conv_f_f = jnp.fft.ifft(fft_conv).real
+
+        # Scale by dx.
+        scaled_conv_f_f = conv_f_f * self.dx
+
+        max_conv = jnp.max(scaled_conv_f_f)
+        c1_ratio = max_conv / (integral_f_safe**2)
+
+        # Return the value to be MINIMIZED.
+        return c1_ratio
+
+    def train_step(self, f_values: jnp.ndarray, opt_state: optax.OptState) -> tuple:
+        """Performs a single training step."""
+        loss, grads = jax.value_and_grad(self._objective_fn)(f_values)
+        updates, opt_state = self.optimizer.update(grads, opt_state, f_values)
+        f_values = optax.apply_updates(f_values, updates)
+
+        return f_values, opt_state, loss
+
+    def run_optimization(self):
+        """Sets up and runs the full optimization process."""
+        schedule = optax.warmup_cosine_decay_schedule(
+            init_value=0.0,
+            peak_value=self.hypers.learning_rate,
+            warmup_steps=self.hypers.warmup_steps,
+            decay_steps=self.hypers.num_steps - self.hypers.warmup_steps,
+            end_value=self.hypers.learning_rate * self.hypers.end_lr_factor,
+        )
+        self.optimizer = optax.adam(learning_rate=schedule)
+
+        key = jax.random.PRNGKey(42)
+        N = self.hypers.num_intervals
+        f_values = jnp.zeros((N,))
+        start_idx, end_idx = N // 4, 3 * N // 4
+        f_values = f_values.at[start_idx:end_idx].set(1.0)
+        f_values += 0.05 * jax.random.uniform(key, (N,))
+
+        opt_state = self.optimizer.init(f_values)
+
+        print(
+            f"Number of intervals (N): {self.hypers.num_intervals}, Steps: {self.hypers.num_steps}"
+        )
+
+        train_step_jit = jax.jit(self.train_step)
+
+        loss = jnp.inf  # Initialize loss
+        for step in range(self.hypers.num_steps):
+            f_values, opt_state, loss = train_step_jit(f_values, opt_state)
+            if step % 2000 == 0 or step == self.hypers.num_steps - 1:
+                # CORRECTED PRINTING: Show the positive loss value directly.
+                print(f"Step {step:5d} | C1 ≈ {loss:.8f}")
+
+        print(f"Final C1 found: {loss:.8f}")
+
+        return jax.nn.relu(f_values), loss
+
+
+def run():
+    """Entry point for running the optimization and returning results."""
+    hypers = Hyperparameters()
+    optimizer = AutocorrelationOptimizer(hypers)
+
+    optimized_f, final_loss_val = optimizer.run_optimization()
+
+    final_c1 = float(final_loss_val)
+
+    f_values_np = np.array(optimized_f)
+
+    return f_values_np, final_c1, final_loss_val, hypers.num_intervals
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/first_autocorr_ineq/requirements.txt b/examples/alphaevolve_math_problems/first_autocorr_ineq/requirements.txt
new file mode 100644
index 00000000..3dee6952
--- /dev/null
+++ b/examples/alphaevolve_math_problems/first_autocorr_ineq/requirements.txt
@@ -0,0 +1,3 @@
+numpy
+jax
+optax
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/heilbronn_convex/13/config.yaml b/examples/alphaevolve_math_problems/heilbronn_convex/13/config.yaml
new file mode 100644
index 00000000..232db58e
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_convex/13/config.yaml
@@ -0,0 +1,125 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert computational geometer and optimization specialist with deep expertise in the Heilbronn triangle problem - a fundamental challenge in discrete geometry first posed by Hans Heilbronn in 1957.
+    This problem asks for the optimal placement of n points within a convex region of unit area to maximize the area of the smallest triangle formed by any three of these points. 
+    Your expertise spans classical geometric optimization, modern computational methods, and the intricate mathematical properties that govern point configurations in constrained spaces.
+
+    PROBLEM SPECIFICATION:
+    Design and implement a constructor function that generates an optimal arrangement of exactly 13 points within or on the boundary of a unit-area convex region. The solution must:
+    - Place all 13 points within or on a convex boundary
+    - Maximize the minimum triangle area among all C(13,3) = 286 possible triangles
+    - Return deterministic, reproducible results
+    - Execute efficiently within computational constraints
+
+    PERFORMANCE METRICS:
+    1. **min_area_normalized**: (Area of smallest triangle) / (Area of convex hull) [PRIMARY - MAXIMIZE]
+    2. **combined_score**: min_area_normalized / 0.030936889034895654 [BENCHMARK COMPARISON - TARGET > 1.0]
+    3. **eval_time**: Execution time in seconds [EFFICIENCY - secondary priority]
+
+    TECHNICAL REQUIREMENTS:
+    - **Determinism**: Use fixed random seeds if employing stochastic methods for reproducibility
+    - **Error handling**: Graceful handling of optimization failures or infeasible configurations
+
+    MATHEMATICAL CONTEXT & THEORETICAL BACKGROUND:
+    - **PROBLEM COMPLEXITY**: The Heilbronn problem is among the most challenging in discrete geometry, with optimal configurations rigorously known only for n ≤ 4 points
+    - **ASYMPTOTIC BEHAVIOR**: For large n, the optimal value approaches O(1/n²) with logarithmic corrections, but the exact constant remains unknown
+    - **GEOMETRIC CONSTRAINTS**: Points must balance competing objectives:
+      * Interior points can form larger triangles but create crowding
+      * Boundary points avoid area penalties but limit triangle formation
+      * Edge cases arise when three points become nearly collinear
+    - **SYMMETRY CONSIDERATIONS**: Optimal configurations often exhibit rotational symmetries (particularly 3-fold due to triangular geometry)
+    - **SCALING INVARIANCE**: The problem is scale-invariant; solutions can be normalized to any convex region
+    - **CRITICAL GEOMETRIC PROPERTIES**:
+      * Delaunay triangulation properties and angle optimization
+      * Voronoi diagram regularity as indicator of point distribution quality
+      * Relationship between circumradius and triangle area
+      * Connection to sphere packing and energy minimization principles
+
+    ADVANCED OPTIMIZATION STRATEGIES:
+    - **MULTI-SCALE APPROACH**: Coarse global search → fine local refinement with adaptive step sizes
+    - **CONSTRAINT HANDLING**: Penalty methods, barrier functions, or projection operators for convexity
+    - **INITIALIZATION STRATEGIES**:
+      * Perturbed regular grids (triangular, square, hexagonal lattices)
+      * Random points with force-based relaxation
+      * Symmetry-constrained configurations (3-fold, 6-fold rotational)
+      * Hybrid boundary/interior distributions
+      * Low-discrepancy sequences (Sobol, Halton) for uniform coverage
+    - **OBJECTIVE FUNCTION DESIGN**:
+      * Smooth approximations to min() function (LogSumExp, p-norms with p→∞)
+      * Barrier methods for boundary constraints
+      * Multi-objective formulations balancing multiple triangle areas
+      * Weighted combinations of smallest k triangle areas
+    - **ADVANCED TECHNIQUES**:
+      * Riemannian optimization on manifolds
+      * Variational methods treating point density as continuous field
+      * Machine learning-guided search using learned geometric priors
+      * Topological optimization considering point connectivity graphs
+      * Continuation methods with parameter homotopy
+
+    GEOMETRIC INSIGHTS & HEURISTICS:
+    - **BOUNDARY CONSIDERATIONS**: Points on boundary contribute to convex hull but may form smaller triangles
+    - **TRIANGLE DEGENERACY**: Avoid near-collinear configurations that create arbitrarily small triangles
+    - **LOCAL VS GLOBAL**: Balance between locally optimal triangle sizes and global configuration harmony
+    - **SYMMETRY EXPLOITATION**: 3-fold rotational symmetry often appears in optimal configurations
+    - **VORONOI RELATIONSHIPS**: Points should have roughly equal Voronoi cell areas for optimal distribution
+    - **ENERGY ANALOGIES**: Treat as electrostatic repulsion or gravitational equilibrium problem
+    - **HISTORICAL APPROACHES**:
+      * Regular lattice arrangements (suboptimal but provide baselines)
+      * Hexagonal close-packing adaptations
+      * Force-based relaxation (treating points as mutually repelling particles)
+      * Simulated annealing and evolutionary computation
+      * Gradient descent with carefully designed objective functions
+
+    VALIDATION FRAMEWORK:
+    - **Geometric constraint verification**:
+      * Point count validation: Exactly 13 points required
+      * Convexity check: All points within or on boundary of convex hull
+    - **Data integrity checks**:
+      * Coordinate bounds: All coordinates are finite real numbers
+      * Point uniqueness: No duplicate points (within numerical tolerance)
+      * Geometric consistency: Points form valid geometric configuration
+    - **Solution quality assessment**:
+      * Local optimality testing through small perturbations
+      * Symmetry analysis: Detection of rotational/reflectional symmetries
+      * Distribution quality: Voronoi cell area variance, nearest neighbor statistics
+      * Convergence verification: For iterative methods, check convergence criteria
+    - **Determinism verification**:
+      * Multiple execution consistency: Same results across multiple runs
+      * Seed effectiveness: Proper random seed implementation
+      * Platform independence: Results stable across different computing environments
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/heilbronn_convex/13/evaluator.py b/examples/alphaevolve_math_problems/heilbronn_convex/13/evaluator.py
new file mode 100644
index 00000000..c483c733
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_convex/13/evaluator.py
@@ -0,0 +1,72 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the heilbronn problem for convex regions, with
+# 13 points.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import time
+import numpy as np
+import itertools
+from scipy.spatial import ConvexHull
+import sys
+import os
+from importlib import __import__
+
+BENCHMARK = 0.030936889034895654
+NUM_POINTS = 13
+
+
+def triangle_area(p1: np.ndarray, p2: np.ndarray, p3: np.ndarray) -> float:
+    """Calculates the area of a triangle given its vertices p1, p2, and p3."""
+    return abs(p1[0] * (p2[1] - p3[1]) + p2[0] * (p3[1] - p1[1]) + p3[0] * (p1[1] - p2[1])) / 2
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        points = None
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+
+            start_time = time.time()
+            points = program.heilbronn_convex13()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        # validate
+        if not isinstance(points, np.ndarray):
+            points = np.array(points)
+
+        if points.shape != (NUM_POINTS, 2):
+            raise ValueError(f"Invalid shapes: points = {points.shape}, expected {(NUM_POINTS,2)}")
+        # metrics
+        min_triangle_area = min(
+            [triangle_area(p1, p2, p3) for p1, p2, p3 in itertools.combinations(points, 3)]
+        )
+        convex_hull_area = ConvexHull(points).volume
+        min_area_normalized = min_triangle_area / convex_hull_area
+
+        return {
+            "min_area_normalized": float(min_area_normalized),
+            "combined_score": float(min_area_normalized / BENCHMARK),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/heilbronn_convex/13/initial_program.py b/examples/alphaevolve_math_problems/heilbronn_convex/13/initial_program.py
new file mode 100644
index 00000000..0f612acf
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_convex/13/initial_program.py
@@ -0,0 +1,19 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def heilbronn_convex13() -> np.ndarray:
+    """
+    Construct an arrangement of n points on or inside a convex region in order to maximize the area of the
+    smallest triangle formed by these points. Here n = 13.
+
+    Returns:
+        points: np.ndarray of shape (13,2) with the x,y coordinates of the points.
+    """
+    n = 13
+    rng = np.random.default_rng(seed=42)
+    points = rng.random((n, 2))
+    return points
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/heilbronn_convex/13/requirements.txt b/examples/alphaevolve_math_problems/heilbronn_convex/13/requirements.txt
new file mode 100644
index 00000000..f31e2ae9
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_convex/13/requirements.txt
@@ -0,0 +1,2 @@
+numpy 
+scipy
diff --git a/examples/alphaevolve_math_problems/heilbronn_convex/14/config.yaml b/examples/alphaevolve_math_problems/heilbronn_convex/14/config.yaml
new file mode 100644
index 00000000..c3fe3fb5
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_convex/14/config.yaml
@@ -0,0 +1,63 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert computational geometer and optimization specialist with deep expertise in the Heilbronn triangle problem - a fundamental challenge in discrete geometry first posed by Hans Heilbronn in 1957.
+    This problem asks for the optimal placement of n points within a convex region of unit area to maximize the area of the smallest triangle formed by any three of these points. 
+    Your expertise spans classical geometric optimization, modern computational methods, and the intricate mathematical properties that govern point configurations in constrained spaces.
+
+    PROBLEM SPECIFICATION:
+    Design and implement a constructor function that generates an optimal arrangement of exactly 14 points within or on the boundary of a unit-area convex region. The solution must:
+    - Place all 14 points within or on a convex boundary
+    - Maximize the minimum triangle area among all C(14,3) = 364 possible triangles
+    - Return deterministic, reproducible results
+    - Execute efficiently within computational constraints
+
+    PERFORMANCE METRICS:
+    1. **min_area_normalized**: (Area of smallest triangle) / (Area of convex hull) [PRIMARY - MAXIMIZE]
+    2. **combined_score**: min_area_normalized / 0.027835571458482138 [BENCHMARK COMPARISON - TARGET > 1.0]
+    3. **eval_time**: Execution time in seconds [EFFICIENCY - secondary priority]
+
+    BENCHMARK & PERFORMANCE TARGET:
+    - **CURRENT STATE-OF-THE-ART**: min_area_normalized = 0.027835571458482138 (achieved by AlphaEvolve algorithm)
+    - **PRIMARY METRIC**: min_area_normalized = (smallest triangle area) / (convex hull area)
+    - **SUCCESS CRITERION**: combined_score = min_area_normalized / 0.027835571458482138 > 1.0
+    - **SIGNIFICANCE**: Even marginal improvements (combined_score > 1.01) represent meaningful advances in this notoriously difficult problem
+
+    TECHNICAL REQUIREMENTS:
+    - **Determinism**: Use fixed random seeds if employing stochastic methods for reproducibility
+    - **Error handling**: Graceful handling of optimization failures or infeasible configurations
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/heilbronn_convex/14/evaluator.py b/examples/alphaevolve_math_problems/heilbronn_convex/14/evaluator.py
new file mode 100644
index 00000000..0113ff87
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_convex/14/evaluator.py
@@ -0,0 +1,71 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the heilbronn problem for convex regions, with
+# 14 points.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import time
+import numpy as np
+import itertools
+from scipy.spatial import ConvexHull
+import sys
+import os
+from importlib import __import__
+
+BENCHMARK = 0.027835571458482138
+NUM_POINTS = 14
+
+
+def triangle_area(p1: np.ndarray, p2: np.ndarray, p3: np.ndarray) -> float:
+    """Calculates the area of a triangle given its vertices p1, p2, and p3."""
+    return abs(p1[0] * (p2[1] - p3[1]) + p2[0] * (p3[1] - p1[1]) + p3[0] * (p1[1] - p2[1])) / 2
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        points = None
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+
+            start_time = time.time()
+            points = program.heilbronn_convex14()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        if not isinstance(points, np.ndarray):
+            points = np.array(points)
+
+        if points.shape != (NUM_POINTS, 2):
+            raise ValueError(f"Invalid shapes: points = {points.shape}, expected {(NUM_POINTS,2)}")
+
+        min_triangle_area = min(
+            [triangle_area(p1, p2, p3) for p1, p2, p3 in itertools.combinations(points, 3)]
+        )
+        convex_hull_area = ConvexHull(points).volume
+        min_area_normalized = min_triangle_area / convex_hull_area
+
+        return {
+            "min_area_normalized": float(min_area_normalized),
+            "combined_score": float(min_area_normalized / BENCHMARK),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/heilbronn_convex/14/initial_program.py b/examples/alphaevolve_math_problems/heilbronn_convex/14/initial_program.py
new file mode 100644
index 00000000..527f6feb
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_convex/14/initial_program.py
@@ -0,0 +1,19 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def heilbronn_convex14() -> np.ndarray:
+    """
+    Construct an arrangement of n points on or inside a convex region in order to maximize the area of the
+    smallest triangle formed by these points. Here n = 14.
+
+    Returns:
+        points: np.ndarray of shape (14,2) with the x,y coordinates of the points.
+    """
+    n = 14
+    rng = np.random.default_rng(seed=42)
+    points = rng.random((n, 2))
+    return points
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/heilbronn_convex/14/requirements.txt b/examples/alphaevolve_math_problems/heilbronn_convex/14/requirements.txt
new file mode 100644
index 00000000..3dee6952
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_convex/14/requirements.txt
@@ -0,0 +1,3 @@
+numpy
+jax
+optax
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/heilbronn_triangle/config.yaml b/examples/alphaevolve_math_problems/heilbronn_triangle/config.yaml
new file mode 100644
index 00000000..4ad51bb1
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_triangle/config.yaml
@@ -0,0 +1,52 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert computational geometer and optimization specialist with deep expertise in the Heilbronn triangle problem - a classical problem in discrete geometry that asks for the optimal placement of n points to maximize the minimum triangle area formed by any three points.
+
+    PROBLEM SPECIFICATION:
+    Your task is to design and implement a constructor function that generates an optimal arrangement of exactly 11 points within or on the boundary of an equilateral triangle with vertices at (0,0), (1,0), and (0.5, sqrt(3)/2).
+
+    PERFORMANCE METRICS:
+    1. **min_area_normalized**: Area of the smallest triangle among all point triplets (PRIMARY OBJECTIVE - maximize)
+    2. **combined_score**: min_area_normalized / 0.036529889880030156 (BENCHMARK COMPARISON - maximize above 1.0)
+    3. **eval_time**: Function execution time in seconds (EFFICIENCY - minimize, but secondary to quality)
+
+    TECHNICAL REQUIREMENTS:
+    - **Determinism**: Use fixed random seeds if employing stochastic methods for reproducibility
+    - **Error handling**: Graceful handling of optimization failures or infeasible configurations
+
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/heilbronn_triangle/evaluator.py b/examples/alphaevolve_math_problems/heilbronn_triangle/evaluator.py
new file mode 100644
index 00000000..eea24396
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_triangle/evaluator.py
@@ -0,0 +1,92 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the heilbronn problem for triangles, with
+# 11 points.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import time
+import numpy as np
+import sys
+import os
+from importlib import __import__
+import itertools
+
+BENCHMARK = 0.036529889880030156
+TOL = 1e-6
+NUM_POINTS = 11
+
+
+def check_inside_triangle_wtol(points: np.ndarray, tol: float = 1e-6):
+    """Checks that all points are inside the triangle with vertices (0,0), (1,0), (0.5, sqrt(3)/2).
+
+    Args:
+        points: Array of 2D points to check
+        tol: Tolerance for numerical errors
+    """
+    for x, y in points:
+        cond1 = y >= -tol
+        cond2 = np.sqrt(3) * x <= np.sqrt(3) - y + tol
+        cond3 = y <= np.sqrt(3) * x + tol
+
+        if not (cond1 and cond2 and cond3):
+            raise ValueError(
+                f"Point ({x}, {y}) is outside the equilateral triangle (tolerance: {tol})."
+            )
+
+
+def triangle_area(a: np.array, b: np.array, c: np.array) -> float:
+    return np.abs(a[0] * (b[1] - c[1]) + b[0] * (c[1] - a[1]) + c[0] * (a[1] - b[1])) / 2
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        points = None
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+
+            start_time = time.time()
+            points = program.heilbronn_triangle11()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        if not isinstance(points, np.ndarray):
+            points = np.array(points)
+
+        if points.shape != (NUM_POINTS, 2):
+            raise ValueError(f"Invalid shapes: points = {points.shape}, expected {(NUM_POINTS,2)}")
+
+        check_inside_triangle_wtol(points, TOL)
+
+        a = np.array([0, 0])
+        b = np.array([1, 0])
+        c = np.array([0.5, np.sqrt(3) / 2])
+        min_triangle_area = min(
+            [triangle_area(p1, p2, p3) for p1, p2, p3 in itertools.combinations(points, 3)]
+        )
+        min_area_normalized = min_triangle_area / triangle_area(a, b, c)
+
+        return {
+            "min_area_normalized": float(min_area_normalized),
+            "combined_score": float(min_area_normalized / BENCHMARK),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/heilbronn_triangle/initial_program.py b/examples/alphaevolve_math_problems/heilbronn_triangle/initial_program.py
new file mode 100644
index 00000000..05c6e577
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_triangle/initial_program.py
@@ -0,0 +1,18 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def heilbronn_triangle11() -> np.ndarray:
+    """
+    Construct an arrangement of n points on or inside a convex region in order to maximize the area of the
+    smallest triangle formed by these points. Here n = 11.
+
+    Returns:
+        points: np.ndarray of shape (11,2) with the x,y coordinates of the points.
+    """
+    n = 11
+    points = np.zeros((n, 2))
+    return points
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/heilbronn_triangle/requirements.txt b/examples/alphaevolve_math_problems/heilbronn_triangle/requirements.txt
new file mode 100644
index 00000000..296d6545
--- /dev/null
+++ b/examples/alphaevolve_math_problems/heilbronn_triangle/requirements.txt
@@ -0,0 +1 @@
+numpy
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/hexagon_packing/11/config.yaml b/examples/alphaevolve_math_problems/hexagon_packing/11/config.yaml
new file mode 100644
index 00000000..a9d81d76
--- /dev/null
+++ b/examples/alphaevolve_math_problems/hexagon_packing/11/config.yaml
@@ -0,0 +1,53 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 6
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert computational geometer and optimization specialist focusing on hexagon packing problems.
+    Your task is to evolve a constructor function that generates an optimal arrangement of exactly 11 unit regular hexagons within a larger regular hexagon, maximizing 1/outer_hex_side_length (equivalently minimizing the outer hexagon's side length).
+    
+    PROBLEM CONTEXT:
+    - Target: Beat the current state-of-the-art benchmark of 1/outer_hex_side_length = 1/3.930092 ≈ 0.2544
+    - Constraint: All 11 inner hexagons must be unit regular hexagons (side length = 1) that are fully contained within the outer hexagon with no overlaps
+    - Mathematical formulation: For hexagon i at position (xi, yi) with rotation θi:
+      * Non-overlap: All pairs of inner hexagons must be disjoint
+      * Containment: All vertices of inner hexagons must lie within the outer hexagon
+      * Objective: maximize 1/R where R is the outer hexagon side length
+
+    PERFORMANCE METRICS:
+    1. **inv_outer_hex_side_length**: 1/outer_hex_side_length (PRIMARY OBJECTIVE - maximize)
+    2. **combined_score**: inverse_side_length / 0.2544 (progress toward beating SOTA)
+    3. **eval_time**: Execution time for full evaluation
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/hexagon_packing/11/evaluator.py b/examples/alphaevolve_math_problems/hexagon_packing/11/evaluator.py
new file mode 100644
index 00000000..7967ca3e
--- /dev/null
+++ b/examples/alphaevolve_math_problems/hexagon_packing/11/evaluator.py
@@ -0,0 +1,234 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for problem of packing unit regular hexagons inside
+# a regular hexagon, with 11 unit hexagons.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+
+import numpy as np
+import math
+
+N_HEX = 11
+BENCHMARK = 1 / 3.930092
+
+
+def hexagon_vertices(
+    center_x: float,
+    center_y: float,
+    side_length: float,
+    angle_degrees: float,
+) -> list[tuple[float, float]]:
+    """Generates the vertices of a regular hexagon.
+    Args:
+    center_x: x-coordinate of the center.
+    center_y: y-coordinate of the center.
+    side_length: Length of each side.
+    angle_degrees: Rotation angle in degrees (clockwise from horizontal).
+    Returns:
+    A list of tuples, where each tuple (x, y) represents the vertex location.
+    """
+    vertices = []
+    angle_radians = math.radians(angle_degrees)
+    for i in range(6):
+        angle = angle_radians + 2 * math.pi * i / 6
+        x = center_x + side_length * math.cos(angle)
+        y = center_y + side_length * math.sin(angle)
+        vertices.append((x, y))
+    return vertices
+
+
+def normalize_vector(v: tuple[float, float]) -> tuple[float, float]:
+    """Normalizes a 2D vector."""
+    magnitude = math.sqrt(v[0] ** 2 + v[1] ** 2)
+    return (v[0] / magnitude, v[1] / magnitude) if magnitude != 0 else (0.0, 0.0)
+
+
+def get_normals(vertices: list[tuple[float, float]]) -> list[tuple[float, float]]:
+    """Gets the outward normals of a polygon's edges."""
+    normals = []
+    for i in range(len(vertices)):
+        p1 = vertices[i]
+        p2 = vertices[(i + 1) % len(vertices)]  # Wrap around to the first vertex.
+        edge = (p2[0] - p1[0], p2[1] - p1[1])
+        normal = normalize_vector((-edge[1], edge[0]))  # Rotate edge by 90 degrees.
+        normals.append(normal)
+    return normals
+
+
+def project_polygon(
+    vertices: list[tuple[float, float]],
+    axis: tuple[float, float],
+) -> tuple[float, float]:
+    """Projects a polygon onto an axis and returns the min/max values."""
+    min_proj = float("inf")
+    max_proj = float("-inf")
+    for vertex in vertices:
+        projection = vertex[0] * axis[0] + vertex[1] * axis[1]  # Dot product.
+        min_proj = min(min_proj, projection)
+        max_proj = max(max_proj, projection)
+    return min_proj, max_proj
+
+
+def overlap_1d(min1: float, max1: float, min2: float, max2: float, tol: float = 1e-6) -> bool:
+    """Determines whether two 1D intervals overlap, allowing for numerical tolerance."""
+    return max1 >= min2 - tol and max2 >= min1 - tol
+
+
+def polygons_intersect(
+    vertices1: list[tuple[float, float]],
+    vertices2: list[tuple[float, float]],
+    tol: float = 1e-6,
+) -> bool:
+    """Determines if two polygons intersect using the Separating Axis Theorem."""
+    normals1 = get_normals(vertices1)
+    normals2 = get_normals(vertices2)
+    axes = normals1 + normals2
+    for axis in axes:
+        min1, max1 = project_polygon(vertices1, axis)
+        min2, max2 = project_polygon(vertices2, axis)
+        if not overlap_1d(min1, max1, min2, max2, tol):
+            return False  # Separating axis found, polygons are disjoint.
+    return True  # No separating axis found, polygons intersect.
+
+
+def hexagons_are_disjoint(
+    hex1_params: tuple[float, float, float, float],
+    hex2_params: tuple[float, float, float, float],
+    tol: float = 1e-6,
+) -> bool:
+    """Determines if two hexagons are disjoint given their parameters."""
+    hex1_vertices = hexagon_vertices(*hex1_params)
+    hex2_vertices = hexagon_vertices(*hex2_params)
+    return not polygons_intersect(hex1_vertices, hex2_vertices, tol)
+
+
+def is_inside_hexagon(
+    point: tuple[float, float],
+    hex_params: tuple[float, float, float, float],
+    tol: float = 1e-6,
+) -> bool:
+    """Checks if a point is inside a hexagon (given its parameters)."""
+    hex_vertices = hexagon_vertices(*hex_params)
+    for i in range(len(hex_vertices)):
+        p1 = hex_vertices[i]
+        p2 = hex_vertices[(i + 1) % len(hex_vertices)]
+        edge_vector = (p2[0] - p1[0], p2[1] - p1[1])
+        point_vector = (point[0] - p1[0], point[1] - p1[1])
+        cross_product = edge_vector[0] * point_vector[1] - edge_vector[1] * point_vector[0]
+        if cross_product < -tol:  # Allow small numerical errors
+            return False
+    return True
+
+
+def all_hexagons_contained(
+    inner_hex_params_list: list[tuple[float, float, float, float]],
+    outer_hex_params: tuple[float, float, float, float],
+    tol: float = 1e-6,
+) -> bool:
+    """Checks if all inner hexagons are contained within the outer hexagon."""
+    for inner_hex_params in inner_hex_params_list:
+        inner_hex_vertices = hexagon_vertices(*inner_hex_params)
+        for vertex in inner_hex_vertices:
+            if not is_inside_hexagon(vertex, outer_hex_params, tol):
+                return False
+    return True
+
+
+def verify_construction(
+    inner_hex_data: tuple[float, float, float],
+    outer_hex_center: tuple[float, float],
+    outer_hex_side_length: float,
+    outer_hex_angle_degrees: float,
+    tol: float = 1e-6,
+):
+    """Verifies the hexagon packing construction with a rotated outer hexagon.
+    Args:
+    inner_hex_data: List of (x, y, angle_degrees) tuples for inner hexagons.
+    outer_hex_center: (x, y) tuple for the outer hexagon center.
+    outer_hex_side_length: Side length of the outer hexagon.
+    outer_hex_angle_degrees: Rotation angle of the outer hexagon in degrees.
+    tol: Numerical tolerance for geometric checks (default: 1e-6).
+    Raises:
+    AssertionError if the construction is not valid.
+    """
+    inner_hex_params_list = [
+        (x, y, 1, angle) for x, y, angle in inner_hex_data
+    ]  # Sets the side length to 1.
+    outer_hex_params = (
+        outer_hex_center[0],
+        outer_hex_center[1],
+        outer_hex_side_length,
+        outer_hex_angle_degrees,
+    )
+    # Disjointness check.
+    for i in range(len(inner_hex_params_list)):
+        for j in range(i + 1, len(inner_hex_params_list)):
+            if not hexagons_are_disjoint(inner_hex_params_list[i], inner_hex_params_list[j], tol):
+                raise AssertionError(f"Hexagons {i+1} and {j+1} intersect!")
+    # Containment check.
+    if not all_hexagons_contained(inner_hex_params_list, outer_hex_params, tol):
+        raise AssertionError("Not all inner hexagons are contained in the outer hexagon!")
+    print("Construction is valid.")
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            inner_hex_data, outer_hex_data, outer_hex_side_length = program.hexagon_packing_11()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        if not isinstance(inner_hex_data, np.ndarray):
+            inner_hex_data = np.array(inner_hex_data)
+        if not isinstance(outer_hex_data, np.ndarray):
+            outer_hex_data = np.array(outer_hex_data)
+
+        if inner_hex_data.shape != (N_HEX, 3):
+            raise ValueError(
+                f"Invalid shapes: inner_hex_data = {inner_hex_data.shape}, expected {(N_HEX,3)}"
+            )
+
+        if outer_hex_data.shape != (3,):
+            raise ValueError(
+                f"Invalid shapes: outer_hex_data = {outer_hex_data.shape}, expected {(3,)}"
+            )
+
+        outer_hex_center = outer_hex_data[:2]
+        outer_hex_angle_degrees = outer_hex_data[-1]
+        verify_construction(
+            inner_hex_data, outer_hex_center, outer_hex_side_length, outer_hex_angle_degrees
+        )
+
+        inv_outer_hex_side_length = float(1 / outer_hex_side_length)
+
+        return {
+            "inv_outer_hex_side_length": inv_outer_hex_side_length,
+            "combined_score": float(inv_outer_hex_side_length / BENCHMARK),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/hexagon_packing/11/initial_program.py b/examples/alphaevolve_math_problems/hexagon_packing/11/initial_program.py
new file mode 100644
index 00000000..c8ff2828
--- /dev/null
+++ b/examples/alphaevolve_math_problems/hexagon_packing/11/initial_program.py
@@ -0,0 +1,37 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def hexagon_packing_11():
+    """
+    Constructs a packing of 11 disjoint unit regular hexagons inside a larger regular hexagon, maximizing 1/outer_hex_side_length.
+    Returns
+        inner_hex_data: np.ndarray of shape (11,3), where each row is of the form (x, y, angle_degrees) containing the (x,y) coordinates and angle_degree of the respective inner hexagon.
+        outer_hex_data: np.ndarray of shape (3,) of form (x,y,angle_degree) containing the (x,y) coordinates and angle_degree of the outer hexagon.
+        outer_hex_side_length: float representing the side length of the outer hexagon.
+    """
+    n = 11
+    # Simple grid arrangement of inner hexagons
+    inner_hex_data = np.array(
+        [
+            [0, 0, 0],  # center
+            [-2.5, 0, 0],  # left
+            [2.5, 0, 0],  # right
+            [-1.25, 2.17, 0],  # top-left
+            [1.25, 2.17, 0],  # top-right
+            [-1.25, -2.17, 0],  # bottom-left
+            [1.25, -2.17, 0],  # bottom-right
+            [-3.75, 2.17, 0],  # far top-left
+            [3.75, 2.17, 0],  # far top-right
+            [-3.75, -2.17, 0],  # far bottom-left
+            [3.75, -2.17, 0],  # far bottom-right
+        ]
+    )
+
+    outer_hex_data = np.array([0, 0, 0])  # centered at origin
+    outer_hex_side_length = 8  # large enough to contain all inner hexagons
+
+    return inner_hex_data, outer_hex_data, outer_hex_side_length
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/hexagon_packing/11/requirements.txt b/examples/alphaevolve_math_problems/hexagon_packing/11/requirements.txt
new file mode 100644
index 00000000..296d6545
--- /dev/null
+++ b/examples/alphaevolve_math_problems/hexagon_packing/11/requirements.txt
@@ -0,0 +1 @@
+numpy
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/hexagon_packing/12/config.yaml b/examples/alphaevolve_math_problems/hexagon_packing/12/config.yaml
new file mode 100644
index 00000000..188b6ffe
--- /dev/null
+++ b/examples/alphaevolve_math_problems/hexagon_packing/12/config.yaml
@@ -0,0 +1,53 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert computational geometer and optimization specialist focusing on hexagon packing problems.
+    Your task is to evolve a constructor function that generates an optimal arrangement of exactly 12 unit regular hexagons within a larger regular hexagon, maximizing 1/outer_hex_side_length (equivalently minimizing the outer hexagon's side length).
+    
+    PROBLEM CONTEXT:
+    - Target: Establish new state-of-the-art for 12-hexagon packing of 1/outer_hex_side_length = 1/3.9419123 ≈ 0.2537
+    - Constraint: All 12 inner hexagons must be unit regular hexagons (side length = 1) that are fully contained within the outer hexagon with no overlaps
+    - Mathematical formulation: For hexagon i at position (xi, yi) with rotation θi:
+      * Non-overlap: All pairs of inner hexagons must be disjoint
+      * Containment: All vertices of inner hexagons must lie within the outer hexagon
+      * Objective: maximize 1/R where R is the outer hexagon side length
+
+    PERFORMANCE METRICS:
+    1. **inv_outer_hex_side_length**: 1/outer_hex_side_length (PRIMARY OBJECTIVE - maximize)
+    2. **combined_score**: inverse_side_length / 0.2537 (progress toward beating SOTA)
+    3. **eval_time**: Execution time for full evaluation
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/hexagon_packing/12/evaluator.py b/examples/alphaevolve_math_problems/hexagon_packing/12/evaluator.py
new file mode 100644
index 00000000..eaf40103
--- /dev/null
+++ b/examples/alphaevolve_math_problems/hexagon_packing/12/evaluator.py
@@ -0,0 +1,234 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for problem of packing unit regular hexagons inside
+# a regular hexagon, with 12 unit hexagons.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+
+import numpy as np
+import math
+
+N_HEX = 12
+BENCHMARK = 1 / 3.9419123
+
+
+def hexagon_vertices(
+    center_x: float,
+    center_y: float,
+    side_length: float,
+    angle_degrees: float,
+) -> list[tuple[float, float]]:
+    """Generates the vertices of a regular hexagon.
+    Args:
+    center_x: x-coordinate of the center.
+    center_y: y-coordinate of the center.
+    side_length: Length of each side.
+    angle_degrees: Rotation angle in degrees (clockwise from horizontal).
+    Returns:
+    A list of tuples, where each tuple (x, y) represents the vertex location.
+    """
+    vertices = []
+    angle_radians = math.radians(angle_degrees)
+    for i in range(6):
+        angle = angle_radians + 2 * math.pi * i / 6
+        x = center_x + side_length * math.cos(angle)
+        y = center_y + side_length * math.sin(angle)
+        vertices.append((x, y))
+    return vertices
+
+
+def normalize_vector(v: tuple[float, float]) -> tuple[float, float]:
+    """Normalizes a 2D vector."""
+    magnitude = math.sqrt(v[0] ** 2 + v[1] ** 2)
+    return (v[0] / magnitude, v[1] / magnitude) if magnitude != 0 else (0.0, 0.0)
+
+
+def get_normals(vertices: list[tuple[float, float]]) -> list[tuple[float, float]]:
+    """Gets the outward normals of a polygon's edges."""
+    normals = []
+    for i in range(len(vertices)):
+        p1 = vertices[i]
+        p2 = vertices[(i + 1) % len(vertices)]  # Wrap around to the first vertex.
+        edge = (p2[0] - p1[0], p2[1] - p1[1])
+        normal = normalize_vector((-edge[1], edge[0]))  # Rotate edge by 90 degrees.
+        normals.append(normal)
+    return normals
+
+
+def project_polygon(
+    vertices: list[tuple[float, float]],
+    axis: tuple[float, float],
+) -> tuple[float, float]:
+    """Projects a polygon onto an axis and returns the min/max values."""
+    min_proj = float("inf")
+    max_proj = float("-inf")
+    for vertex in vertices:
+        projection = vertex[0] * axis[0] + vertex[1] * axis[1]  # Dot product.
+        min_proj = min(min_proj, projection)
+        max_proj = max(max_proj, projection)
+    return min_proj, max_proj
+
+
+def overlap_1d(min1: float, max1: float, min2: float, max2: float, tol: float = 1e-6) -> bool:
+    """Determines whether two 1D intervals overlap, allowing for numerical tolerance."""
+    return max1 >= min2 - tol and max2 >= min1 - tol
+
+
+def polygons_intersect(
+    vertices1: list[tuple[float, float]],
+    vertices2: list[tuple[float, float]],
+    tol: float = 1e-6,
+) -> bool:
+    """Determines if two polygons intersect using the Separating Axis Theorem."""
+    normals1 = get_normals(vertices1)
+    normals2 = get_normals(vertices2)
+    axes = normals1 + normals2
+    for axis in axes:
+        min1, max1 = project_polygon(vertices1, axis)
+        min2, max2 = project_polygon(vertices2, axis)
+        if not overlap_1d(min1, max1, min2, max2, tol):
+            return False  # Separating axis found, polygons are disjoint.
+    return True  # No separating axis found, polygons intersect.
+
+
+def hexagons_are_disjoint(
+    hex1_params: tuple[float, float, float, float],
+    hex2_params: tuple[float, float, float, float],
+    tol: float = 1e-6,
+) -> bool:
+    """Determines if two hexagons are disjoint given their parameters."""
+    hex1_vertices = hexagon_vertices(*hex1_params)
+    hex2_vertices = hexagon_vertices(*hex2_params)
+    return not polygons_intersect(hex1_vertices, hex2_vertices, tol)
+
+
+def is_inside_hexagon(
+    point: tuple[float, float],
+    hex_params: tuple[float, float, float, float],
+    tol: float = 1e-6,
+) -> bool:
+    """Checks if a point is inside a hexagon (given its parameters)."""
+    hex_vertices = hexagon_vertices(*hex_params)
+    for i in range(len(hex_vertices)):
+        p1 = hex_vertices[i]
+        p2 = hex_vertices[(i + 1) % len(hex_vertices)]
+        edge_vector = (p2[0] - p1[0], p2[1] - p1[1])
+        point_vector = (point[0] - p1[0], point[1] - p1[1])
+        cross_product = edge_vector[0] * point_vector[1] - edge_vector[1] * point_vector[0]
+        if cross_product < -tol:  # Allow small numerical errors
+            return False
+    return True
+
+
+def all_hexagons_contained(
+    inner_hex_params_list: list[tuple[float, float, float, float]],
+    outer_hex_params: tuple[float, float, float, float],
+    tol: float = 1e-6,
+) -> bool:
+    """Checks if all inner hexagons are contained within the outer hexagon."""
+    for inner_hex_params in inner_hex_params_list:
+        inner_hex_vertices = hexagon_vertices(*inner_hex_params)
+        for vertex in inner_hex_vertices:
+            if not is_inside_hexagon(vertex, outer_hex_params, tol):
+                return False
+    return True
+
+
+def verify_construction(
+    inner_hex_data: tuple[float, float, float],
+    outer_hex_center: tuple[float, float],
+    outer_hex_side_length: float,
+    outer_hex_angle_degrees: float,
+    tol: float = 1e-6,
+):
+    """Verifies the hexagon packing construction with a rotated outer hexagon.
+    Args:
+    inner_hex_data: List of (x, y, angle_degrees) tuples for inner hexagons.
+    outer_hex_center: (x, y) tuple for the outer hexagon center.
+    outer_hex_side_length: Side length of the outer hexagon.
+    outer_hex_angle_degrees: Rotation angle of the outer hexagon in degrees.
+    tol: Numerical tolerance for geometric checks (default: 1e-6).
+    Raises:
+    AssertionError if the construction is not valid.
+    """
+    inner_hex_params_list = [
+        (x, y, 1, angle) for x, y, angle in inner_hex_data
+    ]  # Sets the side length to 1.
+    outer_hex_params = (
+        outer_hex_center[0],
+        outer_hex_center[1],
+        outer_hex_side_length,
+        outer_hex_angle_degrees,
+    )
+    # Disjointness check.
+    for i in range(len(inner_hex_params_list)):
+        for j in range(i + 1, len(inner_hex_params_list)):
+            if not hexagons_are_disjoint(inner_hex_params_list[i], inner_hex_params_list[j], tol):
+                raise AssertionError(f"Hexagons {i+1} and {j+1} intersect!")
+    # Containment check.
+    if not all_hexagons_contained(inner_hex_params_list, outer_hex_params, tol):
+        raise AssertionError("Not all inner hexagons are contained in the outer hexagon!")
+    print("Construction is valid.")
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            inner_hex_data, outer_hex_data, outer_hex_side_length = program.hexagon_packing_12()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        if not isinstance(inner_hex_data, np.ndarray):
+            inner_hex_data = np.array(inner_hex_data)
+        if not isinstance(outer_hex_data, np.ndarray):
+            outer_hex_data = np.array(outer_hex_data)
+
+        if inner_hex_data.shape != (N_HEX, 3):
+            raise ValueError(
+                f"Invalid shapes: inner_hex_data = {inner_hex_data.shape}, expected {(N_HEX,3)}"
+            )
+
+        if outer_hex_data.shape != (3,):
+            raise ValueError(
+                f"Invalid shapes: outer_hex_data = {outer_hex_data.shape}, expected {(3,)}"
+            )
+
+        outer_hex_center = outer_hex_data[:2]
+        outer_hex_angle_degrees = outer_hex_data[-1]
+        verify_construction(
+            inner_hex_data, outer_hex_center, outer_hex_side_length, outer_hex_angle_degrees
+        )
+
+        inv_outer_hex_side_length = float(1 / outer_hex_side_length)
+
+        return {
+            "inv_outer_hex_side_length": inv_outer_hex_side_length,
+            "combined_score": float(inv_outer_hex_side_length / BENCHMARK),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/hexagon_packing/12/initial_program.py b/examples/alphaevolve_math_problems/hexagon_packing/12/initial_program.py
new file mode 100644
index 00000000..c5aa3da4
--- /dev/null
+++ b/examples/alphaevolve_math_problems/hexagon_packing/12/initial_program.py
@@ -0,0 +1,38 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def hexagon_packing_12():
+    """
+    Constructs a packing of 12 disjoint unit regular hexagons inside a larger regular hexagon, maximizing 1/outer_hex_side_length.
+    Returns
+        inner_hex_data: np.ndarray of shape (12,3), where each row is of the form (x, y, angle_degrees) containing the (x,y) coordinates and angle_degree of the respective inner hexagon.
+        outer_hex_data: np.ndarray of shape (3,) of form (x,y,angle_degree) containing the (x,y) coordinates and angle_degree of the outer hexagon.
+        outer_hex_side_length: float representing the side length of the outer hexagon.
+    """
+    n = 12
+    # Simple grid arrangement of inner hexagons
+    inner_hex_data = np.array(
+        [
+            [0, 0, 0],  # center
+            [-2.5, 0, 0],  # left
+            [2.5, 0, 0],  # right
+            [-1.25, 2.17, 0],  # top-left
+            [1.25, 2.17, 0],  # top-right
+            [-1.25, -2.17, 0],  # bottom-left
+            [1.25, -2.17, 0],  # bottom-right
+            [-3.75, 2.17, 0],  # far top-left
+            [3.75, 2.17, 0],  # far top-right
+            [-3.75, -2.17, 0],  # far bottom-left
+            [3.75, -2.17, 0],  # far bottom-right,
+            [0, -4, 0],  # far bottom-center
+        ]
+    )
+
+    outer_hex_data = np.array([0, 0, 0])  # centered at origin
+    outer_hex_side_length = 8  # large enough to contain all inner hexagons
+
+    return inner_hex_data, outer_hex_data, outer_hex_side_length
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/hexagon_packing/12/requirements.txt b/examples/alphaevolve_math_problems/hexagon_packing/12/requirements.txt
new file mode 100644
index 00000000..296d6545
--- /dev/null
+++ b/examples/alphaevolve_math_problems/hexagon_packing/12/requirements.txt
@@ -0,0 +1 @@
+numpy
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/kissing_number/config.yaml b/examples/alphaevolve_math_problems/kissing_number/config.yaml
new file mode 100644
index 00000000..11604b8d
--- /dev/null
+++ b/examples/alphaevolve_math_problems/kissing_number/config.yaml
@@ -0,0 +1,58 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert in high-dimensional geometry, lattice theory, and combinatorial optimization, specializing in sphere packing and coding theory problems.
+    Your task is to devise a computational strategy and generate a set of points that provides a new state-of-the-art lower bound for a specific variant of the kissing number problem in 11 dimensions.
+
+    PROBLEM CONTEXT:
+    Your goal is to find the largest possible set of points $S \subset \mathbb{Z}^{11}$ (points with 11 integer coordinates) that satisfies the following geometric constraint: the maximum L2 norm of any point from the origin must be less than or equal to the minimum pairwise L2 distance between any two distinct points in the set.
+
+    - **Target**: Beat the AlphaEvolve benchmark of **num_points = 593**.
+    - **Constraint**: For the set of points $S = \{p_1, p_2, ..., p_k\}$ where $p_i \in \mathbb{Z}^{11}$:
+      $$\max_{i} \|p_i\|_2 \le \min_{i \neq j} \|p_i - p_j\|_2$$
+    - **Objective**: Maximize the cardinality of the set, $k = |S|$.
+
+    PERFORMANCE METRICS:
+    1.  **num_points**: The number of points in the final set $S$. **This is the primary objective to maximize.**
+    2.  **combined_score**: Your `num_points` / 593. The goal is to achieve a ratio > 1.0.
+    3.  **eval_time**: The total wall-clock time in seconds to generate the solution.
+
+    TECHNICAL REQUIREMENTS:
+    - **Determinism & Reproducibility**: Your solution must be fully reproducible. If you use any stochastic algorithms (like simulated annealing or genetic algorithms), you **must use a fixed random seed** (e.g., `numpy.random.seed(42)`).
+    - **Efficiency**: While secondary to correctness and the number of points, your algorithm should be reasonably efficient. Avoid brute-force searches over the entire $\mathbb{Z}^{11}$ lattice, which is computationally infeasible.
+
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/kissing_number/evaluator.py b/examples/alphaevolve_math_problems/kissing_number/evaluator.py
new file mode 100644
index 00000000..145fa2de
--- /dev/null
+++ b/examples/alphaevolve_math_problems/kissing_number/evaluator.py
@@ -0,0 +1,95 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the kissing number problem on dimension 11.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+import itertools
+import numpy as np
+
+DIM = 11
+TOL = 1e-6
+BENCHMARK = 593
+
+
+def compute_squared_norm(point: list[int]) -> int:
+    """Returns the squared norm of an integer vector using exact computation."""
+    return sum(pow(int(x), 2) for x in point)
+
+
+def verify_sphere_packing(sphere_centers: np.ndarray, tol: float = 1e-6):
+    """Checks that after normalizing, the points correspond to a valid sphere packing for kissing numbers.
+
+    Args:
+        sphere_centers: the list of sphere centers, of shape [num_spheres, dimension].
+
+    Raises:
+        AssertionError: if the sphere packing is not a valid kissing configuration.
+    """
+    # Rounding to integers to guarantee exact computation throughout.
+    sphere_centers = np.around(sphere_centers).astype(np.int64)
+    squared_norms = [compute_squared_norm(list(center)) for center in sphere_centers]
+
+    # Checks that the set doesn't contain 0.
+    min_squared_norm = min(squared_norms)
+    assert min_squared_norm > tol, f"Verification failed because the set contains 0."
+
+    # Checks that the minimum pairwise distance between centers >= the maximum norm of the centers.
+    max_squared_norm = max(squared_norms)
+    min_squared_distance = min(
+        compute_squared_norm(list(a - b)) for a, b in itertools.combinations(sphere_centers, 2)
+    )
+    assert (
+        min_squared_distance >= max_squared_norm
+    ), f"Verification failed because the minimum squared distance = {min_squared_distance} < {max_squared_norm} = maximum squared norm."
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            points = program.kissing_number11()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        if not isinstance(points, np.ndarray):
+            points = np.array(points)
+
+        if points.shape[1] != 11:
+            raise ValueError(
+                f"Invalid shapes: points = {points.shape}, expected ({points.shape[1]},11)"
+            )
+
+        verify_sphere_packing(points, TOL)
+
+        num_points = len(points)
+        benchmark_ratio = num_points / BENCHMARK
+        return {
+            "num_points": num_points,
+            "combined_score": float(benchmark_ratio),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/kissing_number/initial_program.py b/examples/alphaevolve_math_problems/kissing_number/initial_program.py
new file mode 100644
index 00000000..e623551d
--- /dev/null
+++ b/examples/alphaevolve_math_problems/kissing_number/initial_program.py
@@ -0,0 +1,18 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def kissing_number11() -> np.ndarray:
+    """
+    Constructs a collection of 11-dimensional points with integral coordinates such that their maximum norm is smaller than their minimum pairwise distance, aiming to maximize the number of points.
+
+    Returns:
+        points: np.ndarray of shape (num_points,11)
+    """
+    d = 11
+    points = np.array([np.ones(d), -1 * np.ones(d)]).astype(np.int64)
+
+    return points
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/kissing_number/requirements.txt b/examples/alphaevolve_math_problems/kissing_number/requirements.txt
new file mode 100644
index 00000000..296d6545
--- /dev/null
+++ b/examples/alphaevolve_math_problems/kissing_number/requirements.txt
@@ -0,0 +1 @@
+numpy
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/matmul/config.yaml b/examples/alphaevolve_math_problems/matmul/config.yaml
new file mode 100644
index 00000000..26bf4388
--- /dev/null
+++ b/examples/alphaevolve_math_problems/matmul/config.yaml
@@ -0,0 +1,80 @@
+# Evolution settings
+max_iterations: 100
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 50
+  num_islands: 3
+  migration_interval: 10
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 60
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert in computational linear algebra, numerical optimization, and AI-driven algorithm discovery.
+    Your task is to evolve and optimize a Python script to find the lowest-rank decomposition of the matrix multiplication tensor for a specific instance with variables (n=2,m=4,p=5) fixed.
+
+    PROBLEM CONTEXT:
+    Target: Find the minimal rank R for the tensor decomposition T_ijk = ∑_r=1^R U_ir V_jr W_kr.
+    The goal is to beat the best algorithm and discover a state-of-the-art algorithm with the lowest rank possible.
+    Constraint: The reconstructed tensor from the learned factors (U, V, W) must be EXACTLY EQUAL to the ground-truth matrix multiplication tensor T_ijk after its composition to a tensor form. 
+    This can be enforced in two ways:
+    * by minimizing the loss function to near-zero, and do a "rounding algorithm" to make the continuous aproximate solution converge to a exactly one where its elements are interger multiples of a constant.
+    * by making an algorithm that search in the space of the set of possible elements, or a grid of elements that can compose the final decomposition.
+    You can be creative and choose the best possible algorithm to solve this problem, for example incorporating contraints that force the solution to be in the right space, or by enforcing this from the start.
+  
+    MATHEMATICAL FORMULATION:
+    Given: The standard matrix multiplication tensor T for n, m, p fixed.
+    Objective: Find the smallest integer R such that there exist real or complex valued matrices U, V, W of shapes (n*p, R), (n*m, R), and (m*p, R) that compose T_ijk.
+
+    PERFORMANCE METRICS:
+    combined_score: The minimal inverse rank 32/R for which the optimization was successful, where 32 is the best know decomposition found by Google. A value of 1.0 means you have matched the state-of-the-art. (PRIMARY OBJECTIVE - maximize 1/R).
+    loss: The final loss function result if applicable to the method used.
+    rank: rank of the best decomposition found.
+    eval_time: time of the evaluation.
+
+    VALIDATION FRAMEWORK:
+    Numerical Validation: The final loss for a successful run must be below the success_threshold (e.g., 1e-6).
+    Equality validation: The final decomposition must be exactly equal to the tensor T_ijk:
+    matmul_tensor = np.zeros((n * m, m * p, p * n), dtype=np.int32)
+    for i in range(n):
+      for j in range(m):
+        for k in range(p):
+          matmul_tensor[i * m + j][j * p + k][k * n + i] = 1
+    Rank Validation: The discovered_rank must be an integer.
+
+    TECHNICAL REQUIREMENTS:
+    Reproducibility: Ensure the JAX PRNGKey is handled correctly (or any lib with random numbers) to get reproducible results for a given set of initial conditions and hyperparameters.
+    Numerical Stability: Be aware of potential floating-point precision issues and the possibility of exploding or vanishing gradients, suggesting remedies like gradient clipping if necessary.
+
+    PROBLEM-SPECIFIC CONSIDERATIONS:
+    Initialization is Key: Due to the non-convex landscape, the success of a run is highly dependent on the random initialization. A robust solution should work from multiple different random seeds.
+    Steps vs. Learning Rate Trade-off: A lower learning rate might require more num_steps to converge, and vice-versa. Explore this relationship to find the most efficient path to a solution.
+    From Discovery to Algorithm: The end goal is not just the factors U, V, W, but the algorithm they represent. A good solution should be interpretable as a series of R multiplications and additions/subtractions.
+    The robustness and efficiency of the proposed code and hyperparameter configuration (i.e., it should converge reliably and quickly).
+    
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/matmul/evaluator.py b/examples/alphaevolve_math_problems/matmul/evaluator.py
new file mode 100644
index 00000000..e7144d3a
--- /dev/null
+++ b/examples/alphaevolve_math_problems/matmul/evaluator.py
@@ -0,0 +1,107 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the matrix multiplication problem with tensor size
+# of <2,4,5>
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+import numpy as np
+
+BENCHMARK = 32
+
+
+def verify_tensor_decomposition(
+    decomposition: tuple[np.ndarray, np.ndarray, np.ndarray], n: int, m: int, p: int, rank: int
+):
+    """Verifies the correctness of the tensor decomposition."""
+
+    # Add robustness for cases where the optimizer might fail
+    if not all(isinstance(arr, np.ndarray) for arr in decomposition) or not decomposition:
+        raise ValueError("Decomposition must be a tuple of NumPy arrays.")
+    if any(arr.size == 0 for arr in decomposition):
+        print("Warning: One or more decomposition arrays are empty. Verification skipped.")
+        return
+
+    # Check that each factor matrix has the correct shape.
+    factor_matrix_1, factor_matrix_2, factor_matrix_3 = decomposition
+    if factor_matrix_1.shape != (n * m, rank):
+        raise ValueError(
+            f"Expected shape of factor matrix 1 is {(n * m, rank)}. Actual shape is {factor_matrix_1.shape}."
+        )
+    if factor_matrix_2.shape != (m * p, rank):
+        raise ValueError(
+            f"Expected shape of factor matrix 2 is {(m * p, rank)}. Actual shape is {factor_matrix_2.shape}."
+        )
+    if factor_matrix_3.shape != (n * p, rank):
+        raise ValueError(
+            f"Expected shape of factor matrix 3 is {(n * p, rank)}. Actual shape is {factor_matrix_3.shape}."
+        )
+
+    # Form the matrix multiplication tensor <n, m, p>.
+    matmul_tensor = np.zeros((n * m, m * p, n * p), dtype=np.float32)
+    for i in range(n):
+        for j in range(m):
+            for k in range(p):
+                # Use the standard k*n+i indexing for the third dimension
+                matmul_tensor[i * m + j, j * p + k, k * n + i] = 1
+
+    # Check that the tensor is correctly constructed.
+    constructed_tensor = np.einsum("ir,jr,kr -> ijk", *decomposition)
+
+    # Exact check
+    if not np.array_equal(constructed_tensor, matmul_tensor):
+        # If the exact check fails, report the floating-point difference for diagnostics.
+        diff = np.max(np.abs(constructed_tensor - matmul_tensor))
+        raise ValueError(
+            f"Tensor constructed by decomposition does not exactly match the target tensor. Maximum difference is {diff:.6e}."
+        )
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            decomposition, n, m, p, loss, rank = program.run()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        verify_tensor_decomposition(decomposition, n, m, p, rank)
+
+        success_threshold = 1e-6
+        if loss > success_threshold:
+            print(
+                f"\nWarning: Final loss {loss:.2e} is above the success threshold of {success_threshold:.2e}."
+            )
+
+        inverse_rank = BENCHMARK / rank
+
+        return {
+            "combined_score": inverse_rank,
+            "loss": loss,
+            "rank": rank,
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/matmul/initial_program.py b/examples/alphaevolve_math_problems/matmul/initial_program.py
new file mode 100644
index 00000000..c9c30df0
--- /dev/null
+++ b/examples/alphaevolve_math_problems/matmul/initial_program.py
@@ -0,0 +1,199 @@
+# Disable progress bar for cleaner output logs
+import os
+
+os.environ["TQDM_DISABLE"] = "1"
+
+# Fixed parameters
+n, m, p = 2, 4, 5
+
+# EVOLVE-BLOCK-START
+import numpy as np
+import jax
+import jax.numpy as jnp
+import optax
+from dataclasses import dataclass
+import tqdm
+
+
+# --- Straight-Through Estimator for Rounding ---
+@jax.custom_vjp
+def round_to_half_ste(x):
+    """Forward pass: snaps values to the nearest half-integer."""
+    return jnp.round(x * 2) / 2
+
+
+def round_ste_fwd(x):
+    """Standard forward pass and identity for backward pass."""
+    return round_to_half_ste(x), None
+
+
+def round_ste_bwd(res, g):
+    """Backward pass: Identity function, passes gradient straight through."""
+    return (g,)
+
+
+round_to_half_ste.defvjp(round_ste_fwd, round_ste_bwd)
+# --- End of STE definition ---
+
+
+# --- Loss Functions ---
+def weighted_l2_loss(reconstructed: jnp.ndarray, target: jnp.ndarray) -> jnp.ndarray:
+    error = reconstructed - target
+    weights = jnp.where(target != 0, 100.0, 1.0)
+    return jnp.mean(weights * (error**2))
+
+
+def l2_loss_real(x: jnp.ndarray, y: jnp.ndarray) -> jnp.ndarray:
+    return jnp.mean((x - y) ** 2)
+
+
+# --- Hyperparameters ---
+@dataclass
+class Hyperparameters:
+    rank: int = 55
+    # Phase 1: Continuous Search
+    num_restarts: int = 10
+    phase1_steps: int = 80000
+    phase1_lr: float = 0.01
+    init_scale: float = 0.1
+    l1_strength: float = 1e-6
+    clamp_range: float = 4.0
+    # Phase 2: Discrete Fine-tuning
+    phase2_steps: int = 20000
+    phase2_lr: float = 1e-4  # A much smaller learning rate for fine-tuning
+
+
+# --- Optimizer Classes ---
+class ContinuousOptimizer:
+    """Finds a high-quality approximate continuous solution."""
+
+    def __init__(self, target_tensor: jnp.ndarray, hypers: Hyperparameters):
+        self.target_tensor = target_tensor
+        self.hypers = hypers
+        self.opt = optax.adam(hypers.phase1_lr)
+
+    def _get_constrained_decomposition(self, latent_decomposition: tuple) -> tuple:
+        """Applies a scaled tanh to map latent parameters to the desired range."""
+        return jax.tree_util.tree_map(
+            lambda x: self.hypers.clamp_range * jnp.tanh(x), latent_decomposition
+        )
+
+    def _loss_fn(self, latent_decomposition: tuple) -> jnp.ndarray:
+        constrained = self._get_constrained_decomposition(latent_decomposition)
+        reconstructed = jnp.einsum("ir,jr,kr->ijk", *constrained)
+        recon_loss = weighted_l2_loss(reconstructed, self.target_tensor)
+        l1_penalty = sum(jnp.mean(jnp.abs(arr)) for arr in constrained)
+        return recon_loss + self.hypers.l1_strength * l1_penalty
+
+
+class DiscreteOptimizer:
+    """Refines a continuous solution into an exact discrete one using an STE."""
+
+    def __init__(self, target_tensor: jnp.ndarray, hypers: Hyperparameters):
+        self.target_tensor = target_tensor
+        self.hypers = hypers
+        self.opt = optax.adam(hypers.phase2_lr)
+
+    def _loss_fn(self, continuous_decomposition: tuple) -> jnp.ndarray:
+        # Snap the continuous parameters to the discrete grid
+        discrete_decomposition = jax.tree_util.tree_map(round_to_half_ste, continuous_decomposition)
+        # Compute the loss using only these exact half-integer values
+        reconstructed = jnp.einsum("ir,jr,kr->ijk", *discrete_decomposition)
+        return l2_loss_real(reconstructed, self.target_tensor)
+
+
+# --- JIT-compatible Train Step ---
+def train_step(params, opt_state, optimizer, loss_fn):
+    loss, grads = jax.value_and_grad(loss_fn)(params)
+    updates, opt_state = optimizer.update(grads, opt_state, params)
+    params = optax.apply_updates(params, updates)
+    return params, opt_state, loss
+
+
+def get_matrix_multiplication_tensor(n, m, p):
+    T = jnp.zeros((n * m, m * p, n * p))
+    for i, j, k in np.ndindex(n, m, p):
+        T = T.at[i * m + j, j * p + k, k * n + i].set(1)
+    return T
+
+
+def run():
+    hypers = Hyperparameters()
+    target_tensor = get_matrix_multiplication_tensor(n, m, p)
+    main_key = jax.random.PRNGKey(42)
+
+    # --- PHASE 1: CONTINUOUS EXPLORATION ---
+    print(f"\n{'='*20} PHASE 1: Continuous Exploration {'='*20}")
+    best_loss_phase1 = float("inf")
+    best_latent_decomp = None
+
+    continuous_optimizer = ContinuousOptimizer(target_tensor, hypers)
+
+    # JIT the train_step for the continuous phase
+    jit_train_step_continuous = jax.jit(train_step, static_argnums=(2, 3))
+
+    for i in range(hypers.num_restarts):
+        print(f"\n--- Restart {i+1}/{hypers.num_restarts} ---")
+        main_key, restart_key = jax.random.split(main_key)
+        init_fn = jax.nn.initializers.normal(stddev=hypers.init_scale)
+        latent_decomp = (
+            init_fn(restart_key, (n * m, hypers.rank)),
+            init_fn(restart_key, (m * p, hypers.rank)),
+            init_fn(restart_key, (n * p, hypers.rank)),
+        )
+        opt_state = continuous_optimizer.opt.init(latent_decomp)
+
+        for _ in tqdm.tqdm(range(hypers.phase1_steps), desc="Continuous Search"):
+            latent_decomp, opt_state, loss = jit_train_step_continuous(
+                latent_decomp,
+                opt_state,
+                continuous_optimizer.opt,
+                continuous_optimizer._loss_fn,
+            )
+
+        final_loss = l2_loss_real(
+            target_tensor,
+            jnp.einsum(
+                "ir,jr,kr->ijk",
+                *continuous_optimizer._get_constrained_decomposition(latent_decomp),
+            ),
+        )
+        print(f"End of Trial | Final continuous loss: {final_loss:.8f}")
+
+        if final_loss < best_loss_phase1:
+            best_loss_phase1 = final_loss
+            best_latent_decomp = latent_decomp
+
+    # --- PHASE 2: DISCRETE FINE-TUNING ---
+    print(f"\n{'='*20} PHASE 2: Discrete Fine-tuning (STE) {'='*20}")
+    print(f"Starting with best continuous solution (loss: {best_loss_phase1:.8f})")
+
+    continuous_params = continuous_optimizer._get_constrained_decomposition(best_latent_decomp)
+
+    discrete_optimizer = DiscreteOptimizer(target_tensor, hypers)
+    opt_state = discrete_optimizer.opt.init(continuous_params)
+
+    # JIT the train_step for the discrete phase
+    jit_train_step_discrete = jax.jit(train_step, static_argnums=(2, 3))
+
+    for step in tqdm.tqdm(range(hypers.phase2_steps), desc="Discrete Fine-tuning"):
+        continuous_params, opt_state, loss = jit_train_step_discrete(
+            continuous_params, opt_state, discrete_optimizer.opt, discrete_optimizer._loss_fn
+        )
+        if (step + 1) % 2000 == 0:
+            print(f"Step {step+1} | Discrete Loss: {loss:.8f}")
+        if loss < 1e-7:
+            print("\nFound a perfect solution!")
+            break
+
+    final_discrete_decomposition = jax.tree_util.tree_map(round_to_half_ste, continuous_params)
+    final_loss = l2_loss_real(
+        target_tensor, jnp.einsum("ir,jr,kr->ijk", *final_discrete_decomposition)
+    )
+    print(f"Search complete. Final discrete loss: {final_loss:.8f}")
+
+    final_decomposition_np = jax.tree_util.tree_map(np.array, final_discrete_decomposition)
+    return final_decomposition_np, n, m, p, float(final_loss), hypers.rank
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/matmul/requirements.txt b/examples/alphaevolve_math_problems/matmul/requirements.txt
new file mode 100644
index 00000000..3dee6952
--- /dev/null
+++ b/examples/alphaevolve_math_problems/matmul/requirements.txt
@@ -0,0 +1,3 @@
+numpy
+jax
+optax
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/config.yaml b/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/config.yaml
new file mode 100644
index 00000000..a58741a3
--- /dev/null
+++ b/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/config.yaml
@@ -0,0 +1,57 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert computational geometer and optimization specialist focusing on point dispersion problems.
+    Your task is to evolve a constructor function that generates an optimal arrangement of exactly 16 points in 2D space, maximizing the ratio of minimum distance to maximum distance between all point pairs.
+    
+    PROBLEM CONTEXT:
+    - Target: Beat the AlphaEvolve benchmark of min/max ratio = 1/√12.889266112 ≈ 0.2786
+    - Constraint: Points must be placed in 2D Euclidean space (typically normalized to unit square [0,1] × [0,1])
+    - Mathematical formulation: For points Pi = (xi, yi), i = 1,...,16:
+      * Distance matrix: dij = √[(xi-xj)² + (yi-yj)²] for all i≠j
+      * Minimum distance: dmin = min{dij : i≠j}
+      * Maximum distance: dmax = max{dij : i≠j}
+      * Objective: maximize dmin/dmax subject to spatial constraints
+
+    PERFORMANCE METRICS:
+    1. **min_max_ratio**: dmin/dmax ratio (PRIMARY OBJECTIVE - maximize)
+    2. **combined_score**: min_max_ratio / 0.2786 (progress toward beating AlphaEvolve benchmark)
+    3. **eval_time**: Execution time in seconds (balance accuracy vs. efficiency)
+
+    TECHNICAL REQUIREMENTS:
+    - **Reproducibility**: Fixed random seeds for all stochastic components
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/evaluator.py b/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/evaluator.py
new file mode 100644
index 00000000..0a81387d
--- /dev/null
+++ b/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/evaluator.py
@@ -0,0 +1,65 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for problem of minimizing the ratio of maximum
+# to minimum distance on dimension 2 and with 16 points.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import scipy as sp
+import time
+import numpy as np
+
+NUM_POINTS = 16
+DIMENSION = 2
+BENCHMARK = 1 / 12.889266112
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            points = program.min_max_dist_dim2_16()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        if not isinstance(points, np.ndarray):
+            points = np.array(points)
+
+        if points.shape != (NUM_POINTS, DIMENSION):
+            raise ValueError(
+                f"Invalid shapes: points = {points.shape}, expected {(NUM_POINTS,DIMENSION)}"
+            )
+
+        pairwise_distances = sp.spatial.distance.pdist(points)
+        min_distance = np.min(pairwise_distances)
+        max_distance = np.max(pairwise_distances)
+
+        inv_ratio_squared = (min_distance / max_distance) ** 2 if max_distance > 0 else 0
+        return {
+            "min_max_ratio": float(inv_ratio_squared),
+            "combined_score": float(inv_ratio_squared / BENCHMARK),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/initial_program.py b/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/initial_program.py
new file mode 100644
index 00000000..9348ce43
--- /dev/null
+++ b/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/initial_program.py
@@ -0,0 +1,24 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def min_max_dist_dim2_16() -> np.ndarray:
+    """
+    Creates 16 points in 2 dimensions in order to maximize the ratio of minimum to maximum distance.
+
+    Returns
+        points: np.ndarray of shape (16,2) containing the (x,y) coordinates of the 16 points.
+
+    """
+
+    n = 16
+    d = 2
+
+    # places points randomly
+    np.random.seed(42)
+    points = np.random.randn(n, d)
+
+    return points
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/requirements.txt b/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/requirements.txt
new file mode 100644
index 00000000..5576e19f
--- /dev/null
+++ b/examples/alphaevolve_math_problems/minimizing_max_min_dist/2/requirements.txt
@@ -0,0 +1,2 @@
+numpy
+scipy
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/config.yaml b/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/config.yaml
new file mode 100644
index 00000000..eb2cf2f1
--- /dev/null
+++ b/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/config.yaml
@@ -0,0 +1,57 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert computational geometer and optimization specialist focusing on 3D point dispersion problems.
+    Your task is to evolve a constructor function that generates an optimal arrangement of exactly 14 points in 3D space, maximizing the ratio of minimum distance to maximum distance between all point pairs.
+    
+    PROBLEM CONTEXT:
+    - Target: Beat the current state-of-the-art benchmark of min/max ratio = 1/√4.165849767 ≈ 0.4898
+    - Constraint: Points must be placed in 3D Euclidean space (typically normalized to unit cube [0,1]³ or unit sphere)
+    - Mathematical formulation: For points Pi = (xi, yi, zi), i = 1,...,14:
+      * Distance matrix: dij = √[(xi-xj)² + (yi-yj)² + (zi-zj)²] for all i≠j
+      * Minimum distance: dmin = min{dij : i≠j}
+      * Maximum distance: dmax = max{dij : i≠j}
+      * Objective: maximize dmin/dmax subject to spatial constraints
+
+    PERFORMANCE METRICS:
+    1. **min_max_ratio**: dmin/dmax ratio (PRIMARY OBJECTIVE - maximize)
+    2. **combined_score**: min_max_ratio / 0.4898 (progress toward beating AlphaEvolve benchmark)
+    3. **eval_time**: Execution time in seconds (balance accuracy vs. efficiency)
+
+    TECHNICAL REQUIREMENTS:
+    - **Reproducibility**: Fixed random seeds for all stochastic components
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/evaluator.py b/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/evaluator.py
new file mode 100644
index 00000000..ce70df70
--- /dev/null
+++ b/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/evaluator.py
@@ -0,0 +1,65 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for problem of minimizing the ratio of maximum
+# to minimum distance on dimension 3 and with 14 points.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import scipy as sp
+import time
+import numpy as np
+
+NUM_POINTS = 14
+DIMENSION = 3
+BENCHMARK = 1 / 4.165849767
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            points = program.min_max_dist_dim3_14()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        except Exception as err:
+            raise err
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        if not isinstance(points, np.ndarray):
+            points = np.array(points)
+
+        if points.shape != (NUM_POINTS, DIMENSION):
+            raise ValueError(
+                f"Invalid shapes: points = {points.shape}, expected {(NUM_POINTS,DIMENSION)}"
+            )
+
+        pairwise_distances = sp.spatial.distance.pdist(points)
+        min_distance = np.min(pairwise_distances)
+        max_distance = np.max(pairwise_distances)
+
+        inv_ratio_squared = (min_distance / max_distance) ** 2 if max_distance > 0 else 0
+        return {
+            "min_max_ratio": float(inv_ratio_squared),
+            "combined_score": float(inv_ratio_squared / BENCHMARK),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/initial_program.py b/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/initial_program.py
new file mode 100644
index 00000000..d58a3179
--- /dev/null
+++ b/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/initial_program.py
@@ -0,0 +1,24 @@
+# EVOLVE-BLOCK-START
+import numpy as np
+
+
+def min_max_dist_dim3_14() -> np.ndarray:
+    """
+    Creates 14 points in 3 dimensions in order to maximize the ratio of minimum to maximum distance.
+
+    Returns
+        points: np.ndarray of shape (14,3) containing the (x,y) coordinates of the 14 points.
+
+    """
+
+    n = 14
+    d = 3
+
+    # places points randomly
+    np.random.seed(42)
+    points = np.random.randn(n, d)
+
+    return points
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/requirements.txt b/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/requirements.txt
new file mode 100644
index 00000000..5576e19f
--- /dev/null
+++ b/examples/alphaevolve_math_problems/minimizing_max_min_dist/3/requirements.txt
@@ -0,0 +1,2 @@
+numpy
+scipy
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/second_autocorr_ineq/config.yaml b/examples/alphaevolve_math_problems/second_autocorr_ineq/config.yaml
new file mode 100644
index 00000000..51867e5f
--- /dev/null
+++ b/examples/alphaevolve_math_problems/second_autocorr_ineq/config.yaml
@@ -0,0 +1,137 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are a world-class expert in functional analysis, harmonic analysis, numerical optimization, and AI-driven mathematical discovery. Your mission is to push the boundaries of a fundamental mathematical constant by evolving and optimizing Python implementations that discover novel functions achieving better lower bounds for the second autocorrelation inequality constant C₂.
+
+    MATHEMATICAL PROBLEM CONTEXT:
+    **Core Problem**: Find a non-negative function f: ℝ → ℝ that maximizes the constant C₂ in the second autocorrelation inequality:
+    ||f ★ f||₂² ≤ C₂ ||f ★ f||₁ ||f ★ f||_{∞}
+
+    **Mathematical Framework**:
+    - Objective: Maximize C₂ = ||f ★ f||₂² / (||f ★ f||₁ ||f ★ f||_{∞})
+    - Key simplification: ||f ★ f||₁ = (∫f)², reducing to C₂ = ||f ★ f||₂² / ((∫f)² ||f ★ f||_{∞})
+    - Convolution definition: (f ★ f)(x) = ∫_{-∞}^{∞} f(t)f(x-t) dt
+    - Norms: ||g||₁ = ∫|g|, ||g||₂ = (∫|g|²)^{1/2}, ||g||_{∞} = sup|g|
+    - Constraints: f(x) ≥ 0 for all x ∈ ℝ, ∫f > 0
+
+    **Historical Context & Current State**:
+    - Theoretical bounds: 0.88922 ≤ C₂ ≤ 1 (Young's inequality provides upper bound)
+    - Current best lower bound: **0.8962799441554086** (achieved by Google's AlphaEvolve using step functions)
+    - **Target**: Surpass 0.8962799441554086 to establish a new world record
+    - Mathematical significance: This constant appears in harmonic analysis and has connections to the uncertainty principle
+
+    **Known Function Classes & Their Performance**:
+    - Gaussian functions: ~0.886
+    - Exponential decay: ~0.885
+    - Step functions: 0.8962799441554086 (current champion)
+    - Polynomial decay: Various results < 0.89
+    - Spline functions: Unexplored potential
+    - Piecewise functions: High promise based on step function success
+
+    PERFORMANCE METRICS & SUCCESS CRITERIA:
+    **Primary Objective**:
+    - c2: The C₂ constant achieved (MAXIMIZE THIS - any value > 0.8962799441554086 is groundbreaking)
+
+    **Secondary Metrics**:
+    - combined_score: c2 / 0.8962799441554086 (>1.0 means new world record)
+    - convergence_stability: Consistency across multiple runs
+    - function_complexity: Number of parameters/pieces in the discovered function
+    - computational_efficiency: Time to convergence
+
+    **Diagnostic Metrics**:
+    - loss: Final optimization loss value
+    - n_points: Discretization resolution used
+    - eval_time: Total execution time
+    - gradient_norm: Final gradient magnitude (for gradient-based methods)
+
+    COMPUTATIONAL RESOURCES & IMPLEMENTATION STACK:
+    **Core Mathematical Libraries**: 
+    - numpy, scipy (optimization, integration, FFT for convolutions)
+    - sympy (symbolic computation, analytical derivatives)
+    - jax (automatic differentiation, GPU acceleration)
+    - torch (deep learning optimization, autograd)
+
+    **Optimization & ML Libraries**:
+    - optax (advanced optimizers), scikit-learn (preprocessing, clustering)
+    - numba (JIT compilation for speed)
+
+    **Data & Analysis**:
+    - pandas (results analysis), matplotlib/plotly (visualization)
+    - networkx (if exploring graph-based function representations)
+
+    **Suggested Advanced Packages** (if available):
+    - cvxpy (convex optimization), autograd, casadi (optimal control)
+    - tensorflow-probability (probabilistic methods)
+    - pymoo (multi-objective optimization)
+
+    TECHNICAL REQUIREMENTS & BEST PRACTICES:
+    **Reproducibility (CRITICAL)**:
+    - Fixed random seeds for ALL stochastic components: `numpy.random.seed(42)`, `torch.manual_seed(42)`
+    - Version control: Document package versions used
+    - Deterministic algorithms preferred; if stochastic, average over multiple seeds
+
+    **Function Constraints**:
+    - f(x) ≥ 0 everywhere (use softplus, exponential, or squared transformations)
+    - ∫f > 0 (non-trivial function requirement)
+    - Numerical stability: Avoid functions causing overflow in convolution computation
+
+    **Computational Efficiency**:
+    - Leverage FFT for convolution when possible: O(n log n) vs O(n²)
+    - Use JAX for GPU acceleration and automatic differentiation
+    - Implement adaptive discretization: start coarse, refine around promising regions
+    - Memory management: Handle large convolution arrays efficiently
+
+    STRATEGIC APPROACHES & INNOVATION DIRECTIONS:
+    **Optimization Strategies**:
+    1. **Multi-scale approach**: Optimize on coarse grid, then refine
+    2. **Ensemble methods**: Combine multiple promising functions
+    3. **Adaptive parametrization**: Start simple, increase complexity gradually
+    4. **Basin hopping**: Global optimization with local refinement
+
+    **Function Representation Ideas**:
+    1. **Learned basis functions**: Neural networks with mathematical priors
+    2. **Spline optimization**: B-splines with optimized knot positions
+    3. **Fourier space**: Optimize Fourier coefficients with positivity constraints
+    4. **Mixture models**: Weighted combinations of simple functions
+    5. **Fractal/self-similar**: Exploit scale invariance properties
+
+    **Advanced Mathematical Techniques**:
+    - Variational calculus: Derive optimality conditions analytically
+    - Spectral methods: Leverage eigenfunction decompositions
+    - Convex relaxations: Handle non-convex constraints systematically
+    - Symmetry exploitation: Use even functions (f(-x) = f(x)) to reduce complexity
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/second_autocorr_ineq/evaluator.py b/examples/alphaevolve_math_problems/second_autocorr_ineq/evaluator.py
new file mode 100644
index 00000000..ce51b43d
--- /dev/null
+++ b/examples/alphaevolve_math_problems/second_autocorr_ineq/evaluator.py
@@ -0,0 +1,87 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the second autocorrelation inequality problem.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+import numpy as np
+
+BENCHMARK = 0.8962799441554086
+
+
+def verify_c2_solution(f_values: np.ndarray, c2_achieved_from_opt: float, n_points: int):
+    """
+    Verifies the C2 lower bound solution using the rigorous, unitless, piecewise linear integral method.
+    """
+    if f_values.shape != (n_points,):
+        raise ValueError(f"Expected function values shape {(n_points,)}. Got {f_values.shape}.")
+
+    if np.any(f_values < -1e-6):  # Allow for small floating point errors
+        raise ValueError("Function must be non-negative.")
+
+    f_nonneg = np.maximum(f_values, 0.0)
+    # The raw, unscaled convolution is used
+    convolution = np.convolve(f_nonneg, f_nonneg, mode="full")
+
+    # Calculate the L2-norm squared: ||f*f||_2^2 via piecewise linear integration
+    num_conv_points = len(convolution)
+    x_points = np.linspace(-0.5, 0.5, num_conv_points + 2)
+    x_intervals = np.diff(x_points)
+    y_points = np.concatenate(([0], convolution, [0]))
+    l2_norm_squared = 0.0
+    for i in range(len(convolution) + 1):
+        y1, y2, h = y_points[i], y_points[i + 1], x_intervals[i]
+        interval_l2_squared = (h / 3) * (y1**2 + y1 * y2 + y2**2)
+        l2_norm_squared += interval_l2_squared
+
+    # Calculate the L1-norm: ||f*f||_1
+    # This is an approximation of the integral of the absolute value of the autoconvolution
+    norm_1 = np.sum(np.abs(convolution)) / (len(convolution) + 1)
+
+    # Calculate the infinity-norm: ||f*f||_inf
+    norm_inf = np.max(np.abs(convolution))
+
+    computed_c2 = l2_norm_squared / (norm_1 * norm_inf)
+
+    return computed_c2
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            f_values, c2_achieved_from_opt, loss, n_points = program.run()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        c2_verified = verify_c2_solution(f_values, c2_achieved_from_opt, n_points)
+
+        return {
+            "c2": float(c2_verified),
+            "combined_score": float(c2_verified) / BENCHMARK,
+            "loss": float(loss),
+            "n_points": int(n_points),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/second_autocorr_ineq/initial_program.py b/examples/alphaevolve_math_problems/second_autocorr_ineq/initial_program.py
new file mode 100644
index 00000000..020fb880
--- /dev/null
+++ b/examples/alphaevolve_math_problems/second_autocorr_ineq/initial_program.py
@@ -0,0 +1,110 @@
+# EVOLVE-BLOCK-START
+import jax
+import jax.numpy as jnp
+import optax
+import numpy as np
+from dataclasses import dataclass
+
+
+@dataclass
+class Hyperparameters:
+    """Hyperparameters for the optimization process."""
+
+    num_intervals: int = 50
+    learning_rate: float = 0.01
+    num_steps: int = 15000
+    warmup_steps: int = 1000
+
+
+class C2Optimizer:
+    """
+    Optimizes a discretized function to find a lower bound for the C2 constant
+    using the rigorous, unitless, piecewise-linear integral method.
+    """
+
+    def __init__(self, hypers: Hyperparameters):
+        self.hypers = hypers
+
+    def _objective_fn(self, f_values: jnp.ndarray) -> jnp.ndarray:
+        """
+        Computes the objective function using the unitless norm calculation.
+        """
+        f_non_negative = jax.nn.relu(f_values)
+
+        # Unscaled discrete autoconvolution
+        N = self.hypers.num_intervals
+        padded_f = jnp.pad(f_non_negative, (0, N))
+        fft_f = jnp.fft.fft(padded_f)
+        convolution = jnp.fft.ifft(fft_f * fft_f).real
+
+        # Calculate L2-norm squared of the convolution (rigorous method)
+        num_conv_points = len(convolution)
+        h = 1.0 / (num_conv_points + 1)
+        y_points = jnp.concatenate([jnp.array([0.0]), convolution, jnp.array([0.0])])
+        y1, y2 = y_points[:-1], y_points[1:]
+        l2_norm_squared = jnp.sum((h / 3) * (y1**2 + y1 * y2 + y2**2))
+
+        # Calculate L1-norm of the convolution
+        norm_1 = jnp.sum(jnp.abs(convolution)) / (len(convolution) + 1)
+
+        # Calculate infinity-norm of the convolution
+        norm_inf = jnp.max(jnp.abs(convolution))
+
+        # Calculate C2 ratio
+        denominator = norm_1 * norm_inf
+        c2_ratio = l2_norm_squared / denominator
+
+        # We want to MAXIMIZE C2, so the optimizer must MINIMIZE its negative.
+        return -c2_ratio
+
+    def train_step(self, f_values: jnp.ndarray, opt_state: optax.OptState) -> tuple:
+        """Performs a single training step."""
+        loss, grads = jax.value_and_grad(self._objective_fn)(f_values)
+        updates, opt_state = self.optimizer.update(grads, opt_state, f_values)
+        f_values = optax.apply_updates(f_values, updates)
+        return f_values, opt_state, loss
+
+    def run_optimization(self):
+        """Sets up and runs the full optimization process."""
+        schedule = optax.warmup_cosine_decay_schedule(
+            init_value=0.0,
+            peak_value=self.hypers.learning_rate,
+            warmup_steps=self.hypers.warmup_steps,
+            decay_steps=self.hypers.num_steps - self.hypers.warmup_steps,
+            end_value=self.hypers.learning_rate * 1e-4,
+        )
+        self.optimizer = optax.adam(learning_rate=schedule)
+
+        key = jax.random.PRNGKey(42)
+        f_values = jax.random.uniform(key, (self.hypers.num_intervals,))
+
+        opt_state = self.optimizer.init(f_values)
+        print(
+            f"Number of intervals (N): {self.hypers.num_intervals}, Steps: {self.hypers.num_steps}"
+        )
+        train_step_jit = jax.jit(self.train_step)
+
+        loss = jnp.inf
+        for step in range(self.hypers.num_steps):
+            f_values, opt_state, loss = train_step_jit(f_values, opt_state)
+            if step % 1000 == 0 or step == self.hypers.num_steps - 1:
+                print(f"Step {step:5d} | C2 ≈ {-loss:.8f}")
+
+        final_c2 = -self._objective_fn(f_values)
+        print(f"Final C2 lower bound found: {final_c2:.8f}")
+        return jax.nn.relu(f_values), final_c2
+
+
+def run():
+    """Entry point for running the optimization."""
+    hypers = Hyperparameters()
+    optimizer = C2Optimizer(hypers)
+    optimized_f, final_c2_val = optimizer.run_optimization()
+
+    loss_val = -final_c2_val
+    f_values_np = np.array(optimized_f)
+
+    return f_values_np, float(final_c2_val), float(loss_val), hypers.num_intervals
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/second_autocorr_ineq/requirements.txt b/examples/alphaevolve_math_problems/second_autocorr_ineq/requirements.txt
new file mode 100644
index 00000000..3dee6952
--- /dev/null
+++ b/examples/alphaevolve_math_problems/second_autocorr_ineq/requirements.txt
@@ -0,0 +1,3 @@
+numpy
+jax
+optax
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/sums_diffs_finite_sets/config.yaml b/examples/alphaevolve_math_problems/sums_diffs_finite_sets/config.yaml
new file mode 100644
index 00000000..dc216197
--- /dev/null
+++ b/examples/alphaevolve_math_problems/sums_diffs_finite_sets/config.yaml
@@ -0,0 +1,59 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert in number theory, combinatorial optimization, and AI-driven mathematical discovery.
+    Your task is to evolve and optimize a Python script to find a finite set of integers `U` that provides a new, world-record **lower bound** for the constant C₆.
+    
+    PROBLEM CONTEXT:
+    Target: Find a finite set `U` of non-negative integers (containing 0) that **maximizes** the objective function:
+    C6(U) = 1 + log(|U-U| / |U+U|) / log(2*max(U) + 1)
+    
+    This maximum value provides a tight lower bound for the constant C6.
+    
+    Current best known lower bound: C6 ≥ 1.158417281556896
+    Goal: Find a set `U` that results in a C6 value greater than 1.158417281556896.
+    
+    PERFORMANCE METRICS:
+    - c6_bound: Bound founded.
+    - combined_score: c6_bound/1.158417281556896 (The primary objective is to MAXIMIZE this value - a value > 1 means a new record).
+    - set_size: size of the set U.
+    - max_val: max value in the set U.
+    - eval_time: evaluation time of the main program.
+
+    VALIDATION FRAMEWORK:
+    - The evaluation script re-computes the C6 value using standard NumPy set operations and verifies the constraints on `U`.
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/sums_diffs_finite_sets/evaluator.py b/examples/alphaevolve_math_problems/sums_diffs_finite_sets/evaluator.py
new file mode 100644
index 00000000..9e4d1e24
--- /dev/null
+++ b/examples/alphaevolve_math_problems/sums_diffs_finite_sets/evaluator.py
@@ -0,0 +1,89 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the sums and differences of finite sets problem.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+import numpy as np
+
+BENCHMARK = 1.158417281556896
+
+
+def verify_c6_solution(u_set: np.ndarray, c6_achieved: float):
+    """Verifies the C6 lower bound solution."""
+
+    if not isinstance(u_set, np.ndarray) or u_set.ndim != 1:
+        raise ValueError("Solution U must be a 1D numpy array of integers.")
+
+    # Verify constraints
+    if 0 not in u_set:
+        raise ValueError("Set U must contain 0.")
+    if np.any(u_set < 0):
+        raise ValueError("Set U must contain non-negative integers.")
+
+    # Re-calculate the C6 bound using NumPy
+    u_plus_u = np.unique(u_set[:, None] + u_set[None, :])
+    u_minus_u = np.unique(u_set[:, None] - u_set[None, :])
+
+    size_U_plus_U = len(u_plus_u)
+    size_U_minus_U = len(u_minus_u)
+    max_U = np.max(u_set)
+
+    ratio = size_U_minus_U / size_U_plus_U
+    log_ratio = np.log(ratio)
+    log_denom = np.log(2 * max_U + 1)
+
+    computed_c6 = 1 + log_ratio / log_denom
+
+    # Check for consistency
+    if not np.isclose(computed_c6, c6_achieved):
+        raise ValueError(f"C6 mismatch: reported {c6_achieved:.6f}, computed {computed_c6:.6f}")
+
+    print(f"C6 lower bound achieved: {c6_achieved:.6f}")
+    print(f"Known best bound (AlphaEvolve): {BENCHMARK}")
+
+    if c6_achieved > BENCHMARK:
+        print("Successfully found a new, better lower bound!")
+    else:
+        print("Result is not better than the known lower bounds.")
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            u_set, c6_bound = program.run()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        verify_c6_solution(u_set, c6_bound)
+
+        return {
+            "c6_bound": float(c6_bound),
+            "combined_score": float(c6_bound) / BENCHMARK,
+            "set_size": len(u_set),
+            "max_val": int(np.max(u_set)),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/sums_diffs_finite_sets/initial_program.py b/examples/alphaevolve_math_problems/sums_diffs_finite_sets/initial_program.py
new file mode 100644
index 00000000..d63997af
--- /dev/null
+++ b/examples/alphaevolve_math_problems/sums_diffs_finite_sets/initial_program.py
@@ -0,0 +1,117 @@
+# EVOLVE-BLOCK-START
+import jax
+import jax.numpy as jnp
+from dataclasses import dataclass
+import numpy as np
+import tqdm
+
+
+@dataclass
+class Hyperparameters:
+    max_integer: int = 250
+    num_restarts: int = 5
+    num_search_steps: int = 1000
+    initial_temperature: float = 0.01
+
+
+class C6Searcher:
+    """
+    Searches for a set U by running the search in pure Python for correctness.
+    """
+
+    def __init__(self, hypers: Hyperparameters):
+        self.hypers = hypers
+        self.allowed_values = jnp.array((-1, 0, 1), dtype=jnp.int32)
+
+    @staticmethod
+    def _objective_fn(u_mask: jnp.ndarray) -> jnp.ndarray:
+        """Calculates the C6 lower bound using jnp.unique"""
+        U = jnp.where(u_mask)[0]
+
+        sums = U[:, None] + U[None, :]
+        diffs = U[:, None] - U[None, :]
+
+        size_U_plus_U = jnp.unique(sums).shape[0]
+        size_U_minus_U = jnp.unique(diffs).shape[0]
+        max_U = jnp.max(U)
+
+        # Handle the case where max_U is 0 to avoid log(1)=0 in denominator
+        if max_U == 0:
+            return -1.0  # Return a low value for trivial sets
+
+        ratio = size_U_minus_U / size_U_plus_U
+        c6_bound = 1 + jnp.log(ratio) / jnp.log(2 * max_U + 1)
+
+        return -c6_bound  # Return negative for maximization
+
+    def anneal_step(self, key, temp, current_mask, current_loss):
+        """Performs one step of Simulated Annealing (not JIT-compiled)."""
+        # Propose a random mutation
+        idx_to_flip = jax.random.randint(key, (), 1, len(current_mask))
+        neighbor_mask = current_mask.at[idx_to_flip].set(1 - current_mask[idx_to_flip])
+
+        neighbor_loss = self._objective_fn(neighbor_mask)
+        delta_loss = neighbor_loss - current_loss
+
+        # Metropolis acceptance criterion
+        should_accept = False
+        if delta_loss < 0:
+            should_accept = True
+        else:
+            accept_prob = jnp.exp(-delta_loss / temp)
+            if jax.random.uniform(key) < accept_prob:
+                should_accept = True
+
+        if should_accept:
+            return neighbor_mask, neighbor_loss
+        else:
+            return current_mask, current_loss
+
+
+def run():
+    hypers = Hyperparameters()
+    main_key = jax.random.PRNGKey(42)
+
+    best_loss = float("inf")
+    best_set_np = None
+
+    for i in range(hypers.num_restarts):
+        print(f"\n{'='*20} Restart {i+1}/{hypers.num_restarts} {'='*20}")
+        restart_key, main_key = jax.random.split(main_key)
+        loss, u_set_np = run_single_trial(hypers, restart_key)
+
+        if loss < best_loss:
+            print(f"New best C6 bound found: {-loss:.8f}")
+            best_loss = loss
+            best_set_np = u_set_np
+
+    c6_bound = -best_loss
+    print(f"\nSearch complete. Best C6 lower bound found: {c6_bound:.8f}")
+    return best_set_np, c6_bound
+
+
+def run_single_trial(hypers, key):
+    # Initialize a random sparse set, ensuring 0 is included
+    key, subkey = jax.random.split(key)
+    sparsity = 0.95
+    u_mask = jax.random.bernoulli(subkey, p=(1 - sparsity), shape=(hypers.max_integer + 1,))
+    u_mask = u_mask.at[0].set(True)
+
+    searcher = C6Searcher(hypers)
+    current_loss = searcher._objective_fn(u_mask)
+
+    print(f"Starting SA search. Initial C6 bound: {-current_loss:.6f}")
+
+    current_mask = u_mask
+    for step in tqdm.tqdm(range(hypers.num_search_steps), desc="Annealing Progress"):
+        key, subkey = jax.random.split(key)
+        current_temp = hypers.initial_temperature * (1 - step / hypers.num_search_steps)
+        current_mask, current_loss = searcher.anneal_step(
+            subkey, jnp.maximum(current_temp, 1e-6), current_mask, current_loss
+        )
+
+    final_set = np.where(current_mask)[0]
+    return current_loss, final_set
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/sums_diffs_finite_sets/requirements.txt b/examples/alphaevolve_math_problems/sums_diffs_finite_sets/requirements.txt
new file mode 100644
index 00000000..3dee6952
--- /dev/null
+++ b/examples/alphaevolve_math_problems/sums_diffs_finite_sets/requirements.txt
@@ -0,0 +1,3 @@
+numpy
+jax
+optax
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/third_autocorr_ineq/config.yaml b/examples/alphaevolve_math_problems/third_autocorr_ineq/config.yaml
new file mode 100644
index 00000000..58506ac0
--- /dev/null
+++ b/examples/alphaevolve_math_problems/third_autocorr_ineq/config.yaml
@@ -0,0 +1,62 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert in functional analysis, harmonic analysis, numerical optimization, and AI-driven mathematical discovery.
+    Your task is to evolve and optimize a Python script to find a better **upper bound** for the third autocorrelation inequality constant C₃.
+    
+    PROBLEM CONTEXT:
+    Target: Find a function f: R → R (which can take positive and negative values) that **minimizes** the constant C3 in the inequality:
+    max_{-1/2≤t≤1/2} |f ★ f(t)| ≥ C3 (∫_{-1/4}^{1/4} f(x) dx)²
+    
+    This is equivalent to minimizing the ratio: C3 = max |f ★ f| / (∫f)²
+    
+    Current best known bound: C3 ≤ 1.45810
+    Goal: Beat the AlphaEvolve upper bound of 1.4556427953745406.
+    
+    Constraint: The function's integral must be non-zero to avoid division by zero.
+
+    PERFORMANCE METRICS:
+    - c3: The C3 constant achieved by the discovered function.
+    - combined_score: 1.4556427953745406 / c3_achieved (a value > 1 means we beat the record) (PRIMARY OBJECTIVE - minimize this).
+    - loss: loss value returned by the loss function.
+    - n_points: number of points used to the discretization of the interval.
+    - eval_time: evaluation time to run the solution script.
+    
+    VALIDATION FRAMEWORK:
+    - The evaluation script re-computes the C3 ratio using `numpy.convolve` and `numpy.abs` to verify the value from the optimizer.
+    - It checks that the function's integral is not close to zero.
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/third_autocorr_ineq/evaluator.py b/examples/alphaevolve_math_problems/third_autocorr_ineq/evaluator.py
new file mode 100644
index 00000000..b33c77d6
--- /dev/null
+++ b/examples/alphaevolve_math_problems/third_autocorr_ineq/evaluator.py
@@ -0,0 +1,80 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the third autocorrelation inequality problem.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys
+import os
+from importlib import __import__
+import time
+import numpy as np
+
+# Known bounds
+BENCHMARK = 1.4556427953745406
+
+
+def verify_c3_solution(f_values: np.ndarray, c3_achieved: float, n_points: int):
+    """Verify the solution for the C3 UPPER BOUND optimization."""
+
+    if f_values.shape != (n_points,):
+        raise ValueError(f"Expected function values shape {(n_points,)}. Got {f_values.shape}.")
+
+    # Recompute C3 using NumPy to verify
+    dx = 0.5 / n_points
+
+    # Squared integral of f
+    integral_f_sq = (np.sum(f_values) * dx) ** 2
+
+    if integral_f_sq < 1e-9:
+        raise ValueError("Function integral is close to zero, ratio is unstable.")
+
+    # Max absolute value of the scaled autoconvolution
+    conv = np.convolve(f_values, f_values, mode="full")
+    scaled_conv = conv * dx
+    max_abs_conv = np.max(np.abs(scaled_conv))
+
+    computed_c3 = max_abs_conv / integral_f_sq
+
+    delta = abs(computed_c3 - c3_achieved)
+    if delta > 1e-3:
+        raise ValueError(
+            f"C3 mismatch: reported {c3_achieved:.6f}, computed {computed_c3:.6f}, delta: {delta:.6f}"
+        )
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            start_time = time.time()
+            f_values, c3_achieved, loss, n_points = program.run()
+            end_time = time.time()
+            eval_time = end_time - start_time
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        verify_c3_solution(f_values, c3_achieved, n_points)
+
+        return {
+            "c3": float(c3_achieved),
+            "combined_score": BENCHMARK / float(c3_achieved),
+            "loss": float(loss),
+            "n_points": int(n_points),
+            "eval_time": float(eval_time),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/third_autocorr_ineq/initial_program.py b/examples/alphaevolve_math_problems/third_autocorr_ineq/initial_program.py
new file mode 100644
index 00000000..9f8bbfb4
--- /dev/null
+++ b/examples/alphaevolve_math_problems/third_autocorr_ineq/initial_program.py
@@ -0,0 +1,107 @@
+# EVOLVE-BLOCK-START
+import jax
+import jax.numpy as jnp
+import optax
+import numpy as np
+from dataclasses import dataclass
+
+
+@dataclass
+class Hyperparameters:
+    """Hyperparameters for the optimization process."""
+
+    num_intervals: int = 400
+    learning_rate: float = 0.005
+    num_steps: int = 20000
+    warmup_steps: int = 2000
+
+
+class C3Optimizer:
+    """
+    Optimizes a function f (with positive and negative values) to find an
+    upper bound for the C3 constant.
+    """
+
+    def __init__(self, hypers: Hyperparameters):
+        self.hypers = hypers
+        self.domain_width = 0.5
+        self.dx = self.domain_width / self.hypers.num_intervals
+
+    def _objective_fn(self, f_values: jnp.ndarray) -> jnp.ndarray:
+        """
+        Computes the C3 ratio. The goal is to minimize this value.
+        """
+        # The squared integral of f.
+        integral_f = jnp.sum(f_values) * self.dx
+        eps = 1e-9
+        integral_f_sq_safe = jnp.maximum(integral_f**2, eps)
+
+        # The max of the absolute value of the autoconvolution.
+        N = self.hypers.num_intervals
+        padded_f = jnp.pad(f_values, (0, N))
+
+        fft_f = jnp.fft.fft(padded_f)
+        conv_f_f = jnp.fft.ifft(fft_f * fft_f).real
+
+        # Scale the unscaled convolution sum by dx to approximate the integral.
+        scaled_conv_f_f = conv_f_f * self.dx
+
+        # Take the maximum of the absolute value.
+        max_abs_conv = jnp.max(jnp.abs(scaled_conv_f_f))
+
+        c3_ratio = max_abs_conv / integral_f_sq_safe
+
+        # We want to MINIMIZE the ratio.
+        return c3_ratio
+
+    def train_step(self, f_values: jnp.ndarray, opt_state: optax.OptState) -> tuple:
+        """Performs a single training step."""
+        loss, grads = jax.value_and_grad(self._objective_fn)(f_values)
+        updates, opt_state = self.optimizer.update(grads, opt_state, f_values)
+        f_values = optax.apply_updates(f_values, updates)
+        return f_values, opt_state, loss
+
+    def run_optimization(self):
+        """Sets up and runs the full optimization process."""
+        schedule = optax.warmup_cosine_decay_schedule(
+            init_value=0.0,
+            peak_value=self.hypers.learning_rate,
+            warmup_steps=self.hypers.warmup_steps,
+            decay_steps=self.hypers.num_steps - self.hypers.warmup_steps,
+            end_value=self.hypers.learning_rate * 1e-4,
+        )
+        self.optimizer = optax.adam(learning_rate=schedule)
+
+        key = jax.random.PRNGKey(42)
+        f_values = jax.random.normal(key, (self.hypers.num_intervals,))
+
+        opt_state = self.optimizer.init(f_values)
+        print(
+            f"Number of intervals (N): {self.hypers.num_intervals}, Steps: {self.hypers.num_steps}"
+        )
+        train_step_jit = jax.jit(self.train_step)
+
+        loss = jnp.inf
+        for step in range(self.hypers.num_steps):
+            f_values, opt_state, loss = train_step_jit(f_values, opt_state)
+            if step % 1000 == 0 or step == self.hypers.num_steps - 1:
+                print(f"Step {step:5d} | C3 ≈ {loss:.8f}")
+
+        final_c3 = loss
+        print(f"Final C3 upper bound found: {final_c3:.8f}")
+        return f_values, final_c3
+
+
+def run():
+    """Entry point for running the optimization."""
+    hypers = Hyperparameters()
+    optimizer = C3Optimizer(hypers)
+    optimized_f, final_c3_val = optimizer.run_optimization()
+
+    loss_val = final_c3_val
+    f_values_np = np.array(optimized_f)
+
+    return f_values_np, float(final_c3_val), float(loss_val), hypers.num_intervals
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/third_autocorr_ineq/requirements.txt b/examples/alphaevolve_math_problems/third_autocorr_ineq/requirements.txt
new file mode 100644
index 00000000..3dee6952
--- /dev/null
+++ b/examples/alphaevolve_math_problems/third_autocorr_ineq/requirements.txt
@@ -0,0 +1,3 @@
+numpy
+jax
+optax
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/uncertainty_ineq/config.yaml b/examples/alphaevolve_math_problems/uncertainty_ineq/config.yaml
new file mode 100644
index 00000000..cc35a30d
--- /dev/null
+++ b/examples/alphaevolve_math_problems/uncertainty_ineq/config.yaml
@@ -0,0 +1,63 @@
+# Evolution settings
+max_iterations: 200
+checkpoint_interval: 10
+parallel_evaluations: 1
+
+# LLM configuration
+llm:
+  api_base: "https://api.openai.com/v1"  # Or your LLM provider
+  models:
+    - name: "gpt-4"
+      weight: 1.0
+  temperature: 0.7
+  max_tokens: 4000
+  timeout: 120
+
+# Database configuration (MAP-Elites algorithm)
+database:
+  population_size: 40
+  num_islands: 5
+  migration_interval: 40
+  feature_dimensions:  # MUST be a list, not an integer
+    - "score"
+    - "complexity"
+
+# Evaluation settings
+evaluator:
+  timeout: 360
+  max_retries: 3
+
+# Prompt configuration
+prompt:
+  system_message: |
+    SETTING:
+    You are an expert in harmonic analysis, numerical optimization, and AI-driven mathematical discovery.
+    Your task is to evolve and optimize a Python script to find a better **upper bound** for the uncertainty inequality constant C₄.
+    
+    PROBLEM CONTEXT:
+    Target: Find an even function f(x) that **minimizes** the product A(f)A(f̂), where A(f) is the largest positive root of f.
+    This minimal product provides a tight upper bound for the constant C₄.
+    
+    Current best known upper bound: C₄ ≤ 0.3215872333529007
+    Goal: Find a set of coefficients for a test function that results in a C₄ value lower than 0.3215872333529007.
+    
+    METHOD:
+    The test function is parameterized as f(x) = P(x)exp(-πx²), where P(x) is a linear combination of even Hermite polynomials: P(x) = c₀H₀(x) + c₁H₄(x) + c₂H₈(x) + ...
+    The problem simplifies to finding coefficients [c₀, c₁, c₂, ...] that minimize the largest positive root of P(x), subject to the constraint P(0) = 0.
+    The final C₄ bound is the square of this minimal root, (r_max)².
+    
+    PERFORMANCE METRICS:
+    - c4_bound: bound found by the algorithm.
+    - combined_score: 0.3215872333529007 / c4_bound (The primary objective is to MAXIMIZE this value - a value > 1 means a new record).
+    - r_max: largest positive root found.
+    - coeffs: coefficients found in the optimization.
+    - eval_time: evaluation time of the script.
+    
+    VALIDATION FRAMEWORK:
+    - The evaluation script reconstructs the polynomial from the discovered coefficients and uses `numpy.roots` to independently verify the largest positive root and the C4 bound.
+    - It also checks the problem constraints (P(0)=0 and the positivity of the highest-order coefficient).
+  num_top_programs: 3
+  num_diverse_programs: 2
+
+# Logging
+log_level: "INFO"
\ No newline at end of file
diff --git a/examples/alphaevolve_math_problems/uncertainty_ineq/evaluator.py b/examples/alphaevolve_math_problems/uncertainty_ineq/evaluator.py
new file mode 100644
index 00000000..f521bef1
--- /dev/null
+++ b/examples/alphaevolve_math_problems/uncertainty_ineq/evaluator.py
@@ -0,0 +1,129 @@
+# ===--------------------------------------------------------------------------------------===#
+#
+# This file implements the evaluator for the uncertainty inequality problem.
+#
+# ===--------------------------------------------------------------------------------------===#
+#
+# Some of the code in this file is adapted from:
+#
+# google-deepmind/alphaevolve_results:
+# Licensed under the Apache License v2.0.
+#
+# ===--------------------------------------------------------------------------------------===#
+
+import sys, os, time, numpy as np
+import sympy as sp
+
+BENCHMARK = 0.3215872333529007
+
+x = sp.symbols("x")
+
+
+def _hermite_4k_polys(m: int):
+    degrees = [4 * k for k in range(m)]
+    Hs = [sp.polys.orthopolys.hermite_poly(n=d, x=x, polys=False) for d in degrees]
+    return Hs, degrees
+
+
+def _construct_P_with_forced_zero(coeffs: np.ndarray) -> sp.Expr:
+    """
+    Given m input coeffs (c0..c_{m-1}), build the Hermite combo
+    c0*H0 + ... + c_{m-1}*H_{4(m-1)} + c_last*H_{4m}
+    where c_last is chosen so that P(0) = 0.
+    Also flip sign if limit at +inf is negative.
+    """
+    m = len(coeffs)
+    Hs, _ = _hermite_4k_polys(m + 1)  # include the (m)-th term for solving P(0)=0
+    rc = [sp.Rational(c) for c in coeffs]
+
+    partial = sum(rc[i] * Hs[i] for i in range(m))
+    a = Hs[m].subs(x, 0)
+    b = -partial.subs(x, 0)
+    c_last = sp.Rational(b) / sp.Rational(a)
+
+    P = partial + c_last * Hs[m]
+
+    # Ensure positivity at +inf (all degrees are multiples of 4, so sign is well-defined)
+    if sp.limit(P, x, sp.oo) < 0:
+        P = -P
+
+    return sp.simplify(P)
+
+
+def _largest_positive_root_of_P_over_x2(P: sp.Expr) -> float:
+    # P is even and has P(0)=0; divide by x^2 and find the largest positive real root.
+    gq = sp.exquo(P, x**2)  # exact division (should divide cleanly if multiplicity >= 2)
+    roots = sp.real_roots(gq, x)
+    if not roots:
+        raise ValueError("No real roots for P(x)/x^2.")
+
+    # Validate sign change around each candidate
+    best = None
+    for r in roots:
+        r_approx = r.eval_rational(n=200)
+        eps = sp.Rational(1, 10**198)
+        left = gq.subs(x, r_approx - eps)
+        right = gq.subs(x, r_approx + eps)
+        if (left > 0 and right < 0) or (left < 0 and right > 0):
+            if best is None or r_approx > best:
+                best = r_approx
+
+    if best is None:
+        raise ValueError("No root with a verified sign change for P(x)/x^2.")
+    return float(best)
+
+
+def compute_c4_and_rmax(input_coeffs: np.ndarray):
+    P = _construct_P_with_forced_zero(input_coeffs)
+    # Quick sanity checks
+    assert P.subs(x, 0) == 0, "P(0) != 0 after forcing."
+    assert sp.limit(P, x, sp.oo) > 0, "Limit at +inf is not positive."
+
+    rmax = _largest_positive_root_of_P_over_x2(P)
+    c4 = (rmax**2) / (2.0 * np.pi)
+    return c4, rmax
+
+
+def verify_c4_solution_strict(
+    user_coeffs: np.ndarray, reported_c4: float, reported_rmax: float, atol=1e-9, rtol=1e-9
+):
+    c4, rmax = compute_c4_and_rmax(np.asarray(user_coeffs, dtype=float))
+
+    if not np.isclose(c4, reported_c4, rtol=rtol, atol=atol):
+        raise ValueError(f"C4 mismatch: reported {reported_c4:.12f}, recomputed {c4:.12f}")
+
+    if not np.isclose(rmax, reported_rmax, rtol=rtol, atol=atol):
+        raise ValueError(f"r_max mismatch: reported {reported_rmax:.12f}, recomputed {rmax:.12f}")
+
+    return c4, rmax
+
+
+def evaluate(program_path: str):
+    try:
+        abs_program_path = os.path.abspath(program_path)
+        program_dir = os.path.dirname(abs_program_path)
+        module_name = os.path.splitext(os.path.basename(program_path))[0]
+
+        try:
+            sys.path.insert(0, program_dir)
+            program = __import__(module_name)
+            t0 = time.time()
+            coeffs, c4_bound, r_max = program.run()
+            t1 = time.time()
+        finally:
+            if program_dir in sys.path:
+                sys.path.remove(program_dir)
+
+        coeffs = np.asarray(coeffs, dtype=float)
+
+        c4, rmax = verify_c4_solution_strict(coeffs, float(c4_bound), float(r_max))
+
+        return {
+            "c4_bound": float(c4),
+            "combined_score": float(BENCHMARK / c4),
+            "r_max": float(rmax),
+            "coeffs": coeffs.tolist(),
+            "eval_time": float(t1 - t0),
+        }
+    except Exception as e:
+        return {"combined_score": 0.0, "error": str(e)}
diff --git a/examples/alphaevolve_math_problems/uncertainty_ineq/initial_program.py b/examples/alphaevolve_math_problems/uncertainty_ineq/initial_program.py
new file mode 100644
index 00000000..df1948ce
--- /dev/null
+++ b/examples/alphaevolve_math_problems/uncertainty_ineq/initial_program.py
@@ -0,0 +1,211 @@
+# Disable progress bar for cleaner output logs
+import os
+
+os.environ["TQDM_DISABLE"] = "1"
+
+# EVOLVE-BLOCK-START
+import jax
+import jax.numpy as jnp
+import optax
+import numpy as np
+from dataclasses import dataclass
+from scipy.special import hermite
+import tqdm
+
+
+@dataclass
+class Hyperparameters:
+    learning_rate: float = 0.001
+    num_steps: int = 100000
+    num_restarts: int = 20
+    num_hermite_coeffs: int = 4  # uses H0, H4, H8, H12
+
+
+class UncertaintyOptimizer:
+    """
+    Finds coefficients for a generalized Hermite polynomial P(x) that minimize
+    the largest positive root, providing an upper bound for C4.
+    """
+
+    def __init__(self, hypers: Hyperparameters):
+        self.hypers = hypers
+        self.degrees = [4 * k for k in range(hypers.num_hermite_coeffs)]
+        max_degree = self.degrees[-1]
+        hermite_polys = [hermite(d) for d in self.degrees]
+
+        basis = []
+        for poly in hermite_polys:
+            pad_amount = max_degree - poly.order
+            basis.append(jnp.array(np.pad(poly.coef, (pad_amount, 0))))
+        self.hermite_basis = jnp.stack(basis)
+
+        self.H_vals_at_zero = jnp.array([p(0) for p in hermite_polys])
+        self.x_grid = jnp.linspace(0.0, 10.0, 3000)
+        self.optimizer = optax.adam(self.hypers.learning_rate)
+
+    @staticmethod
+    def _objective_fn(params: jnp.ndarray, hermite_basis, H_vals_at_zero, x_grid):
+        """Penalize negative values of P(x) on [0, Xmax]; mild weighting to emphasize larger x."""
+        c_others, log_c_last = params[:-1], params[-1]
+        c_last = jnp.exp(log_c_last)
+
+        # Enforce P(0) = 0
+        c0 = (
+            -(jnp.sum(c_others * H_vals_at_zero[1:-1]) + c_last * H_vals_at_zero[-1])
+            / H_vals_at_zero[0]
+        )
+        hermite_coeffs = jnp.concatenate([jnp.array([c0]), c_others, jnp.array([c_last])])
+
+        poly_coeffs_std = jnp.sum(hermite_coeffs[:, None] * hermite_basis, axis=0)
+        p_values = jnp.polyval(poly_coeffs_std, x_grid)
+
+        # Slightly increasing weight toward the right end of the interval
+        weights = 1.0 + (x_grid / (x_grid[-1] + 1e-12))
+        loss = jnp.sum(weights * jax.nn.relu(-p_values))
+        return loss
+
+    @staticmethod
+    def train_step(
+        params: jnp.ndarray,
+        opt_state: optax.OptState,
+        optimizer,
+        hermite_basis,
+        H_vals_at_zero,
+        x_grid,
+    ):
+        loss, grads = jax.value_and_grad(UncertaintyOptimizer._objective_fn)(
+            params, hermite_basis, H_vals_at_zero, x_grid
+        )
+        updates, opt_state = optimizer.update(grads, opt_state, params)
+        params = optax.apply_updates(params, updates)
+        return params, opt_state, loss
+
+
+def run_single_trial(optimizer: UncertaintyOptimizer, key: jax.random.PRNGKey):
+    """Runs one full optimization from a near-good starting point with small noise."""
+    num_params_to_opt = optimizer.hypers.num_hermite_coeffs - 1  # = 3 when using H0,H4,H8,H12
+    assert num_params_to_opt == 3, "This initialization assumes num_hermite_coeffs == 4."
+
+    base_c1 = -0.01158510802599293
+    base_c2 = -8.921606035407065e-05
+    base_log_c_last = np.log(1e-6)
+
+    base = jnp.array([base_c1, base_c2, base_log_c_last], dtype=jnp.float32)
+    noise = jax.random.normal(key, (num_params_to_opt,)) * 1e-3
+    params = base + noise
+
+    opt_state = optimizer.optimizer.init(params)
+    jit_train_step = jax.jit(UncertaintyOptimizer.train_step, static_argnums=(2,))
+
+    for _ in range(optimizer.hypers.num_steps):
+        params, opt_state, _ = jit_train_step(
+            params,
+            opt_state,
+            optimizer.optimizer,
+            optimizer.hermite_basis,
+            optimizer.H_vals_at_zero,
+            optimizer.x_grid,
+        )
+    return params
+
+
+def _build_P_from_hermite_coeffs(hermite_coeffs: np.ndarray, degrees: list[int]) -> np.poly1d:
+    """Build monomial-basis polynomial P from Hermite-basis coefficients."""
+    max_degree = degrees[-1]
+    hermite_polys = [hermite(d) for d in degrees]
+
+    P_poly_coeffs = np.zeros(max_degree + 1)
+    for i, c in enumerate(hermite_coeffs):
+        poly = hermite_polys[i]
+        pad_amount = max_degree - poly.order
+        P_poly_coeffs[pad_amount:] += c * poly.coef
+
+    if P_poly_coeffs[0] < 0:
+        P_poly_coeffs = -P_poly_coeffs
+        hermite_coeffs[:] = -hermite_coeffs
+    return np.poly1d(P_poly_coeffs)
+
+
+def _c4_from_hermite_coeffs(hermite_coeffs: np.ndarray, num_hermite_coeffs: int):
+    """Compute r_max and C4 from full Hermite coefficient vector (Google-style)."""
+    degrees = [4 * k for k in range(num_hermite_coeffs)]
+    P = _build_P_from_hermite_coeffs(hermite_coeffs.copy(), degrees)
+
+    # Divide by x^2
+    Q, R = np.polydiv(P, np.poly1d([1.0, 0.0, 0.0]))
+    if np.max(np.abs(R.c)) > 1e-10:
+        return None, None
+
+    roots = Q.r
+    real_pos = roots[(np.isreal(roots)) & (roots.real > 0)].real
+    if real_pos.size == 0:
+        return None, None
+
+    # Tiny sign-change check around candidates
+    r_candidates = np.sort(real_pos)
+    r_max = None
+    for r in r_candidates:
+        eps = 1e-10 * max(1.0, abs(r))
+        left = np.polyval(Q, r - eps)
+        right = np.polyval(Q, r + eps)
+        if left * right < 0:
+            r_max = float(r)
+    if r_max is None:
+        r_max = float(r_candidates[-1])
+
+    c4 = (r_max**2) / (2 * np.pi)
+    return c4, r_max
+
+
+def get_c4_from_params(params: np.ndarray, hypers: Hyperparameters):
+    """Calculates the precise C4 bound from a final set of parameters."""
+    c_others, log_c_last = params[:-1], params[-1]
+    c_last = np.exp(log_c_last)
+
+    degrees = [4 * k for k in range(hypers.num_hermite_coeffs)]
+    hermite_polys = [hermite(d) for d in degrees]
+    H_vals_at_zero = np.array([p(0) for p in hermite_polys])
+
+    # Enforce P(0) = 0
+    c0 = (
+        -(np.sum(c_others * H_vals_at_zero[1:-1]) + c_last * H_vals_at_zero[-1]) / H_vals_at_zero[0]
+    )
+    hermite_coeffs = np.concatenate([[c0], np.array(c_others), [c_last]])
+
+    c4, rmax = _c4_from_hermite_coeffs(hermite_coeffs, hypers.num_hermite_coeffs)
+    if c4 is None:
+        return None, None, None
+    return hermite_coeffs, c4, rmax
+
+
+def run():
+    hypers = Hyperparameters()
+    optimizer = UncertaintyOptimizer(hypers)
+    main_key = jax.random.PRNGKey(42)
+    best_c4_bound = float("inf")
+    best_coeffs, best_r_max = None, None
+
+    print(f"Running {hypers.num_restarts} trials to find the best C4 upper bound...")
+    for _ in tqdm.tqdm(range(hypers.num_restarts), desc="Searching", disable=True):
+        main_key, restart_key = jax.random.split(main_key)
+        final_params = run_single_trial(optimizer, restart_key)
+
+        coeffs, c4_bound, r_max = get_c4_from_params(np.array(final_params), hypers)
+
+        if c4_bound is not None and c4_bound < best_c4_bound:
+            best_c4_bound = c4_bound
+            best_coeffs = coeffs
+            best_r_max = r_max
+
+    if best_coeffs is None:
+        raise RuntimeError("Failed to find a valid solution in any restart.")
+
+    print("\nSearch complete.")
+    print(f"Best Hermite coeffs: {best_coeffs}")
+    print(f"Best largest positive root r_max: {best_r_max:.8f}")
+    print(f"Resulting best C4 upper bound: {best_c4_bound:.8f}")
+
+    return best_coeffs, best_c4_bound, best_r_max
+
+
+# EVOLVE-BLOCK-END
diff --git a/examples/alphaevolve_math_problems/uncertainty_ineq/requirements.txt b/examples/alphaevolve_math_problems/uncertainty_ineq/requirements.txt
new file mode 100644
index 00000000..48d8952e
--- /dev/null
+++ b/examples/alphaevolve_math_problems/uncertainty_ineq/requirements.txt
@@ -0,0 +1,4 @@
+numpy
+sympy
+scipy
+tqdm
\ No newline at end of file