diff --git a/examples/notebooks/Functional_API/README.md b/examples/notebooks/Functional_API/README.md
new file mode 100644
index 00000000..810e8ce6
--- /dev/null
+++ b/examples/notebooks/Functional_API/README.md
@@ -0,0 +1,9 @@
+# Functional API for Optimization
+
+As an alternative to its object-oriented stateful API, EvoTorch provides an API that conforms to the functional programming paradigm. This functional API has its own advantages like being able to work on not just a single population, but on a batch of populations.
+
+Here are the examples demonstrating various features of this functional API:
+
+- **[Maintaining a batch of populations using the functional EvoTorch API](batched_searches.ipynb)**: This notebook shows how one can efficiently run multiple searches simultaneously, each with its own population and hyperparameter configuration, by maintaining a batch of populations.
+- **[Solving constrained optimization problems](constrained.ipynb)**: EvoTorch provides batching-friendly constraint penalization functions that can be used with both the object-oriented API and the functional API. In addition, these constraint penalization functions can be used with gradient-based optimization. This notebook demonstrates these features.
+- **[Solving reinforcement learning tasks using functional evolutionary algorithms](problem.ipynb)**: The functional evolutionary algorithm implementations of EvoTorch can be used to solve problems that are expressed using the object-oriented core API of EvoTorch. To demonstrate this, this notebook instantiates a `GymNE` problem for the reinforcement learning task "CartPole-v1", and solves it using the functional `pgpe` implementation.
diff --git a/examples/notebooks/Functional_API/batched_searches.ipynb b/examples/notebooks/Functional_API/batched_searches.ipynb
new file mode 100644
index 00000000..69042e55
--- /dev/null
+++ b/examples/notebooks/Functional_API/batched_searches.ipynb
@@ -0,0 +1,428 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "d66277ff-bb08-48c6-b1b3-21919b700c9b",
+   "metadata": {},
+   "source": [
+    "# Maintaining a batch of populations using the functional EvoTorch API\n",
+    "\n",
+    "## Motivation\n",
+    "EvoTorch already implements mechanisms for accelerating the evaluation of solutions in a population using ray actors and/or PyTorch vectorization, which is essential for obtaining results in reasonable time.\n",
+    "\n",
+    "However, usually one also needs to run search algorithms multiple times to evaluate the effect of various hyperparameters (such as learning rate, mutation probability etc). This can take a lot of time, and needs additional effort to parallelize (e.g. on a cluster). The functional API of EvoTorch attempts to address this need by leveraging [PyTorch's vmap() transform](https://pytorch.org/docs/stable/generated/torch.func.vmap.html#torch.func.vmap).\n",
+    "\n",
+    "The main idea is that search algorithms in `evotorch.algorithms.functional` are pure functional implementations, and thus can easily be transformed (using `vmap()`) to operate on multiple populations stored as _batches of populations_ (or batches of batches of populations, and so on). With the help of such implementations, one can run multiple searches in parallel starting from different initial populations that cover different regions of the search space. As another example, one can run multiple searches that each have a different initial population as well as a corresponding learning rate.\n",
+    "\n",
+    "In this notebook, we demonstrate how a batch of populations, each originated from a different starting point, can be maintained so that different regions of the search space can be explored simultaneously. The key concepts covered are:\n",
+    "- the ask-and-tell interface provided by search algorithms in the `evotorch.algorithms.functional` namespace.\n",
+    "- the `@rowwise` decorator that makes it easier to write evaluation functions that can be automatically transformed using `vmap()`.\n",
+    "- running multiple searches using vectorization simply by adding a batch dimension to arguments in the ask-and-tell interface and letting EvoTorch handle the rest."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "18d7c1eb-3c93-4c05-9044-8cfd9334e5a9",
+   "metadata": {},
+   "source": [
+    "We begin by importing the necessary libraries and defining some useful variables:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c5971b63-d6b9-4c02-9669-ab2060af833c",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from evotorch.algorithms.functional import cem, cem_ask, cem_tell, pgpe, pgpe_ask, pgpe_tell\n",
+    "from evotorch.decorators import rowwise\n",
+    "\n",
+    "from datetime import datetime\n",
+    "import torch\n",
+    "from math import pi\n",
+    "\n",
+    "# Use a GPU to achieve speedups from vectorization if possible\n",
+    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
+    "# We will search for 1000-dimensional solution vectors\n",
+    "solution_length = 1000"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6461d4b6-9a70-434a-8aa1-e7be410d96d3",
+   "metadata": {},
+   "source": [
+    "## Ask-and-tell\n",
+    "Next, we implement a simple optimization loop for the commonly used Rastrigin function using the ask-and-tell interface for the Cross Entropy Method. An important detail to note is that we are directly evaluating the full population using the evaluation function `rastrigin()` so we need to implement it in a way that it operates on a population represented as a 2D Tensor."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a804967a-1845-40c6-a80d-9424edfa9052",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def rastrigin(x: torch.Tensor) -> torch.Tensor:\n",
+    "    n = x.shape[-1]\n",
+    "    A = 10.0\n",
+    "    return A * n + torch.sum((x ** 2) - (A * torch.cos(2 * pi * x)), dim=-1)\n",
+    "\n",
+    "\n",
+    "# Uniformly sample center_init in [-5.12, 5.12], the typical domain of the rastrigin function\n",
+    "center_init = (torch.rand(solution_length) * 2 - 1) * 5.12\n",
+    "# Set std_max_change to 0.1 for all solution dimensions\n",
+    "stdev_max_change = 0.1 * torch.ones(solution_length)\n",
+    "\n",
+    "cem_state = cem(\n",
+    "    # We want to minimize the evaluation results\n",
+    "    objective_sense=\"min\",\n",
+    "\n",
+    "    # `center_init` is the center point(s) of the initial search distribution(s).\n",
+    "    center_init=center_init.to(device),\n",
+    "\n",
+    "    # The standard deviation of the initial search distribution.\n",
+    "    stdev_init=10.0,\n",
+    "\n",
+    "    # We provide our batch of hyperparameter vectors as `stdev_max_change`.\n",
+    "    stdev_max_change=stdev_max_change,\n",
+    "\n",
+    "    # Solutions belonging to the top half (top 50%) of the population(s)\n",
+    "    # will be chosen as parents.\n",
+    "    parenthood_ratio=0.5,\n",
+    ")\n",
+    "\n",
+    "# We will run the evolutionary search for this many generations:\n",
+    "num_generations = 1500\n",
+    "\n",
+    "# Interval (in seconds) for printing the status:\n",
+    "report_interval = 3\n",
+    "start_time = last_report_time = datetime.now()\n",
+    "\n",
+    "for generation in range(1, 1 + num_generations):\n",
+    "    # Get a population from the evolutionary algorithm\n",
+    "    population = cem_ask(cem_state, popsize=500)\n",
+    "\n",
+    "    # Compute the fitnesses\n",
+    "    fitnesses = rastrigin(population)\n",
+    "\n",
+    "    # Inform the evolutionary algorithm of the fitnesses and get its next state\n",
+    "    cem_state = cem_tell(cem_state, population, fitnesses)\n",
+    "\n",
+    "    # If it is time to report, print the status\n",
+    "    tnow = datetime.now()\n",
+    "    if ((tnow - last_report_time).total_seconds() > report_interval) or (generation == num_generations):\n",
+    "        print(\"generation:\", generation, \"mean fitnesses:\", torch.mean(fitnesses, dim=-1))\n",
+    "        last_report_time = tnow\n",
+    "\n",
+    "print(\"time taken: \", (last_report_time - start_time).total_seconds(), \"secs\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "5e672e8e-77cf-439d-82fb-1775dbe6fc69",
+   "metadata": {},
+   "source": [
+    "## The @rowwise decorator\n",
+    "\n",
+    "Next, we modify the code above so that multiple searches can be executed simultaneously taking advantage of PyTorch's vectorization capabilities. We modify the fitness function as follows:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "44738506-c134-46cf-8268-9f88ca358643",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@rowwise\n",
+    "def rastrigin(x: torch.Tensor) -> torch.Tensor:\n",
+    "    [n] = x.shape\n",
+    "    A = 10.0\n",
+    "    return A * n + torch.sum((x ** 2) - (A * torch.cos(2 * pi * x)))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2a426eb1-b2ed-4694-8184-ef8bbccbf73b",
+   "metadata": {},
+   "source": [
+    "Notice how the fitness function above is decorated via `@rowwise`. This decorator tells EvoTorch that the user has defined the function to operate on its argument `x` as a vector (i.e. a 1-dimensional tensor). This makes it conceptually easier to implement the function and helps EvoTorch safely `vmap()` it in order to apply it to populations or a batch of populations as needed. `@rowwise` ensures that:\n",
+    "\n",
+    "- if the argument `x` is indeed received as a 1-dimensional tensor, the function works as how it is defined;\n",
+    "- if the argument `x` is received as a matrix (i.e. as a 2-dimensional tensor), the operations of the function are applied for each row of the matrix;\n",
+    "- if the argument `x` is received as a tensor with 3 or more dimensions, the operations of the function are applied for each row of each matrix.\n",
+    "\n",
+    "Thanks to this, the fitness function `rastrigin` can be used as it is to evaluate a single solution (represented by a 1-dimensional tensor), a single population (represented by a 2-dimensional tensor), or a batch of populations (represented by a tensor with 3 or more dimensions).\n",
+    "\n",
+    "_Note: We don't *have* to use `@rowwise` to implement our fitness function. Indeed, since our previous definition of `rastrigin()` happens to handle any number of batch dimensions and return a fitness value for each vector, we can use it as-is for running a batch of multiple searches. However, writing the fitness function in such a general way can often be difficult and error-prone, so it is much more convenient to use `@rowwise`._"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "36be528e-fae4-4104-a27a-b255c7facab1",
+   "metadata": {},
+   "source": [
+    "## Batched (vectorized) searches\n",
+    "Using the modified rastrigin function above, we are almost ready to run a batch of searches utilizing additional vectorization over the number of searches.\n",
+    "We will run 4 searches in a batch:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "046dd528-ee4c-469a-96f9-a5bef4ca2bd5",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "batch_size = 4"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b9434b49-90c7-439f-81ae-db4ad024cb91",
+   "metadata": {},
+   "source": [
+    "For both functional `cem` and functional `pgpe`, the hyperparameter `stdev_max_change` can be given as a scalar (which then will be expanded to a vector), or as a vector (which then will be used as it is), or as a batch of vectors (which will mean that for each batch item `i`, the `i`-th `stdev_max_change` vector will be used).\n",
+    "\n",
+    "Since we consider a batch of populations in this example, let us make a batch of starting points and `stdev_max_change` vectors, meaning that each population will have its own different starting point and `stdev_max_change` hyperparameter."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "e97787a3-8ed6-4db8-8396-1cc3ae19cb59",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "center_inits = ((torch.rand(batch_size, solution_length) * 2) - 1) * 5.12\n",
+    "# uniformly sample std_max_change between 0.01 and 0.2\n",
+    "stdev_max_changes = torch.linspace(0.01, 0.2, batch_size)[:, None].expand(-1, solution_length)\n",
+    "print(center_inits.shape, stdev_max_changes.shape)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b28047db-e09a-4327-bf6d-e12b0dd9b8f1",
+   "metadata": {},
+   "source": [
+    "Next we simply provide these to the CEM state initializer and execute CEM using the ask-and-tell interface exacty as before. Internally, EvoTorch will recognize the new batch dimension and appropriately `vmap()` the fitness function for us!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "cba9629f-afc5-482f-bd36-14edb01b9de9",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "cem_state = cem(\n",
+    "    objective_sense=\"min\",\n",
+    "\n",
+    "    # The batch of vectors `starting_points` is given as our `center_init`,\n",
+    "    # that is, the center point(s) of the initial search distribution(s).\n",
+    "    center_init=center_inits.to(device),\n",
+    "\n",
+    "    # The standard deviation of the initial search distribution(s).\n",
+    "    stdev_init=10.0,\n",
+    "\n",
+    "    # We provide our batch of hyperparameter vectors as `stdev_max_change`.\n",
+    "    stdev_max_change=stdev_max_changes,\n",
+    "\n",
+    "    parenthood_ratio=0.5,\n",
+    ")\n",
+    "\n",
+    "start_time = last_report_time = datetime.now()\n",
+    "\n",
+    "for generation in range(1, 1 + num_generations):\n",
+    "    # Get a population from the evolutionary algorithm\n",
+    "    population = cem_ask(cem_state, popsize=500)\n",
+    "\n",
+    "    # Compute the fitnesses\n",
+    "    fitnesses = rastrigin(population)\n",
+    "\n",
+    "    # Inform the evolutionary algorithm of the fitnesses and get its next state\n",
+    "    cem_state = cem_tell(cem_state, population, fitnesses)\n",
+    "\n",
+    "    # If it is time to report, print the status\n",
+    "    tnow = datetime.now()\n",
+    "    if ((tnow - last_report_time).total_seconds() > report_interval) or (generation == num_generations):\n",
+    "        print(\"generation:\", generation, \"mean fitnesses:\", torch.mean(fitnesses, dim=-1))\n",
+    "        last_report_time = tnow\n",
+    "\n",
+    "print(\"time taken: \", (last_report_time - start_time).total_seconds(), \"secs\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6c6d32a5-c4b0-4497-9a92-e82a9cfdd3c1",
+   "metadata": {},
+   "source": [
+    "If this notebook is executed on a GPU, the above batched search will take less time than `batch_size` times the time taken by the single search above, particularly for larger values of `batch_size`. Here are the center points found by CEM:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b370baa4-70c6-4093-87d0-36fa63eabe19",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "cem_state.center"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "cc1ceb7d-4c6b-4cdc-a7e8-e52232819b61",
+   "metadata": {},
+   "source": [
+    "## Another example\n",
+    "\n",
+    "As another example, let us consider the functional `pgpe` algorithm.\n",
+    "For `pgpe`, `center_learning_rate` is a hyperparameter which is expected as a scalar in the non-batched case.\n",
+    "If it is provided as a vector, this means that for each batch item `i`, the `i`-th value of the `center_learning_rate` vector will be used.\n",
+    "\n",
+    "Let us build a `center_learning_rate` vector:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d64a199c-bb51-468e-80dd-393dc0accc9e",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "center_learning_rates = torch.linspace(0.001, 0.4, batch_size)\n",
+    "center_learning_rates"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0c3bcc5c-471f-49d2-a1f5-0f7ef33ccfeb",
+   "metadata": {},
+   "source": [
+    "Now we prepare the first state of our `pgpe` search:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9c3674f4-35df-406a-9574-7b1e417717f7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pgpe_state = pgpe(\n",
+    "    # We want to minimize the evaluation results.\n",
+    "    objective_sense=\"min\",\n",
+    "\n",
+    "    # The batch of vectors `starting_points` is given as our `center_init`,\n",
+    "    # that is, the center point(s) of the initial search distribution(s).\n",
+    "    center_init=center_inits.to(device),\n",
+    "\n",
+    "    # Standard deviation for the initial search distribution(s):\n",
+    "    stdev_init=10.0,\n",
+    "\n",
+    "    # We provide our `center_learning_rate` batch here:\n",
+    "    center_learning_rate=center_learning_rates,\n",
+    "\n",
+    "    # Learning rate for the standard deviation(s) of the search distribution(s):\n",
+    "    stdev_learning_rate=0.1,\n",
+    "\n",
+    "    # We use the \"centered\" ranking where the worst solution is ranked -0.5,\n",
+    "    # and the best solution is ranked +0.5:\n",
+    "    ranking_method=\"centered\",\n",
+    "\n",
+    "    # We use the ClipUp optimizer.\n",
+    "    optimizer=\"clipup\",\n",
+    "\n",
+    "    # Just like how we provide a batch of `center_learning_rate` values,\n",
+    "    # we provide a batch of `max_speed` values for ClipUp:\n",
+    "    optimizer_config={\"max_speed\": center_learning_rates * 2},\n",
+    "\n",
+    "    # Maximum relative change allowed for standard deviation(s) of the\n",
+    "    # search distribution(s):\n",
+    "    stdev_max_change=0.2,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "aad76f22-28dc-42c3-b504-34c585130c49",
+   "metadata": {},
+   "source": [
+    "Below is the main loop of the evolutionary search."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5244f8bd-b3a1-49e7-9ce3-1b78ae649f3b",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# We will run the evolutionary search for this many generations:\n",
+    "num_generations = 1500\n",
+    "\n",
+    "start_time = last_report_time = datetime.now()\n",
+    "\n",
+    "for generation in range(1, 1 + num_generations):\n",
+    "    # Get a population from the evolutionary algorithm\n",
+    "    population = pgpe_ask(pgpe_state, popsize=500)\n",
+    "\n",
+    "    # Compute the fitnesses\n",
+    "    fitnesses = rastrigin(population)\n",
+    "\n",
+    "    # Inform the evolutionary algorithm of the fitnesses and get its next state\n",
+    "    pgpe_state = pgpe_tell(pgpe_state, population, fitnesses)\n",
+    "\n",
+    "    # If it is time to report, print the status\n",
+    "    tnow = datetime.now()\n",
+    "    if ((tnow - last_report_time).total_seconds() > report_interval) or (generation == num_generations):\n",
+    "        print(\"generation:\", generation, \"mean fitnesses:\", torch.mean(fitnesses, dim=-1))\n",
+    "        last_report_time = tnow\n",
+    "\n",
+    "print(\"time taken: \", (last_report_time - start_time).total_seconds(), \"secs\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e614ed61-66ab-402a-890b-db376ba46ab8",
+   "metadata": {},
+   "source": [
+    "Here are the center points found by `pgpe`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4da70d95-8828-4aac-917b-edb0a19e304f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pgpe_state.optimizer_state.center"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.7"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/examples/notebooks/Functional_API/constrained.ipynb b/examples/notebooks/Functional_API/constrained.ipynb
new file mode 100644
index 00000000..73503208
--- /dev/null
+++ b/examples/notebooks/Functional_API/constrained.ipynb
@@ -0,0 +1,474 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "deeeb625-f81b-44a3-99dc-732651fc1c01",
+   "metadata": {},
+   "source": [
+    "# Solving constrained optimization problems\n",
+    "\n",
+    "---\n",
+    "\n",
+    "In this example, we consider the following constrained optimization problem:\n",
+    "\n",
+    "$$\n",
+    "\\begin{array}{r c}\n",
+    "\\text{maximize }   & a + b + c \\\\\n",
+    "\\text{subject to } & 2a + 3b \\leq 45 \\\\\n",
+    "                   & 5a + 2c \\leq 75 \\\\\n",
+    "                   & 3b + c \\leq 50 \\\\\n",
+    "                   & -100 \\leq a \\leq 100 \\\\\n",
+    "                   & -100 \\leq b \\leq 100 \\\\\n",
+    "                   & -100 \\leq c \\leq 100 \\\\\n",
+    "\\end{array}\n",
+    "$$\n",
+    "\n",
+    "We will now solve this optimization problem using:\n",
+    "\n",
+    "- Evolutionary computation (using PGPE with constraint penalties)\n",
+    "- Penalty method (using gradient-based search with ClipUp, with penalty multipliers iteratively incremented)\n",
+    "- Interior points method (with the help of the functional API of PyTorch and the log-barrier function of EvoTorch)\n",
+    "\n",
+    "This notebook demonstrates:\n",
+    "\n",
+    "- How to use functional algorithms of EvoTorch (evolutionary and gradient-based)\n",
+    "- How to use constraint penalization utilities of EvoTorch and use them as building blocks for constrained optimization, both for evolutionary and for gradient-based optimization"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "82423f38-9832-4dab-9b0f-00893427576f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from evotorch.algorithms.functional import pgpe, pgpe_ask, pgpe_tell, clipup, clipup_ask, clipup_tell\n",
+    "from evotorch.decorators import expects_ndim\n",
+    "from evotorch.tools import penalty, log_barrier\n",
+    "\n",
+    "import torch\n",
+    "import torch.func as tfunc\n",
+    "from typing import Union\n",
+    "from functools import partial\n",
+    "\n",
+    "from matplotlib import pyplot as plt\n",
+    "import numpy as np\n",
+    "\n",
+    "from datetime import datetime"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fd9964a8-0729-48f0-afec-9827a7cd1546",
+   "metadata": {},
+   "source": [
+    "## Fitness function implementation\n",
+    "\n",
+    "We begin by implementing our fitness function.\n",
+    "This fitness function has two arguments:\n",
+    "\n",
+    "- `penalty_multiplier`: Expected as a scalar, this value represents the multiplier for the negative penalty quantities that will be added onto the fitness. Higher values for `penalty_multiplier` will result in stronger penalizations.\n",
+    "- `x`: A 1-dimensional tensor, that represents the solution that will be evaluated.\n",
+    "\n",
+    "Notice how the fitness function below is decorated by `@expects_ndim(0, 1)`. This decorators declares that the first positional argument (`penalty_multipliers`) expects a 0-dimensional tensor, and the second positional argument (`x`) expects a 1-dimensional tensor. If any of these arguments are given with more dimensions, those extra dimensions will be considered as batch dimensions by the decorated function. This auto-batching feature is very helpful, because it allows the decorated function `f` to work with not just a single solution, but with a population of solutions (or a batch of populations of solutions) when `x` is given as an n-dimensional tensor with `n>1`. We will use this auto-batching feature with PGPE."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "4a1eddff-6e05-4037-b2eb-f37478734149",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@expects_ndim(0, 1)\n",
+    "def f(penalty_multiplier: torch.Tensor, x: torch.Tensor) -> torch.Tensor:\n",
+    "    a = x[0]\n",
+    "    b = x[1]\n",
+    "    c = x[2]\n",
+    "    objective = a + b + c\n",
+    "\n",
+    "    constraints = [\n",
+    "        [(2 * a) + (3 * b), \"<=\", 45],\n",
+    "        [(5 * a) + (2 * c), \"<=\", 75],\n",
+    "        [(3 * b) + c, \"<=\", 50],\n",
+    "        [a, \">=\", -100],\n",
+    "        [a, \"<=\", 100],\n",
+    "        [b, \">=\", -100],\n",
+    "        [b, \"<=\", 100],\n",
+    "        [c, \">=\", -100],\n",
+    "        [c, \"<=\", 100],\n",
+    "    ]\n",
+    "\n",
+    "    penalty_amount = 0.0\n",
+    "    for lhs, op, rhs in constraints:\n",
+    "        # For each constraint, we add a penalty (if there is violation)\n",
+    "        penalty_amount = penalty_amount + penalty(\n",
+    "            # Left-hand-side, comparison operator, and the right-hand-side:\n",
+    "            lhs,\n",
+    "            op,\n",
+    "            rhs,\n",
+    "            #\n",
+    "            # Because this is a function we wish to maximize, the penalties should be in the opposite direction.\n",
+    "            # Therefore, we declare the sign of the penalties as \"-\", making them negative quantities:\n",
+    "            penalty_sign=\"-\",\n",
+    "            #\n",
+    "            # There will be a penalty in the form: (linear * amount_of_violation)\n",
+    "            linear=1.0,\n",
+    "            #\n",
+    "            # There will also be a penalty in the form: (amount_of_violation ** exp)\n",
+    "            exp=2.0,\n",
+    "            #\n",
+    "            # The exponential penalties are not allowed to exceed this amount:\n",
+    "            exp_inf=5000.0,\n",
+    "            #\n",
+    "            # There will also be a penalty in the form: (step if amount_of_violation > 0 else 0)\n",
+    "            step=10000.0,\n",
+    "        )\n",
+    "\n",
+    "    # Scale the accumulated penalty by `penalty_multiplier`, add it onto the objective, then return it\n",
+    "    return objective + (penalty_amount * penalty_multiplier)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9a69f467-2838-41bb-bd93-f715255edf8f",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Evolutionary computation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7d0d8808-0d9a-43f8-a275-df87fe3b6833",
+   "metadata": {},
+   "source": [
+    "As the evolutionary algorithm, we use the functional `pgpe`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b8ef5439-ee00-40e4-9053-cb621aa4c855",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pgpe_state = pgpe(\n",
+    "    # We want to maximize the evaluation results:\n",
+    "    objective_sense=\"max\",\n",
+    "\n",
+    "    # The center point of the initial search distribution is given as 0 vector:\n",
+    "    center_init=torch.zeros(3),\n",
+    "\n",
+    "    # Learning rates for the center and the standard deviation of the\n",
+    "    # search distribution:\n",
+    "    center_learning_rate=0.1,\n",
+    "    stdev_learning_rate=0.1,\n",
+    "\n",
+    "    # The ranking method is \"centered\", which ranks the worst solution as\n",
+    "    # -0.5, and the best solution as +0.5:\n",
+    "    ranking_method=\"centered\",\n",
+    "\n",
+    "    # We use the ClipUp optimizer:\n",
+    "    optimizer=\"clipup\",\n",
+    "\n",
+    "    # In the case of this example problem, we use the following max_speed:\n",
+    "    optimizer_config={\"max_speed\": 1.0},\n",
+    "\n",
+    "    # The standard deviation for the initial search distribution:\n",
+    "    stdev_init=100.0,\n",
+    "\n",
+    "    # Relative maximum allowed change for the standard deviation.\n",
+    "    # 0.2 means that a standard deviation value cannot change more than 20%\n",
+    "    # of its original value.\n",
+    "    stdev_max_change=0.2,\n",
+    "\n",
+    "    # Minimum standard deviation value. The standard deviation is cannot shrink\n",
+    "    # to values lower than this:\n",
+    "    stdev_min=0.01,\n",
+    ")\n",
+    "\n",
+    "pgpe_state"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3474900a-1e39-4bbc-ad05-6ae377ac9143",
+   "metadata": {},
+   "source": [
+    "Main loop of the evolutionary computation:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d9c8ef56-117d-49f3-8dc2-9a77ae80af36",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "num_generations = 1000  # For how many iterations will PGPE work\n",
+    "\n",
+    "time_interval = 1  # With this period, we will print the status\n",
+    "last_report_time = datetime.now()\n",
+    "\n",
+    "# Initialize the variables that will store the best solution and the best evaluation result\n",
+    "pgpe_best_eval = float(\"-inf\")\n",
+    "pgpe_best = None\n",
+    "\n",
+    "for generation in range(1, 1 + num_generations):\n",
+    "    # Ask for a population from PGPE\n",
+    "    population = pgpe_ask(pgpe_state, popsize=1000)\n",
+    "\n",
+    "    # Compute the fitnesses\n",
+    "    fitnesses = f(1, population)\n",
+    "\n",
+    "    # Inform PGPE of the latest fitnesses, and get its next state\n",
+    "    pgpe_state = pgpe_tell(pgpe_state, population, fitnesses)\n",
+    "\n",
+    "    # From the most recent population and fitnesses, update the best solution and the best evaluation result\n",
+    "    pop_best_index = torch.argmax(fitnesses)\n",
+    "    pop_best = population[pop_best_index]\n",
+    "    pop_best_eval = fitnesses[pop_best_index]\n",
+    "    if pop_best_eval > pgpe_best_eval:\n",
+    "        pgpe_best_eval = pop_best_eval\n",
+    "        pgpe_best = population[pop_best_index, :]\n",
+    "\n",
+    "    # If it is time to report, print the status\n",
+    "    tnow = datetime.now()\n",
+    "    if ((tnow - last_report_time).total_seconds() > time_interval) or (generation == num_generations):\n",
+    "        print(\"best solution:\", pgpe_best, \"best evaluation:\", pgpe_best_eval)\n",
+    "        last_report_time = tnow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ec14d993-fd9a-4cd7-a772-741740aa0c7b",
+   "metadata": {},
+   "source": [
+    "Best solution:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "b53479ed-2350-4fe3-ab4a-e2036aafce3d",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pgpe_best, pgpe_best_eval"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "20c93618-deb8-41dc-a06d-18dc78d96497",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Penalty method\n",
+    "\n",
+    "As an alternative to evolutionary computation, we now use the gradient-based penalty method below.\n",
+    "The main points of the penalty method are as follows:\n",
+    "\n",
+    "- Use a gradient-based search on a fitness function augmented by penalties (i.e. by penalty terms that are computed according to how much the constraints are penalized)\n",
+    "- Periodically increase the multipliers for the penalty terms (when the penalty multiplier is increased, we use the previous optimization's solution as the new starting point)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "3d12cfd7-25cb-4ff5-8481-dba1d8ec8b00",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Try with these penalty multipliers, ordered from small to large:\n",
+    "penalty_multipliers = [0.1, 1, 4]\n",
+    "\n",
+    "# For each penalty multiplier, we do the search for this number of iterations:\n",
+    "num_iterations = 1000\n",
+    "\n",
+    "# Initialize the variables that will store the best solution and the best evaluation result\n",
+    "clipup_best_eval = float(\"-inf\")\n",
+    "clipup_best = None\n",
+    "\n",
+    "# Initialize the search point as a 0 vector:\n",
+    "x = torch.zeros(3)\n",
+    "\n",
+    "for penalty_multiplier in penalty_multipliers:\n",
+    "    # Initialize the ClipUp algorithm for the currently considered penalty multiplier\n",
+    "    clipup_state = clipup(center_init=x, center_learning_rate=0.1, max_speed=1.0)\n",
+    "\n",
+    "    # Optimization loop for the current penalty multiplier\n",
+    "    for iteration in range(1, 1 + num_iterations):\n",
+    "        # Ask ClipUp for the current search point\n",
+    "        x = clipup_ask(clipup_state)\n",
+    "\n",
+    "        # Compute the gradient and the fitness\n",
+    "        gradient, fitness = tfunc.grad_and_value(partial(f, penalty_multiplier))(x)\n",
+    "\n",
+    "        # Update the best-known solution so far\n",
+    "        if fitness > clipup_best_eval:\n",
+    "            clipup_best_eval = fitness\n",
+    "            clipup_best = x\n",
+    "\n",
+    "        # Inform ClipUp of the latest gradient, and get its next state\n",
+    "        clipup_state = clipup_tell(clipup_state, follow_grad=gradient)\n",
+    "\n",
+    "    # After each optimization loop, print the best known solution\n",
+    "    print(\"best solution:\", clipup_best, \" best evaluation:\", clipup_best_eval)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b09576f6-0a84-4572-a36f-9a6c4b3c7f4f",
+   "metadata": {},
+   "source": [
+    "---\n",
+    "\n",
+    "## Interior points method"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b8d18eca-040e-4156-a26f-46af1375c886",
+   "metadata": {},
+   "source": [
+    "Although, as its name implies, the main focus of EvoTorch is evolutionary computation, we now demonstrate that it is possible to implement an interior points method by combining:\n",
+    "\n",
+    "- `torch.func.grad`\n",
+    "- `torch.func.hessian`\n",
+    "- `evotorch.tools.log_barrier`\n",
+    "\n",
+    "The `grad(...)` and `hessian(...)` functions are basic building blocks for implementing a Newton-Raphson search. We penalize proximities to the infeasible regions with the help of `log_barrier(...)`.\n",
+    "\n",
+    "We begin with a modified implementation of our fitness function, where `evotorch.tools.penalty(...)` are replaced by `evotorch.tools.log_barrier(...)`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "40a5bc1c-e6f2-4868-a426-5c24b4b62924",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "@expects_ndim(0, 1)\n",
+    "def f_with_log_barriers(sharpness: torch.Tensor, x: torch.Tensor) -> torch.Tensor:\n",
+    "    a = x[0]\n",
+    "    b = x[1]\n",
+    "    c = x[2]\n",
+    "    objective = a + b + c\n",
+    "\n",
+    "    constraints = [\n",
+    "        [(2 * a) + (3 * b), \"<=\", 45],\n",
+    "        [(5 * a) + (2 * c), \"<=\", 75],\n",
+    "        [(3 * b) + c, \"<=\", 50],\n",
+    "        [a, \">=\", -100],\n",
+    "        [a, \"<=\", 100],\n",
+    "        [b, \">=\", -100],\n",
+    "        [b, \"<=\", 100],\n",
+    "        [c, \">=\", -100],\n",
+    "        [c, \"<=\", 100],\n",
+    "    ]\n",
+    "\n",
+    "    penalty_amount = 0.0\n",
+    "    for lhs, op, rhs in constraints:\n",
+    "        penalty_amount = penalty_amount + log_barrier(lhs, op, rhs, penalty_sign=\"-\", sharpness=sharpness)\n",
+    "\n",
+    "    # Return the objective with the log-barrier penalties added onto it.\n",
+    "    # Notice that in the end, we are inverting the sign of the returned quantity.\n",
+    "    # This will allow us to implement the Newton-Raphson's search method from a minimization perspective.\n",
+    "    return -(objective + penalty_amount)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "f4128f4f-d8ad-4be9-9d81-968406355f8f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Start the search from the 0 vector:\n",
+    "x = torch.zeros(3)\n",
+    "\n",
+    "# Try with these sharpness values for the log barrier, ordered from small to large:\n",
+    "sharpness_values = [1, 10, 100, 1000, 10000]\n",
+    "\n",
+    "# Learning rate for when taking a step\n",
+    "learning_rate = 0.1\n",
+    "\n",
+    "# An identity matrix multiplied by the constant below will be added to the Hessian matrix.\n",
+    "# When a diagonal element of the Hessian matrix is 0 because of numerical issues, this trick will allow the\n",
+    "# algorithm to still take a step.\n",
+    "I_multiplier = 0.01"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "19c0cfad-3ae3-4c3b-b6a9-ddc99051c3fe",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# With this interval (in seconds), we wish to report the status\n",
+    "reporting_interval = 1\n",
+    "last_report_time = datetime.now()\n",
+    "\n",
+    "# The interior points method will run for this amount of iterations:\n",
+    "num_iterations = 200\n",
+    "\n",
+    "for sharpness in sharpness_values:\n",
+    "    print(\"sharpness:\", sharpness)\n",
+    "    print()\n",
+    "\n",
+    "    for iteration in range(1, 1 + num_iterations):\n",
+    "        # Compute the gradient and the solution cost\n",
+    "        g, cost = tfunc.grad_and_value(partial(f_with_log_barriers, sharpness))(x)\n",
+    "\n",
+    "        # Compute the Hessian matrix\n",
+    "        H = tfunc.hessian(partial(f_with_log_barriers, sharpness))(x)\n",
+    "\n",
+    "        # Add the identity matrix multiplied by a constant to the Hessian\n",
+    "        H = H + (I_multiplier * torch.eye(H.shape[-1]))\n",
+    "\n",
+    "        # Take the inverse of the Hessian matrix\n",
+    "        invH = torch.linalg.inv(H)\n",
+    "\n",
+    "        # Move the center point\n",
+    "        x = x - learning_rate * (invH @ g)\n",
+    "\n",
+    "        # If it is time to report, print the status\n",
+    "        tnow = datetime.now()\n",
+    "        if (tnow - last_report_time).total_seconds() > reporting_interval:\n",
+    "            print(\"Iteration:\", iteration, \"  Gradient norm:\", torch.linalg.norm(g), \"  Solution cost:\", cost)\n",
+    "            last_report_time = tnow\n",
+    "\n",
+    "    # Print the current search point\n",
+    "    print()\n",
+    "    print(\"x:\", x)\n",
+    "    print()"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.18"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/examples/notebooks/Functional_API/problem.ipynb b/examples/notebooks/Functional_API/problem.ipynb
new file mode 100644
index 00000000..e5103db7
--- /dev/null
+++ b/examples/notebooks/Functional_API/problem.ipynb
@@ -0,0 +1,261 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "e89116a2-a149-487c-8e8f-7dab78ded7ff",
+   "metadata": {},
+   "source": [
+    "# Solving reinforcement learning tasks using functional evolutionary algorithms\n",
+    "\n",
+    "The functional implementations of evolutionary algorithms can interact with the object-oriented `Problem` API of EvoTorch. To demonstrate this, we instantiate a `GymNE` problem configured to work on the reinforcement learning task `CartPole-v1`, and we use the functional `pgpe` algorithm to solve it."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0286d0f6-8879-4722-902d-1caeb85306bc",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from evotorch.algorithms.functional import pgpe, pgpe_ask, pgpe_tell\n",
+    "from evotorch.neuroevolution import GymNE\n",
+    "from datetime import datetime\n",
+    "import torch"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "eb10227b-523d-4ca5-a871-aca97063eba4",
+   "metadata": {},
+   "source": [
+    "Below, we instantiate the reinforcement learning problem."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9af709e7-796a-47b1-8501-123e3f9fbec1",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "problem = GymNE(\n",
+    "    # The id of the gymnasium task:\n",
+    "    \"CartPole-v1\",\n",
+    "\n",
+    "    # Policy architecture to use.\n",
+    "    # This can also be given as a subclass of `torch.nn.Module`, or an instantiated\n",
+    "    # `torch.nn.Module` object. For simplicity, we use a basic feed-forward neural\n",
+    "    # network, that can be expressed as a string.\n",
+    "    \"Linear(obs_length, 16) >> Tanh() >> Linear(16, act_length)\",\n",
+    "\n",
+    "    # Setting `observation_normalization` as True means that stats regarding\n",
+    "    # observations will be collected during each population evaluation\n",
+    "    # process, and those stats will be used to normalize the future\n",
+    "    # observation data (that will be given as input to the policy).\n",
+    "    observation_normalization=True,\n",
+    "\n",
+    "    # Number of actors to be used. Can be an integer.\n",
+    "    # The string \"max\" means that the number of actors will be equal to the\n",
+    "    # number of CPUs.\n",
+    "    num_actors=\"max\",\n",
+    ")\n",
+    "\n",
+    "problem"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7d55d935-07fe-423a-b75b-befae8c97a21",
+   "metadata": {},
+   "source": [
+    "Now that we have instantiated our problem, we make a callable evaluator object from it. This callable evaluator, named `f`, behaves like a function `f(x)`, where `x` can be a single solution (represented by a 1-dimensional tensor), or a population (represented by a 2-dimensional tensor where each row is a solution), or a batch of populations (represented by a tensor with at least 3 dimensions). Upon receiving its argument `x`, `f` uses the problem object to evaluate the solution(s), and return the evalution result(s)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "07619fad-0604-42ce-af08-3417917a3dd7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "f = problem.make_callable_evaluator()\n",
+    "f"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7ea2af7d-78ee-46e5-bfa4-f959b5565b3c",
+   "metadata": {},
+   "source": [
+    "Hyperparameters for `pgpe`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "9a95f546-1c15-4eed-9d2f-6fceb2d51a64",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "popsize = 100\n",
+    "center_init = problem.make_zeros(num_solutions=1)[0]\n",
+    "max_speed = 0.15\n",
+    "center_learning_rate = max_speed * 0.75\n",
+    "radius = max_speed * 15\n",
+    "stdev_learning_rate = 0.1\n",
+    "stdev_max_change = 0.2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "82c77b1e-318b-450d-9d3c-c5ad6cede329",
+   "metadata": {},
+   "source": [
+    "We prepare the `pgpe` algorithm and get its initial state:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "df014e8e-a878-4ec1-80d4-392156e5ffbe",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pgpe_state = pgpe(\n",
+    "    # Center of the initial search distribution\n",
+    "    center_init=center_init,\n",
+    "\n",
+    "    # Radius for the initial search distribution\n",
+    "    radius_init=radius,\n",
+    "\n",
+    "    # Learning rates for when updating the center and the standard deviation\n",
+    "    # of the search distribution\n",
+    "    center_learning_rate=center_learning_rate,\n",
+    "    stdev_learning_rate=stdev_learning_rate,\n",
+    "\n",
+    "    # Maximum relative amount of change for standard deviation.\n",
+    "    # Setting this as 0.2 means that an item of the standard deviation vector\n",
+    "    # will not be allowed to change more than the 20% of its original value.\n",
+    "    stdev_max_change=stdev_max_change,\n",
+    "\n",
+    "    # The ranking method to be used.\n",
+    "    # \"centered\" is a ranking method which assigns the rank -0.5 to the worst\n",
+    "    # solution, and +0.5 to the best solution.\n",
+    "    ranking_method=\"centered\",\n",
+    "\n",
+    "    # The optimizer to be used. Can be \"clipup\", \"adam\", or \"sgd\".\n",
+    "    optimizer=\"clipup\",\n",
+    "\n",
+    "    # Optimizer-specific hyperparameters:\n",
+    "    optimizer_config={\"max_speed\": max_speed},\n",
+    "\n",
+    "    # Whether or not symmetric sampling will be used.\n",
+    "    symmetric=True,\n",
+    "\n",
+    "    # We want to maximize the evaluation results.\n",
+    "    # In the case of reinforcement learning tasks declared via `GymNE`,\n",
+    "    # evaluation results represent the cumulative rewards.\n",
+    "    objective_sense=\"max\",\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "84e6a2be-c00d-4fee-b4f4-98b3156a90fb",
+   "metadata": {},
+   "source": [
+    "Below is the main loop of the evolutionary search."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "6adefd82-e819-49f4-b08d-7507d5efe264",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# We run the evolutionary search for this many generations:\n",
+    "num_generations = 40\n",
+    "\n",
+    "last_report_time = datetime.now()\n",
+    "\n",
+    "# This is the interval (in seconds) for reporting the status:\n",
+    "reporting_interval = 1\n",
+    "\n",
+    "for generation in range(1, 1 + num_generations):\n",
+    "    # Get a population from the pgpe algorithm\n",
+    "    population = pgpe_ask(pgpe_state, popsize=popsize)\n",
+    "\n",
+    "    # Evaluate the fitnesses\n",
+    "    fitnesses = f(population)\n",
+    "\n",
+    "    # Inform pgpe of the fitnesses and get its next state\n",
+    "    pgpe_state = pgpe_tell(pgpe_state, population, fitnesses)\n",
+    "\n",
+    "    # If it is time to report, print the status\n",
+    "    tnow = datetime.now()\n",
+    "    if (tnow - last_report_time).total_seconds() > reporting_interval:\n",
+    "        print(\"generation:\", generation, \"median eval:\", torch.median(fitnesses))\n",
+    "        last_report_time = tnow"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7b2e117d-2bce-49ce-b912-745f5c707bf5",
+   "metadata": {},
+   "source": [
+    "Here is the center point of the most recent search distribution:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "81a925c3-6cb2-4095-a01b-7c9ea275b575",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "x = pgpe_state.optimizer_state.center\n",
+    "x"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c5e5215d-5ab4-41af-b421-db8684f9bfae",
+   "metadata": {},
+   "source": [
+    "Now, we visualize the agent evolved by our functional `pgpe`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "ce8321b8-7d87-430d-a144-2b0d3fd0d316",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "problem.visualize(x)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.18"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/examples/notebooks/README.md b/examples/notebooks/README.md
index e2eeeb9c..09274655 100644
--- a/examples/notebooks/README.md
+++ b/examples/notebooks/README.md
@@ -25,3 +25,4 @@ The notebook examples are listed below:
 - **[Genetic Programming](Genetic_Programming.ipynb):** demonstrates genetic programming with GPU support.
 - **[Feature Space Illumination with MAPElites](Feature_Space_Illumination_with_MAPElites.ipynb):** demonstrates how one can use the MAPElites algorithm to obtain a population organized according to the features of the solutions.
 - **[Evolving_Objects](Evolving_Objects.ipynb):** demonstrates how to declare and solve optimization problems with custom structured solutions (storing not-necessarily numeric data and/or having varying lengths). In more details, this example evolves simple gaits for the `Ant-v4` reinforcement learning environment using a custom solution encoding such that each solution contains multiple sublists of integers.
+- **[Functional API](Functional_API/)**: As an alternative to its object-oriented stateful API, EvoTorch provides an API that conforms to the functional programming paradigm. This functional API has its own advantages like being able to work on not just a single population, but on a batch of populations. This sub-directory contains examples demonstrating the functional API of EvoTorch.
diff --git a/src/evotorch/__init__.py b/src/evotorch/__init__.py
index 99af43b5..d6226983 100644
--- a/src/evotorch/__init__.py
+++ b/src/evotorch/__init__.py
@@ -29,7 +29,7 @@
 # Import the subpackages of EvoTorch
 from . import tools
 from . import core
-from .core import Problem, Solution, SolutionBatch
+from .core import Problem, ProblemBoundEvaluator, Solution, SolutionBatch
 
 # isort: on
 
@@ -57,6 +57,7 @@
     "__author__",
     "__email__",
     "Problem",
+    "ProblemBoundEvaluator",
     "Solution",
     "SolutionBatch",
     "algorithms",
diff --git a/src/evotorch/algorithms/functional/__init__.py b/src/evotorch/algorithms/functional/__init__.py
new file mode 100644
index 00000000..25ab4683
--- /dev/null
+++ b/src/evotorch/algorithms/functional/__init__.py
@@ -0,0 +1,331 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+"""
+Purely functional implementations of optimization algorithms.
+
+**Reasoning.**
+PyTorch has a functional API within its namespace `torch.func`.
+In addition to allowing one to choose a pure functional programming style,
+`torch.func` enables powerful batched operations via `torch.func.vmap`.
+
+To be able to work with the functional programming style of `torch.func`,
+EvoTorch introduces functional implementations of evolutionary search
+algorithms and optimizers within the namespace
+`evotorch.algorithms.functional`.
+These algorithm implementations are compatible with `torch.func.vmap`,
+and therefore they can perform batched evolutionary searches
+(e.g. they can work on not just a single population, but on batches
+of populations). Such batched searches can be helpful in the following
+scenarios:
+
+**Scenario 1: Nested optimization.**
+The main optimization problem at hand might have internal optimization
+problems. Therefore, when the main optimization problem's fitness function is
+reached, the internal optimization problem will have to be solved for each
+solution of the main problem. In such a scenario, one might want to use a
+functional evolutionary search for the inner optimization problem, so that
+a batch of populations is formed where each batch item represents a separate
+population associated with a separate solution of the main problem.
+
+**Scenario 2: Batched hyperparameter search.**
+If the user is interested in using a search algorithm that has a functional
+implementation, the user might want to implement a hyperparameter search
+in such a way that there is a batch of hyperparameters (instead of just
+a single set of hyperparameters), and the search is performed on a
+batch of populations. In such a setting, each population within the population
+batch is associated with a different hyperparameter set within the
+hyperparameter batch.
+
+**Example: cross entropy method.**
+Let us assume that we have the following fitness function, whose output we
+wish to minimize:
+
+```python
+import torch
+
+
+def f(x: torch.Tensor) -> torch.Tensor:
+    assert x.ndim == 2, "Please pass `x` as a 2-dimensional tensor"
+    return torch.sum(x**2, dim=-1)
+```
+
+Let us initialize our search from a random point:
+
+```python
+solution_length = 1000
+center_init = torch.randn(solution_length, dtype=torch.float32) * 10
+```
+
+Now we can initialize our cross entropy method like this:
+
+```python
+from evotorch.algorithms.functional import cem, cem_ask, cem_tell
+
+state = cem(
+    #
+    # Center point of the initial search distribution:
+    center_init=center_init,
+    #
+    #
+    # Standard deviation of the initial search distribution:
+    stdev_init=10.0,
+    #
+    #
+    # Top half of the population are to be chosen as parents:
+    parenthood_ratio=0.5,
+    #
+    #
+    # We wish to minimize the fitnesses:
+    objective_sense="min",
+    #
+    #
+    # A standard deviation item is not allowed to change more than
+    # 1% of its original value:
+    stdev_max_change=0.01,
+)
+```
+
+At this point, we have an initial state of our cross entropy method search,
+stored by the variable `state`. Now, we can implement a loop and perform
+multiple generations of evolutionary search like this:
+
+```python
+num_generations = 1000
+
+for generation in range(1, 1 + num_generations):
+    # Ask for a new population (of size 1000) from cross entropy method
+    solutions = cem_ask(state, popsize=1000)
+
+    # At this point, `solutions` is a regular PyTorch tensor, ready to be
+    # passed to the function `f`.
+    # `solutions` is a 2-dimensional tensor of shape (N, L) where `N`
+    # is the number of solutions, and `L` is the length of a solution.
+    # Our example fitness function `f` is implemented in such a way that
+    # we can pass our 2-dimensional `solutions` tensor into it directly.
+    # We will receive `fitnesses` as a 1-dimensional tensor of length `N`.
+    fitnesses = f(solutions)
+
+    # Let us report the mean of fitnesses to see the progress
+    print("Generation:", generation, "  Mean of fitnesses:", torch.mean(fitnesses))
+
+    # Now, we inform cross entropy method of the latest state of the search,
+    # the latest population, and the latest fitnesses, so that it can give us
+    # the next state of the search.
+    state = cem_tell(state, solutions, fitnesses)
+```
+
+At the end of the evolutionary search (or, actually, at any point), one can
+analyze the `state` tuple to get information about the current status of the
+search distribution. These state tuples are named tuples, and therefore, the
+data they store are labeled.
+In the case of cross entropy method, the latest center of the search
+distribution can be obtained via:
+
+```python
+latest_center = state.center
+
+# Note, in the case of pgpe, this would be:
+# latest_center = state.optimizer_state.center
+```
+
+**Notes on manipulating the evolutionary search.**
+If, at any point of the search, you would like to change a hyperparameter,
+you can do so by creating a modified copy of your latest `state` tuple,
+and pass it to the ask method of your evolutionary search (which,
+in the case of cross entropy method, is `cem_ask`).
+Similarly, if you wish to change the center point of the search, you can
+pass a modified state tuple containing the new center point to `cem_ask`.
+
+**Notes on batching.**
+In regular non-batched cases, functional search algorithms expect the
+`center_init` argument as a 1-dimensional tensor. If `center_init` is given
+as a tensor with 2 or more dimensions, the extra leftmost dimensions will
+be considered as batch dimensions, and therefore the evolutionary search
+itself will be batched (which means that the ask method of the search
+algorithm will return a batch of populations). Furthermore, certain
+hyperparameters can also be given in batches. See the specific
+documentation of the functional algorithms to see which hyperparameters
+support batching.
+
+When working with batched populations, it is important to make sure that
+the fitness function can work with arbitrary amount of dimensions (not just
+2 dimensions). One way to implement such fitness functions with the help
+of the [rowwise][evotorch.decorators.rowwise] decorator:
+
+```python
+from evotorch.decorators import rowwise
+
+
+@rowwise
+def f(x: torch.Tensor) -> torch.Tensor:
+    return torch.sum(x**2)
+```
+
+When decorated with `@rowwise`, we can implement our function as if the
+tensor `x` is a 1-dimensional tensor. If the decorated `f` receives `x`
+not as a vector, but as a matrix, then it will do the same operation
+on each row of the matrix, in a vectorized manner. If `x` has 3 or
+more dimensions, they will be considered as extra batch dimensions,
+affecting the shape of the resulting tensor.
+
+**Example: gradient-based search.**
+This namespace also provides functional implementations of various gradient
+based optimizers. The reasoning behind the existence of these implementations
+is two-fold: (i) these optimizers are used by the functional `pgpe`
+implementation (for handling the momentum); and (ii) having these optimizers
+with a similar API allows user to switch back-and-forth between evolutionary
+and gradient-based search for solving the same problem, hopefully without
+having to change the code too much.
+
+Let us consider the same fitness function again, in its `@rowwise` form so
+that it can work with a single vector or a batch of such vectors:
+
+```python
+from evotorch.decorators import rowwise
+
+
+@rowwise
+def f(x: torch.Tensor) -> torch.Tensor:
+    return torch.sum(x**2)
+```
+
+To solve this optimization problem using the Adam optimizer, one can
+do the following:
+
+```python
+from evotorch.algorithms.functional import adam, adam_ask, adam_tell
+from torch.func import grad
+
+# Prepare an initial search point
+solution_length = 1000
+center_init = torch.randn(solution_length, dtype=torch.float32) * 10
+
+
+# Initialize the Adam optimizer
+state = adam(
+    center_init=center_init,
+    center_learning_rate=0.001,
+    beta1=0.9,
+    beta2=0.999,
+    epsilon=1e-8,
+)
+
+
+num_iterations = 1000
+
+for iteration in range(1, 1 + num_iterations):
+    # Get the current search point of the Adam search
+    center = adam_ask(state)
+
+    # Get the gradient.
+    # Negative, because we want to minimize f.
+    gradient = -(grad(f)(center))
+
+    # Inform the Adam optimizer of the gradient to follow, and get the next
+    # state of the search
+    state = adam_tell(state, follow_grad=gradient)
+
+
+# Store the final solution
+final_solution = adam_ask(state)
+# or, alternatively:
+# final_solution = state.center
+```
+
+**Solving a stateful Problem object using functional algorithms.**
+If you wish to solve a stateful [Problem][evotorch.core.Problem]
+using a functional optimization algorithm, you can obtain a callable evaluator
+out of that Problem object, and then use it for computing the fitnesses.
+See the following example:
+
+```python
+from evotorch import Problem, SolutionBatch
+from evotorch.algorithms.functional import cem, cem_ask, cem_tell
+
+
+class MyProblem(Problem):
+    def __init__(self):
+        ...
+
+    def _evaluate_batch(self, batch: SolutionBatch):
+        # Stateful batch evaluation code goes here
+        ...
+
+
+# Instantiate the problem
+problem = MyProblem()
+
+# Make a callable fitness evaluator
+fproblem = problem.make_callable_evaluator()
+
+# Make an initial solution
+center_init = torch.randn(problem.solution_length, dtype=torch.float32) * 10
+
+
+# Prepare a cross entropy method search
+state = cem(
+    center_init=center_init,
+    stdev_init=10.0,
+    parenthood_ratio=0.5,
+    objective_sense="min",
+    stdev_max_change=0.01,
+)
+
+
+num_generations = 1000
+for generation in range(1, 1 + num_generations):
+    # Get a population
+    solutions = cem_ask(state, popsize=1000)
+
+    # Call the evaluator to get the fitnesses
+    fitnesses = fproblem(solutions)
+
+    # Let us report the mean of fitnesses to see the progress
+    print("Generation:", generation, "  Mean of fitnesses:", torch.mean(fitnesses))
+
+    # Now, we inform cross entropy method of the latest state of the search,
+    # the latest population, and the latest fitnesses, so that it can give us
+    # the next state of the search.
+    state = cem_tell(state, solutions, fitnesses)
+
+
+# Center of the latest search distribution
+latest_center = state.center
+```
+"""
+
+from .funcadam import adam, adam_ask, adam_tell
+from .funccem import cem, cem_ask, cem_tell
+from .funcclipup import clipup, clipup_ask, clipup_tell
+from .funcpgpe import pgpe, pgpe_ask, pgpe_tell
+from .funcsgd import sgd, sgd_ask, sgd_tell
+
+__all__ = [
+    "adam",
+    "adam_ask",
+    "adam_tell",
+    "cem",
+    "cem_ask",
+    "cem_tell",
+    "clipup",
+    "clipup_ask",
+    "clipup_tell",
+    "pgpe",
+    "pgpe_ask",
+    "pgpe_tell",
+    "sgd",
+    "sgd_ask",
+    "sgd_tell",
+]
diff --git a/src/evotorch/algorithms/functional/funcadam.py b/src/evotorch/algorithms/functional/funcadam.py
new file mode 100644
index 00000000..09761b97
--- /dev/null
+++ b/src/evotorch/algorithms/functional/funcadam.py
@@ -0,0 +1,172 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import NamedTuple
+
+import torch
+
+from ...decorators import expects_ndim
+from ...tools import BatchableScalar, BatchableVector
+
+
+class AdamState(NamedTuple):
+    center: torch.Tensor
+    center_learning_rate: torch.Tensor
+    beta1: torch.Tensor
+    beta2: torch.Tensor
+    epsilon: torch.Tensor
+    m: torch.Tensor
+    v: torch.Tensor
+    t: torch.Tensor
+
+
+def adam(
+    *,
+    center_init: BatchableVector,
+    center_learning_rate: BatchableScalar = 0.001,
+    beta1: BatchableScalar = 0.9,
+    beta2: BatchableScalar = 0.999,
+    epsilon: BatchableScalar = 1e-8,
+) -> AdamState:
+    """
+    Initialize an Adam optimizer and return its initial state.
+
+    Reference:
+
+        Kingma, D. P. and J. Ba (2015).
+        Adam: A method for stochastic optimization.
+        In Proceedings of 3rd International Conference on Learning Representations.
+
+    Args:
+        center_init: Starting point for the Adam search.
+            Expected as a PyTorch tensor with at least 1 dimension.
+            If there are 2 or more dimensions, the extra leftmost dimensions
+            are interpreted as batch dimensions.
+        center_learning_rate: Learning rate (i.e. the step size) for the Adam
+            updates. Can be a scalar or a multidimensional tensor.
+            If given as a tensor with multiple dimensions, those dimensions
+            will be interpreted as batch dimensions.
+        beta1: beta1 hyperparameter for the Adam optimizer.
+            Can be a scalar or a multidimensional tensor.
+            If given as a tensor with multiple dimensions, those dimensions
+            will be interpreted as batch dimensions.
+        beta2: beta2 hyperparameter for the Adam optimizer.
+            Can be a scalar or a multidimensional tensor.
+            If given as a tensor with multiple dimensions, those dimensions
+            will be interpreted as batch dimensions.
+        epsilon: epsilon hyperparameter for the Adam optimizer.
+            Can be a scalar or a multidimensional tensor.
+            If given as a tensor with multiple dimensions, those dimensions
+            will be interpreted as batch dimensions.
+    Returns:
+        A named tuple of type `AdamState`, representing the initial state
+        of the Adam optimizer.
+    """
+    center_init = torch.as_tensor(center_init)
+    dtype = center_init.dtype
+    device = center_init.device
+
+    def as_tensor(x) -> torch.Tensor:
+        return torch.as_tensor(x, dtype=dtype, device=device)
+
+    center_learning_rate = as_tensor(center_learning_rate)
+    beta1 = as_tensor(beta1)
+    beta2 = as_tensor(beta2)
+    epsilon = as_tensor(epsilon)
+
+    m = torch.zeros_like(center_init)
+    v = torch.zeros_like(center_init)
+    t = torch.zeros(center_init.shape[:-1], dtype=dtype, device=device)
+
+    return AdamState(
+        center=center_init,
+        center_learning_rate=center_learning_rate,
+        beta1=beta1,
+        beta2=beta2,
+        epsilon=epsilon,
+        m=m,
+        v=v,
+        t=t,
+    )
+
+
+@expects_ndim(1, 1, 0, 0, 0, 0, 1, 1, 0)
+def _adam_step(
+    g: torch.Tensor,
+    center: torch.Tensor,
+    center_learning_rate: torch.Tensor,
+    beta1: torch.Tensor,
+    beta2: torch.Tensor,
+    epsilon: torch.Tensor,
+    m: torch.Tensor,
+    v: torch.Tensor,
+    t: torch.Tensor,
+) -> tuple:
+    t = t + 1
+    m = (beta1 * m) + ((1 - beta1) * g)
+    v = (beta2 * v) + ((1 - beta2) * (g**2.0))
+    mhat = m / (1 - (beta1**t))
+    vhat = v / (1 - (beta2**t))
+    center_update = center_learning_rate * mhat / (torch.sqrt(vhat) + epsilon)
+    center = center + center_update
+    return center, m, v, t
+
+
+def adam_ask(state: AdamState) -> torch.Tensor:
+    """
+    Get the search point stored by the given `AdamState`.
+
+    Args:
+        state: The current state of the Adam optimizer.
+    Returns:
+        The search point as a 1-dimensional tensor in the non-batched case,
+        or as a multi-dimensional tensor if the Adam search is batched.
+    """
+    return state.center
+
+
+def adam_tell(state: AdamState, *, follow_grad: BatchableVector) -> AdamState:
+    """
+    Tell the Adam optimizer the current gradient to get its next state.
+
+    Args:
+        state: The current state of the Adam optimizer.
+        follow_grad: Gradient at the current point of the Adam search.
+            Can be a 1-dimensional tensor in the non-batched case,
+            or a multi-dimensional tensor in the batched case.
+    Returns:
+        The updated state of Adam with the given gradient applied.
+    """
+    new_center, new_m, new_v, new_t = _adam_step(
+        follow_grad,
+        state.center,
+        state.center_learning_rate,
+        state.beta1,
+        state.beta2,
+        state.epsilon,
+        state.m,
+        state.v,
+        state.t,
+    )
+
+    return AdamState(
+        center=new_center,
+        center_learning_rate=state.center_learning_rate,
+        beta1=state.beta1,
+        beta2=state.beta2,
+        epsilon=state.epsilon,
+        m=new_m,
+        v=new_v,
+        t=new_t,
+    )
diff --git a/src/evotorch/algorithms/functional/funccem.py b/src/evotorch/algorithms/functional/funccem.py
new file mode 100644
index 00000000..6b118712
--- /dev/null
+++ b/src/evotorch/algorithms/functional/funccem.py
@@ -0,0 +1,289 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import Iterable, NamedTuple, Optional, Union
+
+import torch
+
+from ...decorators import expects_ndim
+from ...distributions import SeparableGaussian, make_functional_grad_estimator, make_functional_sampler
+from ...tools import BatchableScalar, BatchableVector, modify_vector
+
+
+class CEMState(NamedTuple):
+    center: torch.Tensor
+    stdev: torch.Tensor
+    stdev_min: torch.Tensor
+    stdev_max: torch.Tensor
+    stdev_max_change: torch.Tensor
+    parenthood_ratio: float
+    maximize: bool
+
+
+def cem(
+    *,
+    center_init: BatchableVector,
+    parenthood_ratio: float,
+    objective_sense: str,
+    stdev_init: Optional[Union[float, BatchableVector]] = None,
+    radius_init: Optional[Union[float, BatchableScalar]] = None,
+    stdev_min: Optional[Union[float, BatchableVector]] = None,
+    stdev_max: Optional[Union[float, BatchableVector]] = None,
+    stdev_max_change: Optional[Union[float, BatchableVector]] = None,
+) -> CEMState:
+    """
+    Get an initial state for the cross entropy method (CEM).
+
+    The received initial state, a named tuple of type `CEMState`, is to be
+    passed to the function `cem_ask(...)` to receive the solutions belonging
+    to the first generation of the evolutionary search.
+
+    References:
+
+        Rubinstein, R. (1999). The cross-entropy method for combinatorial
+        and continuous optimization.
+        Methodology and computing in applied probability, 1(2), 127-190.
+
+        Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P. (2016).
+        Benchmarking deep reinforcement learning for continuous control.
+        International conference on machine learning. PMLR, 2016.
+
+    Args:
+        center_init: Center (i.e. mean) of the initial search distribution.
+            Expected as a PyTorch tensor with at least 1 dimension.
+            If the given `center` tensor has more than 1 dimensions, the extra
+            leftmost dimensions will be interpreted as batch dimensions.
+        stdev_init: Standard deviation of the initial search distribution.
+            If this is given as a scalar `s`, the standard deviation for the
+            search distribution will be interpreted as `[s, s, ..., s]` whose
+            length is the same with the length of `center_init`.
+            If this is given as a 1-dimensional tensor, the given tensor will
+            be interpreted as the standard deviation vector.
+            If this is given as a tensor with at least 2 dimensions, the extra
+            leftmost dimension(s) will be interpreted as batch dimensions.
+            If you wish to express the coverage area of the initial search
+            distribution in terms of "radius" instead, you can leave
+            `stdev_init` as None, and provide a value for the argument
+            `radius_init`.
+        radius_init: Radius for the initial search distribution, representing
+            the euclidean norm for the first standard deviation vector.
+            Setting this value as `r` means that the standard deviation
+            vector will be initialized as a vector `[s, s, ..., s]`
+            whose norm will be equal to `r`. In the non-batched case,
+            `radius_init` is expected as a scalar value.
+            If `radius_init` is given as a tensor with 1 or more
+            dimensions, those dimensions will be considered as batch
+            dimensions. If you wish to express the coverage are of the initial
+            search distribution in terms of the standard deviation values
+            instead, you can leave `radius_init` as None, and provide a value
+            for the argument `stdev_init`.
+        parenthood_ratio: Proportion of the solutions that will be chosen as
+            the parents for the next generation. For example, if this is
+            given as 0.5, the top 50% of the solutions will be chosen as
+            parents.
+        objective_sense: Expected as a string, either as 'min' or as 'max'.
+            Determines if the goal is to minimize or is to maximize.
+        stdev_min: Minimum allowed standard deviation for the search
+            distribution. Can be given as a scalar or as a tensor with one or
+            more dimensions. When given with at least 2 dimensions, the extra
+            leftmost dimensions will be interpreted as batch dimensions.
+        stdev_max: Maximum allowed standard deviation for the search
+            distribution. Can be given as a scalar or as a tensor with one or
+            more dimensions. When given with at least 2 dimensions, the extra
+            leftmost dimensions will be interpreted as batch dimensions.
+        stdev_max_change: Maximum allowed change for the standard deviation
+            vector. If this is given as a scalar, this scalar will serve as a
+            limiter for the change of the entire standard deviation vector.
+            For example, a scalar value of 0.2 means that the elements of the
+            standard deviation vector cannot change more than the 20% of their
+            original values. If this is given as a vector (i.e. as a
+            1-dimensional tensor), each element of `stdev_max_change` will
+            serve as a limiter to its corresponding element within the standard
+            deviation vector. If `stdev_max_change` is given as a tensor with
+            at least 2 dimensions, the extra leftmost dimension(s) will be
+            interpreted as batch dimensions.
+            If you do not wish to have such a limiter, you can leave this as
+            None.
+    Returns:
+        A named tuple, of type `CEMState`, storing the hyperparameters and the
+        initial state of the cross entropy method.
+    """
+    from .misc import _get_stdev_init
+
+    center_init = torch.as_tensor(center_init)
+    if center_init.ndim < 1:
+        raise ValueError(
+            "The center of the search distribution for the functional CEM was expected"
+            " as a tensor with at least 1 dimension."
+            f" However, the encountered `center_init` is {center_init}, of shape {center_init.shape}."
+        )
+
+    solution_length = center_init.shape[-1]
+    if solution_length == 0:
+        raise ValueError("Solution length cannot be 0")
+
+    stdev_init = _get_stdev_init(center_init=center_init, stdev_init=stdev_init, radius_init=radius_init)
+
+    device = center_init.device
+    dtype = center_init.dtype
+
+    def as_vector_like_center(x: Iterable, vector_name: str) -> torch.Tensor:
+        x = torch.as_tensor(x, dtype=dtype, device=device)
+        if x.ndim == 0:
+            x = x.repeat(solution_length)
+        else:
+            if x.shape[-1] != solution_length:
+                raise ValueError(
+                    f"`{vector_name}` has an incompatible length."
+                    f" The length of `{vector_name}`: {x.shape[-1]},"
+                    f" but the solution length implied by the provided `center_init` is {solution_length}."
+                )
+        return x
+
+    if stdev_min is None:
+        stdev_min = 0.0
+    stdev_min = as_vector_like_center(stdev_min, "stdev_min")
+
+    if stdev_max is None:
+        stdev_max = float("inf")
+    stdev_max = as_vector_like_center(stdev_max, "stdev_max")
+
+    if stdev_max_change is None:
+        stdev_max_change = float("inf")
+    stdev_max_change = as_vector_like_center(stdev_max_change, "stdev_max_change")
+    parenthood_ratio = float(parenthood_ratio)
+
+    if objective_sense == "min":
+        maximize = False
+    elif objective_sense == "max":
+        maximize = True
+    else:
+        raise ValueError(
+            f"`objective_sense` was expected as 'min' or 'max', but it was received as {repr(objective_sense)}"
+        )
+
+    return CEMState(
+        center=center_init,
+        stdev=stdev_init,
+        stdev_min=stdev_min,
+        stdev_max=stdev_max,
+        stdev_max_change=stdev_max_change,
+        parenthood_ratio=parenthood_ratio,
+        maximize=maximize,
+    )
+
+
+_required_parameters = ["mu", "sigma", "parenthood_ratio"]
+_cem_sample = make_functional_sampler(SeparableGaussian, required_parameters=_required_parameters)
+_cem_grad = make_functional_grad_estimator(SeparableGaussian, required_parameters=_required_parameters)
+
+
+@expects_ndim(1, 1, None, None, randomness="different")
+def _cem_ask(center: torch.Tensor, stdev: torch.Tensor, parenthood_ratio: float, popsize: int) -> torch.Tensor:
+    return _cem_sample(popsize, mu=center, sigma=stdev, parenthood_ratio=parenthood_ratio)
+
+
+@expects_ndim(1, 1, 1, None, None, 1, 1, 2, 1, randomness="different")
+def _cem_tell(
+    stdev_min: torch.Tensor,
+    stdev_max: torch.Tensor,
+    stdev_max_change: torch.Tensor,
+    parenthood_ratio: float,
+    maximize: bool,
+    org_center: torch.Tensor,
+    org_stdev: torch.Tensor,
+    values: torch.Tensor,
+    evals: torch.Tensor,
+) -> tuple:
+    grads = _cem_grad(
+        values,
+        evals,
+        mu=org_center,
+        sigma=org_stdev,
+        objective_sense=("max" if maximize else "min"),
+        parenthood_ratio=parenthood_ratio,
+    )
+
+    mu_grad = grads["mu"]
+    sigma_grad = grads["sigma"]
+
+    center = org_center + mu_grad
+
+    target_stdev = org_stdev + sigma_grad
+    stdev = modify_vector(
+        org_stdev,
+        target_stdev,
+        lb=stdev_min,
+        ub=stdev_max,
+        max_change=stdev_max_change,
+    )
+
+    return center, stdev
+
+
+def cem_ask(state: CEMState, *, popsize: int) -> torch.Tensor:
+    """
+    Obtain a population from cross entropy method, given the state.
+
+    Args:
+        state: The current state of the cross entropy method search.
+        popsize: Number of solutions to be generated for the requested
+            population.
+    Returns:
+        Population, as a tensor of at least 2 dimensions.
+    """
+    return _cem_ask(state.center, state.stdev, state.parenthood_ratio, popsize)
+
+
+def cem_tell(state: CEMState, values: torch.Tensor, evals: torch.Tensor) -> CEMState:
+    """
+    Given the old state and the evals (fitnesses), obtain the next state.
+
+    From this state tuple, the center point of the search distribution can be
+    obtained via the field `.center`.
+
+    Args:
+        state: The old state of the cross entropy method search.
+        values: The most recent population, as a PyTorch tensor.
+        evals: Evaluation results (i.e. fitnesses) for the solutions expressed
+            by `values`. For example, if `values` is shaped `(N, L)`, this means
+            that there are `N` solutions (of length `L`). So, `evals` is
+            expected as a 1-dimensional tensor of length `N`, where `evals[i]`
+            expresses the fitness of the solution `values[i, :]`.
+            If `values` is shaped `(B, N, L)`, then there is also a batch
+            dimension, so, `evals` is expected as a 2-dimensional tensor of
+            shape `(B, N)`.
+    Returns:
+        The new state of the cross entropy method search.
+    """
+    new_center, new_stdev = _cem_tell(
+        state.stdev_min,
+        state.stdev_max,
+        state.stdev_max_change,
+        state.parenthood_ratio,
+        state.maximize,
+        state.center,
+        state.stdev,
+        values,
+        evals,
+    )
+    return CEMState(
+        center=new_center,
+        stdev=new_stdev,
+        stdev_min=state.stdev_min,
+        stdev_max=state.stdev_max,
+        stdev_max_change=state.stdev_max_change,
+        parenthood_ratio=state.parenthood_ratio,
+        maximize=state.maximize,
+    )
diff --git a/src/evotorch/algorithms/functional/funcclipup.py b/src/evotorch/algorithms/functional/funcclipup.py
new file mode 100644
index 00000000..35032993
--- /dev/null
+++ b/src/evotorch/algorithms/functional/funcclipup.py
@@ -0,0 +1,151 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import NamedTuple, Optional
+
+import torch
+
+from ...decorators import expects_ndim
+from ...tools import BatchableScalar, BatchableVector
+
+
+class ClipUpState(NamedTuple):
+    center: torch.Tensor
+    velocity: torch.Tensor
+    center_learning_rate: torch.Tensor
+    momentum: torch.Tensor
+    max_speed: torch.Tensor
+
+
+def clipup(
+    *,
+    center_init: BatchableVector,
+    momentum: BatchableScalar = 0.9,
+    center_learning_rate: Optional[BatchableScalar] = None,
+    max_speed: Optional[BatchableScalar] = None,
+) -> ClipUpState:
+    """
+    Initialize the ClipUp optimizer and return its initial state.
+
+    Reference:
+
+        Toklu, N. E., Liskowski, P., & Srivastava, R. K. (2020, September).
+        ClipUp: A Simple and Powerful Optimizer for Distribution-Based Policy Evolution.
+        In International Conference on Parallel Problem Solving from Nature (pp. 515-527).
+        Springer, Cham.
+
+    Args:
+        center_init: Starting point for the ClipUp search.
+            Expected as a PyTorch tensor with at least 1 dimension.
+            If there are 2 or more dimensions, the extra leftmost dimensions
+            are interpreted as batch dimensions.
+        center_learning_rate: Learning rate (i.e. the step size) for the ClipUp
+            updates. Can be a scalar or a multidimensional tensor.
+            If given as a tensor with multiple dimensions, those dimensions
+            will be interpreted as batch dimensions.
+        max_speed: Maximum speed, expected as a scalar. The euclidean norm
+            of the velocity (i.e. of the update vector) is not allowed to
+            exceed `max_speed`.
+            If given as a tensor with multiple dimensions, those dimensions
+            will be interpreted as batch dimensions.
+    """
+    center_init = torch.as_tensor(center_init)
+    dtype = center_init.dtype
+    device = center_init.device
+
+    def as_tensor(x) -> torch.Tensor:
+        return torch.as_tensor(x, dtype=dtype, device=device)
+
+    if (center_learning_rate is None) and (max_speed is None):
+        raise ValueError("Both `center_learning_rate` and `max_speed` is missing. At least one of them is needed.")
+    elif (center_learning_rate is not None) and (max_speed is None):
+        center_learning_rate = as_tensor(center_learning_rate)
+        max_speed = center_learning_rate * 2.0
+    elif (center_learning_rate is None) and (max_speed is not None):
+        max_speed = as_tensor(max_speed)
+        center_learning_rate = max_speed / 2.0
+    else:
+        center_learning_rate = as_tensor(center_learning_rate)
+        max_speed = as_tensor(max_speed)
+
+    velocity = torch.zeros_like(center_init)
+    momentum = as_tensor(momentum)
+
+    return ClipUpState(
+        center=center_init,
+        velocity=velocity,
+        center_learning_rate=center_learning_rate,
+        momentum=momentum,
+        max_speed=max_speed,
+    )
+
+
+@expects_ndim(1, 1, 1, 0, 0, 0)
+def _clipup_step(
+    g: torch.Tensor,
+    center: torch.Tensor,
+    velocity: torch.Tensor,
+    center_learning_rate: torch.Tensor,
+    momentum: torch.Tensor,
+    max_speed: torch.Tensor,
+) -> tuple:
+    velocity = (momentum * velocity) + (center_learning_rate * (g / torch.norm(g)))
+    vnorm = torch.norm(velocity)
+    must_clip = (vnorm > max_speed).expand(velocity.shape)
+    velocity = torch.where(must_clip, max_speed * (velocity / vnorm), velocity)
+    center = center + velocity
+    return velocity, center
+
+
+def clipup_ask(state: ClipUpState) -> torch.Tensor:
+    """
+    Get the search point stored by the given `ClipUpState`.
+
+    Args:
+        state: The current state of the ClipUp optimizer.
+    Returns:
+        The search point as a 1-dimensional tensor in the non-batched case,
+        or as a multi-dimensional tensor if the ClipUp search is batched.
+    """
+    return state.center
+
+
+def clipup_tell(state: ClipUpState, *, follow_grad: BatchableVector) -> ClipUpState:
+    """
+    Tell the ClipUp optimizer the current gradient to get its next state.
+
+    Args:
+        state: The current state of the ClipUp optimizer.
+        follow_grad: Gradient at the current point of the Adam search.
+            Can be a 1-dimensional tensor in the non-batched case,
+            or a multi-dimensional tensor in the batched case.
+    Returns:
+        The updated state of ClipUp with the given gradient applied.
+    """
+    velocity, center = _clipup_step(
+        follow_grad,
+        state.center,
+        state.velocity,
+        state.center_learning_rate,
+        state.momentum,
+        state.max_speed,
+    )
+
+    return ClipUpState(
+        center=center,
+        velocity=velocity,
+        center_learning_rate=state.center_learning_rate,
+        momentum=state.momentum,
+        max_speed=state.max_speed,
+    )
diff --git a/src/evotorch/algorithms/functional/funcpgpe.py b/src/evotorch/algorithms/functional/funcpgpe.py
new file mode 100644
index 00000000..2eddadf7
--- /dev/null
+++ b/src/evotorch/algorithms/functional/funcpgpe.py
@@ -0,0 +1,384 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import Iterable, NamedTuple, Optional, Union
+
+import torch
+
+from ...decorators import expects_ndim
+from ...distributions import (
+    SeparableGaussian,
+    SymmetricSeparableGaussian,
+    make_functional_grad_estimator,
+    make_functional_sampler,
+)
+from ...tools import BatchableScalar, BatchableVector, modify_vector
+
+
+def _make_sample_and_grad_funcs(symmetric: bool) -> tuple:
+    distribution = SymmetricSeparableGaussian if symmetric else SeparableGaussian
+    grad_denominator = "num_directions" if symmetric else "num_solutions"
+
+    required_parameters = ["mu", "sigma"]
+    fixed_parameters = dict(
+        divide_mu_grad_by=grad_denominator,
+        divide_sigma_grad_by=grad_denominator,
+    )
+
+    sample = make_functional_sampler(
+        distribution, required_parameters=required_parameters, fixed_parameters=fixed_parameters
+    )
+
+    grad = make_functional_grad_estimator(
+        distribution, required_parameters=required_parameters, fixed_parameters=fixed_parameters
+    )
+
+    return sample, grad
+
+
+_nonsymmetric_sample, _nonsymmetric_grad = _make_sample_and_grad_funcs(False)
+_symmetic_sample, _symmetric_grad = _make_sample_and_grad_funcs(True)
+
+
+class PGPEState(NamedTuple):
+    optimizer: Union[str, tuple]  # "adam" or (adam, adam_ask, adam_tell)
+    optimizer_state: tuple
+    stdev: torch.Tensor
+    stdev_learning_rate: torch.Tensor
+    stdev_min: torch.Tensor
+    stdev_max: torch.Tensor
+    stdev_max_change: torch.Tensor
+    ranking_method: str
+    maximize: bool
+    symmetric: bool
+
+
+def pgpe(
+    *,
+    center_init: BatchableVector,
+    center_learning_rate: BatchableScalar,
+    stdev_learning_rate: BatchableScalar,
+    objective_sense: str,
+    ranking_method: str = "centered",
+    optimizer: Union[str, tuple] = "clipup",  # or "adam" or "sgd"
+    optimizer_config: Optional[dict] = None,
+    stdev_init: Optional[Union[float, BatchableVector]] = None,
+    radius_init: Optional[Union[float, BatchableScalar]] = None,
+    stdev_min: Optional[Union[float, BatchableVector]] = None,
+    stdev_max: Optional[Union[float, BatchableVector]] = None,
+    stdev_max_change: Optional[Union[float, BatchableVector]] = 0.2,
+    symmetric: bool = True,
+) -> PGPEState:
+    """
+    Get an initial state for the PGPE algorithm.
+
+    The received initial state, a named tuple of type `PGPEState`, is to be
+    passed to the function `pgpe_ask(...)` to receive the solutions belonging
+    to the first generation of the evolutionary search.
+
+    Inspired by the PGPE implementations used in the studies
+    of Ha (2017, 2019), and by the evolution strategy variant of
+    Salimans et al. (2017), this PGPE implementation uses 0-centered
+    ranking by default.
+    The default optimizer for this PGPE implementation is ClipUp
+    (Toklu et al., 2020).
+
+    References:
+
+        Frank Sehnke, Christian Osendorfer, Thomas Ruckstiess,
+        Alex Graves, Jan Peters, Jurgen Schmidhuber (2010).
+        Parameter-exploring Policy Gradients.
+        Neural Networks 23(4), 551-559.
+
+        David Ha (2017). Evolving Stable Strategies.
+        <http://blog.otoro.net/2017/11/12/evolving-stable-strategies/>
+
+        Salimans, T., Ho, J., Chen, X., Sidor, S. and Sutskever, I. (2017).
+        Evolution Strategies as a Scalable Alternative to
+        Reinforcement Learning.
+
+        David Ha (2019). Reinforcement Learning for Improving Agent Design.
+        Artificial life 25 (4), 352-365.
+
+        Toklu, N.E., Liskowski, P., Srivastava, R.K. (2020).
+        ClipUp: A Simple and Powerful Optimizer
+        for Distribution-based Policy Evolution.
+        Parallel Problem Solving from Nature (PPSN 2020).
+
+    Args:
+        center_init: Center (i.e. mean) of the initial search distribution.
+            Expected as a PyTorch tensor with at least 1 dimension.
+            If the given `center` tensor has more than 1 dimensions, the extra
+            leftmost dimensions will be interpreted as batch dimensions.
+        center_learning_rate: Learning rate for when updating the center of the
+            search distribution.
+            For normal cases, this is expected as a scalar. If given as an
+            n-dimensional tensor (where n>0), the extra dimensions will be
+            considered as batch dimensions.
+        stdev_learning_rate: Learning rate for when updating the standard
+            deviation of the search distribution.
+            For normal cases, this is expected as a scalar. If given as an
+            n-dimensional tensor (where n>0), the extra dimensions will be
+            considered as batch dimensions.
+        objective_sense: Expected as a string, either as 'min' or as 'max'.
+            Determines if the goal is to minimize or is to maximize.
+        ranking_method: Determines how the fitnesses will be ranked before
+            computing the gradients. Among the choices are
+            "centered" (a linear ranking where the worst solution gets the rank
+            -0.5 and the best solution gets the rank +0.5),
+            "linear" (a linear ranking where the worst solution gets the rank
+            0 and the best solution gets the rank 1),
+            "nes" (the ranking method that is used by the natural evolution
+            strategies), and
+            "raw" (no ranking).
+        optimizer: Functional optimizer to use when updating the center of the
+            search distribution. The functional optimizer can be expressed via
+            a string, or via a tuple.
+            If given as string, the valid choices are:
+            "clipup" (for the ClipUp optimizer),
+            "adam" (for the Adam optimizer),
+            "sgd" (for regular gradient ascent/descent).
+            If given as a tuple, the tuple should be in the form
+            `(optim, optim_ask, optim_tell)`, where the objects
+            `optim`, `optim_ask`, and `optim_tell` are the functions for
+            initializing the optimizer, asking (for the current search point),
+            and telling (the gradient to follow).
+            The function `optim` should expect keyword arguments for its
+            hyperparameters, and should return a state tuple of the optimizer.
+            The function `optim_ask` should expect the state tuple of the
+            optimizer, and should return the current search point as a tensor.
+            The function `optim_tell` should expect the state tuple of the
+            optimizer as a positional argument, and the gradient via the
+            keyword argument `follow_grad`.
+        optimizer_config: Optionally a dictionary, containing the
+            hyperparameters for the optimizer.
+        stdev_init: Standard deviation of the initial search distribution.
+            If this is given as a scalar `s`, the standard deviation for the
+            search distribution will be interpreted as `[s, s, ..., s]` whose
+            length is the same with the length of `center_init`.
+            If this is given as a 1-dimensional tensor, the given tensor will
+            be interpreted as the standard deviation vector.
+            If this is given as a tensor with at least 2 dimensions, the extra
+            leftmost dimension(s) will be interpreted as batch dimensions.
+            If you wish to express the coverage area of the initial search
+            distribution in terms of "radius" instead, you can leave
+            `stdev_init` as None, and provide a value for the argument
+            `radius_init`.
+        radius_init: Radius for the initial search distribution, representing
+            the euclidean norm for the first standard deviation vector.
+            Setting this value as `r` means that the standard deviation
+            vector will be initialized as a vector `[s, s, ..., s]`
+            whose norm will be equal to `r`. In the non-batched case,
+            `radius_init` is expected as a scalar value.
+            If `radius_init` is given as a tensor with 1 or more
+            dimensions, those dimensions will be considered as batch
+            dimensions. If you wish to express the coverage are of the initial
+            search distribution in terms of the standard deviation values
+            instead, you can leave `radius_init` as None, and provide a value
+            for the argument `stdev_init`.
+        stdev_min: Minimum allowed standard deviation for the search
+            distribution. Can be given as a scalar or as a tensor with one or
+            more dimensions. When given with at least 2 dimensions, the extra
+            leftmost dimensions will be interpreted as batch dimensions.
+        stdev_max: Maximum allowed standard deviation for the search
+            distribution. Can be given as a scalar or as a tensor with one or
+            more dimensions. When given with at least 2 dimensions, the extra
+            leftmost dimensions will be interpreted as batch dimensions.
+        stdev_max_change: Maximum allowed change for the standard deviation
+            vector. If this is given as a scalar, this scalar will serve as a
+            limiter for the change of the entire standard deviation vector.
+            For example, a scalar value of 0.2 means that the elements of the
+            standard deviation vector cannot change more than the 20% of their
+            original values. If this is given as a vector (i.e. as a
+            1-dimensional tensor), each element of `stdev_max_change` will
+            serve as a limiter to its corresponding element within the standard
+            deviation vector. If `stdev_max_change` is given as a tensor with
+            at least 2 dimensions, the extra leftmost dimension(s) will be
+            interpreted as batch dimensions.
+            If you do not wish to have such a limiter, you can leave this as
+            None.
+        symmetric: Whether or not symmetric (i.e. antithetic) sampling will be
+            done while generating a new population.
+    Returns:
+        A named tuple, of type `CEMState`, storing the hyperparameters and the
+        initial state of the cross entropy method.
+    """
+    from .misc import _get_stdev_init, get_functional_optimizer
+
+    center_init = torch.as_tensor(center_init)
+    if center_init.ndim < 1:
+        raise ValueError(
+            "The center of the search distribution for the functional PGPE was expected"
+            " as a tensor with at least 1 dimension."
+            f" However, the encountered `center` is {center_init}, of shape {center_init.shape}."
+        )
+
+    solution_length = center_init.shape[-1]
+    if solution_length == 0:
+        raise ValueError("Solution length cannot be 0")
+
+    stdev_init = _get_stdev_init(center_init=center_init, stdev_init=stdev_init, radius_init=radius_init)
+
+    device = center_init.device
+    dtype = center_init.dtype
+
+    def as_tensor(x) -> torch.Tensor:
+        return torch.as_tensor(x, dtype=dtype, device=device)
+
+    def as_vector_like_center(x: Iterable, vector_name: str) -> torch.Tensor:
+        x = as_tensor(x)
+        if x.ndim == 0:
+            x = x.repeat(solution_length)
+        else:
+            if x.shape[-1] != solution_length:
+                raise ValueError(
+                    f"`{vector_name}` has an incompatible length."
+                    f" The length of `{vector_name}`: {x.shape[-1]},"
+                    f" but the solution length implied by the provided `center_init` is {solution_length}."
+                )
+        return x
+
+    center_learning_rate = as_tensor(center_learning_rate)
+    stdev_learning_rate = as_tensor(stdev_learning_rate)
+
+    if objective_sense == "min":
+        maximize = False
+    elif objective_sense == "max":
+        maximize = True
+    else:
+        raise ValueError(
+            f"`objective_sense` was expected as 'min' or 'max', but it was received as {repr(objective_sense)}"
+        )
+
+    ranking_method = str(ranking_method)
+
+    if stdev_min is None:
+        stdev_min = 0.0
+    stdev_min = as_vector_like_center(stdev_min, "stdev_min")
+
+    if stdev_max is None:
+        stdev_max = float("inf")
+    stdev_max = as_vector_like_center(stdev_max, "stdev_max")
+
+    if stdev_max_change is None:
+        stdev_max_change = float("inf")
+    stdev_max_change = as_vector_like_center(stdev_max_change, "stdev_max_change")
+
+    if optimizer_config is None:
+        optimizer_config = {}
+    optimizer_init_func, _, _ = get_functional_optimizer(optimizer)
+    optimizer_state = optimizer_init_func(
+        center_init=center_init, center_learning_rate=center_learning_rate, **optimizer_config
+    )
+
+    symmetric = bool(symmetric)
+
+    return PGPEState(
+        optimizer=optimizer,
+        optimizer_state=optimizer_state,
+        stdev=stdev_init,
+        stdev_learning_rate=stdev_learning_rate,
+        stdev_min=stdev_min,
+        stdev_max=stdev_max,
+        stdev_max_change=stdev_max_change,
+        ranking_method=ranking_method,
+        maximize=maximize,
+        symmetric=symmetric,
+    )
+
+
+def pgpe_ask(state: PGPEState, *, popsize: int) -> torch.Tensor:
+    """
+    Obtain a population from the PGPE algorithm.
+
+    Args:
+        state: The current state of PGPE.
+        popsize: Number of solutions to be generated for the requested
+            population.
+    Returns:
+        Population, as a tensor of at least 2 dimensions.
+    """
+    from .misc import get_functional_optimizer
+
+    _, optimizer_ask, _ = get_functional_optimizer(state.optimizer)
+    center = optimizer_ask(state.optimizer_state)
+    stdev = state.stdev
+    sample_func = _symmetic_sample if state.symmetric else _nonsymmetric_sample
+    return sample_func(popsize, mu=center, sigma=stdev)
+
+
+@expects_ndim(1, 0, 1)
+def _follow_stdev_grad(
+    original_stdev: torch.Tensor,
+    stdev_learning_rate: torch.Tensor,
+    stdev_grad: torch.Tensor,
+) -> torch.Tensor:
+    return original_stdev + (stdev_learning_rate * stdev_grad)
+
+
+def pgpe_tell(state: PGPEState, values: torch.Tensor, evals: torch.Tensor) -> PGPEState:
+    """
+    Given the old state and the evals (fitnesses), obtain the next state.
+
+    From this state tuple, the center point of the search distribution can be
+    obtained via the field `.optimizer_state.center`.
+
+    Args:
+        state: The old state of the cross entropy method search.
+        values: The most recent population, as a PyTorch tensor.
+        evals: Evaluation results (i.e. fitnesses) for the solutions expressed
+            by `values`. For example, if `values` is shaped `(N, L)`, this means
+            that there are `N` solutions (of length `L`). So, `evals` is
+            expected as a 1-dimensional tensor of length `N`, where `evals[i]`
+            expresses the fitness of the solution `values[i, :]`.
+            If `values` is shaped `(B, N, L)`, then there is also a batch
+            dimension, so, `evals` is expected as a 2-dimensional tensor of
+            shape `(B, N)`.
+    Returns:
+        The new state of PGPE.
+    """
+    from .misc import get_functional_optimizer
+
+    _, optimizer_ask, optimizer_tell = get_functional_optimizer(state.optimizer)
+
+    grad_func = _symmetric_grad if state.symmetric else _nonsymmetric_grad
+    objective_sense = "max" if state.maximize else "min"
+    grads = grad_func(
+        values,
+        evals,
+        mu=optimizer_ask(state.optimizer_state),
+        sigma=state.stdev,
+        objective_sense=objective_sense,
+        ranking_method=state.ranking_method,
+    )
+
+    new_optimizer_state = optimizer_tell(state.optimizer_state, follow_grad=grads["mu"])
+
+    target_stdev = _follow_stdev_grad(state.stdev, state.stdev_learning_rate, grads["sigma"])
+    new_stdev = modify_vector(
+        state.stdev, target_stdev, lb=state.stdev_min, ub=state.stdev_max, max_change=state.stdev_max_change
+    )
+
+    return PGPEState(
+        optimizer=state.optimizer,
+        optimizer_state=new_optimizer_state,
+        stdev=new_stdev,
+        stdev_learning_rate=state.stdev_learning_rate,
+        stdev_min=state.stdev_min,
+        stdev_max=state.stdev_max,
+        stdev_max_change=state.stdev_max_change,
+        ranking_method=state.ranking_method,
+        maximize=state.maximize,
+        symmetric=state.symmetric,
+    )
diff --git a/src/evotorch/algorithms/functional/funcsgd.py b/src/evotorch/algorithms/functional/funcsgd.py
new file mode 100644
index 00000000..f451f4c0
--- /dev/null
+++ b/src/evotorch/algorithms/functional/funcsgd.py
@@ -0,0 +1,130 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import NamedTuple, Optional
+
+import torch
+
+from ...decorators import expects_ndim
+from ...tools import BatchableScalar, BatchableVector
+
+
+class SGDState(NamedTuple):
+    center: torch.Tensor
+    velocity: torch.Tensor
+    center_learning_rate: torch.Tensor
+    momentum: torch.Tensor
+
+
+def sgd(
+    *,
+    center_init: BatchableVector,
+    center_learning_rate: BatchableScalar,
+    momentum: Optional[BatchableScalar] = None,
+) -> SGDState:
+    """
+    Initialize the gradient ascent/descent search and get its initial state.
+
+    Reference regarding the momentum behavior:
+
+        Polyak, B. T. (1964).
+        Some methods of speeding up the convergence of iteration methods.
+        USSR Computational Mathematics and Mathematical Physics, 4(5):1–17.
+
+    Args:
+        center_init: Starting point for the gradient ascent/descent.
+            Expected as a PyTorch tensor with at least 1 dimension.
+            If there are 2 or more dimensions, the extra leftmost dimensions
+            are interpreted as batch dimensions.
+        center_learning_rate: Learning rate (i.e. the step size) for gradient
+            ascent/descent. Can be a scalar or a multidimensional tensor.
+            If given as a tensor with multiple dimensions, those dimensions
+            will be interpreted as batch dimensions.
+        momentum: Momentum coefficient, expected as a scalar.
+            If provided as a scalar, Polyak-style momentum will be enabled.
+            If given as a tensor with multiple dimensions, those dimensions
+            will be interpreted as batch dimensions.
+    """
+    center_init = torch.as_tensor(center_init)
+    dtype = center_init.dtype
+    device = center_init.device
+
+    def as_tensor(x) -> torch.Tensor:
+        return torch.as_tensor(x, dtype=dtype, device=device)
+
+    velocity = torch.zeros_like(center_init)
+    center_learning_rate = as_tensor(center_learning_rate)
+    momentum = as_tensor(0.0) if momentum is None else as_tensor(momentum)
+
+    return SGDState(
+        center=center_init,
+        velocity=velocity,
+        center_learning_rate=center_learning_rate,
+        momentum=momentum,
+    )
+
+
+@expects_ndim(1, 1, 1, 0, 0)
+def _sgd_step(
+    g: torch.Tensor,
+    center: torch.Tensor,
+    velocity: torch.Tensor,
+    center_learning_rate: torch.Tensor,
+    momentum: torch.Tensor,
+) -> tuple:
+    velocity = (momentum * velocity) + (center_learning_rate * g)
+    center = center + velocity
+    return velocity, center
+
+
+def sgd_ask(state: SGDState) -> torch.Tensor:
+    """
+    Get the search point stored by the given `SGDState`.
+
+    Args:
+        state: The current state of gradient ascent/descent.
+    Returns:
+        The search point as a 1-dimensional tensor in the non-batched case,
+        or as a multi-dimensional tensor if the search is batched.
+    """
+    return state.center
+
+
+def sgd_tell(state: SGDState, *, follow_grad: BatchableVector) -> SGDState:
+    """
+    Tell the gradient ascent/descent the current gradient to get its next state.
+
+    Args:
+        state: The current state of gradient ascent/descent.
+        follow_grad: Gradient at the current point of the search.
+            Can be a 1-dimensional tensor in the non-batched case,
+            or a multi-dimensional tensor in the batched case.
+    Returns:
+        The updated state of gradient ascent/descent, with the given gradient
+        applied.
+    """
+    velocity, center = _sgd_step(
+        follow_grad,
+        state.center,
+        state.velocity,
+        state.center_learning_rate,
+        state.momentum,
+    )
+
+    return SGDState(
+        center=center,
+        velocity=velocity,
+        center_learning_rate=state.center_learning_rate,
+        momentum=state.momentum,
+    )
diff --git a/src/evotorch/algorithms/functional/misc.py b/src/evotorch/algorithms/functional/misc.py
new file mode 100644
index 00000000..b0683fa8
--- /dev/null
+++ b/src/evotorch/algorithms/functional/misc.py
@@ -0,0 +1,163 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+from typing import Callable, Iterable, NamedTuple, Optional, Union
+
+import torch
+
+
+class OptimizerFunctions(NamedTuple):
+    initialize: Callable
+    ask: Callable
+    tell: Callable
+
+
+def get_functional_optimizer(optimizer: Union[str, tuple]) -> tuple:
+    """
+    Get a tuple of optimizer-related functions, from the given optimizer name.
+
+    For example, if the given string is "adam", the returned tuple will be
+    `(adam, adam_ask, adam_tell)`, where
+    [adam][evotorch.algorithms.functional.funcadam.adam]
+    is the function that will initialize the Adam optimizer,
+    [adam_ask][evotorch.algorithms.functional.funcadam.adam_ask]
+    is the function that will get the current search point as a tensor, and
+    [adam_tell][evotorch.algorithms.functional.funcadam.adam_tell]
+    is the function that will expect the gradient and will return the updated
+    state of the Adam search after applying the given gradient.
+
+    In addition to "adam", the strings "clipup" and "sgd" are also supported.
+
+    If the given optimizer is a 3-element tuple, then, the three elements
+    within the tuple are assumed to be the initialization, ask, and tell
+    functions of a custom optimizer, and those functions are returned
+    in the same order.
+
+    Args:
+        optimizer: The optimizer name as a string, or a 3-element tuple
+            representing the functions related to the optimizer.
+    Returns:
+        A 3-element tuple in the form
+        `(optimizer, optimizer_ask, optimizer_tell)`, where each element
+        is a function, the first one being responsible for initializing
+        the optimizer and returning its first state.
+    """
+    from .funcadam import adam, adam_ask, adam_tell
+    from .funcclipup import clipup, clipup_ask, clipup_tell
+    from .funcsgd import sgd, sgd_ask, sgd_tell
+
+    if optimizer == "adam":
+        return OptimizerFunctions(initialize=adam, ask=adam_ask, tell=adam_tell)
+    elif optimizer == "clipup":
+        return OptimizerFunctions(initialize=clipup, ask=clipup_ask, tell=clipup_tell)
+    elif optimizer in ("sgd", "sga", "momentum"):
+        return OptimizerFunctions(initialize=sgd, ask=sgd_ask, tell=sgd_tell)
+    elif isinstance(optimizer, str):
+        raise ValueError(f"Unrecognized functional optimizer name: {optimizer}")
+    elif isinstance(optimizer, Iterable):
+        a, b, c = optimizer
+        return OptimizerFunctions(initialize=a, ask=b, tell=c)
+    else:
+        raise TypeError(
+            f"`get_functional_optimizer(...)` received an unrecognized argument: {repr(optimizer)}"
+            f" (of type {type(optimizer)})"
+        )
+
+
+def _get_stdev_init(
+    *,
+    center_init: torch.Tensor,
+    stdev_init: Optional[Union[float, Iterable]] = None,
+    radius_init: Optional[Union[float, Iterable]] = None,
+) -> torch.Tensor:
+    """
+    Internal helper function for getting the standard deviation vector.
+
+    This utility function is used by the functional implementations of
+    the algorithms `cem` and `pgpe`.
+
+    Given a `center_init` tensor and the arguments `stdev_init` and
+    `radius_init`, this helper function returns a `stdev_init` tensor ready
+    to be used by the functional evolutionary algorithm with a compatible
+    dtype, device, and shape.
+
+    Args:
+        center_init: The center point for the initial search distribution.
+        stdev_init: Standard deviation as None, or as a scalar, or as a vector,
+            or as a batch of vectors. If left as None, it will be assumed that
+            the user wishes to express the coverage area of the initial search
+            distribution via `radius_init` instead.
+        radius_init: Radius of the initial search distribution as None, or as a
+            scalar, or as a batch of scalars. If left as None, it will be
+            assumed that the user wishes to express the coverage area of the
+            initial search distribution via `stdev_init` instead.
+    Returns:
+        Standard deviation for the initial search distribution, as a vector
+        in the non-batched case, or as a tensor with 2 or more dimensions
+        in the batched case.
+    """
+
+    if not isinstance(center_init, torch.Tensor):
+        raise TypeError(
+            "While computing/validating the initial standard deviation of the functional search algorithm,"
+            " the argument `center_init` was encountered as something other than a tensor."
+            " The argument `center_init` is expected as a tensor."
+        )
+
+    # Get the dtype and the device of the tensor `center_init`. It will be ensured that the standard deviation tensor
+    # is also of this dtype and on this device.
+    dtype = center_init.dtype
+    device = center_init.device
+
+    # The length of a solution is understood from the `center_init` tensor's rightmost dimension's size.
+    solution_length = center_init.shape[-1]
+
+    if (stdev_init is None) and (radius_init is None):
+        raise ValueError(
+            "Both `stdev_init` and `radius_init` are encountered as None."
+            " Please provide one of them so that the standard deviation of the initial search distribution is clear."
+        )
+    elif (stdev_init is not None) and (radius_init is None):
+        # This is the case where `stdev_init` is given and `radius_init` is omitted.
+        stdev_init = torch.as_tensor(stdev_init, dtype=dtype, device=device)
+        if stdev_init.ndim == 0:
+            # This is the case where `stdev_init` is given as a scalar. We convert it to a vector (by repeating its
+            # original scalar form).
+            stdev_init = stdev_init.repeat(solution_length)
+        else:
+            # This is the case where `stdev_init` is not a scalar. We make sure that its rightmost dimension's size
+            # is compatible with the solution length.
+            if stdev_init.shape[-1] != solution_length:
+                raise ValueError(
+                    "The shape of `stdev_init` does not seem compatible with the shape of `center_init`."
+                    f" The shape of `center_init` is {center_init.shape},"
+                    f" implying a solution length of {solution_length}."
+                    f" However, the shape of `stdev_init` is {stdev_init.shape}."
+                    f" Please ensure that `stdev_init` is either a scalar, or is a vector of length {solution_length},"
+                    f" or is a batch of vectors where the rightmost dimension's size is {solution_length}."
+                )
+        return stdev_init
+    elif (stdev_init is None) and (radius_init is not None):
+        # This is the case where `radius_init` is given and `stdev_init` is omitted.
+        # We make a standard deviation vector (or a batch of standard deviation vectors) such that the norm of the
+        # the vector(s) is/are equal to the given radius value(s).
+        radius_init = torch.as_tensor(radius_init, dtype=dtype, device=device)
+        stdev_element = torch.sqrt((radius_init**2) / solution_length)
+        final_stdev = stdev_element[..., None] * torch.ones(solution_length)
+        return final_stdev
+    else:
+        raise ValueError(
+            "Both `stdev_init` and `radius_init` are encountered as values other than None."
+            " Please specify either `stdev_init` or `radius_init`, but not both."
+        )
diff --git a/src/evotorch/core.py b/src/evotorch/core.py
index fd25fe31..f44d0efc 100644
--- a/src/evotorch/core.py
+++ b/src/evotorch/core.py
@@ -86,6 +86,7 @@ def warn(cls, operation_name: Optional[str] = None):
     is_dtype_object,
     is_real,
     is_sequence,
+    make_batched_false_for_vmap,
     message_from,
     modify_tensor,
     multiply_rows_by_scalars,
@@ -1856,7 +1857,12 @@ def generate_values(self, num_solutions: int) -> Union[torch.Tensor, ObjectArray
             A PyTorch tensor for numeric problems, an ObjectArray for
             non-numeric problems.
         """
-        result = self.make_empty(num_solutions=num_solutions)
+        if self.dtype is object:
+            result = self.make_empty(num_solutions=num_solutions)
+        else:
+            result = torch.empty(tuple(), dtype=self.dtype, device=self.device)
+            result = result.expand(num_solutions, self.solution_length)
+            result = result + make_batched_false_for_vmap(result.device)
         self._fill(result)
         return result
 
@@ -3295,6 +3301,100 @@ def is_on_cpu(self) -> bool:
         """
         return str(self.device) == "cpu"
 
+    def make_callable_evaluator(self, *, obj_index: Optional[int] = None) -> "ProblemBoundEvaluator":
+        """
+        Get a callable evaluator for evaluating the given solutions.
+
+        Let us assume that we have a [Problem][evotorch.core.Problem]
+        declared like this:
+
+        ```python
+        from evotorch import Problem
+
+        my_problem = Problem(
+            "min",
+            fitness_function_goes_here,
+            ...,
+        )
+        ```
+
+        Using the regular API of EvoTorch, one has to generate solutions for this
+        problem as follows:
+
+        ```python
+        population_size = ...
+        my_solutions = my_problem.generate_batch(population_size)
+        ```
+
+        For editing the decision values within the
+        [SolutionBatch][evotorch.core.SolutionBatch] `my_solutions`, one has to
+        do the following:
+
+        ```python
+        new_decision_values = ...
+        my_solutions.set_values(new_decision_values)
+        ```
+
+        Finally, to evaluate `my_solutions`, one would have to do these:
+
+        ```python
+        my_problem.evaluate(my_solutions)
+        fitnesses = my_problem.evals
+        ```
+
+        One could desire a different interface which is more compatible with
+        functional programming paradigm, especially when planning to use the
+        functional algorithms (such as, for example, the functional
+        [cem][evotorch.algorithms.functional.funccem.cem]).
+        To achieve this, one can do the following:
+
+        ```python
+        f = my_problem.make_callable_evaluator()
+        ```
+
+        Now, we have a new object `f`, which behaves like a function.
+        This function-like object expects a tensor of decision values, and
+        returns fitnesses, as shown below:
+
+        ```python
+        random_decision_values = torch.randn(
+            population_size,
+            my_problem.solution_length,
+            dtype=my_problem.dtype,
+        )
+
+        fitnesses = f(random_decision_values)
+        ```
+
+        **Parallelized fitness evaluation.**
+        If a `Problem` object is condifured to use parallelized evaluation with
+        the help of multiple actors, a callable evaluator made out of that
+        `Problem` object will also make use of those multiple actors.
+
+        **Additional batch dimensions.**
+        If a callable evaluator receives a tensor with 3 or more dimensions,
+        those extra leftmost dimensions will be considered as batch
+        dimensions. The returned fitness tensor will also preserve those batch
+        dimensions.
+
+        **Notes on vmap.**
+        `ProblemBoundEvaluator` is a shallow wrapper around a `Problem` object.
+        It does NOT transform the underlying problem object to its stateless
+        counterpart, and therefore it does NOT conform to pure functional
+        programming paradigm. Being stateful, it will NOT work correctly with
+        `vmap`. For batched evaluations, it is recommended to use extra batch
+        dimensions, instead of using `vmap`.
+
+        Args:
+            obj_index: The index of the objective according to which the
+                evaluations will be done. If the problem is single-objective,
+                this is not required. If the problem is multi-objective, this
+                needs to be given as an integer.
+        Returns:
+            A callable fitness evaluator, bound to this problem object.
+        """
+        return ProblemBoundEvaluator(self, obj_index=obj_index)
+
 
 SolutionBatchSliceInfo = NamedTuple("SolutionBatchSliceInfo", source="SolutionBatch", slice=IndicesOrSlice)
 
@@ -4990,3 +5090,81 @@ def to_batch(self) -> SolutionBatch:
             The SolutionBatch counterpart of the Solution.
         """
         return self._batch
+
+
+class ProblemBoundEvaluator:
+    """
+    A callable fitness evaluator, bound to the given `Problem`.
+
+    A callable evaluator returned by the method
+    `Problem.make_callable_evaluator` is an instance of this class.
+    For details, please see the documentation of
+    [Problem][evotorch.core.Problem], and of its method
+    `make_callable_evaluator`.
+    """
+
+    def __init__(self, problem: Problem, *, obj_index: Optional[int] = None):
+        """
+        `__init__(...)`: Initialize the `ProblemBoundEvaluator`.
+
+        Args:
+            problem: The problem object to be wrapped.
+            obj_index: The objective index. Optional if the problem being
+                wrapped is single-objective. If the problem being wrapped
+                is multi-objective, this is expected as an integer.
+        """
+        self._problem = problem
+        if not isinstance(self._problem, Problem):
+            clsname = type(self).__name__
+            raise TypeError(
+                f"In its initialization phase, {clsname} expected a `Problem` object,"
+                f" but found: {repr(self._problem)} (of type {repr(type(self._problem))})"
+            )
+        self._obj_index = self._problem.normalize_obj_index(obj_index)
+        self._problem.ensure_numeric()
+        if problem.dtype != problem.eval_dtype:
+            raise TypeError(
+                "The dtype of the decision values is not the same with the dtype of the evaluations."
+                " Currently, it is not supported to make callable evaluators out of problems whose"
+                " decision value dtypes are different than their evaluation dtypes."
+            )
+
+    def _make_empty_solution_batch(self, popsize: int) -> SolutionBatch:
+        return SolutionBatch(self._problem, popsize=popsize, empty=True, device="meta")
+
+    def _prepare_evaluated_solution_batch(self, values_2d: torch.Tensor) -> SolutionBatch:
+        num_solutions, solution_length = values_2d.shape
+        batch = self._make_empty_solution_batch(num_solutions)
+        batch._data = values_2d
+        batch._evdata = torch.empty_like(batch._evdata, device=values_2d.device)
+        self._problem.evaluate(batch)
+        return batch
+
+    def __call__(self, values: torch.Tensor) -> torch.Tensor:
+        """
+        Evaluate the solutions expressed by the given `values` tensor.
+
+        Args:
+            values: Decision values. Expected as a tensor with at least
+                2 dimensions. If the number of dimensions is more than 2,
+                the extra leftmost dimensions will be considered as batch
+                dimensions.
+        Returns:
+            The fitnesses, as a tensor.
+        """
+        ndim = values.ndim
+        if ndim == 0:
+            clsname = type(self).__name__
+            raise ValueError(
+                f"{clsname} was expecting a tensor with at least 1 dimension for solution evaluation."
+                f" However, it received a scalar: {values}"
+            )
+
+        solution_length = values.shape[-1]
+        original_batch_shape = values.shape[:-1]
+
+        values = values.reshape(-1, solution_length)
+        evaluated_batch = self._prepare_evaluated_solution_batch(values)
+        evals = evaluated_batch.evals[:, self._obj_index]
+
+        return evals.reshape(original_batch_shape).as_subclass(torch.Tensor)
diff --git a/src/evotorch/decorators.py b/src/evotorch/decorators.py
index ac7d298b..796ad215 100644
--- a/src/evotorch/decorators.py
+++ b/src/evotorch/decorators.py
@@ -14,12 +14,19 @@
 
 """Module defining decorators for evotorch."""
 
+from numbers import Number
 from typing import Callable, Iterable, Optional, Union
 
+import numpy as np
 import torch
 
 from .tools import Device
 
+try:
+    from torch.func import vmap
+except ImportError:
+    from functorch import vmap
+
 
 def _simple_decorator(
     decorator: Union[str, Callable], args: Iterable, decorator_name: Optional[str] = None
@@ -604,3 +611,361 @@ def f2(x: torch.Tensor) -> torch.Tensor:
     and will send it `n` solutions, and will receive and process `n` fitnesses.
     """
     return _simple_decorator("__evotorch_vectorized__", args, decorator_name="vectorized")
+
+
+def expects_ndim(  # noqa: C901
+    *expected_ndims,
+    allow_smaller_ndim: bool = False,
+    randomness: str = "error",
+) -> Callable:
+    """
+    Decorator to declare the number of dimensions for each positional argument.
+
+    Let us imagine that we have a function `f(a, b)`, where `a` and `b` are
+    PyTorch tensors. Let us also imagine that the function `f` is implemented
+    in such a way that `a` is assumed to be a 2-dimensional tensor, and `b`
+    is assumed to be a 1-dimensional tensor. In this case, the function `f`
+    can be decorated as follows:
+
+    ```python
+    from evotorch.decorators import expects_ndim
+
+
+    @expects_ndim(2, 1)
+    def f(a: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
+        ...
+    ```
+
+    Once decorated like this, the function `f` will gain the following
+    additional behaviors:
+
+    - If less-than-expected number of dimensions are provided either for
+      `a` or for `b`, an error will be raised (unless the decorator
+      is provided with the keyword argument `allow_smaller_ndim=True`)
+    - If either `a` or `b` are given as tensors that have extra leftmost
+      dimensions, those dimensions will be assumed as batch dimensions,
+      and therefore, the function `f` will run in a vectorized manner
+      (with the help of `vmap` behind the scene), and the result will be
+      a tensor with extra leftmost dimension(s), representing a batch
+      of resulting tensors.
+    - For convenience, numpy arrays and scalar data that are subclasses
+      of `numbers.Number` will be converted to PyTorch tensors first, and
+      then will be processed.
+
+    To be able to take advantage of this decorator, please ensure that the
+    decorated function is a `vmap`-friendly function. Please also ensure
+    that the decorated function expects positional arguments only.
+
+    **Randomness.**
+    Like in `torch.func.vmap`, the behavior of the decorated function in
+    terms of randomness can be configured via a keyword argument named
+    `randomness`:
+
+    ```python
+    @expects_ndim(2, 1, randomness="error")
+    def f(a: torch.Tensor, b: torch.Tensor) -> torch.Tensor:
+        ...
+    ```
+
+    If `randomness` is set as "error", then, when there is batching, any
+    attempt to generate random data using PyTorch will raise an error.
+    If `randomness` is set as "different", then, a random generation
+    operation such as `torch.randn(...)` will produce a `BatchedTensor`,
+    where each batch item has its own re-sampled data.
+    If `randomness` is set as "same", then, a random generation operation
+    such as `torch.randn(...)` will produce a non-batched tensor containing
+    random data that is sampled only once.
+
+    **Alternative usage.**
+    `expects_ndim` has an alternative interface that allows one to use it
+    as a tool for temporarily wrapping/transforming other functions. Let us
+    consider again our example function `f`. Instead of using the decorator
+    syntax, one can do:
+
+    ```python
+    result = expects_ndim(f, (2, 1))(a, b)
+    ```
+
+    which will temporarily wrap the function `f` with the additional behaviors
+    mentioned above, and immediately call it with the arguments `a` and `b`.
+    """
+
+    if (len(expected_ndims) == 2) and isinstance(expected_ndims[0], Callable) and isinstance(expected_ndims[1], tuple):
+        func_to_wrap, expected_ndims = expected_ndims
+        return expects_ndim(*expected_ndims, allow_smaller_ndim=allow_smaller_ndim, randomness=randomness)(func_to_wrap)
+
+    expected_ndims = tuple(
+        (None if expected_arg_ndim is None else int(expected_arg_ndim)) for expected_arg_ndim in expected_ndims
+    )
+
+    def expects_ndim_decorator(fn: Callable):
+        def expects_ndim_decorated(*args):
+            # The inner class below is responsible for accumulating the dtype and device info of the tensors
+            # encountered across the arguments received by the decorated function.
+            # Such dtype and device information will be used if one of the considered arguments is given as a native
+            # scalar object (i.e. float), when converting that native scalar object to a PyTorch tensor.
+            class tensor_info:
+                # At first, we initialize the set of encountered dtype and device info as None.
+                # They will be lazily filled if we ever need such information.
+                encountered_dtypes: Optional[set] = None
+                encountered_devices: Optional[set] = None
+
+                @classmethod
+                def update(cls):
+                    # Collect and fill the dtype and device information if it is not filled yet.
+                    if (cls.encountered_dtypes is None) or (cls.encountered_devices is None):
+                        cls.encountered_dtypes = set()
+                        cls.encountered_devices = set()
+                        for expected_arg_ndim, arg in zip(expected_ndims, args):
+                            if (expected_arg_ndims is not None) and isinstance(arg, torch.Tensor):
+                                # If the argument has a declared expected ndim, and also if it is a PyTorch tensor,
+                                # then we add its dtype and device information to the sets `encountered_dtypes` and
+                                # `encountered_devices`.
+                                cls.encountered_dtypes.add(arg.dtype)
+                                cls.encountered_devices.add(arg.device)
+
+                @classmethod
+                def _get_unique_dtype(cls, error_msg: str) -> torch.dtype:
+                    # Ensure that there is only one `dtype` and return it.
+                    # If there is not exactly one dtype, then raise an error.
+                    if len(cls.encountered_dtypes) == 1:
+                        [dtype] = cls.encountered_dtypes
+                        return dtype
+                    else:
+                        raise TypeError(error_msg)
+
+                @classmethod
+                def _get_unique_device(cls, error_msg: str) -> torch.device:
+                    # Ensure that there is only one `device` and return it.
+                    # If there is not exactly one device, then raise an error.
+                    if len(cls.encountered_devices) == 1:
+                        [device] = cls.encountered_devices
+                        return device
+                    else:
+                        raise TypeError(error_msg)
+
+                @classmethod
+                def convert_scalar_to_tensor(cls, scalar: Number) -> torch.Tensor:
+                    # This class method aims to convert a scalar to a PyTorch tensor.
+                    # The dtype and device of the tensor counterpart of the scalar will be taken from the dtype and
+                    # device information of the other tensors encountered so far.
+
+                    # First, we update the dtype and device information that can be collected from the arguments.
+                    cls.update()
+
+                    # Get the device used by the tensor arguments.
+                    device = cls._get_unique_device(
+                        f"The function decorated with `expects_ndim` received the scalar argument {scalar}."
+                        f" However, this scalar argument cannot be converted to a PyTorch tensor, because it is not"
+                        " clear to which device should this scalar be moved."
+                        " This might happen when none of the other considered arguments is a tensor,"
+                        " or when there are multiple tensor arguments with conflicting devices."
+                        f" Devices encountered across all the considered arguments are: {cls.encountered_devices}."
+                        " To make this error go away, please consider making sure that other tensor arguments have a"
+                        " consistent device, or passing this scalar as a PyTorch tensor so that no conversion is"
+                        " needed."
+                    )
+
+                    if isinstance(scalar, (bool, np.bool_)):
+                        # If the given scalar argument is a boolean, we declare the dtype of its tensor counterpart as
+                        # torch.bool.
+                        dtype = torch.bool
+                    else:
+                        # If the given scalar argument is not a boolean, we declare the dtype of its tensor counterpart
+                        # as the dtype that is observed across the other arguments.
+                        dtype = cls._get_unique_dtype(
+                            f" The function decorated with `expects_ndim` received the scalar argument {scalar}."
+                            " However, this scalar argument cannot be converted to a PyTorch tensor, because it is not"
+                            " clear by which dtype should this scalar be represented in its tensor form."
+                            " This might happen when none of the other considered arguments is a tensor,"
+                            " or when there are multiple tensor arguments with different dtypes."
+                            f" dtypes encountered across all the considered arguments are {cls.encountered_dtypes}."
+                            " To make this error go away, please consider making sure that other tensor arguments have"
+                            " a consistent dtype, or passing this scalar as a PyTorch tensor so that no conversion is"
+                            " needed."
+                        )
+
+                    # Finally, using our new dtype and new device, we convert the scalar to a tensor.
+                    return torch.as_tensor(scalar, dtype=dtype, device=device)
+
+            # First, we want to make sure that each positional argument is a PyTorch tensor.
+            # So, we initialize `new_args` as an empty list, which will be filled with the tensor counterparts
+            # of the original positional arguments.
+            new_args = []
+
+            for i_arg, (expected_arg_ndims, arg) in enumerate(zip(expected_ndims, args)):
+                if (expected_arg_ndims is None) or isinstance(arg, torch.Tensor):
+                    # In this case, either the expected number of dimensions is given as None (indicating that the user
+                    # does not wish any batching nor any conversion for this argument), or the argument is already
+                    # a PyTorch tensor (so, no conversion to tensor needs to be done).
+                    # We do not have to do anything in this case.
+                    pass
+                elif isinstance(arg, (Number, np.bool_)):
+                    # If the argument is a scalar `Number`, we convert it to a PyTorch tensor, the dtype and the device
+                    # of it being determined with the help of the inner class `tensor_info`.
+                    arg = tensor_info.convert_scalar_to_tensor(arg)
+                elif isinstance(arg, np.ndarray):
+                    # If the argument is a numpy array, we convert it to a PyTorch tensor.
+                    arg = torch.as_tensor(arg)
+                else:
+                    # This is the case where an object of an unrecognized type is received. We do not know how to
+                    # process this argument, and, naively trying to convert it to a PyTorch tensor could fail, or
+                    # could generate an unexpected result. So, we raise an error.
+                    raise TypeError(f"Received an argument of unexpected type: {arg} (of type {type(arg)})")
+
+                if (expected_arg_ndims is not None) and (arg.ndim < expected_arg_ndims) and (not allow_smaller_ndim):
+                    # This is the case where the currently analyzed positional argument has less-than-expected number
+                    # of dimensions, and we are not in the allow-smaller-ndim mode. So, we raise an error.
+                    raise ValueError(
+                        f"The argument with index {i_arg} has the shape {arg.shape}, having {arg.ndim} dimensions."
+                        f" However, it was expected as a tensor with {expected_arg_ndims} dimensions."
+                    )
+
+                # At this point, we know that `arg` is a proper PyTorch tensor. So, we add it into `new_args`.
+                new_args.append(arg)
+
+            wrapped_fn = fn
+            num_args = len(new_args)
+            wrapped_ndims = [
+                (None if expected_arg_ndim is None else arg.ndim)
+                for expected_arg_ndim, arg in zip(expected_ndims, new_args)
+            ]
+
+            # The following loop will run until we know that no `vmap` is necessary.
+            while True:
+                # Within each iteration, at first, we assume that `vmap` is not necessary, and therefore, for each
+                # positional argument, the batching dimension is `None` (which means no argument will be batched).
+                needs_vmap = False
+                in_dims = [None for _ in new_args]
+
+                for i_arg in range(num_args):
+                    # For each positional argument with index `i_arg`, we check whether or not there are extra leftmost
+                    # dimensions.
+
+                    if (wrapped_ndims[i_arg] is not None) and (wrapped_ndims[i_arg] > expected_ndims[i_arg]):
+                        # This is the case where the number of dimensions associated with this positional argument is
+                        # greater than its expected number of dimensions.
+
+                        # We take note that there is at least one positional argument which requires `vmap`.
+                        needs_vmap = True
+
+                        # We declare that this argument's batching dimension is 0 (i.e. its leftmost dimension).
+                        in_dims[i_arg] = 0
+
+                        # Now that we marked the leftmost dimension of this argument as the batching dimension, we
+                        # should not consider this dimension in the next iteration of this `while` loop. So, we
+                        # decrease its number of not-yet-handled dimensions by 1.
+                        wrapped_ndims[i_arg] -= 1
+
+                if needs_vmap:
+                    # This is the case where there was at least one positional argument that needs `vmap`.
+                    # Therefore, we wrap the function via `vmap`.
+                    # Note that, after this `vmap` wrapping, if some of the positional arguments still have extra
+                    # leftmost dimensions, another level of `vmap`-wrapping will be done by the next iteration of this
+                    # `while` loop.
+                    wrapped_fn = vmap(wrapped_fn, in_dims=tuple(in_dims), randomness=randomness)
+                else:
+                    # This is the case where no positional argument with extra leftmost dimension was found.
+                    # Either the positional arguments were non-batched to begin with, or the `vmap`-wrapping of the
+                    # previous iterations of this `while` loop were sufficient. Therefore, we are now ready to quit
+                    # this loop.
+                    break
+
+            # Run the `vmap`-wrapped counterpart of the function and return its result
+            return wrapped_fn(*new_args)
+
+        return expects_ndim_decorated
+
+    return expects_ndim_decorator
+
+
+def rowwise(*args, randomness: str = "error") -> Callable:
+    """
+    Decorate a vector-expecting function to make it support batch dimensions.
+
+    To be able to decorate a function via `@rowwise`, the following conditions
+    are required to be satisfied:
+    (i) the function expects a single positional argument, which is a PyTorch
+    tensor;
+    (ii) the function is implemented with the assumption that the tensor it
+    receives is a vector (i.e. is 1-dimensional).
+
+    Let us consider the example below:
+
+    ```python
+    @rowwise
+    def f(x: torch.Tensor) -> torch.Tensor:
+        return torch.sum(x**2)
+    ```
+
+    Notice how the implementation of the function `f` assumes that its argument
+    `x` is 1-dimensional, and based on that assumption, omits the `dim`
+    keyword argument when calling `torch.sum(...)`.
+
+    Upon receiving a 1-dimensional tensor, this decorated function `f` will
+    perform its operations on the vector `x`, like how it would work without
+    the decorator `@rowwise`.
+    Upon receiving a 2-dimensional tensor, this decorated function `f` will
+    perform its operations on each row of `x`.
+    Upon receiving a tensor with 3 or more dimensions, this decorated function
+    `f` will interpret its input as a batch of matrices, and perform its
+    operations on each matrix within the batch.
+
+    **Defining fitness functions for Problem objects.**
+    The decorator `@rowwise` can be used for defining a fitness function for a
+    [Problem][evotorch.core.Problem] object. The advantage of doing so is to be
+    able to implement the fitness function with the simple assumption that the
+    input is a vector (that stores decision values for a single solution),
+    and the output is a scalar (that represents the fitness of the solution).
+    The decorator `@rowwise` also flags the decorated function (like
+    `@vectorized` does), so, the fitness function is used correctly by the
+    `Problem` instance, in a vectorized manner. See the example below:
+
+    ```python
+    @rowwise
+    def fitness(decision_values: torch.Tensor) -> torch.Tensor:
+        return torch.sqrt(torch.sum(decision_values**2))
+
+
+    my_problem = Problem("min", fitness, ...)
+    ```
+
+    In the example above, thanks to the decorator `@rowwise`, `my_problem` will
+    use `fitness` in a vectorized manner when evaluating a `SolutionBatch`,
+    even though `fitness` is defined in terms of a single solution.
+
+    **Randomness.**
+    Like in `torch.func.vmap`, the behavior of the decorated function in
+    terms of randomness can be configured via a keyword argument named
+    `randomness`:
+
+    ```python
+    @rowwise(randomness="error")
+    def f(x: torch.Tensor) -> torch.Tensor:
+        ...
+    ```
+
+    If `randomness` is set as "error", then, when there is batching, any
+    attempt to generate random data using PyTorch will raise an error.
+    If `randomness` is set as "different", then, a random generation
+    operation such as `torch.randn(...)` will produce a `BatchedTensor`,
+    where each batch item has its own re-sampled data.
+    If `randomness` is set as "same", then, a random generation operation
+    such as `torch.randn(...)` will produce a non-batched tensor containing
+    random data that is sampled only once.
+    """
+    num_args = len(args)
+
+    if num_args == 0:
+        immediately_decorate = False
+    elif num_args == 1:
+        immediately_decorate = True
+    else:
+        raise TypeError("`rowwise` received invalid number of positional arguments")
+
+    def decorator(fn: Callable) -> Callable:  # <- inner decorator
+        decorated = expects_ndim(fn, (1,), randomness=randomness)
+        decorated.__evotorch_vectorized__ = True
+        return decorated
+
+    return decorator(args[0]) if immediately_decorate else decorator
diff --git a/src/evotorch/distributions.py b/src/evotorch/distributions.py
index b5ee11a6..5af92141 100644
--- a/src/evotorch/distributions.py
+++ b/src/evotorch/distributions.py
@@ -14,11 +14,19 @@
 
 import math
 from copy import copy
-from typing import Any, Callable, Iterable, Optional, Union
+from typing import Any, Callable, Iterable, NamedTuple, Optional, Type, Union
 
 import torch
-
-from .tools import Device, DType, cast_tensors_in_container, device_of_container, dtype_of_container
+from torch.func import vmap
+
+from .tools import (
+    Device,
+    DType,
+    cast_tensors_in_container,
+    device_of_container,
+    dtype_of_container,
+    make_batched_false_for_vmap,
+)
 from .tools import multiply_rows_by_scalars as dot
 from .tools import rowwise_sum as total
 from .tools import to_torch_dtype
@@ -34,6 +42,9 @@ class Distribution(TensorMakerMixin, Serializable):
 
     MANDATORY_PARAMETERS = set()
     OPTIONAL_PARAMETERS = set()
+    PARAMETER_NDIMS = {}
+
+    functional_sample = NotImplemented
 
     def __init__(
         self, *, solution_length: int, parameters: dict, dtype: Optional[DType] = None, device: Optional[Device] = None
@@ -179,6 +190,7 @@ def sample(
         elif (num_solutions is not None) and (out is None):
             num_solutions = int(num_solutions)
             out = self.make_empty(num_solutions=num_solutions)
+            out = out + make_batched_false_for_vmap(out.device)
         elif (num_solutions is None) and (out is not None):
             if out.ndim != 2:
                 raise ValueError(
@@ -401,6 +413,50 @@ class SeparableGaussian(Distribution):
 
     MANDATORY_PARAMETERS = {"mu", "sigma"}
     OPTIONAL_PARAMETERS = {"divide_mu_grad_by", "divide_sigma_grad_by", "parenthood_ratio"}
+    PARAMETER_NDIMS = {"mu": 1, "sigma": 1}
+
+    @classmethod
+    def _unbatched_functional_sample(cls, num_solutions: int, mu: torch.Tensor, sigma: torch.Tensor) -> torch.Tensor:
+        [L] = mu.shape
+        [sigma_L] = sigma.shape
+        if L != sigma_L:
+            raise ValueError(f"The lengths of `mu` ({L}) and `sigma` ({sigma_L}) do not match.")
+        mu = mu.expand(int(num_solutions), L)
+        return torch.normal(mu, sigma)
+
+    @classmethod
+    def functional_sample(cls, num_solutions: int, parameters: dict) -> torch.Tensor:
+        """
+        Sample and return separable Gaussian noise
+
+        This is a static utility method, which allows one to sample separable
+        Gaussian noise, without having to instantiate the distribution class
+        `SeparableGaussian`.
+
+        Args:
+            num_solutions: Number of solutions (or 1-dimensional tensors)
+                that will be sampled.
+            parameters: A parameter dictionary. Within this parameter
+                dictionary, the item `mu` is expected to store the mean, and
+                the item `sigma` is expected to store the standard deviation,
+                each in the form of a 1-dimensional tensor.
+        Returns:
+            Sampled separable Gaussian noise, as a PyTorch tensor.
+            If `mu` and/or `sigma` was given as tensors with 2 or more
+            dimensions (instead of only 1 dimension), the extra leftmost
+            dimensions will be interpreted as batch dimensions, and therefore,
+            this returned tensor will also have batch dimensions.
+        """
+        from .decorators import expects_ndim
+
+        for k in parameters.keys():
+            if (k not in cls.MANDATORY_PARAMETERS) and (k not in cls.OPTIONAL_PARAMETERS):
+                raise ValueError(f"{cls.__name__} encountered an unrecognized parameter: {repr(k)}")
+        mu = parameters["mu"]
+        sigma = parameters["sigma"]
+        return expects_ndim(cls._unbatched_functional_sample, (None, 1, 1), randomness="different")(
+            num_solutions, mu, sigma
+        )
 
     def __init__(
         self,
@@ -556,13 +612,93 @@ def relative_entropy(dist_0: "SeparableGaussian", dist_1: "SeparableGaussian") -
 
 
 class SymmetricSeparableGaussian(SeparableGaussian):
-    """
-    Symmetric (antithetic) separable Gaussian distribution
-    as used by PGPE.
+    r"""
+    Symmetric (antithetic) separable Gaussian distribution as used by PGPE.
+
+    For example, if the desired number of samples (or number of solutions,
+    provided via the argument `num_solutions`) is 6, 3 "directions" will
+    be sampled. Each direction is a pair of solutions, where one of the
+    solutions is the center vector plus perturbation, and the other
+    solution is the center vector minus the same perturbation. Therefore,
+    such a symmetric population of size 6 looks like this:
+
+    ```
+                                                   ___
+    solution[0]: center + sampled_perturbation[0]     \
+                                                       >  direction0
+    solution[1]: center - sampled_perturbation[1]  ___/
+
+                                                   ___
+    solution[2]: center + sampled_perturbation[2]     \
+                                                       >  direction1
+    solution[3]: center - sampled_perturbation[3]  ___/
+
+                                                   ___
+    solution[4]: center + sampled_perturbation[4]     \
+                                                       >  direction2
+    solution[5]: center - sampled_perturbation[5]  ___/
+    ```
     """
 
     MANDATORY_PARAMETERS = {"mu", "sigma"}
     OPTIONAL_PARAMETERS = {"divide_mu_grad_by", "divide_sigma_grad_by", "parenthood_ratio"}
+    PARAMETER_NDIMS = {"mu": 1, "sigma": 1}
+
+    @classmethod
+    def _unbatched_functional_sample(cls, num_solutions: int, mu: torch.Tensor, sigma: torch.Tensor) -> torch.Tensor:
+        zeros = torch.zeros_like(mu)
+        num_solutions = int(num_solutions)
+        if (num_solutions % 2) != 0:
+            raise ValueError(
+                f"Number of solutions to be sampled from {cls.__name__} must be an even number."
+                f" However, the encountered `num_solutions` is {num_solutions}."
+            )
+        num_directions = num_solutions // 2
+
+        positive_ends = SeparableGaussian._unbatched_functional_sample(num_directions, zeros, sigma)
+        negative_ends = -positive_ends
+
+        positive_ends += mu
+        negative_ends += mu
+
+        combined_samples = vmap(torch.stack)([positive_ends, negative_ends]).reshape(-1, positive_ends.shape[-1])
+
+        return combined_samples
+
+    @classmethod
+    def functional_sample(cls, num_solutions: int, parameters: dict) -> torch.Tensor:
+        """
+        Sample and return symmetric separable Gaussian noise
+
+        This is a static utility method, which allows one to sample symmetric
+        separable Gaussian noise, without having to instantiate the
+        distribution class `SymmetricSeparableGaussian`.
+
+        Args:
+            num_solutions: Number of solutions (or 1-dimensional tensors)
+                that will be sampled. Note that, since this distribution is
+                symmetric, `num_solutions` must be even.
+            parameters: A parameter dictionary. Within this parameter
+                dictionary, the item `mu` is expected to store the mean, and
+                the item `sigma` is expected to store the standard deviation,
+                each in the form of a 1-dimensional tensor.
+        Returns:
+            Sampled symmetric separable Gaussian noise, as a PyTorch tensor.
+            If `mu` and/or `sigma` was given as tensors with 2 or more
+            dimensions (instead of only 1 dimension), the extra leftmost
+            dimensions will be interpreted as batch dimensions, and therefore,
+            this returned tensor will also have batch dimensions.
+        """
+        from .decorators import expects_ndim
+
+        for k in parameters.keys():
+            if (k not in cls.MANDATORY_PARAMETERS) and (k not in cls.OPTIONAL_PARAMETERS):
+                raise ValueError(f"{cls.__name__} encountered an unrecognized parameter: {repr(k)}")
+        mu = parameters["mu"]
+        sigma = parameters["sigma"]
+        return expects_ndim(cls._unbatched_functional_sample, (None, 1, 1), randomness="different")(
+            num_solutions, mu, sigma
+        )
 
     def _fill(self, out: torch.Tensor, *, generator: Optional[torch.Generator] = None):
         self.make_gaussian(out=out, center=self.mu, stdev=self.sigma, symmetric=True, generator=generator)
@@ -636,10 +772,11 @@ def _compute_gradients(
 
 
 class ExpSeparableGaussian(SeparableGaussian):
-    """exponentialseparable Multivariate Gaussian, as used by SNES"""
+    """Exponential Separable Multivariate Gaussian, as used by SNES"""
 
     MANDATORY_PARAMETERS = {"mu", "sigma"}
     OPTIONAL_PARAMETERS = set()
+    PARAMETER_NDIMS = {"mu": 1, "sigma": 1}
 
     def _compute_gradients(self, samples: torch.Tensor, weights: torch.Tensor, ranking_used: Optional[str]) -> dict:
         if ranking_used != "nes":
@@ -672,7 +809,7 @@ def update_parameters(
 
 
 class ExpGaussian(Distribution):
-    """exponential Multivariate Gaussian, as used by XNES"""
+    """Exponential Multivariate Gaussian, as used by XNES"""
 
     # Corresponding to mu and A in symbols used in xNES paper
     MANDATORY_PARAMETERS = {"mu", "sigma"}
@@ -680,6 +817,8 @@ class ExpGaussian(Distribution):
     # Inverse of sigma, numerically more stable to track this independently to sigma
     OPTIONAL_PARAMETERS = {"sigma_inv"}
 
+    PARAMETER_NDIMS = {"mu": 1, "sigma": 2, "sigma_inv": 2}
+
     def __init__(
         self,
         parameters: dict,
@@ -873,3 +1012,611 @@ def update_parameters(
 
         # Return modified distribution
         return self.modified_copy(mu=new_mu, sigma=new_A, sigma_inv=new_A_inv)
+
+
+def _get_param_ndim(distribution_class: Type, param_name: str) -> int:
+    return distribution_class.PARAMETER_NDIMS.get(param_name, None)
+
+
+class FunctionalSampler:
+    """
+    Represents a sampler returned by `make_functional_sampler`.
+
+    Please see the documentation of
+    [make_functional_sampler][evotorch.distributions.make_functional_sampler].
+    """
+
+    def __init__(
+        self, distribution_class: Type, *, required_parameters: Iterable, fixed_parameters: Optional[dict] = None
+    ):
+        from .decorators import expects_ndim
+
+        self.__distribution_class = distribution_class
+        self.__required_parameters = [str(element) for element in required_parameters]
+        if len(self.__required_parameters) == 0:
+            raise TypeError("`required_parameters` cannot be empty")
+        self.__required_param_pos = {
+            required_parameter: i_parameter for i_parameter, required_parameter in enumerate(self.__required_parameters)
+        }
+        self.__num_required_parameters = len(self.__required_parameters)
+
+        self.__fixed_parameters = {} if fixed_parameters is None else fixed_parameters
+        self.__sample_batch = expects_ndim(
+            self.__sample,
+            (None,) + tuple(_get_param_ndim(distribution_class, p) for p in self.__required_parameters),
+            randomness="different",
+        )
+
+    def __sample(self, num_solutions: int, *parameters) -> torch.Tensor:
+        parameters = {**dict(zip(self.__required_parameters, parameters)), **(self.__fixed_parameters)}
+        if self.__distribution_class.functional_sample is NotImplemented:
+            distribution = self.__distribution_class(parameters)
+            return distribution.sample(num_solutions)
+        else:
+            return self.__distribution_class.functional_sample(num_solutions, parameters)
+
+    def __call__(self, num_samples: int, *parameter_args, **parameter_kwargs) -> torch.Tensor:
+        num_parameter_args = len(parameter_args)
+        num_parameter_kwargs = len(parameter_kwargs)
+        if (num_parameter_args == 0) and (num_parameter_kwargs == self.__num_required_parameters):
+            parameters = [None] * self.__num_required_parameters
+            for parameter_name, parameter_value in parameter_kwargs.items():
+                parameter_pos = self.__required_param_pos[parameter_name]
+                parameters[parameter_pos] = parameter_value
+        elif (num_parameter_args == self.__num_required_parameters) and (num_parameter_kwargs == 0):
+            parameters = parameter_args
+        elif (num_parameter_args == 0) and (num_parameter_kwargs == 0):
+            raise TypeError("Missing parameter arguments")
+        elif (num_parameter_args > 0) and (num_parameter_kwargs > 0):
+            raise TypeError(
+                "Specifying some of the distribution parameters via positional arguments and some others"
+                " via keyword arguments is not supported."
+                " Please provide the distribution parameters only via positional arguments"
+                " or only via keyword arguments."
+            )
+        else:
+            raise TypeError("Invalid number of arguments")
+        return self.__sample_batch(num_samples, *parameters)
+
+
+def make_functional_sampler(
+    distribution_class, *, required_parameters: Iterable, fixed_parameters: Optional[dict] = None
+) -> Callable:
+    """
+    Make a stateless function that samples from a distribution.
+
+    This function is meant to be used when one wants to follow the functional
+    programming paradigm.
+
+    As an example, let us imagine that we are interested in sampling from the
+    distribution `SymmetricSeparableGaussian`. A sampler function out of this
+    distribution can be created like this:
+
+    ```python
+    from evotorch.distributions import SymmetricSeparableGaussian, make_functional_sampler
+
+    get_samples = make_functional_sampler(
+        SymmetricSeparableGaussian,
+        required_parameters=["mu", "sigma"],
+    )
+    ```
+
+    Now we have a function `get_samples()`, which can be used like this:
+
+    ```python
+    number_of_samples = ...  # an integer representing the desired number of samples
+    mu = ...  # a one-dimensional tensor
+    sigma = ...  # a one-dimensional tensor
+
+    my_samples = get_samples(number_of_samples, mu, sigma)
+    ```
+
+    Alternatively, the parameters of the distribution can be specified via
+    keyword arguments, like this:
+
+    ```python
+    my_samples = get_samples(number_of_samples, mu=mu, sigma=sigma)
+    ```
+
+    **Batched sampling.**
+    A functional sampler can be further transformed via `torch.func.vmap(...)`
+    for creating batched samples.
+
+    As an alternative to `vmap`, a functional sampler has built-in support for
+    batched sampling (which actually still uses `vmap` internally).
+    Let us again consider our example sampler `get_samples`.
+    In this example, the parameters `mu` and `sigma` would be 1-dimensional
+    tensors in the non-batched case (because the distribution
+    `SymmetricSeparableGaussian` expects the parameters `mu` and `sigma` as
+    1-dimensional tensors, as can be observed from the class attribute
+    `SymmetricSeparableGaussian.PARAMETER_NDIMS`).
+    If `get_samples` is given `mu` and/or `sigma` with more than 1 dimensions,
+    those extra leftmost dimensions will be considered as batch dimensions,
+    and therefore the resulting sample tensor will have extra leftmost
+    dimensions too.
+
+    **Declaring fixed parameters.**
+    A functional sampler can be created in such a way that some of the
+    parameters are pre-defined and only a subset of the mandatory parameters
+    are expected via arguments. For example, a sampler that samples from
+    `SymmetricSeparableGaussian` with a fixed sigma can be defined like this:
+
+    ```python
+    predefined_sigma = ...  # The constant sigma goes here
+
+    get_samples2 = make_functional_sampler(
+        SymmetricSeparableGaussian,
+        required_parameters=["mu"],
+        fixed_parameters={"sigma": predefined_sigma},
+    )
+    ```
+
+    The function `get_samples2` can be called like this:
+
+    ```python
+    number_of_samples = ...  # an integer, representing the desired number of samples
+    mu = ...  # a one-dimensional tensor
+
+    # or like this:
+    # my_samples2 = get_samples2(number_of_samples, mu=mu)
+    ```
+
+    **How the functional sampler uses its wrapped distribution.**
+    If the wrapped distribution class has a static method with the signature
+    `functional_sample(num_solutions: int, parameters: dict) -> torch.Tensor`
+    (which expects `num_solutions` as the number of solutions/samples
+    and `parameters` as the parameter dictionary), this functional sampler
+    will use that static method to obtain and return the samples.
+    On the other hand, if the wrapped distribution class declares its class
+    attribute `functional_sample = NotImplemented`, then, the wrapped
+    distribution class will be temporarily instantiated, and then, the
+    `sample()` method of this instance will be used to generate and return
+    the samples.
+
+    Args:
+        distribution_class: A class that inherits from the base class
+            [Distribution][evotorch.distributions.Distribution].
+        required_parameters: A list of strings, each string being the name
+            of a distribution parameter. The order of this list determines
+            the order of parameter-related positional arguments in the
+            returned callable object.
+        fixed_parameters: A dictionary where the keys are parameter names
+            (as strings), and the values are pre-defined parameter values.
+    Returns:
+        A callable object whose function is to return samples from the
+        specified distribution.
+    """
+    return FunctionalSampler(
+        distribution_class, required_parameters=required_parameters, fixed_parameters=fixed_parameters
+    )
+
+
+class GradsWithSamplesAndFitnesses(NamedTuple):
+    grads: torch.Tensor
+    samples: torch.Tensor
+    fitnesses: torch.Tensor
+
+
+class GradsWithSamples(NamedTuple):
+    grads: torch.Tensor
+    samples: torch.Tensor
+
+
+class GradsWithFitnesses(NamedTuple):
+    grads: torch.Tensor
+    fitnesses: torch.Tensor
+
+
+class FunctionalGradEstimator:
+    """
+    Represents the callable object returned by `make_functional_grad_estimator`.
+
+    Please see the documentation of
+    [make_functional_grad_estimator][evotorch.distributions.make_functional_grad_estimator]
+    """
+
+    def __init__(
+        self,
+        distribution_class: Type,
+        *,
+        function: Optional[Callable] = None,
+        objective_sense: str,
+        required_parameters: Iterable,
+        fixed_parameters: Optional[dict] = None,
+        ranking_method: Optional[str] = None,
+        return_samples: bool = False,
+        return_fitnesses: bool = False,
+    ):
+        from .decorators import expects_ndim
+
+        self.__function = function
+        self.__objective_sense = None if objective_sense is None else str(objective_sense)
+        self.__distribution_class = distribution_class
+        self.__required_parameters = [str(element) for element in required_parameters]
+        if len(self.__required_parameters) == 0:
+            raise TypeError("`required_parameters` cannot be empty")
+        self.__required_param_pos = {
+            required_parameter: i_parameter for i_parameter, required_parameter in enumerate(self.__required_parameters)
+        }
+        self.__num_required_parameters = len(self.__required_parameters)
+
+        self.__fixed_parameters = {} if fixed_parameters is None else fixed_parameters
+        self.__return_samples = bool(return_samples)
+        self.__return_fitnesses = bool(return_fitnesses)
+        self.__ranking_method = None if ranking_method is None else str(ranking_method)
+
+        if self.__function is None:
+            leftmost_ndims = (None, None, 2, 1)
+        else:
+            leftmost_ndims = (
+                None,
+                None,
+                None,
+            )
+
+        self.__grad_batch = expects_ndim(
+            self.__grad,
+            leftmost_ndims + tuple(_get_param_ndim(distribution_class, p) for p in self.__required_parameters),
+            randomness="different",
+        )
+
+    def __grad(self, *args) -> torch.Tensor:
+        objective_sense = args[0]
+        ranking_method = args[1]
+
+        if self.__function is None:
+            samples = args[2]
+            fitnesses = args[3]
+            vectors = args[4:]
+            [num_solutions, _] = samples.shape
+            [num_fitnesses] = fitnesses.shape
+            if num_solutions != num_fitnesses:
+                raise ValueError("The length of the fitness vector does not match the number of samples")
+        else:
+            num_solutions = args[2]
+            vectors = args[3:]
+            samples = None
+            fitnesses = None
+
+        parameters = {**dict(zip(self.__required_parameters, vectors)), **(self.__fixed_parameters)}
+        distribution = self.__distribution_class(parameters)
+
+        if samples is None:
+            samples = distribution.sample(num_solutions)
+            fitnesses = self.__function(samples)
+
+        grads = distribution.compute_gradients(
+            samples, fitnesses, objective_sense=objective_sense, ranking_method=ranking_method
+        )
+
+        if self.__return_samples and self.__return_fitnesses:
+            return GradsWithSamplesAndFitnesses(grads=grads, samples=samples, fitnesses=fitnesses)
+        elif self.__return_samples:
+            return GradsWithSamples(grads=grads, samples=samples)
+        elif self.__return_fitnesses:
+            return GradsWithFitnesses(grads=grads, fitnesses=fitnesses)
+        else:
+            return grads
+
+    def __call__(self, *args, **parameter_kwargs) -> torch.Tensor:
+        parameters_need_filtering = False
+        if "objective_sense" in parameter_kwargs:
+            objective_sense = parameter_kwargs["objective_sense"]
+            parameters_need_filtering = True
+        else:
+            objective_sense = self.__objective_sense
+            if self.__objective_sense is None:
+                raise ValueError(
+                    "The gradient estimator was not given an `objective_sense`, neither at its phase of initialization,"
+                    " nor when it got called."
+                )
+
+        if "ranking_method" in parameter_kwargs:
+            ranking_method = parameter_kwargs["ranking_method"]
+            parameters_need_filtering = True
+        else:
+            ranking_method = self.__ranking_method
+
+        if parameters_need_filtering:
+            parameter_kwargs = {
+                k: v for k, v in parameter_kwargs.items() if k not in ("objective_sense", "ranking_method")
+            }
+
+        if self.__function is None:
+            samples = args[0]
+            fitnesses = args[1]
+            num_solutions = None
+            parameter_args = args[2:]
+        else:
+            samples = None
+            fitnesses = None
+            num_solutions = args[0]
+            parameter_args = args[1:]
+
+        num_parameter_args = len(parameter_args)
+        num_parameter_kwargs = len(parameter_kwargs)
+        if (num_parameter_args == 0) and (num_parameter_kwargs == self.__num_required_parameters):
+            parameters = [None] * self.__num_required_parameters
+            for parameter_name, parameter_value in parameter_kwargs.items():
+                parameter_pos = self.__required_param_pos[parameter_name]
+                parameters[parameter_pos] = parameter_value
+        elif (num_parameter_args == self.__num_required_parameters) and (num_parameter_kwargs == 0):
+            parameters = parameter_args
+        elif (num_parameter_args == 0) and (num_parameter_kwargs == 0):
+            raise TypeError("Missing parameter arguments")
+        elif (num_parameter_args > 0) and (num_parameter_kwargs > 0):
+            raise TypeError(
+                "Specifying some of the distribution parameters via positional arguments and some others"
+                " via keyword arguments is not supported."
+                " Please provide the distribution parameters only via positional arguments"
+                " or only via keyword arguments."
+            )
+        else:
+            raise TypeError("Invalid number of arguments")
+
+        if self.__function is None:
+            return self.__grad_batch(objective_sense, ranking_method, samples, fitnesses, *parameters)
+        else:
+            return self.__grad_batch(objective_sense, ranking_method, num_solutions, *parameters)
+
+
+def make_functional_grad_estimator(
+    distribution_class: Type,
+    *,
+    required_parameters: Iterable,
+    function: Optional[Callable] = None,
+    objective_sense: Optional[str] = None,
+    fixed_parameters: Optional[dict] = None,
+    ranking_method: Optional[str] = None,
+    return_samples: bool = False,
+    return_fitnesses: bool = False,
+) -> Callable:
+    """
+    Make a stateless gradient estimator function.
+
+    The returned function estimates gradients for the parameters of the
+    specified distribution, either with the help of a fitness function,
+    or with the help of a pair of tensors representing the samples
+    (or solutions) and their associated fitnesses.
+
+    **Usage 1: with the help of a fitness function.**
+    Let us assume that we have a fitness function `f`, which receives a
+    matrix (i.e. 2-dimensional tensor) and returns a vector (i.e. a
+    1-dimensional tensor), where each the i-th row of the matrix represents
+    the i-th solution, and i-th element of the returned vector represents
+    the fitness of the i-th solution.
+    A functional gradient estimator for this function can be created like
+    this:
+
+    ```python
+    from evotorch.distributions import (
+        SymmetricSeparableGaussian,
+        make_functional_grad_estimator,
+    )
+
+
+    def f(x: torch.Tensor) -> torch.Tensor:
+        ...
+
+
+    fgrad = make_functional_grad_estimator(
+        # The gradient estimator will use this distribution:
+        SymmetricSeparableGaussian,
+        # The gradient estimator will be bound to this fitness function:
+        function=f,
+        # We want to maximize the fitnesses returned by `f`
+        # (use "min" for minimizing them)
+        objective_sense="max",
+        # The distribution parameters "mu" and "sigma" are to be passed
+        # as arguments every time we call it as a function.
+        required_parameters=["mu", "sigma"],
+        # The fitnesses will be ranked according to this method:
+        ranking_method="centered",  # the default is "raw"
+    )
+    ```
+
+    Now that we have our gradient estimator `fgrad`, we can use it as a
+    function:
+
+    ```python
+    current_mu = ...  # mu parameter vector
+    current_sigma = ...  # sigma parameter vector
+    num_samples = ...  # number of samples (temporary population size)
+
+    gradients = fgrad(num_samples, current_mu, current_sigma)
+    # or, alternatively:
+    # gradients = fgrad(num_samples, mu=current_mu, sigma=current_sigma)
+
+    # At this point, we have our `gradients`, which is in the form of a
+    # dictionary. Gradients for the parameters mu and sigma can be obtained
+    # from this dictionary like this:
+    grad_for_mu = gradients["mu"]
+    grad_for_sigma = gradients["sigma"]
+    ```
+
+    **Usage 2: without an explicit fitness function.**
+    Let us imagine a scenario where the procedure of computing the fitnesses
+    is not so straightforward and therefore it is not possible to wrap it
+    within a single fitness function. For such cases, we can create and use a
+    gradient estimator that is not bound to any such fitness function:
+
+    ```python
+    from evotorch.distributions import (
+        SymmetricSeparableGaussian,
+        make_functional_sampler,
+        make_functional_grad_estimator,
+    )
+
+    estimate_grads = make_functional_grad_estimator(
+        # The gradient estimator will use this distribution:
+        SymmetricSeparableGaussian,
+        # We want to maximize the fitnesses (use "min" for minimizing them)
+        objective_sense="max",
+        # The distribution parameters "mu" and "sigma" are to be passed
+        # as arguments every time we call it as a function.
+        required_parameters=["mu", "sigma"],
+        # The fitnesses will be ranked according to this method:
+        ranking_method="centered",  # the default is "raw"
+    )
+    ```
+
+    Note that without being bound to any fitness function, `estimate_grad`
+    will ask us to provide the samples and the fitnesses. A practical way
+    of obtaining such samples is to have a functional sampler:
+
+    ```python
+    get_samples = make_functional_sampler(
+        SymmetricSeparableGaussian,
+        required_parameters=["mu", "sigma"],
+    )
+    ```
+
+    Now we are ready to use our sampler and our estimator:
+
+    ```python
+    current_mu = ...  # mu parameter vector
+    current_sigma = ...  # sigma parameter vector
+    num_samples = ...  # number of samples (temporary population size)
+
+    samples = get_samples(num_samples, current_mu, current_sigma)
+    # or, alternatively:
+    # samples = get_samples(num_samples, mu=current_mu, sigma=current_sigma)
+
+    fitnesses = ...  # code to compute fitnesses from the samples goes here
+
+    gradients = estimate_grads(samples, fitnesses, current_mu, current_sigma)
+    # or, alternatively:
+    # gradients = estimate_grads(
+    #     samples, fitnesses, mu=current_mu, sigma=current_sigma
+    # )
+
+    # At this point, we have our `gradients`, which is in the form of a
+    # dictionary. Gradients for the parameters mu and sigma can be obtained
+    # from this dictionary like this:
+    grad_for_mu = gradients["mu"]
+    grad_for_sigma = gradients["sigma"]
+    ```
+
+    **Batched gradient estimation.**
+    The function returned by `make_functional_grad_estimator` is compatible
+    with `vmap`. If the estimator is bound to a specific fitness function,
+    that fitness function should also be compatible with `vmap`. If the
+    fitness function is not `vmap`-compatible, or if its behavior is
+    unexpected in the presence of `vmap`, then, consider instantiating
+    the gradient estimator without binding it to a fitness function.
+
+    As an alternative to `vmap`, a functional sampler has built-in support for
+    batched sampling (which actually still uses `vmap` internally).
+    Let us again consider our example estimator, `estimate_grads`.
+    In this example, the parameters `current_mu` and `current_sigma` would be
+    1-dimensional tensors in the non-batched case (because the distribution
+    `SymmetricSeparableGaussian` expects the parameters `mu` and `sigma` as
+    1-dimensional tensors, as can be observed from the class attribute
+    `SymmetricSeparableGaussian.PARAMETER_NDIMS`).
+    If `estimate_grads` is given `mu` and/or `sigma` with more than 1
+    dimensions, those extra leftmost dimensions will be considered as batch
+    dimensions, and therefore the resulting sample tensor will have extra
+    leftmost dimensions too.
+
+    **Declaring fixed parameters.**
+    A functional gradient estimator can be created in such a way that some of
+    the parameters are pre-defined and only a subset of the mandatory parameters
+    are expected via arguments. For example, a gradient estimator that samples
+    from `SymmetricSeparableGaussian` with a fixed sigma can be defined like
+    this:
+
+    ```python
+    predefined_sigma = ...  # The constant sigma goes here
+
+    fgrad2 = make_functional_sampler(
+        SymmetricSeparableGaussian,
+        function=f,
+        objective_sense="max",
+        required_parameters=["mu"],
+        fixed_parameters={"sigma": predefined_sigma},
+        ranking_method="centered",
+    )
+    ```
+
+    The function `fgrad2` can be called like this:
+
+    ```python
+    gradients = fgrad2(num_samples, current_mu)
+    # or, alternatively:
+    # gradients = fgrad2(num_samples, mu=current_mu)
+    ```
+
+    **Specifying `objective_sense` and/or `ranking_method` later.**
+    One can omit `objective_sense` and `ranking_method` while making the
+    functional gradient estimator, and specify them later at the moment of
+    estimation. For example:
+
+    ```python
+    fgrad3 = make_functional_sampler(
+        SymmetricSeparableGaussian,
+        function=f,
+        required_parameters=["mu", "sigma"],
+        # Notice: `objective_sense` and `ranking_method` are omitted
+    )
+
+    ...
+
+    mu = ...
+    sigma = ...
+
+    gradients = fgrad3(
+        num_samples,
+        mu=mu,
+        sigma=sigma,
+        objective_sense="max",
+        ranking_method="centered",
+    )
+    ```
+
+    Args:
+        function: The fitness function that will be called for estimating
+            the gradients. If provided, the first positional argument of the
+            returned gradient estimator will be the number of solutions.
+            If omitted, the first and second positional arguments of the
+            returned gradient estimator will be the samples (solutions)
+            and fitnesses.
+            Please note that this `function` is expected to receive a
+            2-dimensional tensor (representing the population, where each
+            row of the 2-dimensional tensor is a solution) and is expected
+            to return a 1-dimensional tensor (where each element is a
+            scalar fitness).
+            For batching and/or `vmap` to work, this `function` itself should
+            be `vmap`-compatible.
+        objective_sense: Specify this as "max" if a higher fitness value means
+            better solution. Specify this as "min" if a higher fitness value
+            means worse solution. Please note that, if `objective_sense` is not
+            provided at the moment of its making, one will have to specify it
+            later every time the estimator is called.
+        required_parameters: A list of strings, each string being the name
+            of a distribution parameter. The order of this list determines
+            the order of parameter-related positional arguments in the
+            returned callable object.
+        fixed_parameters: A dictionary where the keys are parameter names
+            (as strings), and the values are pre-defined parameter values.
+        ranking_method: Give a string here if you would like the fitnesses
+            to be ranked first. Possible values are "centered", "linear",
+            "raw".
+        return_samples: Set this as True if you would like the gradient
+            estimator to return not just the gradients, but also the samples
+            (solutions) that were used for estimating the gradients.
+        return_fitnesses: Set this as True if you would like the gradient
+            estimator to return not just the gradients, but also the fitnesses
+            that were used for estimating the gradients.
+    Returns:
+        A callable object whose function is to estimate gradients.
+    """
+    return FunctionalGradEstimator(
+        distribution_class,
+        function=function,
+        objective_sense=objective_sense,
+        required_parameters=required_parameters,
+        fixed_parameters=fixed_parameters,
+        ranking_method=ranking_method,
+        return_samples=return_samples,
+        return_fitnesses=return_fitnesses,
+    )
diff --git a/src/evotorch/tools/__init__.py b/src/evotorch/tools/__init__.py
index 165cfb6b..38568724 100644
--- a/src/evotorch/tools/__init__.py
+++ b/src/evotorch/tools/__init__.py
@@ -18,6 +18,8 @@
 
 
 __all__ = (
+    "BatchableScalar",
+    "BatchableVector",
     "DType",
     "DTypeAndDevice",
     "Device",
@@ -36,6 +38,7 @@
     "clip_tensor",
     "clone",
     "cloning",
+    "constraints",
     "device_of",
     "device_of_container",
     "dtype_of",
@@ -57,7 +60,9 @@
     "is_real_vector",
     "is_sequence",
     "is_tensor_on_cpu",
+    "log_barrier",
     "make_I",
+    "make_batched_false_for_vmap",
     "make_empty",
     "make_gaussian",
     "make_nan",
@@ -67,10 +72,12 @@
     "make_uniform",
     "make_zeros",
     "modify_tensor",
+    "modify_vector",
     "multiply_rows_by_scalars",
     "mutable_copy",
     "numpy_copy",
     "pass_info_if_needed",
+    "penalty",
     "rank",
     "read_only_tensor",
     "recursiveprintable",
@@ -82,11 +89,13 @@
     "to_numpy_dtype",
     "to_stdev_init",
     "to_torch_dtype",
+    "violation",
 )
 
 
 from . import (
     cloning,
+    constraints,
     hook,
     immutable,
     objectarray,
@@ -96,9 +105,12 @@
     structures,
     tensormaker,
 )
+from .constraints import log_barrier, penalty, violation
 from .hook import Hook
 from .immutable import as_immutable, mutable_copy
 from .misc import (
+    BatchableScalar,
+    BatchableVector,
     Device,
     DType,
     DTypeAndDevice,
@@ -132,6 +144,7 @@
     is_real_vector,
     is_sequence,
     is_tensor_on_cpu,
+    make_batched_false_for_vmap,
     make_empty,
     make_gaussian,
     make_I,
@@ -143,6 +156,7 @@
     make_zeros,
     message_from,
     modify_tensor,
+    modify_vector,
     multiply_rows_by_scalars,
     numpy_copy,
     pass_info_if_needed,
diff --git a/src/evotorch/tools/constraints.py b/src/evotorch/tools/constraints.py
new file mode 100644
index 00000000..47061863
--- /dev/null
+++ b/src/evotorch/tools/constraints.py
@@ -0,0 +1,281 @@
+from typing import Optional, Union
+
+import torch
+
+
+def _violation(lhs: torch.Tensor, comparison: str, rhs: torch.Tensor) -> torch.Tensor:
+    from torch.nn.functional import relu
+
+    if comparison == ">=":
+        return relu(rhs - lhs)
+    elif comparison == "<=":
+        return relu(lhs - rhs)
+    elif comparison == "==":
+        return torch.abs(lhs - rhs)
+    else:
+        raise ValueError(
+            f"Unrecognized comparison operator: {repr(comparison)}."
+            " Supported comparison operators are: '>=', '<=', '=='"
+        )
+
+
+def violation(
+    lhs: Union[float, torch.Tensor],
+    comparison: str,
+    rhs: Union[float, torch.Tensor],
+) -> torch.Tensor:
+    """
+    Get the amount of constraint violation.
+
+    Args:
+        lhs: The left-hand-side of the constraint. In the non-batched case,
+            this is expected as a scalar. If it is given as an n-dimensional
+            tensor where n is at least 1, this is considered as a batch of
+            left-hand-side values.
+        comparison: The operator used for comparing the left-hand-side and the
+            right-hand-side. Expected as a string. Acceptable values are:
+            '<=', '==', '>='.
+        rhs: The right-hand-side of the constraint. In the non-batched case,
+            this is expected as a scalar. If it is given as an n-dimensional
+            tensor where n is at least 1, this is considered as a batch of
+            right-hand-side values.
+    Returns:
+        The amount of violation of the constraint. A value of 0 means that
+        the constraint is not violated at all. The returned violation amount(s)
+        are always non-negative.
+    """
+    from ..decorators import expects_ndim
+
+    return expects_ndim(_violation, (0, None, 0))(lhs, comparison, rhs)
+
+
+def _log_barrier(
+    lhs: torch.Tensor,
+    comparison: str,
+    rhs: torch.Tensor,
+    sharpness: torch.Tensor,
+    penalty_sign: str,
+    inf: torch.Tensor,
+) -> torch.Tensor:
+    from torch.nn.functional import relu
+
+    if comparison == ">=":
+        log_input = relu(lhs - rhs)
+    elif comparison == "<=":
+        log_input = relu(rhs - lhs)
+    else:
+        raise ValueError(
+            f"Unrecognized comparison operator: {repr(comparison)}. Supported comparison operators are: '>=', '<='"
+        )
+
+    log_input = log_input
+
+    result = torch.log(log_input) / sharpness
+
+    inf = -inf
+    result = torch.where(result < inf, inf, result)
+
+    if penalty_sign == "-":
+        pass  # nothing to do
+    elif penalty_sign == "+":
+        result = -result
+    else:
+        raise ValueError(f"Unrecognized penalty sign: {repr(penalty_sign)}. Supported penalty signs are: '+', '-'")
+
+    return result
+
+
+def log_barrier(
+    lhs: Union[float, torch.Tensor],
+    comparison: str,
+    rhs: Union[float, torch.Tensor],
+    *,
+    penalty_sign: str,
+    sharpness: Union[float, torch.Tensor] = 1.0,
+    inf: Optional[Union[float, torch.Tensor]] = None,
+) -> torch.Tensor:
+    """
+    Return a penalty based on how close the constraint is to being violated.
+
+    If the left-hand-side is equal to the right-hand-side, or if the constraint
+    is violated, the returned penalty will be infinite (`+inf` or `-inf`,
+    depending on `penalty_sign`). Such `inf` values can result in numerical
+    instabilities. To overcome such instabilities, you might want to set the
+    keyword argument `inf` as a large-enough finite positive quantity `M`, so
+    that very large (or infinite) penalties will be clipped down to `M`.
+
+    Args:
+        lhs: The left-hand-side of the constraint. In the non-batched case,
+            this is expected as a scalar. If it is given as an n-dimensional
+            tensor where n is at least 1, this is considered as a batch of
+            left-hand-side values.
+        comparison: The operator used for comparing the left-hand-side and the
+            right-hand-side. Expected as a string. Acceptable values are:
+            '<=', '>='.
+        rhs: The right-hand-side of the constraint. In the non-batched case,
+            this is expected as a scalar. If it is given as an n-dimensional
+            tensor where n is at least 1, this is considered as a batch of
+            right-hand-side values.
+        penalty_sign: Expected as string, either as '+' or '-', which
+            determines the sign of the penalty (i.e. determines if the penalty
+            will be positive or negative). One should consider the objective
+            sense of the fitness function at hand for deciding `penalty_sign`.
+            For example, if a fitness function is written from the perspective
+            of maximization, the penalties should be negative, and therefore,
+            `penalty_sign` must be given as '-'.
+        sharpness: The logarithmic penalty will be divided by this number.
+            By default, this value is 1. A sharper log-penalization allows
+            the constraint to get closer to its boundary, and then makes
+            a more sudden jump towards infinity.
+        inf: When concerned about the possible numerical instabilities caused
+            by infinite penalties, one can specify a finite large-enough
+            positive quantity `M` through this argument. As a result,
+            infinite penalties will be clipped down to the finite `M`.
+            One might also think of this as temporarily replacing `inf` with
+            `M` while computing the log-penalties.
+    Returns:
+        Log-penalty amount(s), whose sign(s) is/are determined by
+        `penalty_sign`.
+    """
+    from ..decorators import expects_ndim
+
+    if inf is None:
+        inf = float("inf")
+
+    return expects_ndim(_log_barrier, (0, None, 0, 0, None, 0))(lhs, comparison, rhs, sharpness, penalty_sign, inf)
+
+
+def _penalty(
+    lhs: torch.Tensor,
+    comparison: str,
+    rhs: torch.Tensor,
+    penalty_sign: str,
+    linear: torch.Tensor,
+    step: torch.Tensor,
+    exp: torch.Tensor,
+    exp_inf: torch.Tensor,
+) -> torch.Tensor:
+    # Get the amount of violation that exists on the constraint
+    violation_amount = _violation(lhs, comparison, rhs)
+
+    # Get the constants 0 and 1 on the correct device, with the correct dtype
+    zero = torch.zeros_like(violation_amount)
+    one = torch.zeros_like(violation_amount)
+
+    # Initialize the penalty as the amount of violation times the `linear` multiplier
+    penalty = linear * violation_amount
+
+    # Increase the penalty by the constant `step` if there is violation
+    penalty = penalty + torch.where(violation_amount > zero, step, zero)
+
+    # Check if `exp` is given. `nan` means that `exp` is omitted, in which case no penalty will be applied.
+    exp_given = ~(torch.isnan(exp))
+
+    # Compute an exponential penalty if the argument `exp` is given
+    exped_penalty = violation_amount ** torch.where(exp_given, exp, one)
+
+    # Clip the exponential penalty so that it does not exceed `exp_inf`
+    exped_penalty = torch.where(exped_penalty > exp_inf, exp_inf, exped_penalty)
+
+    # If the argument `exp` is given, add the exponential penalty onto the main penalty
+    penalty = penalty + torch.where(exp_given, exped_penalty, zero)
+
+    # Apply the correct sign on the penalty
+    if penalty_sign == "+":
+        pass  # nothing to do
+    elif penalty_sign == "-":
+        penalty = -penalty
+    else:
+        raise ValueError(f"Unrecognized penalty sign: {repr(penalty_sign)}." "Supported penalty signs are: '+', '-'")
+
+    # Finally, the accumulated penalty is returned
+    return penalty
+
+
+def penalty(
+    lhs: Union[float, torch.Tensor],
+    comparison: str,
+    rhs: Union[float, torch.Tensor],
+    *,
+    penalty_sign: str,
+    linear: Optional[Union[float, torch.Tensor]] = None,
+    step: Optional[Union[float, torch.Tensor]] = None,
+    exp: Optional[Union[float, torch.Tensor]] = None,
+    exp_inf: Optional[Union[float, torch.Tensor]] = None,
+) -> torch.Tensor:
+    """
+    Return a penalty based on the amount of violation of the constraint.
+
+    Depending on the provided arguments, the penalty can be linear,
+    or exponential, or based on step function, or a combination of these.
+
+    Args:
+        lhs: The left-hand-side of the constraint. In the non-batched case,
+            this is expected as a scalar. If it is given as an n-dimensional
+            tensor where n is at least 1, this is considered as a batch of
+            left-hand-side values.
+        comparison: The operator used for comparing the left-hand-side and the
+            right-hand-side. Expected as a string. Acceptable values are:
+            '<=', '==', '>='.
+        rhs: The right-hand-side of the constraint. In the non-batched case,
+            this is expected as a scalar. If it is given as an n-dimensional
+            tensor where n is at least 1, this is considered as a batch of
+            right-hand-side values.
+        penalty_sign: Expected as string, either as '+' or '-', which
+            determines the sign of the penalty (i.e. determines if the penalty
+            will be positive or negative). One should consider the objective
+            sense of the fitness function at hand for deciding `penalty_sign`.
+            For example, if a fitness function is written from the perspective
+            of maximization, the penalties should be negative, and therefore,
+            `penalty_sign` must be given as '-'.
+        linear: Multiplier for the linear component of the penalization.
+            If omitted (i.e. left as None), the value of this multiplier will
+            be 0 (meaning that there will not be any linear penalization).
+            In the non-batched case, this argument is expected as a scalar.
+            If this is provided as a tensor 1 or more dimensions, those
+            dimensions will be considered as batch dimensions.
+        step: The constant amount that will be added onto the penalty if there
+            is a violation. If omitted (i.e. left as None), this value is 0.
+            In the non-batched case, this argument is expected as a scalar.
+            If this is provided as a tensor 1 or more dimensions, those
+            dimensions will be considered as batch dimensions.
+        exp: A constant `p` that will enable exponential penalization in the
+            form `amount_of_violation ** p`. If this is left as None or is
+            given as `nan`, there will be no exponential penalization.
+            In the non-batched case, this argument is expected as a scalar.
+            If this is provided as a tensor 1 or more dimensions, those
+            dimensions will be considered as batch dimensions.
+        exp_inf: Upper bound for exponential penalty values. If exponential
+            penalty is enabled but `exp_inf` is omitted (i.e. left as None),
+            the exponential penalties can jump to very large values or to
+            infinity, potentially causing numerical instabilities. To avoid
+            such numerical instabilities, one might provide a large-enough
+            positive constant `M` via the argument `exp_inf`. When such a value
+            is given, exponential penalties will not be allowed to exceed `M`.
+            One might also think of this as temporarily replacing `inf` with
+            `M` while computing the exponential penalties.
+    Returns:
+        The penalty amount(s), whose sign(s) is/are determined by
+        `sign_penalty`.
+    """
+    from ..decorators import expects_ndim
+
+    if linear is None:
+        linear = 0.0
+    if step is None:
+        step = 0.0
+    if exp is None:
+        exp = float("nan")
+    if exp_inf is None:
+        exp_inf = float("inf")
+
+    return expects_ndim(_penalty, (0, None, 0, None, 0, 0, 0, 0))(
+        lhs,
+        comparison,
+        rhs,
+        penalty_sign,
+        linear,
+        step,
+        exp,
+        exp_inf,
+    )
diff --git a/src/evotorch/tools/misc.py b/src/evotorch/tools/misc.py
index edfbb1ba..d219d383 100644
--- a/src/evotorch/tools/misc.py
+++ b/src/evotorch/tools/misc.py
@@ -33,6 +33,8 @@
 Size = Union[int, torch.Size]
 RealOrVector = Union[float, Iterable[float], torch.Tensor]
 Vector = Union[Iterable[float], torch.Tensor]
+BatchableScalar = Union[Number, np.ndarray, torch.Tensor]
+BatchableVector = Union[torch.Tensor, np.ndarray]
 
 
 try:
@@ -814,6 +816,98 @@ def convert_to_tensor(x, tensorname: str):
             return result
 
 
+def _modify_vector_using_bounds(
+    target: torch.Tensor,
+    lb: torch.Tensor,
+    ub: torch.Tensor,
+) -> torch.Tensor:
+    # Strictly expect `target` as a 1-dimensional tensor, and get its length
+    [target_length] = target.shape
+
+    # Ensure that `lb` and `ub` are 1-dimensional tensors, and get their lengths
+    if lb.ndim == 0:
+        lb = lb.expand(target.shape)
+    if ub.ndim == 0:
+        ub = ub.expand(target.shape)
+    [lb_length] = lb.shape
+    [ub_length] = ub.shape
+
+    # Verify that the lengths of the vectors match
+    if target_length != lb_length:
+        raise ValueError("The lower bound (`lb`) has a different length than the given `target`")
+    if target_length != ub_length:
+        raise ValueError("The upper bound (`ub`) has a different length than the given `target`")
+
+    return torch.min(torch.max(target, lb), ub)
+
+
+def _modify_vector_using_max_change(
+    original: torch.Tensor,
+    target: torch.Tensor,
+    max_change: torch.Tensor,
+) -> torch.Tensor:
+    # Strictly expect `original` and `target` as 1-dimensional tensors, and get their lengths
+    [original_length] = original.shape
+    [target_length] = target.shape
+
+    # Ensure that `max_change` is a 1-dimensional tensor, and get its length
+    if max_change.ndim == 0:
+        max_change = max_change.expand(target.shape)
+    [max_change_length] = max_change.shape
+
+    # Verify that the lengths of the vectors match
+    if target_length != original_length:
+        raise ValueError("The `target` vector and the `original` vector have different lengths")
+    if original_length != max_change_length:
+        raise ValueError("The length of `max_change` is different than the length of `original`")
+
+    max_diff = torch.abs(original) * max_change
+    return torch.min(torch.max(target, original - max_diff), original + max_diff)
+
+
+def modify_vector(
+    original: torch.Tensor,
+    target: torch.Tensor,
+    *,
+    lb: Optional[Union[float, torch.Tensor]] = None,
+    ub: Optional[Union[float, torch.Tensor]] = None,
+    max_change: Optional[Union[float, torch.Tensor]] = None,
+) -> torch.Tensor:
+    """
+    Return the modified version(s) of the vector(s), with bounds checking.
+
+    This function is similar to `modify_tensor`, but it has the following
+    different behaviors:
+
+    - Assumes that all of its arguments are either vectors, or are batches
+      of vectors. If some or more of its arguments have 2 or more dimensions,
+      those arguments will be considered as batches, and the computation will
+      be vectorized to return a batch of results.
+    - Designed to be `vmap`-friendly.
+    - Designed for functional programming paradigm, and therefore lacks the
+      in-place modification option.
+    """
+    from ..decorators import expects_ndim
+
+    if max_change is None:
+        result = target
+    else:
+        result = expects_ndim(_modify_vector_using_max_change, (1, 1, 1), allow_smaller_ndim=True)(
+            original, target, max_change
+        )
+
+    if (lb is None) and (ub is None):
+        pass  # no strict boundaries, so, nothing more to do
+    elif (lb is not None) and (ub is not None):
+        result = expects_ndim(_modify_vector_using_bounds, (1, 1, 1), allow_smaller_ndim=True)(result, lb, ub)
+    else:
+        raise ValueError(
+            "`modify_vector` expects either with `lb` and `ub` given together, or with both of them omitted."
+            " Having only `lb` or only `ub` is not supported."
+        )
+    return result
+
+
 def empty_tensor_like(
     source: Any,
     *,
@@ -1179,6 +1273,25 @@ def _out_tensor(
     return out
 
 
+def _out_tensor_for_random_operation(
+    *size: Size,
+    out: Optional[torch.Tensor] = None,
+    dtype: Optional[DType] = None,
+    device: Optional[Device] = None,
+) -> torch.Tensor:
+    if out is None:
+        out = torch.empty(tuple(), dtype=dtype, device=device).expand(*size)
+        out = out + make_batched_false_for_vmap(out.device)
+    else:
+        if len(size) >= 1:
+            raise ValueError(
+                f"When `out` is provided (i.e. not None), the positional `size` arguments were not expected."
+                f" However, `size` arguments were received as {repr(size)}."
+            )
+        expect_none("when `out` is provided (i.e. not None)", dtype=dtype, device=device)
+    return out
+
+
 def _scalar_requested(*size: Size) -> bool:
     return (len(size) == 1) and isinstance(size[0], tuple) and (len(size[0]) == 0)
 
@@ -1489,7 +1602,7 @@ def _invalid_bound_args():
             f" ub: {repr(ub)}."
         )
 
-    out = _out_tensor(*size, out=out, dtype=dtype, device=device)
+    out = _out_tensor_for_random_operation(*size, out=out, dtype=dtype, device=device)
     gen_kwargs = _generator_kwargs(generator)
 
     def _cast_bounds():
@@ -1606,7 +1719,7 @@ def make_gaussian(
     if scalar_requested:
         size = (1,)
 
-    out = _out_tensor(*size, out=out, dtype=dtype, device=device)
+    out = _out_tensor_for_random_operation(*size, out=out, dtype=dtype, device=device)
     gen_kwargs = _generator_kwargs(generator)
 
     if symmetric:
@@ -1690,7 +1803,7 @@ def make_randint(
 
     if (dtype is None) and (out is None):
         dtype = torch.int64
-    out = _out_tensor(*size, out=out, dtype=dtype, device=device)
+    out = _out_tensor_for_random_operation(*size, out=out, dtype=dtype, device=device)
     gen_kwargs = _generator_kwargs(generator)
     out.random_(**gen_kwargs)
     out %= n
@@ -2091,3 +2204,107 @@ def storage_ptr(x: Iterable) -> int:
         The address of the underlying storage.
     """
     return _storage_ptr(x)
+
+
+def make_batched_false_for_vmap(device: Device) -> torch.Tensor:
+    """
+    Get `False`, properly batched if inside `vmap(..., randomness='different')`.
+
+    **Reasoning.**
+    Imagine we have the following function:
+
+    ```python
+    import torch
+
+
+    def sample_and_shift(target_shape: tuple, shift: torch.Tensor) -> torch.Tensor:
+        result = torch.empty(target_shape, device=x.device)
+        result.normal_()
+        result += shift
+        return result
+    ```
+
+    which allocates an empty tensor, then fills it with samples from the
+    standard normal distribution, then shifts the samples and returns the
+    result. An important implementation detail regarding this example function
+    is that all of its operations are in-place (i.e. the method `normal_()`
+    and the operator `+=` work on the given pre-allocated tensor).
+
+    Let us now imagine that we have a batch of shift tensors, and we would like
+    to generate multiple shifted sample tensors. Ideally, such a batched
+    operation could be done by transforming the example function with the help
+    of `vmap`:
+
+    ```python
+    from torch.func import vmap
+
+    batched_sample_and_shift = vmap(sample_and_shift, in_dims=0, randomness="different")
+    ```
+
+    where the argument `randomness="different"` tells PyTorch that for each
+    batch item, we want to generate different samples (instead of just
+    duplicating the same samples across the batch dimension(s)).
+    Such a re-sampling approach is usually desired in applications where
+    preserving stochasticity is crucial, evolutionary computation being one
+    of such case.
+
+    Now let us call our transformed function:
+
+    ```python
+    batch_of_shifts = ...  # a tensor like `shift`, but with an extra leftmost
+    # dimension for the batches
+
+    # Will fail:
+    batched_results = batched_sample_and_shift(shape_goes_here, batch_of_shifts)
+    ```
+
+    At this point, we observe that `batched_sample_and_shift` fails.
+    The reason for this failure is that the function first allocates an empty
+    tensor, then tries to perform random sampling in an in-place manner.
+    The first allocation via `empty` is not properly batched (it is not aware
+    of the active `vmap`), so, when we later call `.normal_()` on it,
+    there is no room for the data that would be re-sampled for each batch item.
+    To remedy this, we could modify our original function slightly:
+
+    ```python
+    import torch
+
+
+    def sample_and_shift2(target_shape: tuple, shift: torch.Tensor) -> torch.Tensor:
+        result = torch.empty(target_shape, device=x.device)
+        result = result + result.make_batched_false_for_vmap(x.device)
+        result.normal_()
+        result += shift
+        return result
+    ```
+
+    In this modified function, right after making an initial allocation, we add
+    onto it a batched false, and re-assign the result to the variable `result`.
+    Thanks to being the result of an interaction with a batched false, the new
+    `result` variable is now properly batched (if we are inside
+    `vmap(..., randomness="different")`. Now, let us transform our function:
+
+    ```python
+    from torch.func import vmap
+
+    batched_sample_and_shift2 = vmap(sample_and_shift2, in_dims=0, randomness="different")
+    ```
+
+    The following code should now work:
+
+    ```python
+    batch_of_shifts = ...  # a tensor like `shift`, but with an extra leftmost
+    # dimension for the batches
+
+    # Should work:
+    batched_results = batched_sample_and_shift2(shape_goes_here, batch_of_shifts)
+    ```
+
+    Args:
+        device: The target device on which the batched `False` will be created
+    Returns:
+        A scalar tensor having the value `False`. This returned tensor will be
+        a batch of scalar tensors (i.e. a `BatchedTensor`) if we are inside
+        `vmap(..., randomness="different")`.
+    """
+    return torch.randint(0, 1, tuple(), dtype=torch.bool, device=device)
diff --git a/tests/test_constraint_penalization.py b/tests/test_constraint_penalization.py
new file mode 100644
index 00000000..f91f7660
--- /dev/null
+++ b/tests/test_constraint_penalization.py
@@ -0,0 +1,100 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import torch
+
+from evotorch.tools import constraints
+
+
+def test_penalty():
+    a = torch.as_tensor(2.0)
+    b = torch.as_tensor(1.0)
+
+    p = constraints.penalty(a, "<=", b, penalty_sign="+", linear=1.0, step=10.0, exp=2.0, exp_inf=5000.0)
+    assert p > 10.0
+
+    p = constraints.penalty(a, "==", b, penalty_sign="+", linear=1.0, step=10.0, exp=2.0, exp_inf=5000.0)
+    assert p > 10.0
+
+    p = constraints.penalty(a, ">=", b, penalty_sign="+", linear=1.0, step=10.0, exp=2.0, exp_inf=5000.0)
+    assert torch.abs(p) < 1e-8
+
+
+def test_batched_penalty():
+    a = torch.tensor([-2.0, 2.0], dtype=torch.float32)
+    b = torch.as_tensor(1.0, dtype=torch.float32)
+
+    p = constraints.penalty(a, "<=", b, penalty_sign="+", linear=1.0, step=10.0, exp=2.0, exp_inf=5000.0)
+    assert p.shape == (2,)
+    assert torch.abs(p[0]) < 1e-8
+    assert p[1] > 10.0
+
+    p = constraints.penalty(a, "==", b, penalty_sign="+", linear=1.0, step=10.0, exp=2.0, exp_inf=5000.0)
+    assert p.shape == (2,)
+    assert p[0] > 10.0
+    assert p[1] > 10.0
+
+    p = constraints.penalty(a, ">=", b, penalty_sign="+", linear=1.0, step=10.0, exp=2.0, exp_inf=5000.0)
+    assert p.shape == (2,)
+    assert p[0] > 10.0
+    assert torch.abs(p[1]) < 1e-8
+
+
+def test_penalization_amount():
+    a = torch.as_tensor(4.0)
+    b = torch.as_tensor(1.0)
+
+    p = constraints.penalty(a, "<=", b, penalty_sign="-", linear=1.0)
+    desired_p = -(a - b)
+    assert torch.abs(p - desired_p) < 1e-8
+
+    p = constraints.penalty(a, "<=", b, penalty_sign="-", step=10.0)
+    desired_p = -10.0
+    assert torch.abs(p - desired_p) < 1e-8
+
+    p = constraints.penalty(a, "<=", b, penalty_sign="-", linear=1.0, step=10.0)
+    desired_p = -((a - b) + 10.0)
+    assert torch.abs(p - desired_p) < 1e-8
+
+    p = constraints.penalty(a, "<=", b, penalty_sign="-", exp=4.0)
+    desired_p = -((a - b) ** 4.0)
+    assert torch.abs(p - desired_p) < 1e-8
+
+
+def test_positive_log_barrier():
+    a = torch.tensor([1.0, 2.0, 3.0, 4.0], dtype=torch.float32)
+    b = torch.tensor(4.0, dtype=torch.float32)
+    infinity = 1000.0
+
+    p = constraints.log_barrier(a, "<=", b, penalty_sign="+", inf=infinity)
+    assert p.shape == a.shape
+
+    for i in range(1, len(p)):
+        assert p[i] > p[i - 1]
+
+    assert torch.abs(p[-1] - infinity) < 1e-8
+
+
+def test_negative_log_barrier():
+    a = torch.tensor([1.0, 2.0, 3.0, 4.0], dtype=torch.float32)
+    b = torch.tensor(4.0, dtype=torch.float32)
+    infinity = 1000.0
+
+    p = constraints.log_barrier(a, "<=", b, penalty_sign="-", inf=infinity)
+    assert p.shape == a.shape
+
+    for i in range(1, len(p)):
+        assert p[i] < p[i - 1]
+
+    assert torch.abs(p[-1] - (-infinity)) < 1e-8
diff --git a/tests/test_expects_ndim.py b/tests/test_expects_ndim.py
new file mode 100644
index 00000000..b622422b
--- /dev/null
+++ b/tests/test_expects_ndim.py
@@ -0,0 +1,148 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import numpy as np
+import pytest
+import torch
+
+from evotorch import testing
+from evotorch.decorators import expects_ndim, rowwise
+
+
+@expects_ndim(0, 1)
+def f(c: torch.Tensor, x: torch.Tensor) -> torch.Tensor:
+    return c * torch.max(x)
+
+
+@expects_ndim(None, 0, 1)
+def typed_f(dtype: torch.dtype, c: torch.Tensor, x: torch.Tensor) -> torch.Tensor:
+    assert c.dtype == dtype
+    return c * torch.max(x)
+
+
+@rowwise
+def rowwise_max(x: torch.Tensor) -> torch.Tensor:
+    return torch.max(x)
+
+
+def test_expects_ndim_without_batching():
+    c = torch.as_tensor(2, dtype=torch.float32)
+    x = torch.tensor([3, -2, 1], dtype=torch.float32)
+
+    desired = torch.tensor(6, dtype=torch.float32)
+
+    testing.assert_allclose(f(c, x), desired, atol=1e-4)
+
+
+def test_expects_ndim_with_batched_vector():
+    c = torch.as_tensor(2, dtype=torch.float32)
+    x = torch.tensor(
+        [
+            [0, -1, 2],
+            [3, -2, 1],
+            [-4, 6, -1],
+        ],
+        dtype=torch.float32,
+    )
+
+    desired = torch.tensor([4, 6, 12], dtype=torch.float32)
+
+    testing.assert_allclose(f(c, x), desired, atol=1e-4)
+
+
+def test_expects_ndim_with_batched_multiplier():
+    c = torch.tensor([-1, 2], dtype=torch.float32)
+    x = torch.tensor([3, -2, 1], dtype=torch.float32)
+
+    desired = torch.tensor([-3, 6], dtype=torch.float32)
+
+    testing.assert_allclose(f(c, x), desired, atol=1e-4)
+
+
+def test_expects_ndim_with_matching_batch_dims():
+    c = torch.tensor([-1, 1, 2], dtype=torch.float32)
+    x = torch.tensor(
+        [
+            [0, -1, 2],
+            [3, -2, 1],
+            [-4, 6, -1],
+        ],
+        dtype=torch.float32,
+    )
+
+    desired = torch.tensor([-2, 3, 12], dtype=torch.float32)
+
+    testing.assert_allclose(f(c, x), desired, atol=1e-4)
+
+
+def test_expects_ndim_with_multibatch():
+    c = torch.tensor(
+        [
+            [-1, 1, 2],
+            [1, -1, -2],
+        ],
+        dtype=torch.float32,
+    )
+    x = torch.tensor(
+        [
+            [0, -1, 2],
+            [3, -2, 1],
+            [-4, 6, -1],
+        ],
+        dtype=torch.float32,
+    )
+
+    desired = torch.tensor(
+        [
+            [-2, 3, 12],
+            [2, -3, -12],
+        ],
+        dtype=torch.float32,
+    )
+
+    testing.assert_allclose(f(c, x), desired, atol=1e-4)
+
+
+def test_expects_ndim_with_non_tensor_scalar():
+    for c in (True, np.bool_(True), 1, 1.0):
+        for dtype in (torch.float32, torch.float64):
+            x = torch.tensor([3, -2, 1], dtype=dtype)
+            desired = torch.as_tensor(3, dtype=dtype)
+
+            if isinstance(c, (bool, np.bool_)):
+                expected_dtype = torch.bool
+            else:
+                expected_dtype = x.dtype
+
+            testing.assert_allclose(typed_f(expected_dtype, c, x), desired, atol=1e-4)
+
+
+def test_rowwise():
+    with pytest.raises(ValueError):
+        rowwise_max(torch.as_tensor(1.0))
+
+    testing.assert_allclose(rowwise_max(torch.FloatTensor([1, 3, 2])), 3, rtol=1e-4)
+
+    testing.assert_allclose(
+        rowwise_max(
+            torch.FloatTensor(
+                [
+                    [1, 3, 2],
+                    [-1, 0, -9],
+                ]
+            ),
+        ),
+        torch.FloatTensor([3, 0]),
+        rtol=1e-4,
+    )
diff --git a/tests/test_func_alg.py b/tests/test_func_alg.py
new file mode 100644
index 00000000..8aaff624
--- /dev/null
+++ b/tests/test_func_alg.py
@@ -0,0 +1,98 @@
+# Copyright 2024 NNAISENSE SA
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+import pytest
+import torch
+from torch.func import grad
+
+from evotorch.algorithms import functional as funcalg
+from evotorch.decorators import expects_ndim, rowwise
+
+
+@rowwise
+def f(x: torch.Tensor) -> torch.Tensor:
+    return torch.sum(x**2)
+
+
+def test_cem():
+    batch_size = 3
+    solution_length = 5
+    popsize = 20
+
+    x = torch.randn(batch_size, solution_length)
+
+    state = funcalg.cem(
+        center_init=x, stdev_init=1.0, parenthood_ratio=0.5, objective_sense="min", stdev_max_change=0.2
+    )
+    assert state.center.shape == (batch_size, solution_length)
+
+    population = funcalg.cem_ask(state, popsize=popsize)
+    assert population.shape == (batch_size, popsize, solution_length)
+
+    fitnesses = f(population)
+    assert fitnesses.shape == (batch_size, popsize)
+
+    state = funcalg.cem_tell(state, population, fitnesses)
+    assert state.center.shape == (batch_size, solution_length)
+
+
+def test_pgpe():
+    batch_size = 3
+    solution_length = 5
+    popsize = 20
+
+    x = torch.randn(batch_size, solution_length)
+
+    state = funcalg.pgpe(
+        center_init=x,
+        stdev_init=1.0,
+        center_learning_rate=0.1,
+        stdev_learning_rate=0.1,
+        optimizer="clipup",
+        optimizer_config={"max_speed": 0.2},
+        ranking_method="centered",
+        objective_sense="min",
+        stdev_max_change=0.2,
+    )
+    assert state.optimizer_state.center.shape == (batch_size, solution_length)
+
+    population = funcalg.pgpe_ask(state, popsize=popsize)
+    assert population.shape == (batch_size, popsize, solution_length)
+
+    fitnesses = f(population)
+    assert fitnesses.shape == (batch_size, popsize)
+
+    state = funcalg.pgpe_tell(state, population, fitnesses)
+    assert state.optimizer_state.center.shape == (batch_size, solution_length)
+
+
+@pytest.mark.parametrize("optimizer_name", ["adam", "clipup", "sgd"])
+def test_optimizer(optimizer_name: str):
+    batch_size = 3
+    solution_length = 5
+    x = torch.randn(batch_size, solution_length)
+
+    optimizer_start, optimizer_ask, optimizer_tell = funcalg.misc.get_functional_optimizer(optimizer_name)
+    state = optimizer_start(center_init=x, center_learning_rate=0.1)
+    assert state.center.shape == (batch_size, solution_length)
+
+    center = optimizer_ask(state)
+    assert center.shape == (batch_size, solution_length)
+
+    g = expects_ndim(grad(f), (1,))(center)
+    assert g.shape == (batch_size, solution_length)
+
+    state = optimizer_tell(state, follow_grad=-g)
+    assert state.center.shape == (batch_size, solution_length)