In [3]:
import matplotlib.pyplot as plt
import seaborn as sns

# AI-Assisted Programming

In this notebook, we will compare AI-generated function for multivariate normal density with the one already included in the SciPy library. 

We choose ChatGPT 3.5 as our large language model (LLM) of interest. First, we gave it the following prompt:

> Can you generate a function using Python (or any libraries of Python's) called <code>multivariate_normal_density(x, mu, Sigma)</code> which returns the density of a D-dimensional vector <code>x</code> given a D-dimensional mean (location) vector <code>mu</code> and a D × D-dimensional covariance matrix <code>Cov</code>?

In response, ChatGPT generated the following function:

In [13]:
# This function below is generated by ChatGPT.
import numpy as np
from scipy.stats import multivariate_normal

def multivariate_normal_density(x, mu, Sigma):
    """
    Calculate the density of a D-dimensional vector x given a D-dimensional mean (location) vector mu
    and a D × D-dimensional covariance matrix Sigma.

    Parameters:
    - x: D-dimensional vector
    - mu: D-dimensional mean vector
    - Sigma: D × D-dimensional covariance matrix

    Returns:
    - Density of the vector x
    """
    mvn = multivariate_normal(mean=mu, cov=Sigma)
    density = mvn.pdf(x)
    return density

ChatGPT is smart enough to use SciPy! To test this function, ChatGPT used the following lines of code:

In [14]:
# The following code is generated by ChatGPT.
# Example usage:
D = 2  # Dimensionality
x_example = np.array([1, 2])  # D-dimensional vector
mu_example = np.array([0, 0])  # D-dimensional mean vector
Sigma_example = np.array([[1, 0.5], [0.5, 2]])  # D × D-dimensional covariance matrix

result = multivariate_normal_density(x_example, mu_example, Sigma_example)
print(f"Density for x: {result}")

Density for x: 0.03836759318252468


Important to note that this testing condition is a full-covariance Gaussian (non-zero covariance, difference variance per dimension). Comparing this with the original SciPy <code>multivariate_normal()</code> almost seems meaningless.

In [9]:
nvm_ground_truth = multivariate_normal.pdf(x_example, mu_example, Sigma_example)
print(f"Ground truth: {nvm_ground_truth}")

Ground truth: 0.03836759318252468


That is fairly uninteresting. So then we asked ChatGPT to generate the same function without using SciPy. ChatGPT generated the following function:

In [15]:
# The following function is generated by ChatGPT.
def multivariate_normal_density(x, mu, Sigma):
    """
    Calculate the density of a D-dimensional vector x given a D-dimensional mean (location) vector mu
    and a D × D-dimensional covariance matrix Sigma.

    Parameters:
    - x: D-dimensional vector
    - mu: D-dimensional mean vector
    - Sigma: D × D-dimensional covariance matrix

    Returns:
    - Density of the vector x
    """
    D = len(x)
    det_Sigma = np.linalg.det(Sigma)
    inv_Sigma = np.linalg.inv(Sigma)
    diff = x - mu

    exponent = -0.5 * np.dot(diff.T, np.dot(inv_Sigma, diff))
    prefactor = 1 / ((2 * np.pi) ** (D / 2) * det_Sigma ** 0.5)

    density = prefactor * np.exp(exponent)
    return density

The same testing code is deployed here by ChatGPT.

In [16]:
# The following code is generated by ChatGPT.
# Example usage:
D = 2  # Dimensionality
x_example = np.array([1, 2])  # D-dimensional vector
mu_example = np.array([0, 0])  # D-dimensional mean vector
Sigma_example = np.array([[1, 0.5], [0.5, 2]])  # D × D-dimensional covariance matrix

result = multivariate_normal_density(x_example, mu_example, Sigma_example)
print(f"Density for x: {result}")

Density for x: 0.03836759318252469


The last decimal point has changed. Other than that, the new function seems to hold up pretty well. 

We now give the LLM-generated SciPy-free <code>multivariate_normal_density()</code> function new testing conditions: 
1) Spherical Gaussian (zero covariance, shared variance across dimensions);
2) Diagonal Gaussian (zero covariance, different variance across dimensions);
3) Full-covariance Gaussian (non-zero covariance, difference variance across dimensions; this condition is different from the one ChatGPT offered).

In [21]:
def compare_mvn(x, mu, Sigma):
    true = multivariate_normal.pdf(x, mu, Sigma)
    chat = multivariate_normal_density(x, mu, Sigma)
    print(f"Ground truth: {true} | ChatGPT's result: {chat}.")

In [24]:
# New testing conditions, while keeping the dimensionality, x, and mu unchanged.
sphr_cov = np.array([[3.14, 0.], [0., 3.14]])
diag_cov = np.array([[3.14, 0.], [0., 1.57]])
full_cov = np.array([[3.14, 1.], [.5, 1.57]])

compare_mvn(x_example, mu_example, sphr_cov)
compare_mvn(x_example, mu_example, diag_cov)
compare_mvn(x_example, mu_example, full_cov)


Ground truth: 0.02286202114964594 | ChatGPT's result: 0.022862021149645948.
Ground truth: 0.017100504804849568 | ChatGPT's result: 0.017100504804849568.
Ground truth: 0.020130880315595175 | ChatGPT's result: 0.021529769455240662.


Overall, ChatGPT did very well. The generated function only deviates in results when being tested with a full covariance matrix with non-identical covariances. 