This notebook will demonstrate how to use Python and some of its modules to estimate parameters of a multi-variate Student's t distribution for a randomly-generated set of data.  It will use the maximum likelihood estimation (MLE) approach to find the parameters.

Created on August 31 and September 3-5, 2022 by Kevin Spradlin, Jr.

First, import some Python modules.

In [1]:
import typing
import math
import numpy as np
import scipy.stats as stats
import scipy.optimize as optimize

Next, create a set of random multi-variate Student's t variates using the scipy.stats ***multivariate_t*** function.

I used this site to learn about how it works: Taboga, Marco (2021). "Multivariate Student's t distribution", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/probability-distributions/multivariate-student-t-distribution.

In [2]:
def is_matrix_pos_semidef(test_matrix: np.array) -> typing.Dict:
    """
    This function will test the array test_matrix to see if it's positive 
     semi-definite, using this definition:

     "A positive semidefinite matrix is a Hermitian matrix all of whose eigenvalues are nonnegative."
     https://mathworld.wolfram.com/HermitianMatrix.html

     It will check if the matrix is square, Hermitian, and only has positive 
     eigenvalues.

     The function will return a dictionary.  One key is 'pass_test', which
      will be True if it's positive semi-definite and False if not.  The
      other key is 'message' will will state why the matrix failed the test,
      or will be blank it it passed the test.

    Created on June 8, 2022
    """

    pass_test: bool = False
    message: str = ''


    # check if the matrix is square
    matrix_dimensions: List = list(test_matrix.shape)

    if len(matrix_dimensions) != 2:
        message = 'Matrix needs to have 2 dimensions'
        return {'pass_test': pass_test, 'message': message}

    if matrix_dimensions[0] != matrix_dimensions[1]:
        message = 'Matrix needs to be square'
        return {'pass_test': pass_test, 'message': message}


    # check if the matrix is Hermitian
    complex_conjugate: np.array = np.matrix.getH(test_matrix)

    if not np.array_equal(test_matrix, complex_conjugate):
        message = "Matrix isn\'t Hermitian - equal to complex conjugate of itself"
        return {'pass_test': pass_test, 'message': message}


    # calculate the matrix's eigenvalues and check if any are negative
    for current_eigenvalue in np.linalg.eigvals(test_matrix):
        if current_eigenvalue < 0:
            message = f"Matrix has eigenvalue of {current_eigenvalue:6.4f}"
            break


    if message:
        return {'pass_test': pass_test, 'message': message}
    else:
        return {'pass_test': True, 'message': ''}

In [3]:
# let the user define the location vector, shape matrix, and degrees of freedom 
#  of the multi-variate Student's t distribution
mean_vector: np.ndarray = np.asarray([-0.105, 0.015], dtype=np.float32)
shape_matrix: np.ndarray = np.asarray([[0.65, -0.25], [-0.25, 0.20]], dtype=np.float32)
sample_df: int = 4

    
# check that the matrix is positive semi-definite
function_results: typing.Dict = is_matrix_pos_semidef(shape_matrix)

if not function_results['pass_test']:
    print(function_results['message'])    


# generate 10,000 random variates
random_sample: np.ndarray = stats.multivariate_t.rvs(loc=mean_vector, shape=shape_matrix, 
                                                     df=sample_df, size=10000)


Before we use the MLE approach, let's calculate the basic statistics of the sample of random multi-variate Student's t variates.

They should show that the mean vector of the sample are close to the values you defined.  The standard deviations likely won't be close to the shape matrix you described - they should be close to (df / (df - 2)) * shape matrix.  The sample statistics shouldn't show that the skewnesses and especially the excess kurtosis (kurtoses ?) are close to zero, since this is a multi-variate Student's t distribution.

In [4]:
results = stats.describe(random_sample)
print(f"Sample means: {np.array2string(results[2], precision=4):s}")
print(f"Sample variances: {np.array2string(results[3], precision=4):s}")
print(f"Sample skewnesses: {np.array2string(results[4], precision=4):s}")
print(f"Sample excess kurtosis: {np.array2string(results[5], precision=4):s}")

Sample means: [-0.0957  0.0042]
Sample variances: [1.3046 0.4172]
Sample skewnesses: [ 0.0542 -0.6924]
Sample excess kurtosis: [ 9.1708 13.0685]


---

Now, we'll use the MLE approach.
First define a function that calculates the log-likelihood function.  Then find the parameters that maximize the log-likelihood function.

In [5]:
def decode_lh_parameters_students_t(parameters: np.ndarray, num_dimensions: int) -> typing.Dict:
    """
    This function will convert the elements of the parameters array into a
     location/mean vector, a shape matrix, and degrees of freedom.  It will
     return them as values in a dictionary.
    
    Created on August 30 and September 5, 2022
    """

    mean_vector: np.ndarray = parameters[0:num_dimensions]

    shape_matrix: np.ndarray = np.zeros((num_dimensions, num_dimensions), dtype=np.float32)

    addend: int = num_dimensions
    for current_column in range(num_dimensions):
        addend -= current_column
        
        for current_row1 in range(0, current_column):
            shape_matrix[current_row1, current_column] = shape_matrix[current_column, current_row1]
        
        for current_row2 in range(current_column, num_dimensions):
            shape_matrix[current_row2, current_column] = \
              parameters[current_row2 + current_column * num_dimensions + addend]
    
    df: float = parameters[-1]


    return {'mean_vector': mean_vector, 'shape_matrix': shape_matrix, 'degrees_freedom': df}

In [6]:
def trial_pos_semidef_matrix(test_matrix: np.array) -> typing.Dict:
    """
    This function will take a matrix return a new matrix, based on the
     eigendecomposition of the original matrix.  The original matrix's 
     eigenvalues and vectors will be calculated.  If any eigenvalues are
     non-positive, then they will be set to be slightly positive.  The
     new matrix will be calculated from the original matrix's eigenvectors,
     any of the original eigenvalues that are positive, and the adjusted, 
     now-slightly positive eigenvalues.
     
    The new matrix will be positive semi-definite.
    
    Got information about eigendecomposition from this site:
    https://mathworld.wolfram.com/EigenDecomposition.html
    
    Created on September 3, 2022
    """    
    
    message: str = ''
    
    
    # first, make sure the matrix is symmetric
    matrix_dimensions: List = list(test_matrix.shape)

    if len(matrix_dimensions) != 2:
        message = 'Matrix needs to have 2 dimensions'
        return {'any_errors': True, 'message': message}

    if matrix_dimensions[0] != matrix_dimensions[1]:
        message = 'Matrix needs to be square'
        return {'any_errors': True, 'message': message}

 
    if not np.array_equal(test_matrix, test_matrix.T):
        message = "Matrix isn\'t symmetric"
        return {'any_errors': True, 'message': message}
   

    # next, calculate the eigenvalues and eigenvectors of the matrix
    matrix_p: np.ndarray = np.zeros((matrix_dimensions[0], matrix_dimensions[0]), dtype=np.float32)
    matrix_d: np.ndarray = np.zeros((matrix_dimensions[0], matrix_dimensions[0]), dtype=np.float32)
    
    test_eigenvalues, test_eigenvectors = np.linalg.eig(test_matrix)

    
    for current_index, current_eigenvalue in enumerate(test_eigenvalues):
        matrix_p[:,current_index] = test_eigenvectors[:,current_index]
       
        if current_eigenvalue >= 0.0:
            matrix_d[current_index, current_index] = current_eigenvalue
        else:
            matrix_d[current_index, current_index] = np.random.rand() / 10.0

    
    adjusted_matrix: np.ndarray = np.matmul(matrix_p, matrix_d)
    adjusted_matrix = np.matmul(adjusted_matrix, np.linalg.inv(matrix_p))
 

    #print(f"Eigenvalues: {np.array2string(test_eigenvalues, precision=4):s}")
    #print(f"Eigenvectors: {np.array2string(test_eigenvectors, precision=4):s}")
    
    #print(f"Eigenvector matrix: {np.array2string(matrix_p, precision=4):s}")
    #print(f"Eigenvalue matrix: {np.array2string(matrix_d, precision=4):s}")
    #print(f"Test matrix: {np.array2string(test_matrix, precision=4):s}")
    #print(f"Adjusted matrix: {np.array2string(adjusted_matrix, precision=4):s}")
   
    
    return {'any_errors': False, 'message': '', 'adjusted_matrix': adjusted_matrix}

In [7]:
def log_likelihood_multivariate_students_t(parameters: np.ndarray, num_dimensions: int, variates: np.ndarray) -> float:
    """
    Returns the negative of the log-likelihood function.  The scipy.optimize module only
     has a 'minimize' function, so you need to use it to minimize the negative of the
     log-likelihood function, which will maximize the positive of the log-likelihood
     function.

    Expect parameters[0:dimension] to be the location vector, parameters[dimension:-1] are the
     elements of the lower triangle of the shape matrix, and parameters[-1] is the degrees of
     freedom.

    Created on August 31 and September 3, 2022
    """
    
    function_results: typing.Dict = decode_lh_parameters_students_t(parameters, num_dimensions)

    mean_vector: np.ndarray = function_results['mean_vector']

    shape_matrix: np.ndarray = function_results['shape_matrix']

    degrees_freedom: float = float(function_results['degrees_freedom'])

        
    # need to ensure that the shape matrix is positive definite or semi-definite
    # first, test it.  if it passes the test, then calculate the likelihood function
    # if it fails the test, calculate an adjusted shape matrix that passes the
    #  test and use it to calculate the likelihood function
    function_results: typing.Dict = is_matrix_pos_semidef(shape_matrix)

    if not function_results['pass_test']:
        other_function_results: typing.Dict = trial_pos_semidef_matrix(shape_matrix)
        
        if other_function_results['any_errors']:
            print(other_function_results['message'])
            return 0.0
        else:
            shape_matrix = other_function_results['adjusted_matrix']       
        
        
    temp_sum: float = -np.sum(stats.multivariate_t.logpdf(variates, 
                                                          loc=mean_vector, 
                                                          shape=shape_matrix, 
                                                          df=degrees_freedom))

    print(f"Log likelihood: {temp_sum:12.6f}")

    return temp_sum

In order to use the numpy ***minimize*** function, the parameters of the likelihood function need to be fed into the ***minimize*** function as a vector. <br><br>
To do this, I set up a ***init_parameters*** one-dimensional NumPy array.  The first part of the array has the elements in the mean vector, most of the middle part of the array has the lower triangle of the shape matrix, and the last element is the degrees of freedom. <br><br>
Suppose you want to estimate the parameters of multi-variate Student's t distribution with an n-dimensional mean vector and a n-by-n shape matrix.  The first n elements of the ***init_parameters*** array can store the values of the mean vector. <br><br>
You need to ensure that the shape matrix that's used by the likelihood function is symmetric.  If you simply put every element of the matrix into the ***init_parameters*** array, then there's a good chance that the ***minimize*** function would set elements in the array so that the matrix would no longer be symmetric.  To ensure that this won't happen, I only save the elements of the lower triangle of the shape matrix in the ***init_parameters*** array.  Inside of the ***log_likelihood_multivariate_students_t*** function, I use the remaining elements in the array to set up a symmetric matrix.

Here's where I set up the ***init_parameters*** array.

In [8]:
num_dimensions: int = random_sample.shape[1]
num_elements: int = int(num_dimensions * (num_dimensions + 3) / 2) + 1

print(f"Number of dimensions: {num_dimensions:d}")
print(f"Number of parameters in optimization problem: {num_elements:d}")
    
init_parameters: np.ndarray = np.zeros(num_elements, dtype=np.float32)

# set the starting values for the mean vector equal to the sample means.
sample_stats = stats.describe(random_sample)

for current_element in range(num_dimensions):
#    init_parameters[current_element] = 1.0
    init_parameters[current_element] = sample_stats[2][current_element]

# now set the starting values for the covariance matrix.  set the diagonal
#  elements equal to the same variances.
current_element:int = num_dimensions
for step_size in range(num_dimensions, 0, -1):
    if current_element < num_elements:
        init_parameters[current_element] = sample_stats[3][num_dimensions - step_size]
        current_element += step_size

# finally set the starting value for degrees of freedom equal to the
#  number of dimensions
init_parameters[-1] = float(num_dimensions)


print(np.array2string(init_parameters, precision=4))

Number of dimensions: 2
Number of parameters in optimization problem: 6
[-0.0957  0.0042  1.3046  0.      0.4172  2.    ]


---

Finally, I run the ***minimize*** function to perform the MLE and obtain the estimated mean vector, shape matrix, and degrees of freedom.<br><br>

You also need to add some boundary conditions for the optimization function.  It's easier to set them up at the same time you set up the ***init_parameters*** array.  The conditions are:
* elements on the diagonal in the shape matrix need to be positive
* the degrees of freedom needs to be positive.<br><br>

You can see from the results that the values in the mean vector, shape matrix, and degrees of freedom are somewhat close to the values you originally entered.<br>

In [9]:
boundary_cond: typing.List = []

for index, value in enumerate(init_parameters):
    if index < num_dimensions:
        boundary_cond.append([None,None])
    else:
        if value == 1:
            boundary_cond.append([0,100000])
        else:
            boundary_cond.append([None,None])

boundary_cond[-1] = [1,100000]


opt_results = optimize.minimize(log_likelihood_multivariate_students_t, 
                                x0=init_parameters, 
                                args=(num_dimensions, random_sample, ), 
                                method='Nelder-Mead', bounds=boundary_cond,
                                options={'maxiter': num_elements * 1000,
                                         'maxfev': num_elements * 1000})

    
function_results: typing.Dict = decode_lh_parameters_students_t(opt_results['x'], num_dimensions)           

print(f"Estimated mean vector: {np.array2string(function_results['mean_vector'], precision=4):s}")
print(f"Estimated shape matrix: {np.array2string(function_results['shape_matrix'], precision=4):s}")
print(f"Estimated degrees of freedom: {function_results['degrees_freedom']:6.2f}")
print(opt_results['message'])

print(f"\nActual mean vector: {np.array2string(mean_vector, precision=4):s}")
print(f"Actual shape matrix: {np.array2string(shape_matrix, precision=4):s}")
print(f"Actual degrees of freedom: {sample_df:6.2f}")

Log likelihood: 24800.740558
Log likelihood: 24800.664365
Log likelihood: 24800.703513
Log likelihood: 24892.564231
Log likelihood: 24801.960365
Log likelihood: 24896.833138
Log likelihood: 24728.573341
Log likelihood: 24711.528597
Log likelihood: 24621.053116
Log likelihood: 24623.684581
Log likelihood: 24655.491286
Log likelihood: 24609.165276
Log likelihood: 24515.889771
Log likelihood: 24515.946103
Log likelihood: 24424.371394
Log likelihood: 24247.033425
Log likelihood: 24329.164118
Log likelihood: 24295.022256
Log likelihood: 24234.303036
Log likelihood: 24081.984255
Log likelihood: 24042.175034
Log likelihood: 23785.052109
Log likelihood: 23913.270673
Log likelihood: 23762.086185
Log likelihood: 23653.529476
Log likelihood: 23672.823203
Log likelihood: 23643.984022
Log likelihood: 24446.383728
Log likelihood: 23965.949961
Log likelihood: 23557.196519
Log likelihood: 24785.087367
Log likelihood: 23861.623852
Log likelihood: 23527.209439
Log likelihood: 24243.341334
Log likelihood

Log likelihood: 22357.393489
Log likelihood: 22357.285900
Log likelihood: 22357.246413
Log likelihood: 22356.832959
Log likelihood: 22356.303697
Log likelihood: 22356.670721
Log likelihood: 22356.695467
Log likelihood: 22356.500903
Log likelihood: 22356.063868
Log likelihood: 22355.682180
Log likelihood: 22355.910081
Log likelihood: 22355.163922
Log likelihood: 22354.432696
Log likelihood: 22355.183001
Log likelihood: 22354.468696
Log likelihood: 22353.862554
Log likelihood: 22352.814773
Log likelihood: 22353.843110
Log likelihood: 22353.464374
Log likelihood: 22352.184788
Log likelihood: 22350.953202
Log likelihood: 22351.799781
Log likelihood: 22352.349320
Log likelihood: 22350.440711
Log likelihood: 22349.210760
Log likelihood: 22350.668675
Log likelihood: 22349.103805
Log likelihood: 22348.200111
Log likelihood: 22349.482089
Log likelihood: 22346.472919
Log likelihood: 22343.665819
Log likelihood: 22347.068266
Log likelihood: 22347.432033
Log likelihood: 22345.475172
Log likelihood

Log likelihood: 20201.407339
Log likelihood: 20201.409125
Log likelihood: 20201.409087
Log likelihood: 20201.408502
Log likelihood: 20201.412035
Log likelihood: 20201.407463
Log likelihood: 20201.405158
Log likelihood: 20201.404168
Log likelihood: 20201.408737
Log likelihood: 20201.406809
Log likelihood: 20201.408726
Log likelihood: 20201.406781
Log likelihood: 20201.408681
Log likelihood: 20201.406718
Log likelihood: 20201.404368
Log likelihood: 20201.405709
Log likelihood: 20201.405913
Log likelihood: 20201.403994
Log likelihood: 20201.403837
Log likelihood: 20201.402880
Log likelihood: 20201.402378
Log likelihood: 20201.402184
Log likelihood: 20201.402825
Log likelihood: 20201.402564
Log likelihood: 20201.399522
Log likelihood: 20201.397520
Log likelihood: 20201.403061
Log likelihood: 20201.399459
Log likelihood: 20201.400132
Log likelihood: 20201.399027
Log likelihood: 20201.397112
Log likelihood: 20201.396849
Log likelihood: 20201.394239
Log likelihood: 20201.391908
Log likelihood

Log likelihood: 20011.434902
Log likelihood: 20011.433299
Log likelihood: 20011.494150
Log likelihood: 20011.418437
Log likelihood: 20011.434097
Log likelihood: 20011.420096
Log likelihood: 20011.472609
Log likelihood: 20011.394508
Log likelihood: 20011.459709
Log likelihood: 20011.393492
Log likelihood: 20011.475639
Log likelihood: 20011.391834
Log likelihood: 20011.543321
Log likelihood: 20011.386658
Log likelihood: 20011.385392
Log likelihood: 20011.440516
Log likelihood: 20011.449303
Log likelihood: 20011.386715
Log likelihood: 20011.396411
Log likelihood: 20011.379468
Log likelihood: 20011.375877
Log likelihood: 20011.396045
Log likelihood: 20011.381700
Log likelihood: 20011.378370
Log likelihood: 20011.392072
Log likelihood: 20011.373967
Log likelihood: 20011.377511
Log likelihood: 20011.392157
Log likelihood: 20011.372495
Log likelihood: 20011.396952
Log likelihood: 20011.370902
Log likelihood: 20011.372734
Log likelihood: 20011.375264
Log likelihood: 20011.381446
Log likelihood