1. Modeling for risk analysis focuses mainly on the loss side, telling people how much we might lose so we could prepare for it. The risk analysis 
would sometimes especially look at some extremely bad scenarios. On the other hand, modeling for forecasting trys to find a good fitting of something like price. It has a great emphasis on the goodness of fit, using indicators like R-square. 

In [1]:
import Utils
import numpy as np
import pandas as pd
from scipy.stats import skew,kurtosis,t 

In [2]:
dataset = pd.read_csv("../../FinTech-545-Fall2025/MidTerm/problem2.csv")

print('Problem2:')

print('a.')
print('mean of data:')
print(dataset.mean().values)

print('variance of data:')
print((dataset.std()**2).values)

print('skewness of data:')
print(skew(dataset, axis=0, bias=False))

print('kurtosis of data:')
print(kurtosis(dataset, axis=0, bias=True)+3)

print(' ')

print('b.')
print('I would choose t-distribution because the kurtosis is larger than 3, indicating the data has more values in tails compared to the normal distribution')
print(' ')

print('c.')
print('Fit normal distriution:')
mu,sigma = Utils.fit_normal(dataset)
print("Estimated mu =", mu)
print("Estimated sigma =", sigma)

print('Fit t-distriution:')
nu, mu, sigma = t.fit(dataset)

print("Estimated mu =", mu)
print("Estimated sigma =", sigma)
print("Estimated nu =", nu)

print("The excess kurtosis = 6/(nu-4) which is", 6/(nu-4),"indicating more outliers")

Problem2:
a.
mean of data:
[-0.00034577]
variance of data:
[0.00048542]
skewness of data:
[0.11425266]
kurtosis of data:
[3.95755955]
 
b.
I would choose t-distribution because the kurtosis is larger than 3, indicating the data has more values in tails compared to the normal distribution
 
c.
Fit normal distriution:
Estimated mu = -0.0003457749550813146
Estimated sigma = 0.02203232224742731
Fit t-distriution:
Estimated mu = -0.0004775894210559369
Estimated sigma = 0.019398966622270863
Estimated nu = 8.85936027362565
The excess kurtosis = 6/(nu-4) which is 1.234730429963222 indicating more outliers


In [3]:
print('Problem3:')
print('a.')
print('normal distribution:')
print(Utils.var_from_returns(dataset))

print('t-distribution:')
print(Utils.var_from_returns(returns = dataset, alpha = 0.05, dist = 't'))
print(" ")

print('b.')
print('normal distribution:')
print(Utils.es_normal(dataset))
print('t-distribution:')
print(Utils.es_t(dataset))
print(" ")

print("c.")
print("The ES is higher, because the data has more outliers")

Problem3:
a.
normal distribution:
{'VaR Absolute(distance from 0)': np.float64(0.03658572011392575)}
t-distribution:
{'VaR Absolute(distance from 0)': np.float64(0.03610246447166382)}
 
b.
normal distribution:
{'ES Absolute(distance from 0)': np.float64(0.045792128233980406)}
t-distribution:
{'ES Absolute(distance from 0)': np.float64(0.04823023095412436)}
 
c.
The ES is higher, because the data has more outliers


In [4]:
dataset = pd.read_csv("../../FinTech-545-Fall2025/MidTerm/problem4.csv")
print('Problem4:')
print('a.')

cov, corr = Utils.ew_cov_corr_normalized(df = dataset, lam = 0.94)
print(corr)
print(" ")

print('b.')
cov, corr = Utils.ew_cov_corr_normalized(df = dataset, lam = 0.97)
var = np.diag(cov)
std = np.diag(np.sqrt(np.diag(cov)))
print("x1:",var[0],"x2:",var[1],"x3:",var[2])
print(' ')

print('c.')
cov =  std@corr.values@std
print(cov)
print(' ')

print('d.')
print('Recent data should have more weight when doing the calculation..')

Problem4:
a.
          x1        x2        x3
x1  1.000000  0.711329  0.807175
x2  0.711329  1.000000  0.713020
x3  0.807175  0.713020  1.000000
 
b.
x1: 0.01537881149856821 x2: 0.035517432009207775 x3: 0.027813461616296754
 
c.
[[0.01537881 0.01425713 0.01483871]
 [0.01425713 0.03551743 0.02105709]
 [0.01483871 0.02105709 0.02781346]]
 
d.
Recent data should have more weight when doing the calculation..


In [5]:
dataset = pd.read_csv("../../FinTech-545-Fall2025/MidTerm/problem5.csv")
print('Problem5:')
print('a.')
cov,corr = Utils.pairwise_cov_corr(dataset)
print(cov)
print(' ')

print('b.')
eigvals, S = np.linalg.eigh(cov)
print("Eigenvalues of the covariance matrix:", eigvals)
print("It is Non Definite because it has negative eigenvalues")
print(' ')

print('c.')
cov_h = Utils.higham_covariance(cov)
print('Covariance matrix after Higham’s method')
print(cov_h)
print(' ')

print('d.')
eigvals, S = np.linalg.eigh(cov_h)
idx = np.argsort(eigvals)[::-1]                   # sort descending
eigvals, S = eigvals[idx], S[:, idx]   
s = eigvals.sum()
explained_ratio = eigvals / s
cum = np.cumsum(explained_ratio)
print('variance explained:')
print(explained_ratio)
print('cumulative variance explained')
print(cum)

Problem5:
a.
          x1        x2        x3        x4        x5
x1  1.470484  1.454214  0.877269  1.903226  1.444361
x2  1.454214  1.252078  0.539548  1.621918  1.237877
x3  0.877269  0.539548  1.272425  1.171959  1.091912
x4  1.903226  1.621918  1.171959  1.814469  1.589729
x5  1.444361  1.237877  1.091912  1.589729  1.396186
 
b.
Eigenvalues of the covariance matrix: [-0.31024286 -0.13323183  0.02797828  0.83443367  6.78670573]
It is Non Definite because it has negative eigenvalues
 
c.
Covariance matrix after Higham’s method
          x1        x2        x3        x4        x5
x1  1.470484  1.332524  0.886817  1.628700  1.400961
x2  1.332524  1.252078  0.622298  1.454230  1.217183
x3  0.886817  0.622298  1.272425  1.070369  1.057808
x4  1.628700  1.454230  1.070369  1.814469  1.577137
x5  1.400961  1.217183  1.057808  1.577137  1.396186
 
d.
variance explained:
[ 8.98084652e-01  1.01915353e-01 -8.98534328e-10 -1.38677447e-09
 -1.85787357e-09]
cumulative variance explained
[0.89808

In [6]:
dataset = pd.read_csv("../../FinTech-545-Fall2025/MidTerm/problem6.csv")
dataset = dataset[['x1','x2','x3']]
returns = dataset.pct_change().dropna()
print('Problem6:')
print('a.')

returns = returns - returns.mean()
print('Fit t-distriution:')
for column in returns.columns:
    print(column)
    nu, mu, sigma = t.fit(returns[column])
    print("Estimated mu =", mu, "Estimated sigma =", sigma, "Estimated nu =", nu)
print(' ')

print('b.')
samples, R, params = Utils.generate_copula_samples(
    n_assets=3,
    dist_types=["t", "t", "t"],  
    data=returns,                
    corr_method="spearman",     
)
print('correlation matrix used in the copula:')
print(R)
print(' ')

print('c,d')
print('I am using the last price as the current price and using the simulated returns generated in b')
prices = list(dataset.iloc[-1])
out = Utils.portfolio_var_es_sim(
    prices=prices,
    holdings=[100, 100, 100],
    returns=samples, 
    alpha=0.05
)

print(out)

Problem6:
a.
Fit t-distriution:
x1
Estimated mu = -0.0004786520929827172 Estimated sigma = 0.012907549628639717 Estimated nu = 4.729830909131829
x2
Estimated mu = -4.350220543514708e-05 Estimated sigma = 0.009058419601620012 Estimated nu = 6.766945042505089
x3
Estimated mu = 7.48934264331786e-05 Estimated sigma = 0.01706272432180566 Estimated nu = 39.864383658541556
 
b.
correlation matrix used in the copula:
[[1.         0.44629926 0.39419743]
 [0.44629926 1.         0.51176059]
 [0.39419743 0.51176059 1.        ]]
 
c,d
I am using the last price as the current price and using the simulated returns generated in b
     Stock         VaR          ES   VaR_Pct    ES_Pct
0  Asset_1  223.299833  322.327014  0.026883  0.038805
1  Asset_2  134.936181  185.919448  0.017433  0.024019
2  Asset_3  236.222667  298.038643  0.028753  0.036277
3    Total  476.137133  634.402316  0.019624  0.026148
