<a href="https://colab.research.google.com/github/kunan-au/Modeling_Risk/blob/main/Modeling_Market_Risk_VaR.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Value at Risk


Exploring the Frontier of Financial Risk Analysis in the Australian Healthcare Technology Sector

In the dynamic world of financial markets, understanding and managing risk is paramount. My project is at the forefront of this challenge, specifically targeting the **Australian Healthcare Technology Sector (BK7094)**. Embarking on a groundbreaking journey to analyze the **Value at Risk (VaR)** of stocks within this niche yet vital sector.

My approach is a fusion of tradition and innovation. I utilize established VaR calculation methodologies like historical analysis, variance-covariance, and Monte Carlo simulations. But I don't stop there. I integrate these with cutting-edge machine learning techniques, including **Random Forest** algorithms, and advanced optimization methods like **simulated annealing** and **genetic algorithms**. This blend not only enriches my analysis but also elevates its precision.

Recognizing the challenges posed by data scarcity in specialized markets like healthcare technology, adopt an innovative approach using **Generative Adversarial Networks (GAN)**. This technique allows I to generate synthetic, realistic market data scenarios, thus overcoming the limitations of traditional data sources.

Moreover, my project leverages the power of **Natural Language Processing (NLP)** technologies, such as **BERT** and **CNN**, to automate the analysis of vast market research reports. This automation not only enhances my efficiency but also ensures comprehensive market coverage and deeper insights.

Another highlight of my project is the **individual stock VaR analysis**. This allows for a meticulous dissection of investment risks associated with specific stocks in the Australian Healthcare Technology sector. It's a crucial step towards providing investors and analysts with detailed, actionable insights.

By marrying the robustness of traditional financial risk assessment with the agility of modern technology, project aims to deliver **a series weights** about nuanced understanding of market risks – for stakeholders in the rapidly evolving Australian Healthcare Technology sector.




**Specials**

**1. Combining Traditional and Advanced Methods:** Integrates historical, variance-covariance, and Monte Carlo simulation methods with modern techniques for a foundational understanding of VaR.

**2. Utilization of Advanced Optimization Algorithms**: Employs sophisticated techniques like simulated annealing and genetic algorithms for effective optimization of model parameters.

**3. Application of Machine Learning Technologies:** Uses algorithms like Random Forest to capture complex and non-linear relationships in financial data, offering detailed risk analysis.

**4. Adaptability and Flexibility of Models:** Continuously adjusts and optimizes models to adapt to market changes.

**5. Comprehensive Understanding of Market Dynamics:** Merges traditional financial risk assessment methods with modern technology for in-depth market risk analysis.

**6. Innovative Approach to Address Data Scarcity:** Enhances data using Generative Adversarial Networks (GAN), creating synthetic yet realistic market data scenarios.

**7. Synthetic Data Generation to Overcome Data Limitations**: Utilizes GANs to create synthetic data, compensating for the lack of original data.

**8. Enhancing Model Training Effectiveness and Generalizability**: Improves model training and adaptability to diverse market scenarios through GAN-generated data.

**9. Automated Scenario Analysis:** Applies NLP technologies like BERT and CNN for automatic interpretation of market research reports, forming new scenario analyses.

**10. Efficiency and Coverage Improvement:** Utilizes NLP technologies for automated scenario analysis, enhancing efficiency and market information coverage.

**11. Depth of Understanding and Insight:** Extracts key information and insights from a vast amount of text using NLP technologies.

**12. Diversity and Innovation in Scenario Analysis:** Offers new analytical perspectives and patterns through automated scenario analysis using advanced NLP technologies.

**13. Targeted Industry Analysis:** Focuses on the Australian Healthcare Technology Sector, providing in-depth and industry-specific risk analysis.

**14. Explanatory Power and Transparency:** Offers models and methods that are easy to understand, enhancing confidence and comprehension among decision-makers.

**15. Comprehensive Analytical Capability:** Combines macroeconomic data with market-specific information for a holistic risk assessment.

**16. Individual Stock VaR Analysis:** Conducts Value at Risk analysis for individual stocks, enabling detailed dissection and understanding of specific investment risks within the sector.

**17. Multi Asset Class Combination:** Simulates multiple asset classes such as funds and options to generate a VaR distribution that is richer and more realistic in composite performance

**Drawbacks & Todo**

**1. Limited Data Volume:** The available data for the Australian stock market, especially with issues like the import failure of [Adherium Ltd (ADR)](https://docs.google.com/spreadsheets/d/1SkXt8Jfm6l1ucnbBetF7RsidqNeq8VZ37l1MJJIKsIE/edit?usp=sharing), [Beamtree Holdings Ltd(BMT)](https://docs.google.com/spreadsheets/d/1pD79cCnjSs9p8FuFwo2BO0ba7iwrgCIz8RZ3ItyeXI0/edit?usp=sharing/), [Tali Digital Ltd (TD1)](https://docs.google.com/spreadsheets/d/11CXtRS-DMZBSRCAcEvAL-WGZDa4g-K_-typO0QIWz6c/edit?usp=sharing)  is relatively small, which may limit the generalizability and accuracy of your models.

**2. Accuracy of Simulated Data:** The simulated data for funds and equities may not accurately reflect real market dynamics, impacting the authenticity and reliability of your models.

**3. Depth of Scenario Analysis:** The current scenario analysis might not comprehensively cover all possible market variations, especially under extreme market conditions.

**4. Model Assumption Limitations:** The financial models used may be based on certain assumptions (like market behavior, asset correlations) that might not hold true under real market conditions.

**5. Continuity of Time Series Data:** The continuity of time series might be impacted due to the method of data updating and processing, affecting the time-sensitive analysis of your models.

**6. Adaptability of Quantitative Models:** In rapidly changing market conditions, quantitative models may require more frequent updates and adjustments to maintain their accuracy and relevance.

**7. Computational Resource Requirements:** Advanced data analysis and model training may require significant computational resources, which could be a limiting factor in resource-constrained situations.

**8. Flaws in VaR Calculation:** There are flaws in the calculation of Value at Risk (VaR) for stocks in the Australian Healthcare Technology Sector, especially in adequately addressing tail risks.

**9. Insufficient Sector Mapping:** The study, though focused on the Australian Healthcare Technology Sector (BK7094), might have inadequate sector mapping, affecting the precision of the research.

**10. Mixed and Unclassified Data Sets:** The training data set might be focused on bullish and bearish stocks, but inference data could be research reports for understanding the economic environment, leading to inconsistencies between training and inference data.

**11. Financial Formulas and Asset Class Correlation Estimation:** More financial formulas are needed for valuation, along with establishing correlations between different asset classes.

**12. Extreme Value and Inherent Limitations:** The model might have limitations in extreme value and lacks independent quantification of scenarios.

**13. Limitations of NLP Models:** The use of NLP technologies like CNN and Bert for scenario generation did not fully consider the limitations and issues of these models, which may affect their ability to understand complex financial texts and generate accurate scenarios.

**14. Lack of Model Cascading:**The project did not consider effective model cascading strategies, which are crucial in complex financial data analysis.

**15. Computational Power Limitations:** Limited computational power might restrict optimal model parameter tuning, hindering model performance maximization.

**16. Data Restrictions and Inconsistencies:** Restrictions and mixing issues in the data could impact the training and inference effectiveness of the models.

**17. Flaws in VaR Calculation Method:** Improvements are needed in the calculation method of Value at Risk, especially in enhancing predictions for extreme market situations.

**18. Underutilization of GPT for Scenario Enhancement:** Although considering the use of GPT for enhancing scenario analysis, its capabilities might not have been fully utilized, especially in dealing with complex and uncertain market conditions.

**19. Insufficient Mapping and Correlation in Sector Analysis:** The research on the Australian Healthcare Technology Sector stocks might lack sufficient mapping and analysis of correlations between asset classes.

**20. Issues with Not Importing Real Data:** For fund options, the failure to import real data and relying only on simulations could affect the models' realism and applicability.

**21. Inadequate Assessment of Company Health:** Without fundamental analysis, there's a limited understanding of the actual financial health and stability of the companies within the Australian stock market. This can lead to inaccurate valuations and risk assessments.

**22. Missed Insights into Market Dynamics:** Fundamental analysis often provides deep insights into market trends, sector dynamics, and economic indicators. Its absence means missing out on these crucial insights, which can inform better investment decisions.

**23. Challenges in Long-Term Forecasting:** Fundamental analysis plays a key role in long-term forecasting and in assessing the sustainability of a company's growth. Not incorporating it can weaken the long-term forecasting ability of your models.

......


Details Update / Parameters / Names abt Genetic Alogrithm

Data From:

https://www.morningstar.com.au/

https://www.tigerbrokers.com.au/

https://www.google.com/finance/?hl=en

https://www.finance.gov.au/

https://www.bis.org/bcbs/

https://www.rba.gov.au/

All outputs in below link

https://colab.research.google.com/drive/1ViG2YY5FUOomsTtv3CNzXVX_ViSy_pgy?usp=sharing

**Colab Running Information**

In [None]:
gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Not connected to a GPU')
else:
  print(gpu_info)

In [None]:
from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('Not using a high-RAM runtime')
else:
  print('You are using a high-RAM runtime!')

**Package**

In [None]:
!pip install arch
!pip install deap
!pip install tensorflow
!pip install scikit-optimize
!pip install copulas
!pip install nltk
!pip install spacy
!pip install sumy
!pip install PyPDF2
!pip install summa
!pip install accelerate -U
!pip install transformers[torch] -U
!pip install datasets
!pip install keras-tuner
!python -m spacy download en_core_web_sm

In [None]:
# Data Processing and Analysis
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.ensemble import RandomForestRegressor
import gspread

# Data Visualization
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import seaborn as sns
from wordcloud import WordCloud

# Machine Learning and Deep Learning
import tensorflow as tf
from tensorflow.keras.layers import Input, Embedding, Conv1D, MaxPooling1D, Flatten, Dense, Dropout, Concatenate, BatchNormalization, Reshape, LSTM, Bidirectional
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import LeakyReLU
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.utils import to_categorical

from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from kerastuner.tuners import RandomSearch
import torch
import torch.nn.functional as F

# Natural Language Processing (NLP) and Transformers
import spacy
from transformers import BertTokenizer, BertForSequenceClassification, BertForTokenClassification, Trainer, TrainingArguments
from transformers import pipeline
from datasets import Dataset, load_dataset
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer as SumyTokenizer
from sumy.summarizers.lsa import LsaSummarizer
from summa import summarizer

# Network Analysis
import networkx as nx

# Statistics and Optimization
from scipy.optimize import minimize
from scipy.stats import norm, t, uniform
from arch import arch_model

# Miscellaneous
import random
import PyPDF2
import re
import requests
from getpass import getpass
from google.colab import auth
from google.auth import default
from deap import base, creator, tools, algorithms
from skopt import BayesSearchCV
from copulas.multivariate import GaussianMultivariate
from kerastuner import HyperParameters, Tuner
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import LabelEncoder

**Data Reading**

BK7094 Stock Data

In [None]:
#Setup for Google Sheets API
auth.authenticate_user()
creds, _ = default()
gc = gspread.authorize(creds)

In [None]:
def get_sheet_data(spreadsheet_name, worksheet_index=0):
    spreadsheet = gc.open(spreadsheet_name)
    worksheet = spreadsheet.get_worksheet(worksheet_index)
    data = worksheet.get_all_values()
    df = pd.DataFrame(data)
    df.columns = df.iloc[0]
    df = df.drop(0)
    df['Date'] = df['Date'].str.split().str[0]
    df['Date'] = pd.to_datetime(df['Date'], format='%m/%d/%Y').dt.date
    df['Close'] = pd.to_numeric(df['Close'], errors='coerce')
    return df

In [None]:
sheet_names = ["ICR","CTQ","AYA","AHI","TRI",
               "MDR","M7T","PME","VHT", "SHG",
               "PCK","JTL","IME","DOC","ALC",
               "4DX","ONE","CGS","GLH","HIQ"]

dataframes = [get_sheet_data(name) for name in sheet_names]

combined_df = pd.concat(dataframes, ignore_index=True)

for df in dataframes:
    print(df.head())

Text Data

In [None]:
pdf_paths = [
    '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/Bullish.pdf',
    '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/Bearish.pdf',
    '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/Netrual.pdf'
]
for path in pdf_paths:
    with open(path, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        print(f"Contents of {path} (first page):")
        first_page = reader.pages[0]
        print(first_page.extract_text())
        print("\n")

Economic Situation Data

In [None]:
pdf_paths = [
    '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/rba-annual-report-2023.pdf',
    '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/rba-annual-report-2022.pdf',
    '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/2021-report.pdf',
    '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/statement-on-monetary-policy-2023-11.pdf',
    '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/statement-on-monetary-policy-2023-08.pdf'
]


reports = []
for path in pdf_paths:
    with open(path, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        text = ''
        for page in reader.pages:
            text += page.extract_text() + '\n'
        reports.append(text)

df_reports = pd.DataFrame({'report': reports})
print(df_reports.head())

Extra Data Import

In [None]:
api_key = getpass('Enter your API key:')

In [None]:
url = f"https://newsapi.org/v2/everything?q=economy&apiKey={api_key}"

response = requests.get(url)
data = response.json()

articles = data['articles']
for article in articles:
    print(article['title'])
    print(article['description'])
    print()

In [None]:
sentiment_pipeline = pipeline("sentiment-analysis")

for article in articles:
    title = article['title']
    sentiment = sentiment_pipeline(title)[0]
    print(f"Title: {title}")
    print(f"Sentiment: {sentiment['label']}")
    print()

In [None]:
def map_sentiment_label(sentiment_label):
    mapping = {
        'POSITIVE': 'Bullish',
        'NEGATIVE': 'Bearish',
        'NEUTRAL': 'Neutral'
    }
    return mapping.get(sentiment_label, 'Neutral')

In [None]:
for article in articles:
    title = article['title']
    sentiment_result = sentiment_pipeline(title)[0]
    custom_label = map_sentiment_label(sentiment_result['label'])
    print(f"Title: {title}")
    print(f"Custom Sentiment: {custom_label}")
    print()

Data Integration

In [None]:
pdf_paths = {
    'Bullish': '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/Bullish.pdf',
    'Bearish': '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/Bearish.pdf',
    'Neutral': '/content/drive/My Drive/Colab Notebooks/BK7094 Modeling Market Risk Data/Training Text pdf form/Netrual.pdf'
}

data = []
for label, path in pdf_paths.items():
    with open(path, 'rb') as file:
        reader = PyPDF2.PdfReader(file)
        for page in reader.pages:
            text = page.extract_text()
            data.append({'text': text, 'label': label})

# Fetch the articles
response = requests.get(url)
articles = response.json()['articles']

# Initialize the sentiment analysis pipeline
sentiment_pipeline = pipeline("sentiment-analysis")

# Process the articles and add them to your data list
for article in articles:
    title = article['title']
    sentiment_result = sentiment_pipeline(title)[0]
    custom_label = map_sentiment_label(sentiment_result['label'])
    data.append({'text': title, 'label': custom_label})

# Create the final DataFrame
final_df_nlp = pd.DataFrame(data)

# Print the first few rows of the DataFrame
print(final_df_nlp.head())

Visulization

In [None]:
dataframes_forSeen = {name: get_sheet_data(name) for name in sheet_names}
plt.figure(figsize=(12, 6))
for name, df in dataframes_forSeen.items():
    plt.plot(df['Date'], df['Close'], label=name)

plt.xlabel('Date')
plt.ylabel('Close Price')
plt.title('Stock Closing Prices')
plt.legend()
plt.xticks(rotation=45)
plt.show()

In [None]:
# Generate and display a word cloud
plt.figure(figsize=(10, 6))
text = ' '.join(final_df_nlp['text'])
wordcloud = WordCloud(width=800, height=400, background_color='white').generate(text)
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis('off')
plt.title('Word Cloud of Article Titles')
plt.show()

In [None]:
print(final_df_nlp)

unique_labels = final_df_nlp['label'].unique()
print("Unique Labels:", unique_labels)

label_counts = final_df_nlp['label'].value_counts()
for label, count in label_counts.items():
    print(f"{label}: {count}")

**Data Set Reinforcement**

Bootstraping

In [None]:
def bootstrap_data(df, n_bootstraps=1):
    bootstrap_samples = []
    for _ in range(n_bootstraps):
        sample = df.sample(n=len(df), replace=True).copy(deep=True)
        bootstrap_samples.append(sample)
    return bootstrap_samples

In [None]:
bootstraped_dataframes = [bootstrap_data(df) for df in dataframes]

for i, bootstrapped_dfs in enumerate(bootstraped_dataframes):
    print(f"Original DataFrame {i}:")
    for j, df_sample in enumerate(bootstrapped_dfs):
        print(f"Bootstrap Sample {j + 1} Head:")
        print(df_sample.head())
        print()

Long Short-Term Memory (LSTM)

In [None]:
def prepare_data_for_lstm(df, feature_columns, target_column, n_steps):
    df_copy = df.copy()
    X, y = [], []
    scalers = {}

    for col in feature_columns:
        scalers[col] = MinMaxScaler(feature_range=(0, 1))
        df_copy[col] = scalers[col].fit_transform(df_copy[col].values.reshape(-1, 1))

    for i in range(n_steps, len(df_copy)):
        X.append(df_copy[feature_columns].iloc[i-n_steps:i].values)
        y.append(df_copy[target_column].iloc[i])

    X, y = np.array(X), np.array(y)
    X = np.reshape(X, (X.shape[0], X.shape[1], len(feature_columns)))

    return X, y, scalers[target_column]

In [None]:
def build_lstm_model(input_shape):
    model = Sequential([
        LSTM(50, return_sequences=True, input_shape=input_shape),
        LSTM(50),
        Dense(1)
    ])
    model.compile(optimizer='adam', loss='mse')
    return model

In [None]:
n_steps = 10

feature_columns = ['Open', 'High', 'Low', 'Close', 'Volume']
target_column = 'Close'

In [None]:
lstm_models = []
predictions = []

for df in dataframes:
    X, y, scaler = prepare_data_for_lstm(df, feature_columns, target_column, n_steps)
    model = build_lstm_model(X.shape[1:])
    model.fit(X, y, epochs=10, batch_size=32)
    lstm_models.append(model)
    predictions.append(model.predict(X))

In [None]:
predicted = model.predict(X)
predicted_original_scale = scaler.inverse_transform(predicted)
new_df = df.copy(deep=True)

new_df['Predicted_Close'] = np.nan

start_idx = n_steps
end_idx = start_idx + len(predicted_original_scale)

new_df.loc[start_idx:end_idx-1, 'Predicted_Close'] = predicted_original_scale.squeeze()

print(new_df)

print(new_df['Predicted_Close'])

Generative Adversarial Networks (GAN)



In [None]:
def build_generator(seq_length, latent_dim, n_features=5):
    input_noise = Input(shape=(latent_dim,))
    x = Dense(128)(input_noise)
    x = LeakyReLU(alpha=0.01)(x)
    x = BatchNormalization(momentum=0.8)(x)
    x = Dense(seq_length * n_features)(x)
    x = Reshape((seq_length, n_features))(x)
    return Model(input_noise, x)

In [None]:
def build_gan(generator, discriminator):
    z = Input(shape=(latent_dim,))
    fake_seq = generator(z)
    discriminator.trainable = False
    validity = discriminator(fake_seq)
    return Model(z, validity)

In [None]:
def build_discriminator(seq_length, n_features=5):
    seq = Input(shape=(seq_length, n_features))
    x = LSTM(64, return_sequences=True)(seq)
    x = LSTM(64)(x)
    x = Dense(1, activation='sigmoid')(x)
    return Model(seq, x)

In [None]:
def preprocess_and_create_sequences(df, selected_columns, seq_length):
    data = df[selected_columns]

    scaler = MinMaxScaler(feature_range=(0, 1))
    scaled_data = scaler.fit_transform(data)

    def create_sequences(data, seq_length):
        xs, ys = [], []
        for i in range(len(data) - seq_length):
            x = data[i:(i + seq_length)]
            y = data[i + seq_length]
            xs.append(x)
            ys.append(y)
        return np.array(xs), np.array(ys)

    X, y = create_sequences(scaled_data, seq_length)
    return X, y, scaler

In [None]:
latent_dim = 32
seq_length = 60
processed_data = [preprocess_and_create_sequences(df, ['Open', 'High', 'Low', 'Close', 'Volume'], seq_length) for df in dataframes]

In [None]:
generator = build_generator(seq_length, latent_dim, n_features=5)  # Make sure to match n_features
discriminator = build_discriminator(seq_length, n_features=5)
discriminator.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5), metrics=['accuracy'])
gan = build_gan(generator, discriminator)
gan.compile(loss='binary_crossentropy', optimizer=Adam(0.0002, 0.5))

In [None]:
def train_gan(generator, discriminator, gan, processed_data, epochs, batch_size, latent_dim):
    for data_tuple in processed_data:
        X, y, _ = data_tuple  # Assuming each tuple is (X, y, scaler)
        real = np.ones((batch_size, 1))
        fake = np.zeros((batch_size, 1))

        for epoch in range(epochs):
            # Randomly select real sequences
            idx = np.random.randint(0, X.shape[0], batch_size)
            real_seqs = X[idx]

            # Generate fake sequences
            noise = np.random.normal(0, 1, (batch_size, latent_dim))
            fake_seqs = generator.predict(noise)

            # Train discriminator
            d_loss_real = discriminator.train_on_batch(real_seqs, real)
            d_loss_fake = discriminator.train_on_batch(fake_seqs, fake)
            d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

            # Train generator
            noise = np.random.normal(0, 1, (batch_size, latent_dim))
            g_loss = gan.train_on_batch(noise, real)

            # Print progress
            if epoch % 100 == 0:
                print(f"Epoch {epoch} [D loss: {d_loss[0]}, acc.: {100*d_loss[1]}] [G loss: {g_loss}]")

In [None]:
train_gan(generator, discriminator, gan, processed_data, epochs=3, batch_size=32, latent_dim=latent_dim)

In [None]:
def generate_data(generator, latent_dim, num_samples):

    noise = np.random.normal(0, 1, size=(num_samples, latent_dim))

    generated_data = generator.predict(noise)

    return generated_data

In [None]:
num_samples = 1000
latent_dim = 32
generated_data = generate_data(generator, latent_dim, num_samples)

flattened_data = generated_data.reshape(-1, generated_data.shape[1] * generated_data.shape[2])

columns = ['Price', 'Volume', 'Interest', 'Sentiment', 'Trend']
flattened_columns = [f'{col}_t{i}' for i in range(generated_data.shape[1]) for col in columns]

generated_df = pd.DataFrame(flattened_data, columns=flattened_columns)

print(generated_df)

**Individual Stock VaR Analysis**

Statistic Method

In [None]:
def calculate_var(df, confidence_level=95):
    # Convert columns to numeric (excluding Date)
    for col in df.columns:
        if col != 'Date':
            df[col] = pd.to_numeric(df[col], errors='coerce')

    # Calculate daily returns
    df['Return'] = df['Close'].pct_change()

    # Calculate VaR at the specified confidence level
    var = np.percentile(df['Return'].dropna(), 100 - confidence_level)
    return var

In [None]:
# Calculating VaR for each stock
for name, df in zip(sheet_names, dataframes):
    var = calculate_var(df)
    print(f"VaR at 95% confidence level for {name}: {var*100:.2f}%")

Historical Maximum Loss

In [None]:
# Function to calculate VaR by finding the greatest loss
def calculate_VaR_simple_loss(df):
    # Convert columns to numeric (excluding Date)
    for col in df.columns:
        if col != 'Date':
            df[col] = pd.to_numeric(df[col], errors='coerce')

    # Calculate daily returns
    df['Return'] = df['Close'].pct_change()

    # Find the greatest loss (minimum return)
    greatest_loss = df['Return'].min()
    return greatest_loss

In [None]:
# Calculating VaR for each stock by identifying the greatest loss
for name, df in zip(sheet_names, dataframes):
    var = calculate_VaR_simple_loss(df)
    print(f"VaR (greatest loss) for {name}: {var*100:.2f}%")

Historical & Time Decay Factor EWMA

In [None]:
def calculate_VaR_with_time_decay(df, decay_factor=1):
    # Convert columns to numeric (excluding Date)
    for col in df.columns:
        if col != 'Date':
            df[col] = pd.to_numeric(df[col], errors='coerce')

    # Calculate daily returns
    df['Return'] = df['Close'].pct_change()

    # Apply exponential weighting
    weights = np.array([decay_factor**i for i in range(len(df))])[::-1]
    weighted_returns = df['Return'] * weights
    weighted_returns /= weights.sum()

    # Find the greatest loss (minimum return) in weighted returns
    greatest_loss = weighted_returns.min()
    return greatest_loss

In [None]:
# Calculating VaR for each stock by identifying the greatest loss
for name, df in zip(sheet_names, dataframes):
    var = calculate_VaR_with_time_decay(df)
    print(f"VaR (greatest loss) for {name}: {var*100:.2f}%")

In [None]:
def calculate_VaR_with_time_decay_amend(df, decay_factor=1):
    # Convert columns to numeric (excluding Date)
    for col in df.columns:
        if col != 'Date':
            df[col] = pd.to_numeric(df[col], errors='coerce')

    # Calculate daily returns
    df['Return'] = df['Close'].pct_change()

    # Handle NaN values in returns
    df.dropna(subset=['Return'], inplace=True)

    # When decay_factor is 1, use unweighted returns directly
    if decay_factor == 1:
        greatest_loss = df['Return'].min()
    else:
        # Apply exponential weighting
        weights = np.array([decay_factor**i for i in range(len(df))])[::-1]
        weighted_returns = df['Return'] * weights
        weighted_returns /= weights.sum()

        # Find the greatest loss in weighted returns
        greatest_loss = weighted_returns.min()

    return greatest_loss

In [None]:
# Calculating VaR for each stock by identifying the greatest loss
for name, df in zip(sheet_names, dataframes):
    var = calculate_VaR_with_time_decay_amend(df)
    print(f"VaR (greatest loss) for {name}: {var*100:.2f}%")

CVaR  Expected Shortfall (ES) Tail VaR

In [None]:
def calculate_CVaR_with_time_decay(df, decay_factor=1, confidence_level=0.95):
    # Convert columns to numeric (excluding Date)
    for col in df.columns:
        if col != 'Date':
            df[col] = pd.to_numeric(df[col], errors='coerce')

    # Calculate daily returns
    df['Return'] = df['Close'].pct_change()

    # Apply exponential weighting
    weights = np.array([decay_factor**i for i in range(len(df))])[::-1]
    weighted_returns = df['Return'] * weights
    weighted_returns /= weights.sum()

    # Find the VaR (Value at Risk)
    VaR_threshold = np.percentile(weighted_returns.dropna(), (1 - confidence_level) * 100)

    # Calculate CVaR (Conditional Value at Risk)
    # Only consider the returns that are less than the VaR threshold
    tail_losses = weighted_returns[weighted_returns < VaR_threshold]
    CVaR = tail_losses.mean()  # Conditional VaR is the mean of the losses beyond the VaR threshold

    return CVaR

In [None]:
# Calculating CVaR for each stock by identifying the conditional mean loss beyond the VaR threshold
for name, df in zip(sheet_names, dataframes):
    cvar = calculate_CVaR_with_time_decay(df)
    print(f"CVaR (conditional mean loss) for {name}: {cvar*100:.2f}%")

Monte Carlo

Individual Stock

In [None]:
def calculate_stock_returns(initial_value, final_value, days=365):
    """
    Calculate the annual and daily returns of a stock.

    :param initial_value: The initial value of the stock.
    :param final_value: The final value of the stock after a period.
    :param days: The number of days over which the final value is measured. Default is 365 for one year.
    :return: A tuple containing the annual return and daily return as percentages.
    """
    # Calculate annual return
    annual_return = ((final_value - initial_value) / initial_value) * 100

    # Calculate daily return based on the number of days
    daily_return = ((final_value / initial_value) ** (1/days) - 1) * 100

    return annual_return, daily_return

Calculate return

In [None]:
for name, df in zip(sheet_names, dataframes):
    if not df.empty:

        df['Date'] = pd.to_datetime(df['Date'])
        df['Close'] = pd.to_numeric(df['Close'], errors='coerce')

        initial_value = df['Close'].iloc[0]
        final_value = df['Close'].iloc[-1]

        days = (df['Date'].iloc[-1] - df['Date'].iloc[0]).days

        annual_return, daily_return = calculate_stock_returns(initial_value, final_value, days)

        print(name, f"Annual Return: {annual_return:.2f}%, Daily Return: {daily_return:.4f}%")

Normal Distribution

In [None]:
def calculate_VaR_MonteCarlo_Normal(df, days=252, iterations=10000, confidence_level=0.95, plot=False, plot_paths=False):
    # Calculate daily returns
    returns = df['Close'].pct_change().dropna()

    # Fit a GARCH model to estimate volatility
    garch = arch_model(returns, vol='Garch', p=1, q=1)
    model = garch.fit(disp='off')
    forecast = model.forecast(horizon=days)
    vol = np.sqrt(forecast.variance.iloc[-1])

    # Calculate mean return
    mean_return = returns.mean()
    # Simulate returns using the normal distribution
    simulated_returns = np.random.normal(mean_return, vol, (iterations, days))
    # Calculate simulated price changes
    simulated_price_changes = np.exp(simulated_returns) - 1

    # Calculate VaR
    VaR = np.percentile(simulated_price_changes, (1 - confidence_level) * 100)

    if plot:
      # Flatten the array to make it one-dimensional
      flattened_simulated_changes = simulated_price_changes.flatten()
      plt.figure(figsize=(10, 6))
      plt.hist(flattened_simulated_changes, bins=50, alpha=0.7, color='red')
      plt.axvline(x=VaR, color='red', linestyle='--', label=f"VaR at {confidence_level*100}%: {VaR*100:.2f}%")
      plt.title(f"Simulated Price Changes Distribution\nVaR (at {confidence_level*100}%): {VaR*100:.2f}%")
      plt.xlabel('Simulated Price Changes')
      plt.ylabel('Frequency')
      plt.legend()
      plt.grid(True)
      plt.show()

    if plot_paths:
        vol = np.sqrt(forecast.variance.dropna().mean(axis=0))
        simulated_paths = np.zeros((iterations, days))

        for i in range(iterations):
            daily_returns = np.random.normal(mean_return, vol, days)
            simulated_paths[i, :] = np.cumprod(1 + daily_returns) * df['Close'].iloc[-1]

        plt.figure(figsize=(10, 6))
        for i in range(iterations):
            plt.plot(simulated_paths[i], alpha=0.2)

        plt.title(f"Monte Carlo Simulation Paths ({iterations} iterations)")
        plt.xlabel('Days')
        plt.ylabel('Simulated Price')
        plt.grid(True)
        plt.show()

    return VaR

T Distrubtion

In [None]:
def calculate_VaR_MonteCarlo_Advanced(df, days=252, iterations=10000, confidence_level=0.99, scale_factor=10, plot=False, plot_paths=False):
    # Rescaling returns
    returns = df['Close'].pct_change().dropna() * scale_factor

    # GARCH model
    garch = arch_model(returns, vol='Garch', p=1, q=1)
    model = garch.fit(update_freq=10, disp='off')
    forecast = model.forecast(horizon=days)
    vol = np.sqrt(forecast.variance.iloc[-1].iloc[-1]) / scale_factor

    # Fitting t-distribution
    deg_freedom, loc, scale = t.fit(returns)
    simulated_returns = t.rvs(deg_freedom, loc, scale, size=(iterations, days))

    # Adjust for extreme values to prevent overflow
    simulated_returns = np.clip(simulated_returns, a_min=np.percentile(simulated_returns, 1), a_max=np.percentile(simulated_returns, 99))

    # Simulated price changes
    simulated_price_changes = np.exp(simulated_returns * vol) - 1

    # Remove or limit infinite values
    simulated_price_changes = np.clip(simulated_price_changes, a_min=-np.inf, a_max=np.percentile(simulated_price_changes, 99.9))

    # VaR calculation
    VaR = np.percentile(simulated_price_changes, (1 - confidence_level) * 100)

    if plot:
        flattened_simulated_changes = simulated_price_changes.flatten()
        plt.figure(figsize=(10, 6))
        plt.hist(flattened_simulated_changes, bins=50, alpha=0.7, color='blue')
        plt.axvline(x=VaR, color='red', linestyle='--', label=f"VaR at {confidence_level*100}%: {VaR*100:.2f}%")
        plt.title(f"Simulated Price Changes Distribution\nVaR (at {confidence_level*100}%): {VaR*100:.2f}%")
        plt.xlabel('Simulated Price Changes')
        plt.ylabel('Frequency')
        plt.legend()
        plt.grid(True)
        plt.show()

    if plot_paths:
        vol = np.sqrt(forecast.variance.iloc[-1].iloc[-1]) / scale_factor
        simulated_paths = np.zeros((iterations, days))

        for i in range(iterations):
            daily_returns = t.rvs(deg_freedom, loc, scale, size=days) * vol
            daily_returns = np.clip(daily_returns, a_min=-np.inf, a_max=np.percentile(daily_returns, 99.9))
            simulated_paths[i, :] = np.cumprod(1 + daily_returns) * df['Close'].iloc[-1]

        plt.figure(figsize=(10, 6))
        for i in range(iterations):
            plt.plot(simulated_paths[i], alpha=0.2)

        plt.title(f"Monte Carlo Simulation Paths ({iterations} iterations)")
        plt.xlabel('Days')
        plt.ylabel('Simulated Price')
        plt.grid(True)
        plt.show()

    return VaR

In [None]:
for name, df in zip(sheet_names, dataframes):
    if not df.empty:
        # Call the function for the normal distribution
        var_normal = calculate_VaR_MonteCarlo_Normal(df, days=252, iterations=100, confidence_level=0.95, plot=True, plot_paths=True)
        print(f"Normal VaR (greatest loss) for {name}: {var_normal*100:.2f}%")

        # Call the function for the t-distribution
        var_advanced = calculate_VaR_MonteCarlo_Advanced(df, days=252, iterations=100, confidence_level=0.99, scale_factor=10, plot=True, plot_paths=True)
        print(f"Advanced VaR (greatest loss) for {name}: {var_advanced*100:.2f}%")

Price-Simulated

In [None]:
def combined_monte_carlo_simulation(df, days, iterations, confidence_level, plot=True, plot_subset=100):

    df = df.replace([np.inf, -np.inf], np.nan).dropna()
    returns = df['Close'].pct_change().dropna()
    returns = returns.clip(lower=returns.quantile(0.01), upper=returns.quantile(0.99))

    mean_return = returns.mean()
    std_dev = returns.std()

    garch = arch_model(returns, vol='Garch', p=1, q=1)
    model = garch.fit(disp='off')
    forecast = model.forecast(horizon=days)
    forecasted_vol = np.sqrt(forecast.variance.iloc[-1].mean())

    df_param, loc, scale = t.fit(returns)

    normal_simulated_returns = norm.rvs(mean_return, forecasted_vol, (iterations, days))
    t_simulated_returns = t.rvs(df_param, loc, scale, size=(iterations, days))

    initial_price = df['Close'].iloc[-1]
    normal_price_paths = initial_price * np.cumprod(1 + normal_simulated_returns, axis=1)
    t_price_paths = initial_price * np.cumprod(1 + t_simulated_returns, axis=1)

    normal_final_prices = normal_price_paths[:, -1]
    t_final_prices = t_price_paths[:, -1]
    normal_price_VaR = np.percentile(normal_final_prices, 100 - (confidence_level * 100))
    t_price_VaR = np.percentile(t_final_prices, 100 - (confidence_level * 100))

    normal_price_VaR_percent = (normal_price_VaR - initial_price) / initial_price * 100
    t_price_VaR_percent = (t_price_VaR - initial_price) / initial_price * 100

    if plot:
        plt.figure(figsize=(14, 7))
        subset_indices = np.random.choice(range(iterations), size=plot_subset, replace=False)
        plt.plot(normal_price_paths[subset_indices].T, alpha=0.1, color='blue')
        plt.plot(t_price_paths[subset_indices].T, alpha=0.1, color='red')
        plt.title(f"Combined Monte Carlo Simulation ({iterations} simulations)")
        plt.xlabel('Days')
        plt.ylabel('Simulated Price Paths')
        plt.grid(True)
        plt.show()

        plt.figure(figsize=(10, 6))
        plt.hist(normal_final_prices, bins=50, alpha=0.7, color='blue')
        plt.hist(t_final_prices, bins=50, alpha=0.7, color='red')
        plt.axvline(x=normal_price_VaR, color='navy', linestyle='--', label=f"Normal VaR: {normal_price_VaR_percent:.2f}%")
        plt.axvline(x=t_price_VaR, color='darkred', linestyle='--', label=f"t-Distribution VaR: {t_price_VaR_percent:.2f}%")
        plt.title(f"Distribution of Simulated Final Prices\nNormal VaR: {normal_price_VaR_percent:.2f}%, t-Distribution VaR: {t_price_VaR_percent:.2f}%")
        plt.xlabel('Simulated Final Prices')
        plt.ylabel('Frequency')
        plt.legend()
        plt.grid(True)
        plt.show()

    return normal_price_VaR_percent, t_price_VaR_percent

In [None]:
for name, df in zip(sheet_names, dataframes):
    if not df.empty:
        var_normal, var_advanced = combined_monte_carlo_simulation(
            df,
            days=252,
            iterations=10000,
            confidence_level=0.9,
            plot=True,
            plot_subset=100
        )
    print(f"Normal VaR (greatest loss) for {name}: {var_normal:.2f}%")
    print(f"Advanced VaR (greatest loss) for {name}: {var_advanced:.2f}%")

Return

In [None]:
def compare_VaR_distributions(dataframes, weights, confidence_level):
    # Calculate daily returns for each stock
    return_frames = [df['Close'].pct_change().dropna() for df in dataframes]

    # Combine returns into a single DataFrame
    combined_returns = pd.concat(return_frames, axis=1)

    # Clean the data: Remove rows with non-finite values (NaN, inf, -inf)
    combined_returns = combined_returns.replace([np.inf, -np.inf], np.nan).dropna()

    # Calculate portfolio's daily returns
    portfolio_returns = combined_returns.dot(weights)

    # Fit a normal distribution
    mean, std = norm.fit(portfolio_returns)

    # Fit a t-distribution
    df, loc, scale = t.fit(portfolio_returns)

    # Calculate VaR for normal distribution
    VaR_normal = norm.ppf(1 - confidence_level, mean, std)

    # Calculate VaR for t-distribution
    VaR_t = t.ppf(1 - confidence_level, df, loc, scale)

    # Visualization
    plt.figure(figsize=(10, 6))
    plt.hist(portfolio_returns, bins=50, alpha=0.7, color='blue', density=True, label='Empirical')

    # Plot normal distribution
    x = np.linspace(min(portfolio_returns), max(portfolio_returns), 100)
    plt.plot(x, norm.pdf(x, mean, std), 'r-', lw=2, label='Normal Distribution')
    plt.axvline(x=VaR_normal, color='red', linestyle='dashed', linewidth=2, label=f'Normal VaR: {VaR_normal}')

    # Plot t-distribution
    plt.plot(x, t.pdf(x, df, loc, scale), 'g-', lw=2, label='t-Distribution')
    plt.axvline(x=VaR_t, color='green', linestyle='dashed', linewidth=2, label=f't-VaR: {VaR_t}')

    plt.title("Comparison of VaR Using Normal and t-Distributions")
    plt.xlabel("Returns")
    plt.ylabel("Density")
    plt.legend()
    plt.show()

    return VaR_normal, VaR_t

In [None]:
confidence_level = 0.95  # Confidence level
num_stocks = len(sheet_names)
equal_weights = [1/num_stocks for _ in range(num_stocks)]
# Compare VaR
VaR_normal, VaR_t = compare_VaR_distributions(dataframes, equal_weights, confidence_level)
print(f"Calculated VaR (Normal): {VaR_normal}, VaR (t-Distribution): {VaR_t}")

**Portfolio Analysis**

Weight Caculation

In [None]:
def get_real_time_price(api_key, symbol, market='ASX'):
    full_symbol = f"{market}:{symbol}"
    url = f"https://financialmodelingprep.com/api/v3/quote/{full_symbol}?apikey={api_key}"
    response = requests.get(url)
    data = response.json()

    try:
        if not data:
            print(f"No data available for symbol: {full_symbol}")
            return None
        real_time_price = data[0]['price']
        return real_time_price
    except (KeyError, IndexError):
        print(f"Error retrieving data for symbol: {full_symbol}")
        return None

In [None]:
def calculate_portfolio_weights(amount, stock_prices, stock_quantities):
    # Calculate the total investment for each stock
    total_investment_per_stock = [price * quantity for price, quantity in zip(stock_prices, stock_quantities)]

    # Calculate the total investment in the portfolio
    total_investment = sum(total_investment_per_stock)

    # Check if the total investment exceeds the provided amount
    if total_investment > amount:
        raise ValueError("The total investment exceeds the provided amount.")

    # Calculate the weights
    weights = [investment / total_investment for investment in total_investment_per_stock]

    # Normalize weights so they sum up to 1
    total_weights = sum(weights)
    normalized_weights = [weight / total_weights for weight in weights]

    # Visualization
    bottom = 0
    for i, (weight, name) in enumerate(zip(normalized_weights, sheet_names)):
        plt.bar('Portfolio', weight, bottom=bottom, label=name)
        bottom += weight

    plt.legend()
    plt.title('Portfolio Weights Visualization')
    plt.ylabel('Weight')
    plt.show()

    return normalized_weights

In [None]:
def calculate_investment_by_weights(amount, weights):
    # Ensure the sum of weights equals 1
    if sum(weights) != 1:
        raise ValueError("The sum of the weights must equal 1.")

    # Calculate the investment amount for each stock
    investment_per_stock = [amount * weight for weight in weights]

    return investment_per_stock

In [None]:
api_key = getpass('Enter your API key: ')
stock_symbols = sheet_names

In [None]:
real_time_stock_prices = [get_real_time_price(api_key, symbol) for symbol in sheet_names]

# Replace None values with 0.3
real_time_stock_prices = [price if price is not None else 0.3 for price in real_time_stock_prices]

print(f"The latest closing price of {sheet_names} is: {real_time_stock_prices}")

In [None]:
stock_quantities = [10, 15, 12, 20, 18, 5, 8, 25, 7, 9,
                    11, 14, 13, 19, 17, 6, 10, 23, 8, 10]

initial_amount = 10000  # Example amount

weights = calculate_portfolio_weights(initial_amount, real_time_stock_prices, stock_quantities)
print(f"Weights of each stock in the portfolio: {weights}")

investment_per_stock = calculate_investment_by_weights(initial_amount, weights)
print(f"Investment in each stock: {investment_per_stock}")

**Historical Method**

In [None]:
def calculate_and_plot_historical_VaR(dataframes, weights, confidence_level):
    # Calculate daily returns for each stock
    return_frames = [df['Close'].pct_change().dropna() for df in dataframes]

    # Combine returns into a single DataFrame
    combined_returns = pd.concat(return_frames, axis=1)

    # Calculate portfolio's daily returns
    portfolio_returns = combined_returns.dot(weights)

    # Sort the portfolio returns
    sorted_returns = portfolio_returns.sort_values()

    # Calculate VaR
    VaR_index = int((1 - confidence_level) * len(sorted_returns))
    VaR = sorted_returns.iloc[VaR_index]

    # Visualization
    plt.figure(figsize=(10, 6))
    plt.hist(portfolio_returns, bins=50, alpha=0.7, color='blue')
    plt.axvline(x=VaR, color='red', linestyle='dashed', linewidth=2, label=f'VaR at {confidence_level*100}%: {VaR}')
    plt.title("Portfolio Returns Distribution with VaR")
    plt.xlabel("Returns")
    plt.ylabel("Frequency")
    plt.legend()
    plt.show()

    return VaR

In [None]:
def calculate_and_plot_VaR_CVaR(dataframes, weights, confidence_level, lambda_decay):
    # Calculate daily returns for each stock with time decay applied
    return_frames = []
    for df in dataframes:
        daily_returns = df['Close'].pct_change().dropna()
        # Apply time decay (exponential weighting)
        daily_returns_weighted = daily_returns.ewm(alpha=lambda_decay).mean()
        return_frames.append(daily_returns_weighted)

    # Combine returns data
    combined_returns = pd.concat(return_frames, axis=1)
    portfolio_returns = combined_returns.dot(weights)

    # Calculate VaR
    sorted_returns = portfolio_returns.sort_values()
    VaR_index = int((1 - confidence_level) * len(sorted_returns))
    VaR = sorted_returns.iloc[VaR_index]

    # Calculate CVaR
    CVaR = sorted_returns[sorted_returns <= VaR].mean()

    # Visualization
    plt.figure(figsize=(10, 6))
    plt.hist(portfolio_returns, bins=50, alpha=0.7, color='blue')
    plt.axvline(x=VaR, color='red', linestyle='dashed', linewidth=2, label=f'VaR at {confidence_level*100}%: {VaR}')
    plt.axvline(x=CVaR, color='green', linestyle='dashed', linewidth=2, label=f'CVaR: {CVaR}')
    plt.title("Portfolio Returns Distribution with VaR and CVaR")
    plt.xlabel("Returns")
    plt.ylabel("Frequency")
    plt.legend()
    plt.show()

    return VaR, CVaR

In [None]:
confidence_level = 0.95
lambda_decay = 0.94

# Calculate and visualize VaR and CVaR
VaR, CVaR = calculate_and_plot_VaR_CVaR(dataframes, weights, confidence_level, lambda_decay)
print(f"Calculated VaR: {VaR}, CVaR: {CVaR}")

**Statistic Method**

In [None]:
def calculate_and_plot_statistical_VaR(dataframes, weights, confidence_level, time_horizon=1):
    # Calculate daily returns for each stock
    return_frames = [df['Close'].pct_change().dropna() for df in dataframes]

    # Combine returns into a single DataFrame
    combined_returns = pd.concat(return_frames, axis=1)

    # Calculate portfolio's daily returns
    portfolio_returns = combined_returns.dot(weights)

    # Compute portfolio standard deviation (volatility)
    portfolio_std = np.std(portfolio_returns)

    # Calculate VaR
    Z_score = norm.ppf(confidence_level)
    VaR = Z_score * portfolio_std * np.sqrt(time_horizon)

    # Visualization
    plt.figure(figsize=(10, 6))
    plt.hist(portfolio_returns, bins=50, alpha=0.7, color='blue')
    plt.axvline(x=-VaR, color='red', linestyle='dashed', linewidth=2, label=f'VaR at {confidence_level*100}%: {VaR}')
    plt.title("Portfolio Returns Distribution with VaR")
    plt.xlabel("Returns")
    plt.ylabel("Frequency")
    plt.legend()
    plt.show()

    return VaR

In [None]:
def calculate_and_plot_t_VaR(dataframes, weights, confidence_level, time_horizon=1):

    return_frames = [df['Close'].pct_change().dropna() for df in dataframes]

    # Combine returns into a single DataFrame
    combined_returns = pd.concat(return_frames, axis=1)

    # Clean the data: Remove rows with non-finite values (NaN, inf, -inf)
    combined_returns = combined_returns.replace([np.inf, -np.inf], np.nan).dropna()

    # Calculate portfolio's daily returns
    portfolio_returns = combined_returns.dot(weights)

    # Fit a t-distribution to the portfolio returns
    df, loc, scale = t.fit(portfolio_returns)

    # Calculate VaR
    VaR = t.ppf(1 - confidence_level, df, loc=loc, scale=scale) * np.sqrt(time_horizon)

    # Visualization
    plt.figure(figsize=(10, 6))
    plt.hist(portfolio_returns, bins=50, alpha=0.7, color='blue', density=True)
    x = np.linspace(min(portfolio_returns), max(portfolio_returns), 100)
    plt.plot(x, t.pdf(x, df, loc, scale), 'r-', lw=2, label='t-distribution')
    plt.axvline(x=VaR, color='red', linestyle='dashed', linewidth=2, label=f'VaR at {confidence_level*100}%: {VaR}')
    plt.title("Portfolio Returns Distribution with VaR (t-Distribution)")
    plt.xlabel("Returns")
    plt.ylabel("Density")
    plt.legend()
    plt.show()

    return VaR

In [None]:
onfidence_level = 0.95  # Confidence level
# Calculate and visualize VaR
VaR = calculate_and_plot_statistical_VaR(dataframes, weights, confidence_level)
print(f"Calculated VaR at {confidence_level*100}% confidence level: {VaR}")

In [None]:
confidence_level = 0.95  # Confidence level
# Calculate and visualize VaR
VaR = calculate_and_plot_t_VaR(dataframes, weights, confidence_level)
print(f"Calculated VaR using t-distribution at {confidence_level*100}% confidence level: {VaR}")

**Monte-Carlo Method**

Plot Function


In [None]:
def plot_distribution(final_returns, VaR, CVaR, confidence_level):
    plt.figure(figsize=(12, 6))
    sns.histplot(final_returns, kde=True, color='blue', bins=50)
    plt.axvline(x=VaR, color='red', linestyle='--', label=f"VaR at {confidence_level*100}%: {VaR:.2f}")
    plt.axvline(x=CVaR, color='green', linestyle='--', label=f"CVaR: {CVaR:.2f}")
    plt.title("Portfolio Return Distribution with VaR and CVaR")
    plt.xlabel("Portfolio Returns")
    plt.ylabel("Frequency")
    plt.legend()
    plt.grid(True)
    plt.tight_layout()
    plt.show()

In [None]:
def plot_simulations(simulated_prices, simulated_returns, VaR, CVaR, confidence_level):
    # Plot the simulated prices
    plt.figure(figsize=(21, 10))  # Adjusting figure size for prices plot
    for i in range(simulated_prices.shape[0]):
        plt.plot(simulated_prices[i], alpha=0.5)
    plt.title("Simulated Portfolio Prices")
    plt.xlabel("Days")
    plt.ylabel("Prices")
    plt.show()  # Display the prices plot

    print("")

    # Plot the simulated returns
    plt.figure(figsize=(21, 10))  # Adjusting figure size for returns plot
    for i in range(simulated_returns.shape[0]):
        plt.plot(simulated_returns[i], alpha=0.5)
    plt.title("Simulated Portfolio Returns")
    plt.xlabel("Days")
    plt.ylabel("Returns")

    # Check if VaR and CVaR are numbers and not NaN before plotting
    if VaR and not np.isnan(VaR):
        plt.axhline(y=-VaR, color='red', linestyle='dashed', linewidth=2, label=f'VaR at {confidence_level*100}%: {-VaR}')
    if CVaR and not np.isnan(CVaR):
        plt.axhline(y=-CVaR, color='orange', linestyle='dashed', linewidth=2, label=f'CVaR: {-CVaR}')
    plt.legend()

    plt.tight_layout()
    plt.show()  # Display the returns plot

In [None]:
def plot_monte_carlo_simulation(simulated_portfolio_returns, VaR, CVaR, confidence_level):
    plt.figure(figsize=(21, 14))
    plt.subplot(2, 1, 1)
    colors = sns.color_palette("hsv", len(simulated_portfolio_returns))
    for i in range(len(simulated_portfolio_returns)):
        plt.plot(simulated_portfolio_returns[i, :], color=colors[i], alpha=0.5)
    plt.axhline(y=VaR, color='red', linestyle='--', label=f"VaR at {confidence_level*100}%: {VaR:.2f}")
    plt.axhline(y=CVaR, color='green', linestyle='--', label=f"CVaR: {CVaR:.2f}")
    plt.title(f"Monte Carlo Simulation of Portfolio")
    plt.xlabel('Days')
    plt.ylabel('Simulated Portfolio Returns')
    plt.legend()
    plt.grid(True)
    plt.tight_layout()
    plt.show()

In [None]:
def plot_correlation_heatmap(combined_returns):
    plt.figure(figsize=(8, 6))
    sns.heatmap(combined_returns.corr(), annot=True, cmap='viridis')
    plt.title("Assets Correlation Heatmap")
    plt.show()

In [None]:
def plot_VaR_impacts(VaR_impact):
    categories = list(VaR_impact.keys())
    impacts = list(VaR_impact.values())

    plt.figure(figsize=(10, 5))
    plt.bar(categories, impacts, color='skyblue')
    plt.xlabel('Analysis Type')
    plt.ylabel('Adjusted VaR')
    plt.title('Impact on VaR by Different Analyses')
    plt.xticks(rotation=45)
    plt.show()

Basic VaR

In [None]:
def calculate_VaR_t_distribution(returns, confidence_level=0.99, iterations=10000):
    # Fit t-distribution to the data
    df, loc, scale = t.fit(returns)
    # Simulate returns
    simulated_returns = t.rvs(df, loc, scale, size=iterations)
    # Calculate VaR
    VaR = np.percentile(simulated_returns, (1 - confidence_level) * 100)
    return VaR

Liquidity Risk

In [None]:
def calculate_liquidity_risk(dataframes, liquidity_threshold):
    liquidity_risk = {}
    for df, name in zip(dataframes, sheet_names):
        average_volume = df['Volume'].mean()
        liquidity_risk[name] = 'High' if average_volume < liquidity_threshold else 'Low'
    return liquidity_risk

Stress Testing

In [None]:
def stress_test(dataframes, weights, stress_factors):
    stress_results = {}
    for factor, change in stress_factors.items():
        stressed_returns = []
        for df, weight in zip(dataframes, weights):
            stressed_price = df['Close'] * (1 + change)
            stressed_return = stressed_price.pct_change().dropna()
            stressed_returns.append(stressed_return * weight)
        portfolio_stressed_return = pd.concat(stressed_returns, axis=1).sum(axis=1)
        stress_results[factor] = portfolio_stressed_return
    return stress_results

Scenario Analysis

In [None]:
def conduct_scenario_analysis(dataframes, weights, scenarios):
    scenario_results = {}
    for scenario_name, scenario_changes in scenarios.items():
        scenario_portfolio_returns = pd.DataFrame()
        for df, weight, name in zip(dataframes, weights, sheet_names):
            scenario_df = df.copy()
            for column, change in scenario_changes.items():
                if column in scenario_df.columns:
                    scenario_df[column] *= (1 + change)
            scenario_portfolio_returns[name] = scenario_df[column].pct_change() * weight
        total_scenario_return = scenario_portfolio_returns.sum(axis=1)
        scenario_VaR = np.percentile(total_scenario_return.dropna(), 5)
        scenario_results[scenario_name] = scenario_VaR
    return scenario_results

VaR Impact

In [None]:
def analyze_VaR_impact(dataframes, weights, sheet_names, liquidity_threshold, stress_factors, scenarios, VaR, confidence_level=0.99):

    liquidity_risks = calculate_liquidity_risk(dataframes, liquidity_threshold)
    stress_test_results = stress_test(dataframes, weights, stress_factors)
    scenario_analysis_results = conduct_scenario_analysis(dataframes, weights, scenarios)

    adjusted_VaR = {
        'Liquidity Risk': {},
        'Stress Test': {},
        'Scenario Analysis': {}
    }

    for name, risk in liquidity_risks.items():
        adjusted_VaR['Liquidity Risk'][name] = VaR * 1.1 if risk == 'High' else VaR

    for condition, returns in stress_test_results.items():
        adjusted_VaR['Stress Test'][condition] = calculate_VaR_t_distribution(returns, confidence_level)

    for scenario, returns in scenario_analysis_results.items():
        adjusted_VaR['Scenario Analysis'][scenario] = calculate_VaR_t_distribution(returns, confidence_level)

    flat_adjusted_VaR = {}

    for analysis_type, results in adjusted_VaR.items():
        for name, value in results.items():
            flat_adjusted_VaR[f"{analysis_type} - {name}"] = value

    return flat_adjusted_VaR

Parameters such as alpha need to be coordinated with the parameters of basic VaR

In [None]:
def monte_carlo_portfolio_analysis(dataframes, weights, sheet_names, days=252, iterations=100, confidence_level=0.99, increased_volatility_factor=1.5):
    # Convert days and iterations to integers
    days = int(days)
    iterations = int(iterations)

    # Combine the returns of the different dataframes
    combined_returns = pd.DataFrame()
    combined_prices = pd.DataFrame()
    for df, weight, name in zip(dataframes, weights, sheet_names):
        daily_returns = df['Close'].pct_change().dropna()
        combined_returns[name] = daily_returns * weight
        combined_prices[name] = df['Close'] * weight

    portfolio_returns = combined_returns.sum(axis=1)  # Portfolio returns
    portfolio_prices = combined_prices.sum(axis=1)  # Portfolio prices

    # Initialize arrays for simulated portfolio prices and returns
    simulated_portfolio_prices = np.zeros((iterations, days))
    simulated_portfolio_returns = np.zeros((iterations, days))
    initial_price = portfolio_prices.iloc[0]  # Initial portfolio price

    # Run Monte Carlo simulation for prices and returns
    for i in range(iterations):
        prices = [initial_price]
        for d in range(1, days):
            simulated_return = np.random.normal(portfolio_returns.mean(), portfolio_returns.std() * increased_volatility_factor)
            price = prices[d-1] * (1 + simulated_return)
            prices.append(price)
        simulated_portfolio_prices[i, :] = prices
        simulated_portfolio_returns[i, :] = np.array(prices) / initial_price - 1

    # Calculate final returns and VaR
    final_returns = simulated_portfolio_returns[:, -1]
    VaR = calculate_VaR_t_distribution(final_returns, confidence_level, iterations)

    # Calculate CVaR if there are returns below VaR
    returns_below_VaR = final_returns[final_returns <= VaR]
    CVaR = np.nan  # Initialize CVaR as NaN
    if len(returns_below_VaR) > 0:
        CVaR = returns_below_VaR.mean()  # Calculate CVaR

    # Call plotting functions
    # Plotting the simulations with increased size
    plt.figure(figsize=(10, 6))
    plot_simulations(simulated_portfolio_prices, simulated_portfolio_returns, VaR, CVaR, confidence_level)
    print("\nGraph Description: Simulated Portfolio Price Trajectories\n")  # Description for the first plot

    # Plotting the distribution of final returns
    plt.figure(figsize=(10, 6))
    plot_distribution(final_returns, VaR, CVaR, confidence_level)
    print("\nGraph Description: Distribution of Final Returns\n")  # Description for the second plot

    # Plotting the Monte Carlo simulation
    plt.figure(figsize=(10, 6))
    plot_monte_carlo_simulation(simulated_portfolio_returns, VaR, CVaR, confidence_level)
    print("\nGraph Description: Monte Carlo Simulation Results\n")  # Description for the third plot

    plt.figure(figsize=(10, 6))
    plot_correlation_heatmap(combined_returns)
    print("\nGraph Description: Correlation Heatmap of Portfolio Returns\n")  # Description for the fourth plot

    VaR_impact = analyze_VaR_impact(dataframes, weights, sheet_names, liquidity_threshold, stress_factors, scenarios, VaR, confidence_level)

    plot_VaR_impacts(VaR_impact)

    return VaR, CVaR, simulated_portfolio_prices, simulated_portfolio_returns

In [None]:
liquidity_threshold = 100

stress_factors = {
    'Market Drop': -0.10,
    'Interest Rate Rise': 0.05
}

scenarios = {
    'Economic Boom': {'Close': 0.1},
    'Market Crash': {'Close': -0.2}
}

Natural Language Processing *(NLP)*

Prestored Function

In [None]:
def extract_gdp_growth(text):
    pattern = r"GDP growth rate is (\d+\.?\d*)%"
    match = re.search(pattern, text)
    return float(match.group(1)) if match else None

In [None]:
def extract_unemployment_rate(text):
    pattern = r"unemployment rate is (\d+\.?\d*)%"
    match = re.search(pattern, text)
    return float(match.group(1)) if match else None

In [None]:
def extract_inflation_rate(text):
    pattern = r"inflation rate is (\d+\.?\d*)%"
    match = re.search(pattern, text)
    return float(match.group(1)) if match else None

In [None]:
def extract_interest_rate(text):
    pattern = r"interest rate is (\d+\.?\d*)%"
    match = re.search(pattern, text)
    return float(match.group(1)) if match else None

In [None]:
def extract_budget_deficit(text):
    pattern = r"budget deficit is (\d+\.?\d*)%"
    match = re.search(pattern, text)
    return float(match.group(1)) if match else None

In [None]:
def extract_trade_balance(text):
    pattern = r"trade balance is (\d+\.?\d*)%"
    match = re.search(pattern, text)
    return float(match.group(1)) if match else None

# Add more...

In [None]:
def extract_all_economic_data(text):
    data = {
        "GDP Growth Rate": extract_gdp_growth(text),
        "Unemployment Rate": extract_unemployment_rate(text),
        "Inflation Rate": extract_inflation_rate(text),
        "Interest Rate": extract_interest_rate(text),
        "Budget Deficit": extract_budget_deficit(text),
        "Trade Balance": extract_trade_balance(text),
        # Add more indicators...
    }
    return {k: v for k, v in data.items() if v is not None}

Test

In [None]:
text_test = "This year's GDP growth rate is 3.2%, the unemployment rate has dropped to 4.5%, the interest rate remains at 2%, and the trade balance is 1.5%."

# Extract economic data
economic_data = extract_all_economic_data(text_test)

# Print results
for key, value in economic_data.items():
    print(f"{key}: {value}")

Bert Model Construction

In [None]:
# Initializing the tokenizer and model with the 'bert-base-uncased' pre-trained model.
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model_Bert = BertForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=3)

# Assuming final_df_nlp is your DataFrame containing 'text' and 'label'
# Convert labels to numerical values
label_map = {'Bullish': 0, 'Bearish': 1, 'Neutral': 2}
final_df_nlp['label'] = final_df_nlp['label'].map(label_map)

# Split the dataset into training and testing
train_df, test_df = train_test_split(final_df_nlp, test_size=0.2)
train_dataset = Dataset.from_pandas(train_df)
test_dataset = Dataset.from_pandas(test_df)

In [None]:
# Define a function for tokenization. It processes the text data for BERT.
def tokenize_function(examples):
    result = tokenizer(examples["text"], padding="max_length", truncation=True, max_length=128)
    result["labels"] = list(map(int, examples["label"]))
    return result

In [None]:
# Applying the tokenization function to the dataset in batches.
tokenized_train_dataset = train_dataset.map(tokenize_function, batched=True)
tokenized_test_dataset = test_dataset.map(tokenize_function, batched=True)

# Setting up training arguments. Specifies how the model should be trained.
training_args = TrainingArguments(
    output_dir='./results',  # Directory where the training results will be saved.
    num_train_epochs=50,  # Number of epochs to train for.
    per_device_train_batch_size=16,  # Batch size per device during training.
    warmup_steps=500,  # Number of steps for the warmup phase.
    weight_decay=0.01,  # Weight decay for regularization.
    logging_dir='./logs',  # Directory for storing logs.
    logging_steps=10,  # Log metrics every 10 steps.
)

In [None]:
# Initializing the Trainer with the model, training arguments, and dataset.
trainer = Trainer(
    model=model_Bert,
    args=training_args,
    train_dataset=tokenized_train_dataset,
    eval_dataset=tokenized_test_dataset
)

# Start training the model.
trainer.train()

# Evaluate the model on the test dataset.
results = trainer.evaluate()
print(results)

In [None]:
# Define a function to classify a single piece of text using the trained model.
def classify_text_Bert(text, model_Bert, tokenizer):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    model_Bert.to(device)
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    inputs = {k: v.to(device) for k, v in inputs.items()}
    with torch.no_grad():
        outputs = model_Bert(**inputs)
    predictions = outputs.logits.argmax(-1)
    return predictions.item()

In [None]:
# Example text for classification.
text = "The market is expected to grow significantly in the next quarter."

# Classify the example text.
prediction = classify_text_Bert(text, model_Bert, tokenizer)
print("Classification result:", prediction)

Random Search

In [None]:
# Defining parameter ranges
param_distributions = {
    "num_train_epochs": [2, 3, 4],  # Number of training epochs
    "per_device_train_batch_size": [8, 16, 32],  # Batch size per device
    "learning_rate": uniform(1e-5, 5e-5),  # Learning rate
    "warmup_steps": [0, 500, 1000],  # Number of warmup steps
    "weight_decay": uniform(0.0, 0.1)  # Weight decay
}

In [None]:
def sample_params(param_dist, num_samples):
    sampled_params = []
    for _ in range(num_samples):
        params = {k: random.choice(v) if isinstance(v, list) else v.rvs() for k, v in param_dist.items()}
        sampled_params.append(params)
    return sampled_params

In [None]:
# Sample parameters
sampled_params = sample_params(param_distributions, num_samples=10)  # Generate 10 random parameter sets

In [None]:
# Train and evaluate for each parameter set
for params in sampled_params:
    # Update training arguments
    training_args = TrainingArguments(
        output_dir='./results',
        num_train_epochs=params['num_train_epochs'],
        per_device_train_batch_size=params['per_device_train_batch_size'],
        learning_rate=params['learning_rate'],
        warmup_steps=params['warmup_steps'],
        weight_decay=params['weight_decay'],
        logging_dir='./logs',
        logging_steps=10,
    )

    # Initialize trainer
    trainer = Trainer(
        model=model_Bert,
        args=training_args,
        train_dataset=tokenized_train_dataset,
        eval_dataset=tokenized_test_dataset
    )

    # Train the model
    trainer.train()

    # Evaluate the model
    results = trainer.evaluate()
    print(f"Params: {params}")
    print(f"Results: {results}\n")

In [None]:
prediction = classify_text_Bert(text, model_Bert, tokenizer)
print("Classification result:", prediction)

Convolutional Neural Network Model Construction *(CNN)*

In [None]:
def create_model(vocab_size, embedding_dim, max_length, filter_sizes, num_filters, num_classes):
    input_layer = Input(shape=(max_length,))
    embedding = Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length)(input_layer)
    pooled_outputs = []
    for size in filter_sizes:
        conv = Conv1D(filters=num_filters, kernel_size=size, activation='relu')(embedding)
        pool = MaxPooling1D(pool_size=max_length - size + 1)(conv)
        pooled_outputs.append(pool)
    merged = Concatenate(axis=1)(pooled_outputs)
    flatten = Flatten()(merged)
    dense = Dense(10, activation='relu')(flatten)
    output = Dense(num_classes, activation='softmax')(dense)
    model = Model(inputs=input_layer, outputs=output)
    optimizer = Adam(lr=1e-3)
    model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

    return model

In [None]:
def build_enhanced_cnn_model(vocab_size, embedding_dim, max_length, num_filters, filter_sizes, entity_embedding_dim, num_relations):
    # Input and embedding layers
    text_input = Input(shape=(max_length,))
    text_embedding = Embedding(input_dim=vocab_size, output_dim=embedding_dim)(text_input)

    # Convolution and pooling layers
    conv_blocks = []
    for size in filter_sizes:
        conv = Conv1D(filters=num_filters, kernel_size=size, activation='relu')(text_embedding)
        conv = MaxPooling1D(pool_size=2)(conv)
        conv = Flatten()(conv)
        conv_blocks.append(conv)

    # Merge convolution layer outputs
    merged = Concatenate()(conv_blocks) if len(conv_blocks) > 1 else conv_blocks[0]

    # Dense, dropout, and batch normalization layers
    merged = Dense(256, activation='relu')(merged)
    merged = Dropout(0.3)(merged)
    merged = BatchNormalization()(merged)

    # Bidirectional LSTM layer
    merged = Reshape((1, -1))(merged)
    lstm = Bidirectional(LSTM(128, return_sequences=False))(merged)

    # Output layer
    output = Dense(num_relations, activation='softmax')(lstm)

    # Create and compile the model
    model = Model(inputs=text_input, outputs=output)
    lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(
      initial_learning_rate=0.001,
      decay_steps=100000,
      decay_rate=0.96,
      staircase=True)

    optimizer = tf.keras.optimizers.Adam(learning_rate=lr_schedule)

    model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

    return model

In [None]:
def compile_model(model):
    """
    Compile the given model with Adam optimizer and categorical crossentropy loss.

    :param model: The Keras model to compile.
    :return: Compiled model.
    """
    optimizer = Adam(learning_rate=0.001)
    model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
    return model

In [None]:
# Define your hyperparameters and create the model
vocab_size = 10000
embedding_dim = 100
max_length = 200
num_filters = 128
filter_sizes = [3, 4, 5]
entity_embedding_dim = 50
num_relations = 10

final_df_nlp['text'] = final_df_nlp['text'].apply(lambda x: x.lower())

tokenizer = Tokenizer(num_words=vocab_size, oov_token='<OOV>')
tokenizer.fit_on_texts(final_df_nlp['text'])
sequences = tokenizer.texts_to_sequences(final_df_nlp['text'])
padded = pad_sequences(sequences, maxlen=max_length, padding='post', truncating='post')


label_encoder = LabelEncoder()
encoded_labels = label_encoder.fit_transform(final_df_nlp['label'])
categorical_labels = to_categorical(encoded_labels, num_classes=3)


X_train, X_val, y_train, y_val = train_test_split(padded, categorical_labels, test_size=0.2, random_state=42)

model = build_enhanced_cnn_model(vocab_size, embedding_dim, max_length, num_filters, filter_sizes, entity_embedding_dim, len(label_encoder.classes_))
model.summary()

model.fit(X_train, y_train, batch_size=32, epochs=10, validation_data=(X_val, y_val))

Tuning Parameters / Random rersearch

In [None]:
def build_model(hp):
    max_length = 200  # or another value of your choice
    num_relations = 3  # number of classes, adjust according to your task

    text_input = Input(shape=(max_length,))
    embedding_dim = hp.Choice('embedding_dim', values=[50, 100, 150])
    text_embedding = Embedding(input_dim=vocab_size, output_dim=embedding_dim)(text_input)

    conv_blocks = []
    for size in [3, 4, 5]:
        num_filters = hp.Int(f'num_filters_{size}', min_value=32, max_value=128, step=32)
        conv = Conv1D(filters=num_filters, kernel_size=size, activation='relu')(text_embedding)
        conv = MaxPooling1D(pool_size=2)(conv)
        conv = Flatten()(conv)
        conv_blocks.append(conv)

    merged = Concatenate()(conv_blocks) if len(conv_blocks) > 1 else conv_blocks[0]
    dense_units = hp.Int('dense_units', min_value=64, max_value=256, step=64)
    merged = Dense(dense_units, activation='relu')(merged)
    merged = Dropout(0.5)(merged)
    merged = BatchNormalization()(merged)

    lstm_units = hp.Int('lstm_units', min_value=64, max_value=256, step=64)
    merged = Reshape((1, -1))(merged)
    lstm = Bidirectional(LSTM(lstm_units, return_sequences=False))(merged)

    output = Dense(num_relations, activation='softmax')(lstm)
    model = Model(inputs=text_input, outputs=output)

    lr = hp.Float('learning_rate', min_value=1e-4, max_value=1e-2, sampling='LOG')
    optimizer = Adam(learning_rate=lr)
    model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

    return model

In [None]:
tuner = RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=1,  # Number of hyperparameter combinations to try
    executions_per_trial=1,
    directory='my_dir',
    project_name='nlp_tuning'
)

In [None]:
tuner.search(X_train, y_train, epochs=10, validation_data=(X_val, y_val))
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
model_CNN = tuner.get_best_models(num_models=1)[0]

print("Best hyperparameters:", best_hps.values)
model_CNN.summary()

Classify Text CNN Function Achieve

In [None]:
bert_tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

In [None]:
def classify_text_CNN(text, model_CNN, bert_tokenizer, max_length):
    inputs = bert_tokenizer.encode_plus(
        text,
        add_special_tokens=True,
        max_length=max_length,
        padding='max_length',
        truncation=True,
        return_tensors='tf'
    )
    predictions = model_CNN.predict([inputs['input_ids']])
    prediction_class = np.argmax(predictions, axis=-1)
    confidence = np.max(predictions, axis=-1)
    return prediction_class[0], confidence[0]

In [None]:
text = "The market is expected to grow significantly in the next quarter."
prediction_class, confidence = classify_text_CNN(text, model_CNN, bert_tokenizer, max_length)
print("Classification result:", prediction_class, "with confidence", confidence)

Classify Text Bert Function Achieve

In [None]:
def classify_text_Bert(text, model_Bert, tokenizer):
    device = "cuda" if torch.cuda.is_available() else "cpu"
    model_Bert.to(device)
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    inputs = {k: v.to(device) for k, v in inputs.items()}
    with torch.no_grad():
        outputs = model_Bert(**inputs)
    logits = outputs.logits
    probabilities = F.softmax(logits, dim=-1)
    confidence, predictions = torch.max(probabilities, dim=-1)
    return predictions.item(), confidence.item()

In [None]:
pretrained_model_name = 'bert-base-uncased'
tokenizer = BertTokenizer.from_pretrained(pretrained_model_name)
prediction_class, confidence = classify_text_Bert(text, model_Bert, tokenizer)
print("Classification result:", prediction_class, "with confidence", confidence)

In [None]:
def determine_final_classification(result_bert, result_cnn):
    classification_bert, confidence_bert = result_bert
    classification_cnn, confidence_cnn = result_cnn

    if classification_bert == classification_cnn:
        return classification_bert

    return classification_bert if confidence_bert > confidence_cnn else classification_cnn

In [None]:
def adjust_scenarios_and_factors(classification, economic_data):

    adjustment_rules = {
        'Bullish': {'GDP Growth Rate': (0.02, 'positive'), 'Unemployment Rate': (-0.01, 'negative')},
        'Bearish': {'GDP Growth Rate': (-0.02, 'negative'), 'Unemployment Rate': (0.01, 'positive')},
        'Neutral': {}
    }

    adjustments = adjustment_rules.get(classification, {})

    for key, (adjustment, impact) in adjustments.items():
        if key in economic_data:
            value = economic_data[key]

            if key == 'GDP Growth Rate':
                if impact == 'positive':
                    scenarios['Economic Boom']['Close'] += adjustment * value
                    scenarios['Market Crash']['Close'] -= adjustment * value
                elif impact == 'negative':
                    scenarios['Economic Boom']['Close'] -= adjustment * value
                    scenarios['Market Crash']['Close'] += adjustment * value

    return scenarios, stress_factors

In [None]:
from keras.preprocessing.sequence import pad_sequences

for report in df_reports['report']:
    classification_bert, confidence_bert = classify_text_Bert(report, model_Bert, tokenizer)
    classification_cnn, confidence_cnn = classify_text_CNN(report, model_CNN, tokenizer, max_length)
    final_classification = determine_final_classification((classification_bert, confidence_bert), (classification_cnn, confidence_cnn))
    economic_data = extract_all_economic_data(report)
    adjusted_scenarios, adjusted_factors = adjust_scenarios_and_factors(final_classification, economic_data)

print(adjusted_scenarios)
print(adjusted_factors)

In [None]:
liquidity_threshold = 100
scenario_results = conduct_scenario_analysis(dataframes, weights, scenarios)
print(weights)
print(scenario_results)
# Call the function to plot the analysis
monte_carlo_portfolio_analysis(dataframes, weights, sheet_names)

In [None]:
def calculate_portfolio_VaR(weights, dataframes, confidence_level=0.99, scale_factor=10):
    portfolio_returns = pd.DataFrame()

    for weight, df in zip(weights, dataframes):
        returns = df['Close'].pct_change().dropna() * scale_factor
        portfolio_returns = pd.concat([portfolio_returns, returns * weight], axis=1)

    total_returns = portfolio_returns.sum(axis=1)
    df, loc, scale = t.fit(total_returns)
    simulated_returns = t.rvs(df, loc, scale, size=10000)

    simulated_price_changes = np.exp(simulated_returns) - 1
    VaR = np.percentile(simulated_price_changes, (1 - confidence_level) * 100)

    return VaR

OPtimizing

Basic Minimize

In [None]:
def minimize_VaR(dataframes, sheet_names, confidence_level=0.99, scale_factor=10):

    num_assets = len(dataframes)
    initial_weights = np.array([1.0 / num_assets] * num_assets)

    # Constraints: Weights sum to 1, and each weight is between 0 and 1
    constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
    bounds = tuple((0, 1) for asset in range(num_assets))

    # Objective function: Minimize VaR
    def objective(weights):
        return calculate_portfolio_VaR(weights, dataframes, confidence_level, scale_factor)

    # Optimization
    result = minimize(objective, initial_weights, method='SLSQP', bounds=bounds, constraints=constraints)

    if result.success:
        optimized_weights = result.x
        optimized_VaR = calculate_portfolio_VaR(optimized_weights, dataframes, confidence_level, scale_factor)
        print(f"Optimized Weights: {optimized_weights}")
        print(f"Optimized Portfolio VaR: {optimized_VaR}")
        return optimized_weights, optimized_VaR
    else:
        raise ValueError('Optimization failed')

In [None]:
optimized_weights, optimized_VaR = minimize_VaR(dataframes, sheet_names)

Genetic Algorithm

In [None]:
NUM_PARAMS = 6

In [None]:
def create_individual():
    number_of_assets = len(dataframes)
    weights = [random.uniform(0, 1) for _ in range(number_of_assets)]
    weights /= np.sum(weights)  # Normalize the weights to sum to 1

    days = random.randint(200, 300)  # Example values
    iterations = random.randint(800, 1200)
    confidence_level = random.uniform(0.95, 0.99)

    return [weights, days, iterations, confidence_level]

In [None]:
def fitness_function(individual):
    weights, days, iterations, confidence_level = individual
    VaR, CVaR, VaR_impact, additional_value = monte_carlo_portfolio_analysis(dataframes, weights, sheet_names, days, iterations, confidence_level)
    return (-VaR,)

# Set up GA parameters
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.FitnessMax)

toolbox = base.Toolbox()
toolbox.register("individual", tools.initIterate, creator.Individual, create_individual)
toolbox.register("population", tools.initRepeat, list, toolbox.individual)
toolbox.register("evaluate", fitness_function)
toolbox.register("mate", tools.cxTwoPoint)
toolbox.register("mutate", tools.mutGaussian, mu=0, sigma=1, indpb=0.1)
toolbox.register("select", tools.selTournament, tournsize=3)

In [None]:
def optimize_and_track():
    population = toolbox.population(n=1)
    ngen = 2
    best_scores = []
    best_individuals = []

    for gen in range(ngen):
        offspring = algorithms.varAnd(population, toolbox, cxpb=0.5, mutpb=0.2)
        fits = toolbox.map(toolbox.evaluate, offspring)

        for fit, ind in zip(fits, offspring):
            ind.fitness.values = fit

        population = toolbox.select(offspring, k=len(population))
        best_ind = tools.selBest(population, k=1)[0]
        best_scores.append(best_ind.fitness.values[0])
        best_individuals.append(best_ind)


        print(f"Generation {gen}: Best individual is {best_ind}, Best VaR is {-best_ind.fitness.values[0]}")

    overall_best_ind = tools.selBest(population, k=1)[0]
    print(f"Overall best individual is {overall_best_ind}, with VaR: {-overall_best_ind.fitness.values[0]}")

    return best_scores, best_individuals

In [None]:
best_scores, best_individuals = optimize_and_track()

Fitness scores are defined with negative VaR values, so a larger score on the graph (closer to zero or a positive number) actually means better fitness because it represents lower risk.

In [None]:
plt.figure(figsize=(10, 6))
plt.plot(best_scores)
plt.title("Optimization Progress")
plt.xlabel("Generation")
plt.ylabel("Best Fitness Score")
plt.show()

Monte Carlo ML

Random Forest

In [None]:
dataframes = [get_sheet_data(name) for name in sheet_names]
combined_df = pd.concat(dataframes, ignore_index=True)

In [None]:
combined_df = combined_df.dropna()

combined_df['Return'] = combined_df['Close'].pct_change()
combined_df = combined_df.dropna()

combined_df['Date'] = pd.to_datetime(combined_df['Date'])
combined_df['DayOfWeek'] = combined_df['Date'].dt.dayofweek
combined_df['Month'] = combined_df['Date'].dt.month

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
combined_df[['Open', 'High', 'Low', 'Close', 'Volume']] = scaler.fit_transform(combined_df[['Open', 'High', 'Low', 'Close', 'Volume']])

In [None]:
X = combined_df[['Open', 'High', 'Low', 'Volume']]
y = combined_df['Close']

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
model = RandomForestRegressor(n_estimators=1000, max_features='sqrt', min_samples_leaf=4, random_state=42)

In [None]:
num_simulations = 10000
simulated_price_paths = np.zeros((num_simulations, len(X_test)))

In [None]:
cross_val_scores = cross_val_score(model, X_train, y_train, cv=5)
print("Average Cross-Validation Score:", np.mean(cross_val_scores))

In [None]:
model.fit(X_train, y_train)
predictions = model.predict(X_test)

Poisson Distribution

In [None]:
num_simulations = 5000
lambda_jump = 0.05
jump_sd = 0.4

In [None]:
def simulate_stock_prices(predictions, X_test, num_simulations, lambda_jump, jump_sd):
    # Simulate volatility - GARCH
    garch = arch_model(predictions, vol='Garch', p=1, q=1)
    garch_fitted = garch.fit()

    simulation_length = len(X_test)
    simulated_price_paths = np.zeros((num_simulations, simulation_length))

    # Initial stock prices
    initial_prices = X_test['Open'].values

    # Monte Carlo simulation
    for i in range(num_simulations):
        simulated_prices = [initial_prices[0]]
        for t in range(1, simulation_length):
            # Fetch volatility
            vol = garch_fitted.conditional_volatility[t]
            # Random shock
            shock = np.random.normal(0, vol)
            # Jump part
            jump = np.random.normal(0, jump_sd) if np.random.random() < lambda_jump else 0
            # Simulate price
            simulated_price = simulated_prices[t-1] * (1 + predictions[t] + shock + jump)
            simulated_prices.append(simulated_price)
        simulated_price_paths[i, :] = simulated_prices

    return simulated_price_paths

In [None]:
def visualize_simulation_results(simulated_price_paths, VaR):
    print(VaR)
    # Visualize simulated stock price paths
    plt.figure(figsize=(10, 6))
    for i in range(len(simulated_price_paths)):  # Plot only 100 paths for clarity
        plt.plot(simulated_price_paths[i, :], alpha=0.2)
    plt.title("Simulated Stock Price Paths")
    plt.xlabel("Time")
    plt.ylabel("Price")
    plt.show()

    # Visualize VaR distribution
    initial_prices = simulated_price_paths[:, 0]
    end_returns = (simulated_price_paths[:, -1] - initial_prices) / initial_prices
    plt.figure(figsize=(10, 6))
    plt.hist(end_returns, bins=25, alpha=0.7, color='blue')
    plt.axvline(-VaR, color='red', linestyle='dashed', linewidth=2)
    plt.title("Distribution of Returns and VaR")
    plt.xlabel("Return")
    plt.ylabel("Frequency")
    plt.show()

In [None]:
def calculate_VaR_randomForest(simulated_price_paths, confidence_level=0.95):
    """
    Calculate Value at Risk (VaR) from simulated stock price paths.
    """
    initial_prices = simulated_price_paths[:, 0]
    end_prices = simulated_price_paths[:, -1]
    returns = (end_prices - initial_prices) / initial_prices
    sorted_returns = np.sort(returns)
    var_index = int((1 - confidence_level) * len(sorted_returns))
    VaR = -sorted_returns[var_index]
    return VaR

In [None]:
simulated_price_paths = simulate_stock_prices(predictions, X_test, num_simulations, lambda_jump, jump_sd)
VaR = calculate_VaR_randomForest(simulated_price_paths)
visualize_simulation_results(simulated_price_paths, VaR)

Tuning Parameters

Bayes Research / Random Forest

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [None]:
param_space = {
    'n_estimators': (100, 1000),
    'max_features': ['auto', 'sqrt', 'log2'],
    'min_samples_split': (2, 10),
    'min_samples_leaf': (1, 4)
}

In [None]:
bayes_search = BayesSearchCV(
    estimator=RandomForestRegressor(random_state=42),
    search_spaces=param_space,
    n_iter=32,
    cv=5,
    n_jobs=-1,
    random_state=42
)

In [None]:
bayes_search.fit(X_train, y_train)

In [None]:
print("Best parameters found: ", bayes_search.best_params_)

In [None]:
best_model = bayes_search.best_estimator_
test_score = best_model.score(X_test, y_test)
print("Test set score of best model: ", test_score)

Composite Conmtruction

Copula

Genenerate diifferent Asset class data


In [None]:
def simulate_fund_data(days, mean_return=0, std_dev=0.01):

    dates = pd.date_range(start="2022-01-01", periods=days)
    returns = np.random.normal(mean_return, std_dev, days)
    return pd.DataFrame({'Date': dates, 'FUND_Close': returns.cumsum()})

In [None]:
def calculate_VaR_with_Copula_and_Fund(dataframes, fund_data, weights, days=252, iterations=10000, confidence_level=0.99):

    returns = pd.DataFrame()
    for i, df in enumerate(dataframes):
        stock_name = sheet_names[i]
        returns[stock_name] = df['Close'].pct_change().dropna()
    returns['FUND'] = fund_data['FUND_Close'].pct_change().dropna()


    copula = GaussianMultivariate()
    copula.fit(returns)


    simulated_returns = copula.sample(iterations)



    if simulated_returns.shape[1] != len(weights):
        raise ValueError("Mismatch in number of assets and weights.")

    portfolio_returns = simulated_returns.dot(weights)


    VaR = np.percentile(portfolio_returns, (1 - confidence_level) * 100)

    plt.figure(figsize=(10, 6))
    sns.histplot(portfolio_returns, bins=50, kde=True)
    plt.axvline(x=VaR, color='r', linestyle='--', label=f'VaR at {confidence_level*100}%: {VaR}')
    plt.title('Simulated Portfolio Returns Distribution with VaR')
    plt.legend()
    plt.show()

    return VaR

In [None]:
fund_data = simulate_fund_data(252)

sheet_names = ["META", "MSFT", "NFLX"]
dataframes = [get_sheet_data(name) for name in sheet_names]

weights = np.array([0.2, 0.2, 0.2, 0.4])

VaR = calculate_VaR_with_Copula_and_Fund(dataframes, fund_data, weights)

Basic Theory

In [None]:
def simulate_asset_price(S0, days, mu, sigma):
    dt = 1 / 252  # One trading day
    random_shocks = np.random.normal(0, 1, days)
    price = S0 * np.cumprod(np.exp((mu - 0.5 * sigma**2) * dt + sigma * np.sqrt(dt) * random_shocks))
    return price

In [None]:
def simulate_option_prices(S0, K, T, r, sigma, num_days, num_simulations):
    """
    Simulate option prices using GBM for the underlying asset and the Black-Scholes model.
    S0: Initial stock price
    K: Strike price
    T: Time to maturity
    r: Risk-free interest rate
    sigma: Volatility of the underlying asset
    num_days: Number of days to simulate
    num_simulations: Number of simulated paths
    """
    dt = T / num_days
    option_price_paths = []

    for _ in range(num_simulations):
        # Simulate price path for the underlying asset
        price_path = simulate_gbm(S0, r, sigma, T, num_days)

        # Calculate option price for each day
        option_prices = [black_scholes(S, K, T - i*dt, r, sigma) for i, S in enumerate(price_path)]
        option_price_paths.append(option_prices)

    return np.array(option_price_paths)

In [None]:
def black_scholes(S, K, T, r, sigma, option_type="call"):
    d1 = (np.log(S / K) + (r + 0.5 * sigma**2) * T) / (sigma * np.sqrt(T))
    d2 = d1 - sigma * np.sqrt(T)
    if option_type == "call":
        option_price = S * norm.cdf(d1) - K * np.exp(-r * T) * norm.cdf(d2)
    else:
        option_price = K * np.exp(-r * T) * norm.cdf(-d2) - S * norm.cdf(-d1)
    return option_price

Advanced Asset Price Simulation (e.g., Geometric Brownian Motion)

In [None]:
def simulate_gbm(S0, mu, sigma, T, steps):
    """
    Simulate asset prices using Geometric Brownian Motion.
    S0: Initial asset price
    mu: Drift coefficient
    sigma: Volatility coefficient
    T: Time horizon
    steps: Number of time steps
    """
    dt = T / steps
    random_component = np.random.normal(0, np.sqrt(dt), steps)
    price_path = np.exp((mu - 0.5 * sigma**2) * dt + sigma * random_component)
    return S0 * np.cumprod(price_path)

In [None]:
def monte_carlo_gbm_european_call(S0, K, T, r, sigma, num_simulations, steps):
    """
    Monte Carlo simulation for European call option using GBM.
    S0: Initial stock price
    K: Strike price
    T: Time to maturity
    r: Risk-free interest rate
    sigma: Volatility
    num_simulations: Number of simulated paths
    steps: Number of time steps
    """
    np.random.seed(0)  # For reproducibility
    option_price_paths = []

    for _ in range(num_simulations):
        asset_prices = simulate_gbm(S0, r, sigma, T, steps)
        option_prices = np.maximum(asset_prices - K, 0)  # Calculate option prices at each step
        discounted_option_prices = np.exp(-r * np.arange(1, steps + 1) / 252) * option_prices
        option_price_paths.append(discounted_option_prices)

    # Convert to numpy array for easier manipulation
    option_price_paths = np.array(option_price_paths)

    # Calculate the average option price for each day across all simulations
    mean_option_prices = np.mean(option_price_paths, axis=0)

    # Visualization
    plt.figure(figsize=(15, 6))

    # Plot simulated option price paths
    plt.subplot(1, 2, 1)
    for option_path in option_price_paths:
        plt.plot(option_path, color='blue', alpha=0.1)
    plt.title('Simulated Option Price Paths')
    plt.xlabel('Days')
    plt.ylabel('Option Price')

    # Plot average option price
    plt.subplot(1, 2, 2)
    plt.plot(mean_option_prices, color='green')
    plt.title('Average Option Price Over Time')
    plt.xlabel('Days')
    plt.ylabel('Average Option Price')

    plt.tight_layout()
    plt.show()

    return mean_option_prices

No Cost consideration

In [None]:
# Parameters for the simulation
S0 = 100
K = 110
T = 1
r = 0.05
sigma = 0.2
num_simulations = 10000
steps = 252

# Run simulation and plot
option_price_series = monte_carlo_gbm_european_call(S0, K, T, r, sigma, num_simulations, steps)

# Print the entire series of average option prices
print("Simulated European Call Option Price Series:")
print(option_price_series)

# Print the final average option price
final_option_price = option_price_series[-1]
print(f"Final Simulated European Call Option Price: {final_option_price:.2f}")

Copula

In [None]:
def simulate_option_returns(days, S0, K, T, r, sigma, option_type="call"):
    # Simulate underlying asset prices
    asset_prices = simulate_asset_price(S0, days, r, sigma)

    # Calculate option prices for each day
    option_prices = [black_scholes(S, K, T - i/252, r, sigma, option_type) for i, S in enumerate(asset_prices)]

    # Convert to DataFrame and calculate returns
    option_prices_df = pd.DataFrame({'OPTION_Close': option_prices})
    option_returns = option_prices_df['OPTION_Close'].pct_change().dropna()
    return option_returns

In [None]:
def calculate_VaR_with_Copula_and_FundorOption(dataframes, fund_data, option_data, weights, days=252, iterations=10000, confidence_level=0.99, include_fund=True, include_option=True):
    returns = pd.DataFrame()
    for i, df in enumerate(dataframes):
        stock_name = sheet_names[i]
        returns[stock_name] = df['Close'].pct_change().dropna()

    if include_fund:
        returns['FUND'] = fund_data['FUND_Close'].pct_change().dropna()

    if include_option:
        # Assuming option_data is already a series of returns
        returns['OPTION'] = option_data

    # Fit the Copula model
    copula = GaussianMultivariate()
    copula.fit(returns)

    simulated_returns = copula.sample(iterations)

    if simulated_returns.shape[1] != len(weights):
        raise ValueError("Mismatch in number of assets and weights.")

    portfolio_returns = simulated_returns.dot(weights)

    VaR = np.percentile(portfolio_returns, (1 - confidence_level) * 100)

    plt.figure(figsize=(10, 6))
    sns.histplot(portfolio_returns, bins=50, kde=True)
    plt.axvline(x=VaR, color='r', linestyle='--', label=f'VaR at {confidence_level*100}%: {VaR}')
    plt.title('Simulated Portfolio Returns Distribution with VaR')
    plt.legend()
    plt.show()

    return VaR

In [None]:
num_stocks = len(sheet_names)
include_fund = True
include_option = True

num_assets = num_stocks + (1 if include_fund else 0) + (1 if include_option else 0)

weights = [1/num_assets for _ in range(num_assets)]

option_data = simulate_option_returns(days=252, S0=100, K=110, T=1, r=0.05, sigma=0.2, option_type="call")
VaR = calculate_VaR_with_Copula_and_FundorOption(dataframes, fund_data, option_data, weights, include_fund=include_fund, include_option=include_option)

Annealing

In [None]:
def portfolio_variance(weights, cov_matrix):

    return np.dot(weights.T, np.dot(cov_matrix, weights))

In [None]:
def simulated_annealing_monte_carlo(cov_matrix, initial_weights, steps=1000, temp=1.0, cooling_rate=0.95):

    current_solution = np.array(initial_weights)
    best_solution = np.array(initial_weights)
    best_variance = portfolio_variance(current_solution, cov_matrix)

    variances = [best_variance]  # Store variances for visualization

    for step in range(steps):
        new_solution = np.random.normal(current_solution, 0.1)
        new_solution = new_solution / np.sum(new_solution)

        current_variance = portfolio_variance(current_solution, cov_matrix)
        new_variance = portfolio_variance(new_solution, cov_matrix)

        if new_variance < best_variance:
            best_solution, best_variance = new_solution, new_variance
        elif np.random.random() < np.exp(-(new_variance - current_variance) / temp):
            current_solution = new_solution

        temp *= cooling_rate
        variances.append(best_variance)

    # Visualization
    plt.plot(variances)
    plt.title('Portfolio Variance over Simulated Annealing Steps')
    plt.xlabel('Step')
    plt.ylabel('Variance')
    plt.show()

    return best_solution

In [None]:
# Example covariance matrix (5 assets)
cov_matrix = np.array([
    [0.005, -0.002, 0.004, 0.001, 0.002],
    [-0.002, 0.004, -0.001, 0.002, 0.003],
    [0.004, -0.001, 0.01, 0.003, 0.002],
    [0.001, 0.002, 0.003, 0.006, 0.001],
    [0.002, 0.003, 0.002, 0.001, 0.007]
])

num_assets = len(cov_matrix)

# Initial weights (equal distribution for 5 assets)
initial_weights = [1/num_assets] * num_assets

# Run simulated annealing to optimize weights
optimized_weights = simulated_annealing_monte_carlo(cov_matrix, initial_weights)

# Assuming you have the following data ready
fund_data = simulate_fund_data(252)  # 1 year of trading days
option_data = simulate_option_returns(252, S0, K, T, r, sigma)

# Now calculate VaR with optimized weights
VaR = calculate_VaR_with_Copula_and_FundorOption(dataframes, fund_data, option_data, optimized_weights, include_fund=True, include_option=True)
print("Calculated VaR:", VaR)