# Generate Stochastic Volatility Surface Samples

This notebook generates samples for the future training of models to fit volatilities implied from Stochastic Volatility model parameters.

ToDo:
* Run one generation
* Find out how to retrieve files
* Check how long we can run without being interrupted by Colab

In [15]:
# Import relevant modules
import os
import numpy as np
from platform import python_version

# Install and import SDevPy modules
!pip install sdevpy --upgrade
import sdevpy as sd
from sdevpy.tools.timer import Stopwatch
from sdevpy.tools import filemanager
from sdevpy.volsurfacegen import stovolfactory

print("Python version: " + python_version())
print("NumPy version: " + np.__version__)
print("SDevPy version: " + sd.__version__)

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Python version: 3.10.12
NumPy version: 1.22.4
SDevPy version: 1.0.0


## 1) Set runtime configuration


In [21]:
# Global settings
# MODEL_TYPE = "ShiftedSABR"
# MODEL_TYPE = "McShiftedSABR"
MODEL_TYPE = "FbSABR"
# MODEL_TYPE = "McShiftedZABR"
# MODEL_TYPE = "McShiftedHeston"
NUM_SAMPLES = 10 * 1000
NUM_EXPIRIES = 10
SURFACE_SIZE = 50
# The 2 parameters below are only relevant for models whose reference is calculated by MC
NUM_MC = 100 * 1000 # 100 * 1000
POINTS_PER_YEAR = 25 # 25
# Change seed to generate different sets
SEED = 12# [2468, 8642, 2112, 4444, 88, 6666, 1122, 12]

print(">> Set up runtime configuration")
project_folder = "/content/sdevpy/stovol"
!mkdir -p sdevpy/stovol
# !unzip -u "/content/sdev.python-pinns_worst_of.zip" -d sdev.python/models
print("> Project folder: " + project_folder)
data_folder = os.path.join(project_folder, "samples")
print("> Data folder: " + data_folder)
filemanager.check_directory(data_folder)
print("> Chosen model: " + MODEL_TYPE)
data_file = os.path.join(data_folder, MODEL_TYPE + "_samples.tsv")

>> Set up runtime configuration
> Project folder: /content/sdevpy/stovol
> Data folder: /content/sdevpy/stovol/samples
> Chosen model: FbSABR


## 2) Generate samples

Here we generate the samples using the SDevPy framework. First prices are calculated with the chosen models. Then these prices are transformed into normal volatilities and the data is cleansed. Finally a tsv file is output containing the dataset.

In [22]:
# Select the model
generator = stovolfactory.set_generator(MODEL_TYPE, NUM_EXPIRIES, SURFACE_SIZE, NUM_MC,
                                        POINTS_PER_YEAR, SEED)

In [23]:
# Generate samples (prices)
print(f"> Generate {NUM_SAMPLES:,} price samples")
timer_gen = Stopwatch("Generating Samples")
timer_gen.trigger()
data_df = generator.generate_samples(NUM_SAMPLES)
timer_gen.stop()

# Convert to normal vols and cleanse
print("> Convert to normal vol and cleanse data")
timer_conv = Stopwatch("Converting Prices")
timer_conv.trigger()
data_df = generator.to_nvol(data_df, cleanse=True)
num_clean = len(data_df.index)
print(f"> Dataset size after cleansing: {num_clean:,}")
timer_conv.stop()

# Output to file
timer_out = Stopwatch("File Output")
timer_out.trigger()
generator.to_file(data_df, data_file)
timer_out.stop()

# View timers
timer_gen.print()
timer_conv.print()
timer_out.print()

> Generate 10,000 price samples
Number of strikes: 5
Number of expiries: 10
Surface size: 50
Number of samples: 10,000
Number of surfaces/parameter samples: 200
Surface generation number 1/200
Surface generation number 2/200
Surface generation number 3/200
Surface generation number 4/200
Surface generation number 5/200
Surface generation number 6/200
Surface generation number 7/200
Surface generation number 8/200
Surface generation number 9/200
Surface generation number 10/200
Surface generation number 11/200
Surface generation number 12/200
Surface generation number 13/200
Surface generation number 14/200
Surface generation number 15/200
Surface generation number 16/200
Surface generation number 17/200
Surface generation number 18/200
Surface generation number 19/200
Surface generation number 20/200
Surface generation number 21/200
Surface generation number 22/200
Surface generation number 23/200
Surface generation number 24/200
Surface generation number 25/200
Surface generation numb