Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Invalid Price time series from Heston Process #182

Closed
masanorihirano opened this issue Aug 12, 2021 · 11 comments · Fixed by #186
Closed

BUG: Invalid Price time series from Heston Process #182

masanorihirano opened this issue Aug 12, 2021 · 11 comments · Fixed by #186
Labels
bug Something isn't working

Comments

@masanorihirano
Copy link
Collaborator

The current implementation of the Heston process can make price time series including non-positive prices because of the discrete approximation.

https://github.com/pfnet-research/pfhedge/blob/main/pfhedge/stochastic/heston.py#L103

For example, we can face this issue with this code:

import torch
from pfhedge.instruments import BrownianStock
from pfhedge.instruments import HestonStock
from pfhedge.instruments import EuropeanOption
from pfhedge.nn import Hedger, MultiLayerPerceptron
torch.autograd.set_detect_anomaly(True)


def main():
    torch.manual_seed(2)
    # stock = BrownianStock(cost=1e-4, volatility=0.2)
    stock = HestonStock()
    derivative = EuropeanOption(stock)

    model = MultiLayerPerceptron()
    hedger = Hedger(
        model, inputs=["log_moneyness", "expiry_time", "prev_hedge"])

    hedger.fit(derivative, n_epochs=200)

    price = hedger.price(derivative)

    print(price)


if __name__ == '__main__':
    main()

In this code,

  1. Heston process generates no-positive value
  2. no-positive spot price cause the log moneyness of -inf
  3. MLP returns nan because of the invalid input
  4. backpropagation work incorrectly

Current implementation use the discreate approximation of the following equations:

However, according to the following calculation, the first equation can be in a different form.


This transformation enables us to implement the Heston process geometrically and avoid the current issue.

@masanorihirano
Copy link
Collaborator Author

The fundamental problem was different.

At first, I missed the detailed implementation of this discrete approximation, but, now I have read the reference and confirmed it was implemented geometrically.

The actual problem was the price has been too small to be approximated to 0.0 as a result of the computation.

In the code I wrote above, log(-104.7081) was calculated as 0.0 in the line of the following link.
https://github.com/pfnet-research/pfhedge/blob/main/pfhedge/stochastic/heston.py#L118

Now, I am being a little confused to think about how to handle this error.
Practically, I should change the initial price (this is not fundamental fix), fix pfhedge, or other...?

@ghost
Copy link

ghost commented Aug 12, 2021

The cause of the too-small spot seems to be an extremely large variance generated by generate_cir.
I'll inspect the cause of the large variance.

variance = generate_cir(

These plots are spot and variance for which log(spot) dooms to nan.
s
v

As shown in the plot, v1 = variance[:, i_step + 1] is extremely large for i_step=8 and sok2 * v1 is dominant in the following expression of log_spot[:, i_spot=9]. Since k2 = -0.35 < 0 is negative, log_spot[:, i_spot=9] becomes largely negative and so spot goes extremely small.

# Compute log S(t + 1): Eq(33)
k0 = -rho * kappa * theta * dt / sigma
k1 = GAMMA1 * dt * (kappa * rho / sigma - 0.5) - rho / sigma
k2 = GAMMA2 * dt * (kappa * rho / sigma - 0.5) + rho / sigma
k3 = GAMMA1 * dt * (1 - rho ** 2)
k4 = GAMMA2 * dt * (1 - rho ** 2)
v0 = variance[:, i_step]
v1 = variance[:, i_step + 1]
log_spot[:, i_step + 1] = sum(
(
log_spot[:, i_step],
k0 + k1 * v0 + k2 * v1,
(k3 * v0 + k4 * v1).sqrt() * randn[:, i_step],
)
)

code to plot
import torch
import numpy as np
from pfhedge.instruments import BrownianStock
from pfhedge.instruments import HestonStock
from pfhedge.instruments import EuropeanOption
from pfhedge.nn import Hedger, MultiLayerPerceptron
from pfhedge.nn import ExpectedShortfall

torch.autograd.set_detect_anomaly(True)


def main():
    torch.manual_seed(2)
    stock = HestonStock()
    derivative = EuropeanOption(stock)

    model = MultiLayerPerceptron()
    hedger = Hedger(
        model,
        inputs=["log_moneyness", "expiry_time", "prev_hedge"],
        criterion=ExpectedShortfall(),
    )

    try:
        hedger.fit(derivative, n_epochs=200)
    except RuntimeError:
        import matplotlib.pyplot as plt

        s = derivative.underlier.spot
        v = derivative.underlier.variance
        i = s.log().isfinite().logical_not().any(1)
        s = s[i]
        v = v[i]

        plt.figure()
        plt.plot(s.T)
        plt.locator_params(axis='x', integer=True,min_n_ticks=20)
        plt.title("spot")
        plt.savefig("s.png")

        plt.figure()
        plt.plot(v.T)
        plt.locator_params(axis='x', integer=True,min_n_ticks=20)
        plt.title("variance")
        plt.savefig("v.png")

        kappa: float = 1.0
        theta: float = 0.04
        sigma: float = 2.0
        rho: float = -0.7
        dt: float = 1 / 250
        GAMMA1 = 1 / 2
        GAMMA2 = 1 / 2

        k0 = -rho * kappa * theta * dt / sigma
        k1 = GAMMA1 * dt * (kappa * rho / sigma - 0.5) - rho / sigma
        k2 = GAMMA2 * dt * (kappa * rho / sigma - 0.5) + rho / sigma
        k3 = GAMMA1 * dt * (1 - rho ** 2)
        k4 = GAMMA2 * dt * (1 - rho ** 2)

        print("k0", k0)
        print("k1", k1)
        print("k2", k2)
        print("k3", k3)
        print("k4", k4)


if __name__ == "__main__":
    main()

@ghost
Copy link

ghost commented Aug 12, 2021

Just an FYI thing, with torch.set_default_dtype(torch.float64) the code ran without RuntimeError. (it does not fix the issue essentially though...)

update: retracted

@masanorihirano
Copy link
Collaborator Author

Even if we use torch.set_default_dtype(torch.float64), the same issue happen:

import torch
from pfhedge.instruments import BrownianStock
from pfhedge.instruments import HestonStock
from pfhedge.instruments import EuropeanOption
from pfhedge.nn import Hedger, MultiLayerPerceptron
torch.autograd.set_detect_anomaly(True)
torch.set_default_dtype(torch.float64)


def main():
    torch.manual_seed(0)
    # stock = BrownianStock(cost=1e-4, volatility=0.2)
    stock = HestonStock()
    derivative = EuropeanOption(stock)

    model = MultiLayerPerceptron()
    hedger = Hedger(
        model, inputs=["log_moneyness", "expiry_time", "prev_hedge"])

    hedger.fit(derivative, n_epochs=200)

    price = hedger.price(derivative)

    print(price)


if __name__ == '__main__':
    main()

@masanorihirano
Copy link
Collaborator Author

The fundamental issue occurred here:

# Compute V(t + dt) where psi > PSI_CRIT: Eq(25)
u = rand[:, i_step]
p = (psi - 1) / (psi + 1)
beta = (1 - p) / (m + EPSILON)
pinv = ((1 - p) / (1 - u + EPSILON)).log() / beta
next_1 = torch.where(u > p, pinv, torch.zeros_like(u))

According to the reference, these seem correct.
However, when the u is too near to 1, 1-u becomes almost 0.
Then, next_1 was sed the logarithm of it, thus this process causes a little bit bigger numbers.
However, because u is drawn from a uniform distribution, I think this issue cannot be avoided.

@ghost
Copy link

ghost commented Aug 13, 2021

There's a bug here.

s2 = sum(
v * (tensor_sigma ** 2) * exp * (1 - exp) / tensor_kappa,
tensor_theta * (tensor_sigma ** 2) * ((1 - exp) ** 2) / (2 * tensor_kappa),
)

Here the first tensor is of size (n_paths,) while the second is (,).
I wrongly assumed it's equivalent to the following expression,

s2 = ( 
     v * (tensor_sigma ** 2) * exp * (1 - exp) / tensor_kappa
     + tensor_theta * (tensor_sigma ** 2) * ((1 - exp) ** 2) / (2 * tensor_kappa)
)

But the results are different. The first tensor is unintendedly summed over the paths.

import torch


a = torch.full((5,), 3)
b = torch.tensor(2)
result = sum(a, b)
expect = a + b

print(result)
# tensor(17)
print(expect)
# tensor([5, 5, 5, 5, 5])

Consequently s^2 is O(n_paths) times greater than the correct value and eventually pinv ~ 1 / beta is O(n_paths) times greater.

I don't think ((1 - p) / (1 - u + EPSILON)).log() is the cause because it has EPSILON keeping log at most log((1 - 0.95) / EPSILON) ~ 15, which is not that seriously large (cancelled by small 1/beta anyway).

If I substitute sum(...) with the correct expression ... + ..., the issue is resolved (See "fix" below).
May I patch this to pfhedge (it'll be after tomorrow though), or you want to?

fix
import torch
from torch import Tensor
from pfhedge.instruments import HestonStock
from pfhedge.instruments import EuropeanOption
from pfhedge.nn import Hedger, MultiLayerPerceptron

from typing import Optional
from typing import Tuple
from torch import Tensor

torch.autograd.set_detect_anomaly(True)


def generate_cir(
    n_paths: int,
    n_steps: int,
    init_value: float = 0.04,
    kappa: float = 1.0,
    theta: float = 0.04,
    sigma: float = 2.0,
    dt: float = 1 / 250,
    dtype: torch.dtype = None,
    device: torch.device = None,
) -> Tensor:
    # PSI_CRIT in [1.0, 2.0]. See section 3.2.3
    PSI_CRIT = 1.5
    # Prevent zero division
    EPSILON = 1e-8

    output = torch.empty((n_paths, n_steps), dtype=dtype, device=device)
    output[:, 0] = init_value

    randn = torch.randn_like(output)
    rand = torch.rand_like(output)

    tensor_kappa = torch.tensor(kappa, dtype=dtype, device=device)
    tensor_theta = torch.tensor(theta, dtype=dtype, device=device)
    tensor_sigma = torch.tensor(sigma, dtype=dtype, device=device)
    tensor_dt = torch.tensor(dt, dtype=dtype, device=device)

    for i_step in range(n_steps - 1):
        v = output[:, i_step]

        # Compute m, s, psi: Eq(17,18)
        exp = (-tensor_kappa * tensor_dt).exp()
        m = tensor_theta + (v - tensor_theta) * exp
        # s2 = sum(
        #     v * (tensor_sigma ** 2) * exp * (1 - exp) / tensor_kappa,
        #     tensor_theta * (tensor_sigma ** 2) * ((1 - exp) ** 2) / (2 * tensor_kappa),
        # )
        s2 = v * (tensor_sigma ** 2) * exp * (1 - exp) / tensor_kappa + tensor_theta * (
            tensor_sigma ** 2
        ) * ((1 - exp) ** 2) / (2 * tensor_kappa)
        psi = s2 / (m ** 2 + EPSILON)

        # Compute V(t + dt) where psi <= PSI_CRIT: Eq(23, 27, 28)
        b = ((2 / psi) - 1 + (2 / psi).sqrt() * (2 / psi - 1).sqrt()).sqrt()
        a = m / (1 + b ** 2)
        next_0 = a * (b + randn[:, i_step]) ** 2

        # Compute V(t + dt) where psi > PSI_CRIT: Eq(25)
        u = rand[:, i_step]
        p = (psi - 1) / (psi + 1)
        beta = (1 - p) / (m + EPSILON)
        pinv = ((1 - p) / (1 - u + EPSILON)).log() / beta
        next_1 = torch.where(u > p, pinv, torch.zeros_like(u))

        output[:, i_step + 1] = torch.where(psi <= PSI_CRIT, next_0, next_1)

    return output


def generate_heston(
    n_paths: int,
    n_steps: int,
    init_state: Tuple[float, float] = (1.0, 0.04),
    kappa: float = 1.0,
    theta: float = 0.04,
    sigma: float = 2.0,
    rho: float = -0.7,
    dt: float = 1 / 250,
    dtype: Optional[torch.dtype] = None,
    device: Optional[torch.device] = None,
) -> Tuple[Tensor, Tensor]:
    GAMMA1 = 0.5
    GAMMA2 = 0.5

    init_spot, init_var = init_state

    variance = generate_cir(
        n_paths=n_paths,
        n_steps=n_steps,
        init_value=init_var,
        kappa=kappa,
        theta=theta,
        sigma=sigma,
        dt=dt,
        dtype=dtype,
        device=device,
    )

    log_spot = torch.empty_like(variance)
    log_spot[:, 0] = torch.tensor(init_spot).log()
    randn = torch.randn_like(variance)

    for i_step in range(n_steps - 1):
        # Compute log S(t + 1): Eq(33)
        k0 = -rho * kappa * theta * dt / sigma
        k1 = GAMMA1 * dt * (kappa * rho / sigma - 0.5) - rho / sigma
        k2 = GAMMA2 * dt * (kappa * rho / sigma - 0.5) + rho / sigma
        k3 = GAMMA1 * dt * (1 - rho ** 2)
        k4 = GAMMA2 * dt * (1 - rho ** 2)
        v0 = variance[:, i_step]
        v1 = variance[:, i_step + 1]
        log_spot[:, i_step + 1] = sum(
            (
                log_spot[:, i_step],
                k0 + k1 * v0 + k2 * v1,
                (k3 * v0 + k4 * v1).sqrt() * randn[:, i_step],
            )
        )

    return (log_spot.exp(), variance)


class HestonStockFixed(HestonStock):
    def simulate(
        self,
        n_paths: int = 1,
        time_horizon: float = 20 / 250,
        init_state: Optional[tuple] = None,
    ) -> None:
        if init_state is None:
            init_state = self.default_init_state

        spot, variance = generate_heston(
            n_paths=n_paths,
            n_steps=int(time_horizon / self.dt),
            init_state=init_state,
            kappa=self.kappa,
            theta=self.theta,
            sigma=self.sigma,
            rho=self.rho,
            dt=self.dt,
            dtype=self.dtype,
            device=self.device,
        )

        self.register_buffer("spot", spot)
        self.register_buffer("variance", variance)


def main():
    torch.manual_seed(2)
    # stock = BrownianStock(cost=1e-4, volatility=0.2)
    # stock = HestonStock()
    stock = HestonStockFixed()
    derivative = EuropeanOption(stock)

    model = MultiLayerPerceptron()
    hedger = Hedger(model, inputs=["log_moneyness", "expiry_time", "prev_hedge"])

    hedger.fit(derivative, n_epochs=200)

    price = hedger.price(derivative)

    print(price)


if __name__ == "__main__":
    main()

update: typo corrected

@ghost ghost added the bug Something isn't working label Aug 13, 2021
@masanorihirano
Copy link
Collaborator Author

I also found the same issue as #184.
Thus, I'll make a PR including the fix.

@masanorihirano
Copy link
Collaborator Author

I mean, the same issue as #184 in CIR.

@ghost
Copy link

ghost commented Aug 13, 2021

Thank you so much, @masanorihirano !!

ghost pushed a commit that referenced this issue Aug 13, 2021
@ghost ghost changed the title Invalid Price time series from Heston Process BUG: Invalid Price time series from Heston Process Aug 13, 2021
@ghost
Copy link

ghost commented Aug 13, 2021

Resolved by #186

@ghost ghost closed this as completed Aug 13, 2021
ghost pushed a commit that referenced this issue Aug 13, 2021
* BIG: Fix invalid time series of CIR process (close #182) (#186)

* DOC: Add missing dt to generate_cir (#186)

* DOC: Update an example in README.md (#181)

Co-authored-by: GitHub Actions <action@github.com>
Co-authored-by: Masanori HIRANO <masanorihirano@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: GitHub Actions <action@github.com>
@ghost
Copy link

ghost commented Aug 13, 2021

@masanorihirano
Resolved with the latest release, 0.7.5.
Again, thank you so much for the really quick and pertinent fix!!

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant