Add DataHedger and DeepHedger #575

justinhou95 · 2022-04-13T13:49:23Z

Use a robust expression of entropic_risk_measure for large values.
Add Data hedger using any generated data and features.
Add Deep hedger using different neural network structures in different time steps.
Add a notebook illustrating the use of data hedger and deep hedger
Add matplotlib into dependencies

- Modify the entropic risk measure to be more robust for large price values - Build datahedger class which can hedge with data already simulated or generated - Build deephedger class which can hedge with different neural networks at different time steps - add matplotlib into dependency (maybe move to dev-dependencies next time we do not need to include plotting features in the future) - add data_hedge.ipynb illustrating the two new classes

use only entropic risk measure

masanorihirano · 2022-05-17T08:28:44Z

Oh... I found almost the same modification of entropic_risk_measure.
I have made a different PR that only focused on the entropic_risk_measure in #581

masanorihirano · 2022-05-17T08:36:58Z

BTW, although I have not fully checked your data hedger, at glance, it seems to be not generalized enough.
Did you consider the extension of BasePrimary like BrownianStock using self.register_buffer("spot", xxxx), which makes it possible to register any data to the stock, and the extension of the Feature like UnderlierSpot, which also enable us to use the registered parameter in the BasePrimary class.

justinhou95 · 2022-05-17T12:51:34Z

Hi! Thank you so much for your reply! I wrote this data hedger mainly due to the following concerns:

Simulation of the Data (Prices and Information for hedging) and Hedging better be separable and independent. For the sake of robustness, random seeds can cause trouble when data added into the stock are simulated under different random seeds as the spot price when using self.register_buffer("spot", xxxx) in advance registering data. Even though is not an issue under careful implementation, since later I would like to use synthetic data (e.g. data generated by GAN externally) to train the hedger instead of a stochastic model (e.g. Heston model), it is not efficient to encode the GAN generator into the BasePrimary. Let alone sometimes we only want to train on an existing dataset without the generator.
Calculation of the Feature better be done in advance. Features are calculated within the class Hedger but not in advance, so if you are training different hedgers but with the same features, calculations are duplicated. This could be inefficient if later the feature space is large (e.g. signature features).
Flexibility in Neural Networks. The main motivation to write DeepHedger is to make it more flexible in neural network design. This enables different neural networks at different times which in principle (regardless of training issues under a more complicated neural network) gives better performance of path-dependent option pricing.

masanorihirano · 2022-05-17T17:36:49Z

I can understand your motivation for this implementation, but from the perspective of the design of pfhedge, it still seems to be weird, at least for me.
If we employ different classes for hedging with data, it becomes difficult to combine the artificial price movement and the actual data.

Thus, even if we implement a kind of data hedger, we should respect and follow the design of the original pfhedge.

First, if we want to use the actual price movement, we can implement like

class DataStock(BasePrimary):
    def __init__(
        self,
        price_movement: torch.Tensor
        cost: float = 0.0,
        dt: float = 1 / 250,
        dtype: Optional[torch.dtype] = None,
        device: Optional[torch.device] = None,
    ) -> None:
        super().__init__()
        self.price_movement = price_movement
        self.cost: float = cost
        self.dt: float = dt

        self.generator.to(dtype=dtype, device=device)
        self.to(dtype=dtype, device=device)

    def simulate(
        self,
        n_paths: int = 1,
        time_horizon: float = 20 / 250,
    ) -> None:
        n_steps: int = math.ceil(time_horizon / self.dt)
        spot = <some sampling process from self.price_movement>
        self.register_buffer("spot", spot)

If you want to use the generator of GAN,

class GeneratorStock(BasePrimary):
    def __init__(
        self,
        generator
        cost: float = 0.0,
        dt: float = 1 / 250,
        dtype: Optional[torch.dtype] = None,
        device: Optional[torch.device] = None,
    ) -> None:
        super().__init__()
        self.generator = generator
        self.cost: float = cost
        self.dt: float = dt

        self.generator.to(dtype=dtype, device=device)
        self.to(dtype=dtype, device=device)

    def simulate(
        self,
        n_paths: int = 1,
        time_horizon: float = 20 / 250,
        init_state: Optional[Tuple[Union[torch.Tensor, float, int]]] = None,
    ) -> None:
        if init_state is None:
            init_state = cast(Tuple[float], self.default_init_state)
        init_state_tensor = torch.as_tensor(init_state[0], device=self.device)

        n_steps: int = math.ceil(time_horizon / self.dt)
        seeds = torch.randn([n_path, ......])
        generated_spot = self.generator(seeds)
        spot = init_state_tensor * generated_spot
        self.register_buffer("spot", spot)

Those are roughly written; thus, I don't guarantee those are working.
I think if you want to use the actual data or generated data in options, you can implement it similarly.

Using the same procedure, the problem of "Calculation of the Feature" is also solved.
If we added those calculations in init and just set the results in simulated methods, we can use the calculated results in the feature class.

Of course, I agree with that the current implementation has a great burden on using those data. But, I think we should implement following the pfhedge architecture for them as a part of this package.
(Just for the private usage, your implementation is enough working and splendid.)

Lastly, about the last problem (flexibility in NN), I think the current pfhedge has enough flexibility.
Of course, I agree that the benefit of employing a kind of defined-by-run such as PyTorch is the flexibility.
But, the accepted model from pfhedge is nn.Module class; thus, we can implement any NN if we can implement in one class as an extension of nn.Module.
The only thing we cannot change in the middle of training is the input-output of NN.
However, in this framework, at least currently, the dynamic input/output is not acceptable, which is also not accepted from your implementation.
According to your implementation, what you want to do is the usage of the different NN at different time steps.
If you want to do so, we should implement the switching architecture in the model (not hedger) and make a new feature representing the time step.

I have also noticed the problem that the hedging model should depend on the time-to-maturity, which might motivate you to implement the switching model architecture depending on timestep.
(Although it was written in Japanese, I stated about it in https://sigfin.org/028-06/)
But, the implementation should be more elevated as a part of pfhedge.

Although I wrote a lot, it doesn't intend to deny your motivation and your implementation.
I hope this comment will be valuable for us and the future pfhedge.

justinhou95 · 2022-05-18T12:30:09Z

Thank you so much for your detailed reply! Now it makes great sense to me why we are persisting within the structure of current Hedger and building functions on the top of the design of pfhege. For future development, features would be developed on the established classes e.g. BasePrimary and Hedger. Definitely, the discussion is valuable! :-)

justinhou95 added 6 commits April 1, 2022 10:58

change loss

8bf99d0

use only entropic risk measure

add data_hedge

edbcc40

sigma = 0.2

7e75589

epochs = 100

a9fd362

make check done

af7b22f

justinhou95 closed this May 18, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add DataHedger and DeepHedger #575

Add DataHedger and DeepHedger #575

justinhou95 commented Apr 13, 2022

masanorihirano commented May 17, 2022

masanorihirano commented May 17, 2022

justinhou95 commented May 17, 2022

masanorihirano commented May 17, 2022

justinhou95 commented May 18, 2022

Add DataHedger and DeepHedger #575

Add DataHedger and DeepHedger #575

Conversation

justinhou95 commented Apr 13, 2022

masanorihirano commented May 17, 2022

masanorihirano commented May 17, 2022

justinhou95 commented May 17, 2022

masanorihirano commented May 17, 2022

justinhou95 commented May 18, 2022