Skip to content

Commit

Permalink
Merge pull request #354 from tensortrade-org/docs-ray-training-example
Browse files Browse the repository at this point in the history
[docs] Add Ray training example
  • Loading branch information
carlogrisetti committed Nov 8, 2021
2 parents b95cd94 + eb09d43 commit 3086803
Show file tree
Hide file tree
Showing 6 changed files with 255 additions and 1,983 deletions.
29 changes: 14 additions & 15 deletions docs/source/components/reward_scheme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Reward schemes receive the `TradingEnv` at each time step and return a `float`, corresponding to the benefit of that specific action. For example, if the action taken this step was a sell that resulted in positive profits, our `RewardScheme` could return a positive number to encourage more trades like this. On the other hand, if the action was a sell that resulted in a loss, the scheme could return a negative reward to teach the agent not to make similar actions in the future.

A version of this example algorithm is implemented in `SimpleProfit`, however more complex schemes can obviously be used instead.
TensorTrade currently implements `SimpleProfit`, `RiskAdjustedReturns` and `PBR` (position-based returns), however more complex schemes can obviously be used instead.

Each reward scheme has a `reward` method, which takes in the `TradingEnv` at each time step and returns a `float` corresponding to the value of that action. As with action schemes, it is often necessary to store additional state within a reward scheme for various reasons. This state should be reset each time the reward scheme's reset method is called, which is done automatically when the environment is reset.

Expand All @@ -14,26 +14,22 @@ from tensortrade.env.default.rewards import SimpleProfit
reward_scheme = SimpleProfit()
```

_The simple profit scheme returns a reward of -1 for not holding a trade, 1 for holding a trade, 2 for purchasing an instrument, and a value corresponding to the (positive/negative) profit earned by a trade if an instrument was sold._

## Default
These are the default reward schemes.
<hr>

### Simple Profit
A reward scheme that rewards the agent for profitable trades and prioritizes trading over not trading.
## Simple Profit
A reward scheme that simply rewards the agent for profitable trades, no matter how it got there.

<br>**Overview**<br>
#### Overview
The simple profit scheme needs to keep a history of profit over time. The way it does this is through looking at the portfolio as a means of keeping track of how the portfolio moves. This is seen inside of the `get_reward` function.

<br>**Computing Reward**<br>
<br>**Compatibility**<br>
#### Computing Reward
Reward is calculated as the cumulative percentage change in net worth over the previous `window_size` time steps

<hr>

### Risk Adjusted Returns
## Risk Adjusted Returns
A reward scheme that rewards the agent for increasing its net worth, while penalizing more volatile strategies.

<br>**Overview**<br>
#### Overview
When trading you often are not just looking at the overall returns of your model. You're also looking at the overall volatility of your trading strategy over time compared to other metrics. The two major strategies here are the sharpe and sortino ratio.

The **sharpe ratio** looks at the overall movements of the portfolio and generates a penalty for massive movements through a lower score. This includes major movements towards the upside and downside.
Expand All @@ -44,8 +40,11 @@ The **sortino ratio** takes the same idea, though it focuses more on penalizing

![Sortino Ratio](../_static/images/sortino.png)

<br>**Computing Reward**<br>
#### Computing Reward
Given the choice of `return_algorithm` the reward is computed using the `risk_free_rate` and the `target_returns` parameters.

<br>**Compatibility**<br>
<hr>

## Position-based returns (PBR)

Documentation is missing. Please [submit a pull request](https://github.com/tensortrade-org/tensortrade/pulls) and help us expand it!
153 changes: 0 additions & 153 deletions docs/source/examples/train_and_evaluate.md

This file was deleted.

0 comments on commit 3086803

Please sign in to comment.