Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Is there an 'easy' way to set Order properties? #4

Closed
snafu4 opened this issue Nov 9, 2021 · 17 comments
Closed

Question: Is there an 'easy' way to set Order properties? #4

snafu4 opened this issue Nov 9, 2021 · 17 comments

Comments

@snafu4
Copy link

snafu4 commented Nov 9, 2021

Specifically, how do I set 'volume', 'volume_step', 'volume_min' and 'volume_max' without creating a child of the Order class? I need to be able to set these values potentially for each order.

@AminHP
Copy link
Owner

AminHP commented Nov 14, 2021

These values (volume_step, volume_min, volume_max) are only used in here and here.

I don't exactly know how you are working with the simulator or the env. But changing the values stored in the symbols_info attribute before creating a new order might work in some cases.

@snafu4
Copy link
Author

snafu4 commented Nov 16, 2021

env is equivalent to:

env = MtEnv(
    original_simulator=sim,
    trading_symbols=['GBPCAD', 'EURUSD', 'USDJPY'],
    window_size=10,
    # time_points=[desired time points ...],
    hold_threshold=0.5,
    close_threshold=0.5,
    fee=lambda symbol: 
...

Using the following right after above

max = 0.01
min = 0.01
step = 0.01
env.original_simulator.symbols_info[symbol].volume_max = max
env.original_simulator.symbols_info[symbol].volume_min = min
env.original_simulator.symbols_info[symbol].volume_step = step

sets the values (verified in code right after) but doesn't actually change the order volume (the volume is all over the place).

Can you expand on '... might work in some cases...'? Why some cases and not others?

@AminHP
Copy link
Owner

AminHP commented Nov 16, 2021

For example, this code works. But inside an A2C model, you can't apply it.

sim = MtEnv(...)

env.simulator.symbols_info[symbol1].volume_max = a1
env.simulator.symbols_info[symbol2].volume_max = a2
env.step(...)

env.simulator.symbols_info[symbol1].volume_max = b1
env.simulator.symbols_info[symbol2].volume_max = b2
env.step(...)

env.simulator.symbols_info[symbol1].volume_max = c1
env.simulator.symbols_info[symbol2].volume_max = c2
env.step(...)

@snafu4
Copy link
Author

snafu4 commented Nov 16, 2021

I am using AC2.

The _get_modified_volume() in MTEnv class limits the volume to between volume_max and volume_min before the order is sent to MtSimulator for execution.

Can you please explain your statement above "...But inside an A2C model, you can't apply it. ..."? Also, why would the model being used restrict the use of the volume_ values?

Thanks

@AminHP
Copy link
Owner

AminHP commented Nov 19, 2021

About the statement "...But inside an A2C ...": I mean this code works because you can change the volume_* values before calling the step method of the env. But when you are using an A2C model, it calls the step method inside of its routines so we cannot change volume_* values by the code I posted earlier.

About the "why would the model ..." question: This is something related to the MetaTrader. They restrict these values and we cannot set any volume we want.

In case you don't want these restrictions to be applied, just remove or modify the _check_volume and _get_modified_volume methods. For example:

class MyMtEnv(MtEnv):
    def _check_volume(self, symbol: str, volume: float) -> None:
        pass

    def _get_modified_volume(self, symbol: str, volume: float) -> float:
        si = self.simulator.symbols_info[symbol]
        v = abs(volume)
        return v

@snafu4
Copy link
Author

snafu4 commented Nov 21, 2021

If it is not possible to limit the volume (lot size in forex), how else can the risk be mitigated within this framework?

@AminHP
Copy link
Owner

AminHP commented Nov 24, 2021

Shouldn't the RL model itself learn to manage the risk?
I think risk management is part of the prediction algorithm. I mean when we give an action to a GymEnv, it should only check if this action is according to the environmental constraints and then apply it. If the action conflicts with the environmental restrictions, the GymEnv can either ignore or modify it, which the latter was selected in MtEnv.
Anything beyond that like risk management and per-order volume restrictions should be applied before passing the actions to the step method.

Since stable-baselines does not support such a thing, a simple way to do it is to call a function at the beginning of the step method:

class MyMtEnv(MtEnv):
    def step(self, action: np.ndarray) -> Tuple[Dict[str, np.ndarray], float, bool, Dict[str, Any]]:
        action = self._modify_action(action)
        return super().step(action)

    def _modify_action(self, action: np.ndarray) -> np.ndarray:
        k = self.symbol_max_orders + 2
        for i, symbol in enumerate(self.trading_symbols):
            symbol_action = action[k*i:k*(i+1)]
            volume = symbol_action[-1]
            if self._current_tick > 20:  # or some other conditions according to your risk management algorithm
                volume = np.clip(volume, -1.2, 1.2)
            symbol_action[-1] = volume
        return action

@snafu4
Copy link
Author

snafu4 commented Nov 24, 2021

I will give the code a try. Thanks.

You bring up a good point w.r.t. the RL model learning to manage the risk (that's what a proper reward is supposed to control, right??). However, I trade with a proprietary firm that restricts certain aspects of my account during a trading session (for example, if my balance goes more than x% below it's daily starting amount, all my trades will be closed automatically).

This means that I have to watch my intra-day drawdown. One of the ways I control this is through a restriction of the volume/lot-size.

This is why the control I've asked about in this thread is critical for me.

@AminHP
Copy link
Owner

AminHP commented Nov 24, 2021

Yes, an ideal RL agent should be able to control everything, but it is difficult to make such an agent. By the way, I like your attitude. Managing some stuff besides the RL model is an excellent way to combine the knowledge of a human expert and an AI agent. It helps to achieve better results in less time.

Let me know if my last piece of code was proper for your requirement.

@snafu4
Copy link
Author

snafu4 commented Nov 25, 2021

I will try and debug asap but maybe you can quickly see the problem:

class MyCustomEnv(MtEnv):
    
    def step(self, action: np.ndarray) -> Tuple[Dict[str, np.ndarray], float, bool, Dict[str, Any]]:
        action = self._modify_action(action)
        super().step(action)

    def _modify_action(self, action: np.ndarray) -> np.ndarray:
        k = self.symbol_max_orders + 2
        for i, symbol in enumerate(self.trading_symbols):
            symbol_action = action[k*i:k*(i+1)]
            volume = symbol_action[-1]
            if self._current_tick > 0:  # or some other conditions according to your risk management algorithm
                volume = min(volume, 0.01)
            symbol_action[-1] = volume

I get errors below when:

env = MyCustomEnv(original_simulator = sim,
                 trading_symbols = ['EURUSD', 'GBPJPY'],
                 window_size = 10,
                  symbol_max_orders = 2,
                  multiprocessing_processes = 4,
                 )

is run.

Process SpawnPoolWorker-16:
Traceback (most recent call last):
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\multiprocess\process.py", line 315, in _bootstrap
    self.run()
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\multiprocess\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\multiprocess\pool.py", line 114, in worker
    task = get()
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\multiprocess\queues.py", line 361, in get
    return _ForkingPickler.loads(res)
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\dill\_dill.py", line 327, in loads
    return load(file, ignore, **kwds)
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\dill\_dill.py", line 313, in load
    return Unpickler(file, ignore=ignore, **kwds).load()
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\dill\_dill.py", line 525, in load
    obj = StockUnpickler.load(self)
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\dill\_dill.py", line 515, in find_class
    return StockUnpickler.find_class(self, module, name)
AttributeError: Can't get attribute 'MyCustomEnv' on <module '__main__' (built-in)>

@AminHP
Copy link
Owner

AminHP commented Nov 25, 2021

It seems there is a problem with pathos and multiprocessing. Please try multiprocessing_processes=None for now until I fix it.

@snafu4
Copy link
Author

snafu4 commented Nov 25, 2021

FYI: multiprocessing_processes=None

results in


---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<timed exec> in <module>

~\AppData\Local\Temp/ipykernel_21208/3706694186.py in run_Model(i, total_timesteps, fileName, saveModel)
      6     model = A2C('MultiInputPolicy', env, verbose=0)
      7 #     model.learn(total_timesteps=total_timesteps, callback=[eval_callback])
----> 8     model.learn(total_timesteps=total_timesteps)
      9 
     10     observation = env.reset()

d:\python\pyenv\gym-mtsim\lib\site-packages\stable_baselines3\a2c\a2c.py in learn(self, total_timesteps, callback, log_interval, eval_env, eval_freq, n_eval_episodes, tb_log_name, eval_log_path, reset_num_timesteps)
    190     ) -> "A2C":
    191 
--> 192         return super(A2C, self).learn(
    193             total_timesteps=total_timesteps,
    194             callback=callback,

d:\python\pyenv\gym-mtsim\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py in learn(self, total_timesteps, callback, log_interval, eval_env, eval_freq, n_eval_episodes, tb_log_name, eval_log_path, reset_num_timesteps)
    235         while self.num_timesteps < total_timesteps:
    236 
--> 237             continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
    238 
    239             if continue_training is False:

d:\python\pyenv\gym-mtsim\lib\site-packages\stable_baselines3\common\on_policy_algorithm.py in collect_rollouts(self, env, callback, rollout_buffer, n_rollout_steps)
    176                 clipped_actions = np.clip(actions, self.action_space.low, self.action_space.high)
    177 
--> 178             new_obs, rewards, dones, infos = env.step(clipped_actions)
    179 
    180             self.num_timesteps += env.num_envs

d:\python\pyenv\gym-mtsim\lib\site-packages\stable_baselines3\common\vec_env\base_vec_env.py in step(self, actions)
    160         """
    161         self.step_async(actions)
--> 162         return self.step_wait()
    163 
    164     def get_images(self) -> Sequence[np.ndarray]:

d:\python\pyenv\gym-mtsim\lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py in step_wait(self)
     41     def step_wait(self) -> VecEnvStepReturn:
     42         for env_idx in range(self.num_envs):
---> 43             obs, self.buf_rews[env_idx], self.buf_dones[env_idx], self.buf_infos[env_idx] = self.envs[env_idx].step(
     44                 self.actions[env_idx]
     45             )

d:\python\pyenv\gym-mtsim\lib\site-packages\stable_baselines3\common\monitor.py in step(self, action)
     88         if self.needs_reset:
     89             raise RuntimeError("Tried to step environment that needs reset")
---> 90         observation, reward, done, info = self.env.step(action)
     91         self.rewards.append(reward)
     92         if done:

~\AppData\Local\Temp/ipykernel_21208/4242459879.py in step(self, action)
     71     def step(self, action: np.ndarray) -> Tuple[Dict[str, np.ndarray], float, bool, Dict[str, Any]]:
     72         action = self._modify_action(action)
---> 73         super().step(action)
     74 
     75     def _modify_action(self, action: np.ndarray) -> np.ndarray:

D:\Python\pyenv\gym-mtsim\gym_mtsim\envs\mt_env.py in step(self, action)
    108 
    109     def step(self, action: np.ndarray) -> Tuple[Dict[str, np.ndarray], float, bool, Dict[str, Any]]:
--> 110         orders_info, closed_orders_info = self._apply_action(action)
    111 
    112         self._current_tick += 1

D:\Python\pyenv\gym-mtsim\gym_mtsim\envs\mt_env.py in _apply_action(self, action)
    135 
    136         for i, symbol in enumerate(self.trading_symbols):
--> 137             symbol_action = action[k*i:k*(i+1)]
    138             close_orders_logit = symbol_action[:-2]
    139             hold_logit = symbol_action[-2]

TypeError: 'NoneType' object is not subscriptable

@AminHP
Copy link
Owner

AminHP commented Nov 25, 2021

I updated this code

@snafu4
Copy link
Author

snafu4 commented Nov 25, 2021

The new code works (thanks) but it does not appear to accomplish the original goal. The volume is not affected by its inclusion.

class MyMtEnv(MtEnv):
    def step(self, action: np.ndarray) -> Tuple[Dict[str, np.ndarray], float, bool, Dict[str, Any]]:
        action = self._modify_action(action)
        super().step(action)

    def _modify_action(self, action: np.ndarray) -> np.ndarray:
        k = self.symbol_max_orders + 2
        for i, symbol in enumerate(self.trading_symbols):
            symbol_action = action[k*i:k*(i+1)]
            volume = symbol_action[-1]
#             if self._current_tick > 20:  # or some other conditions according to your risk management algorithm
            volume = min(volume, 0.01)
            symbol_action[-1] = volume
        return action

The volume should be fixed at 0.01.

image

@AminHP
Copy link
Owner

AminHP commented Nov 25, 2021

My code had a problem and I fixed it. Try the new code and send me your complete code if the problem still exists. Make sure you are using MyMtEnv not MtEnv.

@snafu4
Copy link
Author

snafu4 commented Nov 25, 2021

You caught my mistake ('Make sure you are using MyMtEnv not MtEnv'). I wasn't actually testing your updated code.

New code appears work for restricting volume!! ... but only if

multiprocessing_processes = None

When set to 1, 4 get:

AttributeError: Can't get attribute 'MyCustomEnv' on <module '__main__' (built-in)>
Process SpawnPoolWorker-4:
Traceback (most recent call last):
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\multiprocess\process.py", line 315, in _bootstrap
    self.run()
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\multiprocess\process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\multiprocess\pool.py", line 114, in worker
    task = get()
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\multiprocess\queues.py", line 361, in get
    return _ForkingPickler.loads(res)
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\dill\_dill.py", line 327, in loads
    return load(file, ignore, **kwds)
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\dill\_dill.py", line 313, in load
    return Unpickler(file, ignore=ignore, **kwds).load()
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\dill\_dill.py", line 525, in load
    obj = StockUnpickler.load(self)
  File "d:\python\pyenv\gym-mtsim\lib\site-packages\dill\_dill.py", line 515, in find_class
    return StockUnpickler.find_class(self, module, name)
AttributeError: Can't get attribute 'MyCustomEnv' on <module '__main__' (built-in)>

@AminHP
Copy link
Owner

AminHP commented Dec 1, 2021

As I said earlier, it seems to be a problem with pathos. I don't exactly know the problem but it can be fixed using the code below.

class MyCustomEnv(MtEnv):
    def __init__(self, *args, **kwargs):
        multiprocessing_processes = kwargs.pop('multiprocessing_processes', None)
        super().__init__(*args, **kwargs)
        self.multiprocessing_pool = Pool(multiprocessing_processes) if multiprocessing_processes else None

    def step(self, action: np.ndarray):
        action = self._modify_action(action)
        return super().step(action)

    def _modify_action(self, action: np.ndarray) -> np.ndarray:
        k = self.symbol_max_orders + 2
        for i, symbol in enumerate(self.trading_symbols):
            symbol_action = action[k*i:k*(i+1)]
            volume = symbol_action[-1]
            volume = np.clip(volume, -0.01, 0.01)
            symbol_action[-1] = volume
        return action

@AminHP AminHP closed this as completed Jan 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants