# 1. Environment definition
To use custom environments in **RLLTE**, it suffices to follow the [gymnasium](https://gymnasium.farama.org/) interface and prepare your environment following [Tutorials: Make Your Own Custom Environment](https://gymnasium.farama.org/tutorials/gymnasium_basics/environment_creation/#). A example is:

In [1]:
import gymnasium as gym
import numpy as np

class CustomEnv(gym.Env):
    def __init__(self, total_length) -> None:
        super().__init__()
        self.observation_space = gym.spaces.Box(
            shape=(9, 84, 84),
            high=255.0,
            low=0.,
            dtype=np.uint8
        )
        self.action_space = gym.spaces.Box(
            shape=(7,),
            high=1.,
            low=-1.,
            dtype=np.float32
        )
        self.total_length = total_length
        self.count = 0

    def step(self, action):
        obs = self.observation_space.sample()
        reward = np.random.rand()
        if self.count < self.total_length:
            terminated = truncated = False
        else:
            terminated = truncated = True
        info = {"discount": 0.99}
        self.count += 1

        return obs, reward, terminated, truncated, info

    def reset(self, seed=None, options=None):
        self.count = 0
        return self.observation_space.sample(), {"discount": 0.99}

pygame 2.4.0 (SDL 2.26.4, Python 3.8.16)
Hello from the pygame community. https://www.pygame.org/contribute.html


# 2. Use `make_rllte_env`
In **RLLTE**, the environments are assumed to be ***vectorized*** and a `make_rllte_env` function is used to warp the environments:


In [3]:
from rllte.env import make_rllte_env
# create vectorized environments
env = make_rllte_env(env_id=CustomEnv, 
                     device="cuda", 
                     env_kwargs={'total_length': 499} # set env arguments
                     )

After that, you can use the custom environment in application directly.

In [4]:
from rllte.agent import DrQv2
from rllte.env.utils import make_rllte_env

if __name__ == "__main__":
    # env setup
    device = "cuda:0"
    env = make_rllte_env(env_id=CustomEnv, 
                        device=device, 
                        env_kwargs={'total_length': 499} # set env arguments
                        )
    eval_env = make_rllte_env(env_id=CustomEnv, 
                            device=device, 
                            env_kwargs={'total_length': 499} # set env arguments
                            )
    agent = DrQv2(env=env, 
                eval_env=eval_env, 
                device=device,
                tag="drqv2_dmc_pixel")
    agent.train(num_train_steps=5000, log_interval=1000)

[08/29/2023 12:12:21 PM] - [[1m[34mINFO.[0m] - Invoking RLLTE Engine...
[08/29/2023 12:12:21 PM] - [[1m[34mINFO.[0m] - Tag               : drqv2_dmc_pixel
[08/29/2023 12:12:21 PM] - [[1m[34mINFO.[0m] - Device            : NVIDIA GeForce RTX 3090
[08/29/2023 12:12:21 PM] - [[1m[33mDEBUG[0m] - Agent             : DrQv2
[08/29/2023 12:12:21 PM] - [[1m[33mDEBUG[0m] - Encoder           : TassaCnnEncoder
[08/29/2023 12:12:21 PM] - [[1m[33mDEBUG[0m] - Policy            : OffPolicyDetActorDoubleCritic
[08/29/2023 12:12:21 PM] - [[1m[33mDEBUG[0m] - Storage           : NStepReplayStorage
[08/29/2023 12:12:21 PM] - [[1m[33mDEBUG[0m] - Distribution      : TruncatedNormalNoise
[08/29/2023 12:12:21 PM] - [[1m[33mDEBUG[0m] - Augmentation      : True, RandomShift
[08/29/2023 12:12:21 PM] - [[1m[33mDEBUG[0m] - Intrinsic Reward  : False
[08/29/2023 12:12:35 PM] - [[1m[32mEVAL.[0m] - S: 0           | E: 0           | L: 500         | R: 249.162     | T: 0:00:15    
[08/29

   function: 'forward' (/export/yuanmingqi/code/rllte/rllte/xploit/policy/off_policy_det_actor_double_critic.py:141)
   reasons:  step == 0
to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html.
   function: 'get_dist' (/export/yuanmingqi/code/rllte/rllte/xploit/policy/off_policy_det_actor_double_critic.py:162)
   reasons:  step == 0
to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html.
   function: 'reset' (/export/yuanmingqi/code/rllte/rllte/xplore/distribution/truncated_normal_noise.py:100)
   reasons:  step == 0
to diagnose recompilation issues, see https://pytorch.org/docs/master/dynamo/troubleshooting.html.


[08/29/2023 12:13:08 PM] - [[1m[31mTRAIN[0m] - S: 3000        | E: 6           | L: 500         | R: 248.395     | FPS: 62.378    | T: 0:00:48    
[08/29/2023 12:13:20 PM] - [[1m[31mTRAIN[0m] - S: 4000        | E: 8           | L: 500         | R: 247.984     | FPS: 65.796    | T: 0:01:00    
[08/29/2023 12:13:33 PM] - [[1m[31mTRAIN[0m] - S: 5000        | E: 10          | L: 500         | R: 249.162     | FPS: 68.033    | T: 0:01:13    
[08/29/2023 12:13:45 PM] - [[1m[32mEVAL.[0m] - S: 5000        | E: 10          | L: 500         | R: 250.176     | T: 0:01:25    
[08/29/2023 12:13:45 PM] - [[1m[34mINFO.[0m] - Training Accomplished!
[08/29/2023 12:13:45 PM] - [[1m[34mINFO.[0m] - Model saved at: /export/yuanmingqi/code/rllte/examples/logs/drqv2_dmc_pixel/2023-08-29-12-12-20/model
