# A guide Portfolio Optimization Environment

This notebook aims to provide an example of using PortfolioOptimizationEnv (or POE) to train a reinforcement learning model that learns to solve the portfolio optimization problem.

In this document, we will reproduce a famous architecture called EIIE (ensemble of identical independent evaluators), introduced in the following paper:

- Zhengyao Jiang, Dixing Xu, & Jinjun Liang. (2017). A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem. https://doi.org/10.48550/arXiv.1706.10059.

It's advisable to read it to understand the algorithm implemented in this notebook.

### Note
If you're using this environment, consider citing the following paper (in adittion to FinRL references):

- Caio Costa, & Anna Costa (2023). POE: A General Portfolio Optimization Environment for FinRL. In *Anais do II Brazilian Workshop on Artificial Intelligence in Finance* (pp. 132–143). SBC. https://doi.org/10.5753/bwaif.2023.231144.

```
@inproceedings{bwaif,
 author = {Caio Costa and Anna Costa},
 title = {POE: A General Portfolio Optimization Environment for FinRL},
 booktitle = {Anais do II Brazilian Workshop on Artificial Intelligence in Finance},
 location = {João Pessoa/PB},
 year = {2023},
 keywords = {},
 issn = {0000-0000},
 pages = {132--143},
 publisher = {SBC},
 address = {Porto Alegre, RS, Brasil},
 doi = {10.5753/bwaif.2023.231144},
 url = {https://sol.sbc.org.br/index.php/bwaif/article/view/24959}
}

```

## Installation and imports

To run this notebook in google colab, uncomment the cells below.

In [1]:
## install finrl library
# !pip install wrds
# !pip install quantstats
# !pip install torch_geometric
# !pip install swig
# !pip install -q condacolab
# !pip install shimmy
# import condacolab
# condacolab.install()
# !apt-get update -y -qq && apt-get install -y -qq cmake libopenmpi-dev python3-dev zlib1g-dev libgl1-mesa-glx swig
# !pip install git+https://github.com/flpymonkey/FinRL_Online_Portfolio_Benchmarks.git

In [2]:
## Hide matplotlib warnings
# import warnings
# warnings.filterwarnings('ignore')

import logging
logging.getLogger('matplotlib.font_manager').disabled = True

#### Import the necessary code libraries

In [3]:
import torch

import numpy as np

from sklearn.preprocessing import MaxAbsScaler

from finrl.meta.preprocessor.yahoodownloader import YahooDownloader
from finrl.meta.preprocessor.preprocessors import GroupByScaler
from finrl.meta.env_portfolio_optimization.env_portfolio_optimization import PortfolioOptimizationEnv
from finrl.agents.portfolio_optimization.models import DRLAgent
from finrl.agents.portfolio_optimization.architectures import EIIE

device = 'cuda:0' if torch.cuda.is_available() else 'cpu'

## Fetch data

In his paper, *Jiang et al* creates a portfolio composed by the top-11 cryptocurrencies based on 30-days volume. Since it's not specified when this classification was done, it's difficult to reproduce, so we will use a similar approach in the Brazillian stock market:

- We select top-10 stocks from Brazillian stock market;
- For simplicity, we disconsider stocks that have missing data for the days in period 2011-01-01 to 2019-12-31 (9 years);

In [None]:
TEST_TICKER = [
   "MSFT",
    "V",
    "AAPL",
    "BA",
    "INTC",
    "WMT",
]


TRAIN_START_DATE = '2009-04-01'
TEST_END_DATE = '2024-10-01'

TRAIN_DATES = ['2018-12-31', '2019-12-31', '2020-12-31', '2021-12-31', '2022-12-31', '2023-12-31']


In [5]:
print(len(TEST_TICKER))

portfolio_raw_df = YahooDownloader(start_date = TRAIN_START_DATE,
                                end_date = TEST_END_DATE,
                                ticker_list = TEST_TICKER).fetch_data()
portfolio_raw_df

6


[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


Shape of DataFrame:  (23406, 8)


Unnamed: 0,date,open,high,low,close,volume,tic,day
0,2009-04-01,3.717500,3.892857,3.710357,3.274470,589372000,AAPL,2
1,2009-04-01,34.520000,35.599998,34.209999,26.850750,9288800,BA,2
2,2009-04-01,14.770000,15.320000,14.620000,9.507456,75052800,INTC,2
3,2009-04-01,18.230000,19.360001,18.180000,14.301780,96438900,MSFT,2
4,2009-04-01,13.687500,14.000000,13.407500,12.129694,44144400,V,2
...,...,...,...,...,...,...,...,...
23401,2024-09-30,154.789993,155.300003,151.240005,152.039993,10902200,BA,0
23402,2024-09-30,23.740000,23.950001,23.090000,23.459999,66308200,INTC,0
23403,2024-09-30,428.209991,430.420013,425.369995,429.440399,16807300,MSFT,0
23404,2024-09-30,275.000000,275.690002,273.200012,274.428284,5969900,V,0


In [6]:
portfolio_raw_df.groupby("tic").count()

Unnamed: 0_level_0,date,open,high,low,close,volume,day
tic,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
AAPL,3901,3901,3901,3901,3901,3901,3901
BA,3901,3901,3901,3901,3901,3901,3901
INTC,3901,3901,3901,3901,3901,3901,3901
MSFT,3901,3901,3901,3901,3901,3901,3901
V,3901,3901,3901,3901,3901,3901,3901
WMT,3901,3901,3901,3901,3901,3901,3901


### Normalize Data

We normalize the data dividing the time series of each stock by its maximum value, so that the dataframe contains values between 0 and 1.

In [7]:
portfolio_norm_df = GroupByScaler(by="tic", scaler=MaxAbsScaler).fit_transform(portfolio_raw_df)
portfolio_norm_df

  X.loc[select_mask, self.columns] = self.scalers[value].transform(
  X.loc[select_mask, self.columns] = self.scalers[value].transform(


Unnamed: 0,date,open,high,low,close,volume,tic,day
0,2009-04-01,0.015720,0.016410,0.015918,0.013976,0.313329,AAPL,0.5
1,2009-04-01,0.077397,0.079819,0.077716,0.062400,0.089997,BA,0.5
2,2009-04-01,0.216569,0.221100,0.217204,0.153140,0.249431,INTC,0.5
3,2009-04-01,0.039036,0.041337,0.039142,0.030705,0.302015,MSFT,0.5
4,2009-04-01,0.046901,0.047770,0.046220,0.041682,0.130785,V,0.5
...,...,...,...,...,...,...,...,...
23401,2024-09-30,0.347055,0.348198,0.343579,0.353335,0.105628,BA,0.0
23402,2024-09-30,0.348094,0.345649,0.343040,0.377879,0.220369,INTC,0.0
23403,2024-09-30,0.916938,0.919014,0.915838,0.921968,0.052635,MSFT,0.0
23404,2024-09-30,0.942297,0.940697,0.941809,0.943031,0.017687,V,0.0


### Instantiate Environment

Using the `PortfolioOptimizationEnv`, it's easy to instantiate a portfolio optimization environment for reinforcement learning agents. In the example below, we use the dataframe created before to start an environment.

### Instantiate Model

Now, we can instantiate the model using FinRL API. In this example, we are going to use the EIIE architecture introduced by Jiang et. al.

:exclamation: **Note:** Remember to set the architecture's `time_window` parameter with the same value of the environment's `time_window`.

In [None]:
df_portfolio = portfolio_norm_df[["date", "tic", "close", "high", "low"]]

from datetime import datetime

NUM_MODELS = 5


for split_date in TRAIN_DATES:
    df_portfolio_train = df_portfolio[(df_portfolio["date"] >= TRAIN_START_DATE) & (df_portfolio["date"] < split_date)]

    df_portfolio_test = df_portfolio[(df_portfolio["date"] > split_date) & (df_portfolio["date"] < TEST_END_DATE)]

    for i in range(0, NUM_MODELS):

        print("=================")
        print(split_date)
        print(i)

        environment = PortfolioOptimizationEnv(
                df_portfolio_train,
                initial_amount=100000,
                comission_fee_pct=0.0025,
                time_window=50,
                features=["close", "high", "low"],
                normalize_df=None
            )

        # set PolicyGradient parameters
        model_kwargs = {
            "lr": 0.01,
            "policy": EIIE,
        }

        # here, we can set EIIE's parameters
        policy_kwargs = {
            "k_size": 3,
            "time_window": 50,
        }

        model = DRLAgent(environment).get_model("pg", device, model_kwargs, policy_kwargs)

        DRLAgent.train_model(model, episodes=40)

        current_timestamp = datetime.now()
        timestamp_string = current_timestamp.strftime("%Y-%m-%d %H:%M:%S")

        FILE = f"policy_EIIE_{split_date}_{str(i)}_{timestamp_string}.pt"

        torch.save(model.train_policy.state_dict(), FILE)

        environment_test = PortfolioOptimizationEnv(
            df_portfolio_test,
            initial_amount=100000,
            comission_fee_pct=0.0025,
            time_window=50,
            features=["close", "high", "low"],
            normalize_df=None
        )

        EIIE_results = {
            "training": environment._asset_memory["final"],
            "test": {},
        }

        # instantiate an architecture with the same arguments used in training
        # and load with load_state_dict.
        policy = EIIE(time_window=50, device=device)
        policy.load_state_dict(torch.load(FILE))

        environment.reset()
        DRLAgent.DRL_validation(model, environment, policy=policy)
        EIIE_results["training"] = environment._asset_memory["final"]


        # 2020
        DRLAgent.DRL_validation(model, environment_test, policy=policy)
        EIIE_results["test"]["value"] = environment_test._asset_memory["final"]

        UBAH_results = {
            "train": {},
            "test": {},
        }

        PORTFOLIO_SIZE = len(TEST_TICKER)

        # train period
        terminated = False
        environment.reset()
        while not terminated:
            action = [0] + [1/PORTFOLIO_SIZE] * PORTFOLIO_SIZE
            _, _, terminated, _ = environment.step(action)
        UBAH_results["train"]["value"] = environment._asset_memory["final"]

        # 2020
        terminated = False
        environment_test.reset()
        while not terminated:
            action = [0] + [1/PORTFOLIO_SIZE] * PORTFOLIO_SIZE
            _, _, terminated, _ = environment_test.step(action)
        UBAH_results["test"]["value"] = environment_test._asset_memory["final"]


2018-12-31
0


  0%|          | 0/40 [00:00<?, ?it/s]

Initial portfolio value:100000
Final portfolio value: 518157.5
Final accumulative portfolio value: 5.181575
Maximum DrawDown: -0.19953436900356636
Sharpe ratio: 1.2437006786418974


  2%|▎         | 1/40 [00:31<20:16, 31.20s/it]

Initial portfolio value:100000
Final portfolio value: 638280.9375
Final accumulative portfolio value: 6.382809375
Maximum DrawDown: -0.24438885576137714
Sharpe ratio: 1.252906030061764


  5%|▌         | 2/40 [01:02<19:42, 31.12s/it]

Initial portfolio value:100000
Final portfolio value: 788846.9375
Final accumulative portfolio value: 7.888469375
Maximum DrawDown: -0.2596710098218472
Sharpe ratio: 1.2729511319119857


  8%|▊         | 3/40 [01:33<19:20, 31.37s/it]

Initial portfolio value:100000
Final portfolio value: 905069.4375
Final accumulative portfolio value: 9.050694375
Maximum DrawDown: -0.1872902423344741
Sharpe ratio: 1.3267364674556361


 10%|█         | 4/40 [02:06<19:12, 32.01s/it]

Initial portfolio value:100000
Final portfolio value: 1152856.125
Final accumulative portfolio value: 11.52856125
Maximum DrawDown: -0.18909784762873905
Sharpe ratio: 1.3852901630973375


 12%|█▎        | 5/40 [02:38<18:39, 31.99s/it]

Initial portfolio value:100000
Final portfolio value: 1367338.625
Final accumulative portfolio value: 13.67338625
Maximum DrawDown: -0.19128019447793343
Sharpe ratio: 1.4312410877138053


 15%|█▌        | 6/40 [03:09<17:46, 31.37s/it]

Initial portfolio value:100000
Final portfolio value: 1488788.0
Final accumulative portfolio value: 14.88788
Maximum DrawDown: -0.19316614546777178
Sharpe ratio: 1.431908564569418


 18%|█▊        | 7/40 [03:33<16:04, 29.23s/it]

Initial portfolio value:100000
Final portfolio value: 1609713.625
Final accumulative portfolio value: 16.09713625
Maximum DrawDown: -0.19356016103618623
Sharpe ratio: 1.460823578347117


 20%|██        | 8/40 [03:57<14:41, 27.55s/it]

Initial portfolio value:100000
Final portfolio value: 1673115.25
Final accumulative portfolio value: 16.7311525
Maximum DrawDown: -0.19435583870444773
Sharpe ratio: 1.4648435482592015


 22%|██▎       | 9/40 [04:21<13:34, 26.26s/it]

Initial portfolio value:100000
Final portfolio value: 1705884.875
Final accumulative portfolio value: 17.05884875
Maximum DrawDown: -0.19490157157515542
Sharpe ratio: 1.4657701947133097


 25%|██▌       | 10/40 [04:44<12:41, 25.39s/it]

Initial portfolio value:100000
Final portfolio value: 1672884.0
Final accumulative portfolio value: 16.72884
Maximum DrawDown: -0.19704244106870217
Sharpe ratio: 1.4419059413432804


 28%|██▊       | 11/40 [05:07<11:54, 24.64s/it]

Initial portfolio value:100000
Final portfolio value: 1748040.25
Final accumulative portfolio value: 17.4804025
Maximum DrawDown: -0.19542296131330872
Sharpe ratio: 1.4680095234092596


 30%|███       | 12/40 [05:51<13:39, 29.28s/it]


KeyboardInterrupt: 

### Train Model

  0%|          | 0/40 [00:00<?, ?it/s]

Initial portfolio value:100000
Final portfolio value: 1958396.5
Final accumulative portfolio value: 19.583965
Maximum DrawDown: -0.30520120295900544
Sharpe ratio: 1.1565652909453532


  2%|▎         | 1/40 [00:37<24:26, 37.59s/it]

Initial portfolio value:100000
Final portfolio value: 2673373.5
Final accumulative portfolio value: 26.733735
Maximum DrawDown: -0.3001341818144375
Sharpe ratio: 1.1769115049119097


  5%|▌         | 2/40 [01:23<27:03, 42.72s/it]

Initial portfolio value:100000
Final portfolio value: 3104070.0
Final accumulative portfolio value: 31.0407
Maximum DrawDown: -0.30769359388857154
Sharpe ratio: 1.1594700277159495


  8%|▊         | 3/40 [02:09<27:04, 43.92s/it]

Initial portfolio value:100000
Final portfolio value: 3257473.5
Final accumulative portfolio value: 32.574735
Maximum DrawDown: -0.3188991033785997
Sharpe ratio: 1.1540243918383162


 10%|█         | 4/40 [02:49<25:33, 42.59s/it]

Initial portfolio value:100000
Final portfolio value: 2503919.0
Final accumulative portfolio value: 25.03919
Maximum DrawDown: -0.33090874695781514
Sharpe ratio: 1.2719800005494837


 12%|█▎        | 5/40 [03:28<24:01, 41.19s/it]

Initial portfolio value:100000
Final portfolio value: 2824975.0
Final accumulative portfolio value: 28.24975
Maximum DrawDown: -0.3468525801464468
Sharpe ratio: 1.3028201279984415


 15%|█▌        | 6/40 [04:10<23:35, 41.62s/it]

Initial portfolio value:100000
Final portfolio value: 3128960.75
Final accumulative portfolio value: 31.2896075
Maximum DrawDown: -0.3524105610161633
Sharpe ratio: 1.3128877504348149


 18%|█▊        | 7/40 [04:47<21:54, 39.83s/it]

Initial portfolio value:100000
Final portfolio value: 3250123.0
Final accumulative portfolio value: 32.50123
Maximum DrawDown: -0.35392634911743326
Sharpe ratio: 1.3169987955103746


 20%|██        | 8/40 [05:29<21:40, 40.63s/it]

Initial portfolio value:100000
Final portfolio value: 3510849.25
Final accumulative portfolio value: 35.1084925
Maximum DrawDown: -0.35457902690108023
Sharpe ratio: 1.3071059322743788


 22%|██▎       | 9/40 [06:10<21:06, 40.85s/it]

Initial portfolio value:100000
Final portfolio value: 5532465.0
Final accumulative portfolio value: 55.32465
Maximum DrawDown: -0.37187194693191683
Sharpe ratio: 1.2847834014896786


 25%|██▌       | 10/40 [06:52<20:36, 41.20s/it]

Initial portfolio value:100000
Final portfolio value: 3439077.25
Final accumulative portfolio value: 34.3907725
Maximum DrawDown: -0.35511904510742687
Sharpe ratio: 1.3246386336373621


 28%|██▊       | 11/40 [07:27<19:01, 39.36s/it]

Initial portfolio value:100000
Final portfolio value: 3911968.0
Final accumulative portfolio value: 39.11968
Maximum DrawDown: -0.3554210262186409
Sharpe ratio: 1.3167660848931673


 30%|███       | 12/40 [08:01<17:32, 37.57s/it]

Initial portfolio value:100000
Final portfolio value: 5154100.5
Final accumulative portfolio value: 51.541005
Maximum DrawDown: -0.3745649415810328
Sharpe ratio: 1.2279484990623395


 32%|███▎      | 13/40 [08:36<16:31, 36.71s/it]

Initial portfolio value:100000
Final portfolio value: 5323997.0
Final accumulative portfolio value: 53.23997
Maximum DrawDown: -0.37473742731821647
Sharpe ratio: 1.236810978513806


 35%|███▌      | 14/40 [09:14<16:04, 37.10s/it]

Initial portfolio value:100000
Final portfolio value: 3308452.0
Final accumulative portfolio value: 33.08452
Maximum DrawDown: -0.35570409528098157
Sharpe ratio: 1.267991276598877


 38%|███▊      | 15/40 [09:58<16:21, 39.26s/it]

Initial portfolio value:100000
Final portfolio value: 3417761.5
Final accumulative portfolio value: 34.177615
Maximum DrawDown: -0.3557730766829933
Sharpe ratio: 1.318638297074427


 40%|████      | 16/40 [10:40<15:59, 39.99s/it]

Initial portfolio value:100000
Final portfolio value: 3497765.75
Final accumulative portfolio value: 34.9776575
Maximum DrawDown: -0.3559284590936215
Sharpe ratio: 1.327648331890976


 42%|████▎     | 17/40 [11:22<15:33, 40.60s/it]

Initial portfolio value:100000
Final portfolio value: 3513002.5
Final accumulative portfolio value: 35.130025
Maximum DrawDown: -0.35605538954551763
Sharpe ratio: 1.3277559242076482


 45%|████▌     | 18/40 [12:02<14:54, 40.67s/it]

Initial portfolio value:100000
Final portfolio value: 3521169.5
Final accumulative portfolio value: 35.211695
Maximum DrawDown: -0.35614092809736375
Sharpe ratio: 1.328958125275801


 48%|████▊     | 19/40 [12:45<14:24, 41.16s/it]

Initial portfolio value:100000
Final portfolio value: 3544196.5
Final accumulative portfolio value: 35.441965
Maximum DrawDown: -0.3562072789380354
Sharpe ratio: 1.329809042159697


 50%|█████     | 20/40 [13:26<13:43, 41.17s/it]

Initial portfolio value:100000
Final portfolio value: 3550096.5
Final accumulative portfolio value: 35.500965
Maximum DrawDown: -0.35625963156843665
Sharpe ratio: 1.3307800467313944


 52%|█████▎    | 21/40 [14:07<13:02, 41.16s/it]

Initial portfolio value:100000
Final portfolio value: 3576087.0
Final accumulative portfolio value: 35.76087
Maximum DrawDown: -0.35630015332318643
Sharpe ratio: 1.3322572280403382


 55%|█████▌    | 22/40 [14:47<12:13, 40.77s/it]

Initial portfolio value:100000
Final portfolio value: 3591728.0
Final accumulative portfolio value: 35.91728
Maximum DrawDown: -0.35633478674045993
Sharpe ratio: 1.3333116305516008


 57%|█████▊    | 23/40 [15:27<11:30, 40.61s/it]

Initial portfolio value:100000
Final portfolio value: 3608139.0
Final accumulative portfolio value: 36.08139
Maximum DrawDown: -0.3563636789331881
Sharpe ratio: 1.3346508606314595


 60%|██████    | 24/40 [16:08<10:50, 40.64s/it]

Initial portfolio value:100000
Final portfolio value: 3623134.5
Final accumulative portfolio value: 36.231345
Maximum DrawDown: -0.35638714647250824
Sharpe ratio: 1.3354170235531333


 62%|██████▎   | 25/40 [16:52<10:23, 41.55s/it]

Initial portfolio value:100000
Final portfolio value: 3630157.5
Final accumulative portfolio value: 36.301575
Maximum DrawDown: -0.35640768211970764
Sharpe ratio: 1.3369499118376469


 65%|██████▌   | 26/40 [17:36<09:54, 42.45s/it]

Initial portfolio value:100000
Final portfolio value: 3660747.25
Final accumulative portfolio value: 36.6074725
Maximum DrawDown: -0.35642537499431604
Sharpe ratio: 1.3385252434943187


 68%|██████▊   | 27/40 [18:21<09:19, 43.06s/it]

Initial portfolio value:100000
Final portfolio value: 3669604.5
Final accumulative portfolio value: 36.696045
Maximum DrawDown: -0.35644141307642685
Sharpe ratio: 1.340503285487601


 70%|███████   | 28/40 [19:03<08:35, 42.95s/it]

Initial portfolio value:100000
Final portfolio value: 3700970.5
Final accumulative portfolio value: 37.009705
Maximum DrawDown: -0.3564556101372539
Sharpe ratio: 1.3419810818543871


 72%|███████▎  | 29/40 [19:50<08:06, 44.18s/it]

Initial portfolio value:100000
Final portfolio value: 3706175.75
Final accumulative portfolio value: 37.0617575
Maximum DrawDown: -0.35646694129053635
Sharpe ratio: 1.3443131056267335


 75%|███████▌  | 30/40 [20:33<07:18, 43.82s/it]

Initial portfolio value:100000
Final portfolio value: 3765160.5
Final accumulative portfolio value: 37.651605
Maximum DrawDown: -0.3564786813018046
Sharpe ratio: 1.3481864444968992


 78%|███████▊  | 31/40 [21:17<06:34, 43.79s/it]

Initial portfolio value:100000
Final portfolio value: 3797120.5
Final accumulative portfolio value: 37.971205
Maximum DrawDown: -0.35649166016756795
Sharpe ratio: 1.3510651139356495


 80%|████████  | 32/40 [21:59<05:45, 43.24s/it]

Initial portfolio value:100000
Final portfolio value: 3813483.0
Final accumulative portfolio value: 38.13483
Maximum DrawDown: -0.35649774742895923
Sharpe ratio: 1.3537278751193325


 82%|████████▎ | 33/40 [22:40<04:57, 42.49s/it]

Initial portfolio value:100000
Final portfolio value: 3794414.25
Final accumulative portfolio value: 37.9441425
Maximum DrawDown: -0.3565050600193611
Sharpe ratio: 1.3503244522884725


 85%|████████▌ | 34/40 [23:22<04:14, 42.49s/it]

Initial portfolio value:100000
Final portfolio value: 3807726.0
Final accumulative portfolio value: 38.07726
Maximum DrawDown: -0.3565128201841168
Sharpe ratio: 1.3550185522386886


 88%|████████▊ | 35/40 [24:06<03:34, 42.83s/it]

Initial portfolio value:100000
Final portfolio value: 3938381.0
Final accumulative portfolio value: 39.38381
Maximum DrawDown: -0.3565198969145309
Sharpe ratio: 1.3649170108690671


 90%|█████████ | 36/40 [24:50<02:52, 43.23s/it]

Initial portfolio value:100000
Final portfolio value: 3979621.25
Final accumulative portfolio value: 39.7962125
Maximum DrawDown: -0.3565255802576264
Sharpe ratio: 1.3668256543630617


 92%|█████████▎| 37/40 [25:32<02:08, 42.94s/it]

Initial portfolio value:100000
Final portfolio value: 4106552.75
Final accumulative portfolio value: 41.0655275
Maximum DrawDown: -0.35652908851816056
Sharpe ratio: 1.3610999571303466


 95%|█████████▌| 38/40 [26:16<01:26, 43.18s/it]

Initial portfolio value:100000
Final portfolio value: 4087476.5
Final accumulative portfolio value: 40.874765
Maximum DrawDown: -0.3565325258093377
Sharpe ratio: 1.3730398519367388


 98%|█████████▊| 39/40 [26:54<00:41, 41.61s/it]

Initial portfolio value:100000
Final portfolio value: 3881257.5
Final accumulative portfolio value: 38.812575
Maximum DrawDown: -0.35653697422205255
Sharpe ratio: 1.3609118883892033


100%|██████████| 40/40 [27:29<00:00, 41.24s/it]


<finrl.agents.portfolio_optimization.algorithms.PolicyGradient at 0x1cb8f2267d0>

### Save Model

## Test Model

### Instantiate different environments

Since we have three different periods of time, we need three different environments instantiated to simulate them.

### Test EIIE architecture
Now, we can test the EIIE architecture in the three different test periods. It's important no note that, in this code, we load the saved policy even though it's not necessary just to show how to save and load your model.

### Test Uniform Buy and Hold
For comparison, we will also test the performance of a uniform buy and hold strategy. In this strategy, the portfolio has no remaining cash and the same percentage of money is allocated in each asset.

### Plot graphics

We can see that the agent is able to learn a good policy but its performance is worse the more the test period advances into the future. To get a better performance in 2022, for example, the agent should probably be trained again using more recent data.