## Carrengando os dados (Loading the data)

Os dados serão armazenados em dois dataframes:

* **df_stocks**: all the stocks
* **df_bench**: only the benchmarks

1. Importando Bibliotecas

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import datetime
import dateutil
import glob
import os

2. Listando dataframes previamente armazenados

In [2]:
# listing pandas dataframes previously saved
lst_df_path = glob.glob(os.path.join('/content/raw', 'df_*.pickle'))

In [3]:
# checking the path and file names
#lst_df_path[:3]
lst_df_path[:]

[]

In [None]:
# remove the ticker that will be used for Benchmarks later
lst_df_path.remove('/content/raw/df_BVSP.pickle')
lst_df_path.remove('/content/raw/df_USDBRL.pickle')

In [None]:
# creating a separed list for the Benchmarks
lst_df_path_bench = ['/content/raw/df_BVSP.pickle', '/content/raw/df_USDBRL.pickle']

In [None]:
# concatenating all stocks into one dataframe
lst_df_stocks = []

for fname in lst_df_path:
    df = pd.read_pickle(fname)
    # keeping only Adj Close
    #df.drop(columns=['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Open', 'Adj High', 'Adj Low'], inplace=True)
    df.drop(columns=['Open', 'High', 'Low', 'Close', 'Volume'], inplace=True)
    ticker = fname.split('/content/raw/')[1].split('df_')[1].split('.')[0] 
    df.columns = [ticker]
    lst_df_stocks.append(df)
    
df_stocks = pd.concat(lst_df_stocks, axis=1)

In [None]:
df_stocks = pd.concat(lst_df_stocks, axis=1)

In [None]:
# checking column names
df_stocks.columns

In [None]:
# concatenating the benchmarks into one dataframe
lst_df_bench = []

for fname in lst_df_path_bench:
    df = pd.read_pickle(fname)
    # keeping only Adj Close
    #df.drop(columns=['Open', 'High', 'Low', 'Close', 'Volume', 'Adj Open', 'Adj High', 'Adj Low'], inplace=True)
    df.drop(columns=['Open', 'High', 'Low', 'Close', 'Volume'], inplace=True)
    ticker = fname.split('/content/raw/')[1].split('df_')[1].split('.')[0] 
    df.columns = [ticker]
    lst_df_bench.append(df)
    
df_bench = pd.concat(lst_df_bench, axis=1)

In [None]:
df_bench.columns

In [None]:
df_bench.head()

## Portfólio Otimizado Mensal

O objetivo é compor uma carteira com bom desempenho utilizando apenas uma pequena quantidade de ações da lista.

A cada mês será elaborada uma nova carteira com base no Índice Sharpe dos meses anteriores, e seu desempenho será comparado com três benchmarks:

* iBovespa: Índice oficial da Bovespa (composto por +60 ações)

* Média BVSP: média simples de todas as ações disponíveis da iBovespa

* Dolar: O valor atual dos dólares americanos em reais

**Restrições adicionais ao portfólio:**

O peso máximo de uma ação é de 25%
O peso mínimo de uma ação é 2%

**Resultados esperados:**

* desempenho aprimorado no longo prazo
* maior volatilidade que o iBovespa, devido ao pequeno número de ações que compõem a carteira

**Configurando a otimização**

*Baseado no curso Udemy de Jose Portilla em [Python para algoritmo financeiro e comercial.](https://www.udemy.com/python-for-finance-and-trading-algorithms/learn/v4/)*

In [None]:
from scipy.optimize import minimize

In [None]:
# utility function to obtain the expected Return, expected Volatity, and Sharpe Ration from the log returns, given the weights
def get_ret_vol_sr(weights):
    global log_ret
    weights = np.array(weights)    
    ret = np.sum( log_ret.mean() * weights * 252)
    vol = np.sqrt( np.dot(weights.T, np.dot(log_ret.cov()*252, weights)))
    sr = ret/vol
    return np.array([ret, vol, sr])

In [None]:
# the actual function to be minimized
def neg_sharpe(weights):
    return -1.*get_ret_vol_sr(weights)[2]

In [None]:
# contraint function
def check_sum(weights):
    return np.sum(weights) - 1.

In [None]:
# contraint function
def check_max_weight(weights):
    global max_weight
    return np.minimum(weights.max(), max_weight) - weights.max()

In [None]:
# contraint function
def check_weights(weights):
    global max_weight
    w1 = np.sum(weights) - 1.
    w2 = np.minimum(weights.max(), max_weight) - weights.max()
    return np.abs(w1) + np.abs(w2)

In [None]:
# constraint tuple
#cons = ({'type' : 'eq', 'fun' : check_sum})
#cons = ({'type' : 'eq', 'fun' : check_sum}, {'type' : 'eq', 'fun' : check_max_weight}) # did not work
cons = ({'type' : 'eq', 'fun' : check_weights}) # using this workaround instead

In [None]:
n_stocks = df_stocks.shape[1]

In [None]:
bounds = tuple([(0,1) for i in range(n_stocks)])

In [None]:
init_guess = np.ones(n_stocks) / n_stocks

## Definir parâmetros de previsão

In [None]:
# the start date of the fist prediction (year, month, day)
day_start = datetime.datetime(2019,1,1).date()

# total number of months to run the prediction
n_months_run = 28

# training months before current prediction
n_months_train = 12

# portfolio weights (before re-balancing)
max_weight = 0.25  # used in the constraint function
min_weight = 0.02  # used in the running prediction

# Previsão mensal em execução

In [None]:
delta_month = dateutil.relativedelta.relativedelta(months=+1)
delta_day = dateutil.relativedelta.relativedelta(days=+1)

valid_start = day_start
valid_end = valid_start + delta_month - delta_day

train_start = valid_start - n_months_train*delta_month
train_end = valid_start - delta_day

time = []
p = []
b1 = []
b2 = []
b3 = []


#
for i in range(n_months_run):
    
    # dataframes
    df_train = df_stocks.truncate(before=train_start, after=train_end)
    df_valid = df_stocks.truncate(before=valid_start, after=valid_end)
    df_valid_bench = df_bench.truncate(before=valid_start, after=valid_end)
    
    # calculating log returns of the training data
    log_ret = np.log( df_train.divide(df_train.shift(1, axis=0), axis=0) ).iloc[2:]
    # notice that log_ret is used by the function `get_ret_vol_sr` and, consequently,
    # the `neg_sharpe` function    
      
    
    # calculating optimized weights
    opt_results = minimize(neg_sharpe, init_guess, method='SLSQP', bounds=bounds, constraints=cons)
    
    weights = opt_results.x
    
    
    # Weight Re-balancing
    idx = np.where(opt_results.x>=min_weight)[0]
    weights = weights[idx]
    weights /= weights.sum()
    
    labels = log_ret.columns[idx]
    
    # using the portfolio weights on the validation data
    df1 = df_valid[labels]
    df1 = df1/df1.iloc[0] # percentage return of the portfolio
    df2 = (df1 * weights).sum(axis=1)
    df2 = df2/df2.iloc[0] # percentage return of the portfolio
    
    # percentage return of the benchmarks
    df2b = df_valid_bench/df_valid_bench.iloc[0]
    
    time.append(valid_start.strftime('%Y/%m'))
    p.append(df2.iloc[-1])
    b1.append(df2b['BVSP'].iloc[-1])
    b2.append(df2b['USDBRL'].iloc[-1])
    b3.append(df1.mean(axis=1).iloc[-1]) # Simple average of all stocks
    
    print('\nStart: {}, Portfolio: {:.2f}, iBovespa: {:.2f}, Dolar: {:.2f}, Avg. : {:.2f}'.format(time[-1], p[-1],
                                                                                                 b1[-1], b2[-1], b3[-1]))
    
    for l,w in zip(labels, weights):
        print('  > {} : {:.2f}'.format(l, w))

    
    # time update for the next loop
    valid_start += delta_month
    valid_end  = valid_start + delta_month - delta_day
    
    train_start += delta_month
    train_end = valid_start - delta_day

## Apresentando os resultados

In [None]:
d = {'Date' : pd.to_datetime(time),
    'Portfolio' : p,
    'iBovespa' : b1,
    'Dolar' : b2,
    'Avg. BVSP' : b3}
df_results = pd.DataFrame(data=d)
df_results.set_index('Date', inplace=True)

In [None]:
print('Average - Monthly returns:')
df_results.mean(axis=0)

In [None]:
print('std - Monthly returns:')
df_results.std(axis=0)

In [None]:
ax = df_results.plot(style='-o')
ax.axhline(y=1.0, color='gray', linestyle='--', lw=0.5)