# Previsão de demanda de ingredientes de uma rede de fast-food

### Neste notebook irei realizar uma previsão de demanda de ingredientes necessários para realização das atividades de uma rede de fast-food, utilizando técnicas de modelagem estatística, aprendizado de máquina e séries temporais irei prever a demanda de todos os ingredientes, irei automatizar um experimento para ajustar alguns modelos que fazem sentido para os dados, e então irei salvá-los no MLFlow onde temos uma API para manipular os modelos treinados e verificar o valor das métricas de avaliação que iremos utilizar.

### Irei utilizar a metodologia Crisp-DM 

# Etapa 1 entendimento de negócios: 

### Não é novidade para ninguém que é muito importante fazer a previsão de demanda em qualquer negócio, no contexto de fast food é extremamente necessário para evitar desperdício e ao mesmo tempo sanar a demanda dos clientes, diminuindo os gastos e maximizando os lucros. Uma curiosidade é que segundo uma pesquisa da Mordor Intelligence em torno de 48% da população brasileira consome fast food pelo menos uma vez na semana¹ e é um dos maiores consumidores de fast food do mundo.

### Temos dados das vendas de alguns restaurantes de uma rede de fast food² com detalhes das compras realizadas nesses restaurantes, além de algumas informações sobre os ingredientes utilizados em cada receita, e seria de grande valor para a rede de fast food fazer a previsão da demanda de cada ingrediente para ter um melhor planejamento, além disso algumas perguntas sobre os dados são pertinentes para o time do fast food, as perguntas são:

### - Qual o prato pedido com mais frequência?
### - Qual o ingrediente utilizado em maior frequência?
### - Quais os ingredientes menos utilizado?
### - Existe diferença no número de vendas entre os restaurante? qual vende mais?

### Também foi informado que atualmente eles utilizam a média móvel (com lag = 3) para prever a demanda de cada ingrediente, seria interessante se fosse possível ajustar outros modelos que tenham uma performance melhor.

¹ - https://www.mordorintelligence.com/industry-reports/brazil-foodservice-market \
² - https://www.kaggle.com/datasets/rishitsaraf/fast-food-restaurant-chain?select=sub_recipes.csv


In [1]:
# manipular dataframes pandas
import pandas as pd
import pandasql as ps
from pandasql import sqldf

# fazer queries em SQL 
import sqlalchemy as sql
from sqlalchemy import create_engine
from sqlalchemy import text

# visualização
import pygwalker as pyg

# teste de hipótese 
from statsmodels.stats.diagnostic import acorr_ljungbox
from statsmodels.tsa.stattools import adfuller
import pymannkendall as mk

# ajustar os modelos 
import pandas as pd
import numpy as np
import mlflow
import mlflow.sklearn
import mlflow.xgboost
from sklearn.model_selection import TimeSeriesSplit
from sklearn.ensemble import RandomForestRegressor
from xgboost import XGBRegressor
from statsmodels.tsa.holtwinters import ExponentialSmoothing
from statsmodels.tsa.statespace.structural import UnobservedComponents
from sklearn.metrics import mean_squared_error


# Planejamento das queries

## Vamos começar respondendo as perguntas de negócio, e depois fazer a previsão de demanda. Para responder a pergunta: 
## - Qual o prato pedido com mais frequência?
### Como temos dados que variam ao longo do tempo seria melhor responder essa pergunta considerando a data, e então teríamos a receita mais pedida de acordo com o intervalo de tempo.

#### Observando a descrição dos dados podemos perceber que para descobrir qual o prato mais pedido podemos observar a tabela menu_items que possui todos os pedidos em todos os restaurantes, tem a chave estrangeira PLU que pode ser usada para conectar no menu_items.csv e de lá na tabela recipes que tem a chave estrangeira Id com o menu_items.csv e mostra o nome da receita, juntamente com a quantidade que foi pedida podemos somar a coluna de quantidade e fazer a contagem.
#### Então em etapas fica o seguinte:
#### - Selecionar o Id e PLU do menuitem que são as chaves com a tabela menu_items
#### - Selecionar a coluna quantity que indica a quantidade do prato solicitado para fazer a contagem adequada dos pratos
#### - Selecionar a coluna recipe_Id de menu_items e fazer um join das colunas Id e PLU com o menuId e PLU em menu_items
#### - Fazer um join dessa nova tabela gerada com todas as colunas da tabela recipes usando o RecipeId e a coluna Id da tabela recipes como chave estrangeira.

In [4]:
engine = create_engine("mysql+mysqlconnector://root:SENHA_SERVIDOR_SQL@PORTA_SERVIDOR/NOME_DATABASE")

# Test the connection
connection = engine.connect()
# INSERT INTO phonebook (name, phone) VALUES ('John Doe', '123-456-7890');

In [5]:
query = text('''SELECT mi.date, mi.Id, mi.PLU, mi.Quantity, m_i.MenuItemId,
         m_i.RecipeId, recipes.RecipeName, recipes.RecipeDescription
        FROM menuitem mi 
        INNER JOIN menu_items m_i ON mi.Id = m_i.MenuItemId 
        AND mi.PLU = m_i.PLU
        INNER JOIN recipes ON m_i.RecipeId = recipes.RecipeId
        ''')

table_menuitems = pd.read_sql( query, con=engine )

In [6]:
table_menuitems.head()

Unnamed: 0,date,Id,PLU,Quantity,MenuItemId,RecipeId,RecipeName,RecipeDescription
0,2005-03-15,326,111000041,1,326.0,358.0,FtL/ChxStr,Chickn Strips FtLong
1,2005-03-15,280,212000126,1,280.0,305.0,21oz21CDrk,21oz Carbonated Fountain 21Fnt
2,2005-03-15,8,121000008,1,8.0,47.0,Six/B.M.T.,B.M.T. 6 inch
3,2005-03-15,91,121000001,1,91.0,22.0,Six/Veggie,Veggie Delite 6 inch
4,2005-03-15,564,175000062,1,564.0,579.0,ad6/Avocad,Avocado Add6in


In [7]:
table_menuitems.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 182862 entries, 0 to 182861
Data columns (total 8 columns):
 #   Column             Non-Null Count   Dtype  
---  ------             --------------   -----  
 0   date               182862 non-null  object 
 1   Id                 182862 non-null  int64  
 2   PLU                182862 non-null  int64  
 3   Quantity           182862 non-null  int64  
 4   MenuItemId         182862 non-null  float64
 5   RecipeId           182862 non-null  float64
 6   RecipeName         182862 non-null  object 
 7   RecipeDescription  182862 non-null  object 
dtypes: float64(2), int64(3), object(3)
memory usage: 11.2+ MB


In [8]:
table_menuitems['RecipeId'] = table_menuitems['RecipeId'].astype(str)

### Agora que temos os dados precisamos agrupar eles de acordo com o prato, data e a soma da quantidade de cada item para termos o total vendido.

In [9]:
total_pratos_unicos = len(table_menuitems['RecipeName'].unique())
print('Temo um total de', total_pratos_unicos, 'pratos únicos' )

Temo um total de 267 pratos únicos


### Considerando que temos um grande número de pratos únicos precisamos visualizar apenas os mais frequentes para não termos uma visualização dos dados muito polúida, iremos visualizar os 3 pratos mais vendidos para cada data, para termos uma visualização mais limpa.

In [10]:
# utilizando o sql para minerar dos dados no dataframe pandas.
# assim selecinamos o top 10 itensmais vendidos de cada dia.

query = '''
SELECT table_menuitems.* 
FROM ( SELECT SUM(Quantity), date, RecipeName, 
row_number() over (partition by date order by SUM(Quantity) desc) as seqnum
FROM table_menuitems 
GROUP BY date, RecipeName 
) table_menuitems
WHERE seqnum <= 3 
'''

mais_vendidos = sqldf(query, locals())

In [11]:
mais_vendidos.sort_values(by='date', ascending=False).head()

Unnamed: 0,SUM(Quantity),date,RecipeName,seqnum
308,92,2029-05-15,Btl/BtCDrk,3
307,206,2029-05-15,21oz21CDrk,2
306,308,2029-05-15,Chp/Chips,1
305,184,2029-04-15,Btl/BtCDrk,3
304,246,2029-04-15,21oz21CDrk,2


In [12]:
mais_vendidos['RecipeName'].value_counts()

RecipeName
Chp/Chips             103
21oz21CDrk            102
Coo/Cookie             74
Btl/BtCDrk             29
FtL/Turkey              1
Name: count, dtype: int64

In [13]:
mais_vendidos.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 309 entries, 0 to 308
Data columns (total 4 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   SUM(Quantity)  309 non-null    int64 
 1   date           309 non-null    object
 2   RecipeName     309 non-null    object
 3   seqnum         309 non-null    int64 
dtypes: int64(2), object(2)
memory usage: 9.8+ KB


In [14]:
pyg.walk(mais_vendidos)

Box(children=(HTML(value='\n<div id="ifr-pyg-00061f84b8bf5de8sZyjVfoaLB05Kmkl" style="height: auto">\n    <hea…

<pygwalker.api.pygwalker.PygWalker at 0x2042ee44170>

salvei a imagem gerada com o PyGWalker para caso esteja rodando o notebook 

<img src="EDA\\line_plot_quanitty_date_recipe.png" />

### Não é perceptível nenhuma tendencia positiva ou negativa de acordo com o gráfico.

#### Podemos ver a linha em vermelho que representa o Chp/Chips (Chips Chips) é o item mais vendido em todas as datas, logo em seguida 21oz21CDrk(21oz Carbonated Fountain 21Fnt) que permanece em segundo lugar mas em alguns anos briga com o Btl/BtCDrk(Bottled Carbonated Drink BtlDrk) e o Coo/Cookie(Cookie Cookie), além deles temos o item ftL/Turkey que não aparece no gráfico pois ele ficou entre os 3 primeiros colocados apenas uma vez.

# Qual o ingrediente utilizado em maior frequência? 
### temos uma tabela com as unidades de medida, o recipe com o nome das receitas e uma outra tabela com os ingredientes por receita mais a sua quantidade. Vamos verificar os três ingredientes mais utilizados para cada data.

#### - Pegar a frequência com que aparece cada RecipeId para cada data.
#### - Guardar os RecipeId únicos em uma tabela junto com a frequência que ela aparece para cada data.
#### - Temos que para 1 receita vamos ter a coluna com o id da receita e outra com os ingredients na tabela recipe_ingredient_assignments, e temos um problema temos que as linhas com o id do ingrediente se repete porque temos mais de um ingrediente por receita, como ilustra na imagem.

<img src="EDA/ingrediente_receita.png" />

#### Então para não termos problemas vamos contar a frequência de cada receita por data, logos após de acordo com a contagem de frequência de cada receita vamos multiplicar a frequência de cada receita vezes a quantidade de cada ingrediente para cada receita de acordo com a tabela recipe_ingredient_assignments. Logo após vamos agrupar esses dados de acordo com os ingredientes somando a frequencia de cada ingrediente.

#### Iremos considerar que o produto utilizado com mais frequência aparece mais vezes e desconsideramos a unidade de medida do produto, então se um produto é utilizado  10 vezes mas apenas 10 gramas dele no total ele não vai pesar menos do que um outro produto que é utilizado 2 vezes e pesa 20g no total, irei considerar apenas a frequencia do produto, pois as unidades de medida temos um problema, temos unidades de medida que representam peso e outras que representam volume, não temos um dicionário que poderíamos utilizar para ter uma mesma unidade de medida para todos os produtos, dependendo do caso seria interessante que houvesse a medida de volume de todos os produtos para saber o volume necessário para armazenar cada produto no estoque.


In [15]:
query ='''SELECT date, RecipeId, SUM(Quantity) AS freq
    FROM table_menuitems
    GROUP BY date, RecipeId
    '''

qtd_p_data = sqldf(query, locals())

In [16]:
qtd_p_data.head()

Unnamed: 0,date,RecipeId,freq
0,1930-03-15,10.0,46
1,1930-03-15,102.0,22
2,1930-03-15,1021.0,16
3,1930-03-15,104.0,4
4,1930-03-15,105.0,2


In [17]:
len(qtd_p_data['date'].unique())

103

In [18]:
query = text('''SELECT *
FROM recipe_ingredient_assignments ''')

recipe_ingredient_assignments = pd.read_sql( query, con=engine )
qtd_p_data_final = pd.DataFrame(columns=['data','IngredientId', 'total'])

Abaixo tenho o processo para calcular a frequencia de cada ingrediente utilizando comandos em SQL, mas ele é muito lento na hora de calcular, então refiz todos estes comandos utilizando funções nativas do pandas que são mais rápidas neste caso.

In [19]:
# for date in table_menuitems['date'].unique():

#     # pegar a frequencia de cada receita para aquela data
#     query = f''' SELECT * FROM qtd_p_data WHERE date = '{date}' ''' 
#     data_qtd_pedido = sqldf(query, locals())

#     # fazer a tabela do total de ingredientes para aquela data
#     query = ''' SELECT ing.RecipeId, ing.IngredientId, ing.Quantity * data_qtd.freq AS qtd_total
#     FROM data_qtd_pedido AS data_qtd
#     INNER JOIN recipe_ingredient_assignments AS ing
#     '''
#     tmp_table = sqldf(query)
    
#     # pegando os tres ingredientes mais frequentes
#     query = ''' SELECT * FROM ( SELECT IngredientId, SUM(qtd_total) AS total,
#     ROW_NUMBER() OVER ( ORDER BY SUM(qtd_total) DESC ) AS row_number
#     FROM tmp_table 
#     GROUP BY IngredientId
#     ORDER BY SUM(qtd_total) DESC )
#     WHERE row_number <= 3'''
#     top3 = sqldf(query)
#     print(tmp_table)
    


### Utilizando os comandos SQL a execução é muito lenta
### Fazer o mesmo utilizando funções nativas do pandas é mais rápido

In [20]:
recipe_ingredient_assignments['RecipeId'] = recipe_ingredient_assignments['RecipeId'].astype(float)

In [21]:
qtd_p_data_final = pd.DataFrame(columns=['date','IngredientId', 'total'])

for date in table_menuitems['date'].unique():

    # pegar a frequencia de cada receita para aquela data
    atual = qtd_p_data.loc[qtd_p_data['date']==f'{date}']
    atual['RecipeId'] = atual['RecipeId'].astype(float)
    # merge da tabela atual com a q tem os ingredientes
    merge_table = pd.merge( atual, recipe_ingredient_assignments, on='RecipeId', how='inner' )
    merge_table['total'] = merge_table['freq'] * merge_table['Quantity']
    
    top_three = merge_table.groupby('IngredientId').agg({'total':sum})
    top_three = top_three.sort_values(by='total', ascending=False).reset_index()
    top_three = top_three.head(3)
    top_three['date'] = date

    qtd_p_data_final = pd.concat([qtd_p_data_final, top_three], axis=0)



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  atual['RecipeId'] = atual['RecipeId'].astype(float)
  top_three = merge_table.groupby('IngredientId').agg({'total':sum})
  qtd_p_data_final = pd.concat([qtd_p_data_final, top_three], axis=0)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  atual['RecipeId'] = atual['RecipeId'].astype(float)
  top_three = merge_table.groupby('IngredientId').agg({'total':sum})
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentati

In [22]:
qtd_p_data_final.sort_values(by='date')

Unnamed: 0,date,IngredientId,total
1,1930-03-15,13.0,604.0
2,1930-03-15,3.0,458.0
0,1930-03-15,110.0,664.0
1,1930-04-15,3.0,560.0
2,1930-04-15,92.0,490.0
...,...,...,...
0,2029-04-15,92.0,658.0
1,2029-04-15,3.0,652.0
2,2029-05-15,92.0,438.0
1,2029-05-15,110.0,554.0


In [23]:
qtd_p_data_final.info()

<class 'pandas.core.frame.DataFrame'>
Index: 309 entries, 0 to 2
Data columns (total 3 columns):
 #   Column        Non-Null Count  Dtype  
---  ------        --------------  -----  
 0   date          309 non-null    object 
 1   IngredientId  309 non-null    float64
 2   total         309 non-null    float64
dtypes: float64(2), object(1)
memory usage: 9.7+ KB


In [24]:
qtd_p_data_final['IngredientId'] = qtd_p_data_final['IngredientId'].astype(object)

In [25]:
pyg.walk(qtd_p_data_final)

Box(children=(HTML(value='\n<div id="ifr-pyg-00061f84b944b70bc5LePUsrb7knYoB6" style="height: auto">\n    <hea…

<pygwalker.api.pygwalker.PygWalker at 0x2042d58fd40>

Contra intuitivamente utilizar o scatter plot era mais adequando nesse caso, porque não temos

In [26]:
import plotly.express as px

fig = px.scatter(qtd_p_data_final, x='date', y='total', color='IngredientId')
fig.show()

### Ao contrário do que se esperava os ingredientes mais utilizados variam bastante ao longo do tempo, como tinhamos pouca variabilidade de receitas/pratos que são pedidos no restaurante faria sentido se esperar que os ingredientes não variassem tanto assim já que os ingredientes utilizados em determinado prato teoricamente seria o mesmo. Uma explicação possível para isso é que alguns ingredientes podem ser equivalentes e modificamos por razões de custo, ou que ao longo do tempo os ingredientes foram mudados para agradar o paladar dos clientes.

### Contraintuitivamente neste caso é melhor visualizar com o scatter plot, pois temos algumas datas faltantes, e em alguns dias específicos os 3 ingredientes mais utilizados são diferentes, então se utilizarmos o line plot teriamos saltos grandes entres os pontos. Como não temos um número grande de ingredientes podemos ver na tabela de descrição dos ingredientes o que cada um significa. Podemos verificar o valor da frequencia de cada ingrediente no eixo y.

In [27]:
ings = qtd_p_data_final['IngredientId'].unique()
ings = pd.DataFrame({'IngredientId':ings})

In [28]:
query = text('''SELECT *
FROM ingredients ''')

ingredients = pd.read_sql( query, con=engine )

query = ''' SELECT ing.IngredientId, ing.IngredientName, ing.IngredientShortDescription
FROM ings 
LEFT JOIN ingredients ing ON ing.ingredientId = ings.IngredientId
'''
sqldf(query)

Unnamed: 0,IngredientId,IngredientName,IngredientShortDescription
0,3,Chicken Strips ...,Chicken Strips
1,110,Fountain Beverage syrup ...,Fountain Beverage syrup
2,13,Turkey ...,Turkey
3,92,Marinara Sauce ...,Marinara Sauce
4,22,Chips ...,Chips
5,4,Chicken. oven roasted patty ...,"Chicken, single piece"
6,7,Ham ...,Ham
7,82,"Pastrami, beef ...","Pastrami, beef"
8,15,Cookies ...,Cookies


### Estes são os ingredientes mais utilizados

# Respondendo a terceita pergunta: Qual o ingrediente utilizado em menor frequência?
### Podemos nos aproveitar dos códigos da última pergunta para pegar essa informação.

In [29]:
qtd_p_data_final = pd.DataFrame(columns=['date','IngredientId', 'total'])

for date in table_menuitems['date'].unique():

    # pegar a frequencia de cada receita para aquela data
    atual = qtd_p_data.loc[qtd_p_data['date']==f'{date}']
    atual['RecipeId'] = atual['RecipeId'].astype(float)
    # merge da tabela atual com a q tem os ingredientes
    merge_table = pd.merge( atual, recipe_ingredient_assignments, on='RecipeId', how='inner' )
    merge_table['total'] = merge_table['freq'] * merge_table['Quantity']
    
    top_three = merge_table.groupby('IngredientId').agg({'total':sum})
    top_three = top_three.sort_values(by='total').reset_index() # so foi modificada essa linha pra ficar em ordem crescente
    top_three = top_three.head(3)
    top_three['date'] = date

    qtd_p_data_final = pd.concat([qtd_p_data_final, top_three], axis=0)





A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy


The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.


The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#r

In [30]:
qtd_p_data_final['IngredientId'] = qtd_p_data_final['IngredientId'].astype(object)

In [31]:
import plotly.express as px

fig = px.scatter(qtd_p_data_final, x='date', y='total', color='IngredientId')
fig.show()

### Os ingredientes utilizados em menor quantidade variam mais, como podemos ver temos uma variedade de ingredientes muito maior entre os menos utilizados.

In [32]:
query = text('''SELECT *
FROM ingredients ''')

ingredients = pd.read_sql( query, con=engine )
ings = qtd_p_data_final['IngredientId'].unique()
ings = pd.DataFrame({'IngredientId':ings})

query = ''' SELECT ing.IngredientId, ing.IngredientName, ing.IngredientShortDescription 
FROM ings 
LEFT JOIN ingredients AS ing ON ing.ingredientId = ings.IngredientId
'''
sqldf(query)

Unnamed: 0,IngredientId,IngredientName,IngredientShortDescription
0,25,Green Peppers ...,Green Peppers
1,17,Olives ...,Olives
2,172,Coffee ...,"Coffee, filter style gro"
3,528,House Sandwich Sauce ...,House Sandwich Sauce
4,163,"Beverage, Juice Box ...",Juice Box
5,44,"Milk, bottle ...","Milk, bottle"
6,274,Sweetener packets ...,Sweetener packets
7,30,"Cheese, Shredded Monterey Cheddar ...","Cheese, Shredded Montere"
8,160,"Cup, Cold 30 oz. ...","Cup, Cold 30 oz."
9,68,"Cup, Hot Cup, 16 oz. ...","Cup, Hot Cup, 16 oz"


### Os ingredientes na tabela acima são os ingredientes menos utilizado.

# Respondendo a quarta pergunta: Existe diferença no número de vendas entre os restaurante? qual vende mais?

### Para isso vamos ter que repetir o mesmo processo da pergunta 1, mas agrupar pelo StoreNumber que indica o número de identificação da loja e somar a quantity, que é a quantidade de vendas de uma receita naquele restaurante.

In [33]:
query = "SELECT * FROM menuitem"
menuitems = pd.read_sql(query, con=engine)

query = ''' 
SELECT StoreNumber, SUM(Quantity)
FROM menuitems
GROUP BY StoreNumber
'''

store_sales = sqldf(query)

In [34]:
store_sales

Unnamed: 0,StoreNumber,SUM(Quantity)
0,0,
1,1,
2,4904,32406.0
3,12631,29156.0
4,20974,19957.0
5,46673,18333.0


In [35]:
pyg.walk(store_sales)

Box(children=(HTML(value='\n<div id="ifr-pyg-00061f84bc145ff65fSdLVIcRx7k0sWz" style="height: auto">\n    <hea…

<pygwalker.api.pygwalker.PygWalker at 0x20433e6fc50>

<img src="EDA\\store_number_quantity.png" />

### Temos uma diferença na quantidade total de pedidos para cada loja, essa informação é importante de se considerar no momento da modelagem.

# Modelagem de previsão de demanda dos ingredientes.

### Vamos fazer a previsão de todos o ingredientes, como se trata de uma abordagem preditiva e apenas importa se a previsão é considerada boa ou não irei selecionar modelos que podem ser utilizados para esse, logo após fazer um preprocessamento nos dados, e irei ajustar todos eles e selecionar o que modelo que tiver a melhor performance.
### Vamos considerar que a melhor performance seja o modelo que tiver menor erro em relação ao valor real, e que tenha erro positivo, ou seja, estima valores um pouco acima dos valores reais. Assim ele evita desperdício e aumenta os lucros pois não deixa de vender por estimar menor quantidade de ingredientes.

### Para isso vai ser necessário ter a frequência de cada ingrediente por data, então vamos ter um tabela com a data e a frequência, isso para cada loja pois a demanda das lojas é diferente. Então teremos para cada loja uma tabela, e para cada ingrediente uma tabela, então vamos ter um número total de tabelas igual o número de ingredientes vezes o número de lojas.
### Poderíamos testar o efeito de algumas covariáveis na previsão, como por exemplo transformar as recipeId em títulos de algumas colunas e nas linhas colocar a frequência de cada covariável, mas isso tenderia a causar problemas porque teríamos que prever o valor da frequência de cada recipeId que representa os pratos, e como toda previsão teríamos erros que poderiam piorar a performance de nosso modelo para prever os ingredientes.

# Ideia deixar os dados no seguinte formato: 
##  Criar um dicionário com {'StoreNumber' : { 'ingrediente': { 'data' : [datas], 'frequencia': [] } } }

In [36]:
query = ''' SELECT m_i.date, m_i.StoreNumber, m_i.Id, m_i.PLU, m_i.Quantity FROM menuitem AS m_i '''
pt1 = pd.read_sql(query, con=engine)

In [37]:
pt1.head()

Unnamed: 0,date,StoreNumber,Id,PLU,Quantity
0,2005-03-15,46673,326.0,111000041.0,1.0
1,2005-03-15,46673,280.0,212000126.0,1.0
2,2005-03-15,12631,8.0,121000008.0,1.0
3,2005-03-15,12631,91.0,121000001.0,1.0
4,2005-03-15,12631,564.0,175000062.0,1.0


In [38]:
query = ''' SELECT * FROM menu_items '''
menu_items = pd.read_sql(query, con=engine)
menu_items.head()

Unnamed: 0,MenuItemName,MenuItemDescription,PLU,MenuItemId,RecipeId
0,FtL/Ham,Ham FtLong,111000004.0,1.0,6.0
1,Six/Ham,Ham 6 inch,121000004.0,2.0,7.0
2,FfB/Ham,Ham FtFbd,112000004.0,3.0,2.0
3,fBd/Ham,Ham FlatBd,122000004.0,4.0,8.0
4,Sld/Ham,Ham Salad,131000004.0,5.0,9.0


In [39]:
query = ''' 
SELECT pt1.date, pt1.StoreNumber, mi.RecipeId, pt1.Quantity
FROM pt1 
INNER JOIN menu_items mi 
ON pt1.Id = mi.MenuItemId
AND pt1.PLU = mi.PLU
'''
pt2 = sqldf(query)
pt2.head()

Unnamed: 0,date,StoreNumber,RecipeId,Quantity
0,2005-03-15,46673,358.0,1.0
1,2005-03-15,46673,358.0,1.0
2,2005-03-15,46673,305.0,1.0
3,2005-03-15,46673,305.0,1.0
4,2005-03-15,12631,47.0,1.0


### Como tem mais de um ingrediente por RecipeId é necessário calcular a quantidade total de cada RecipeId separadamente para cada StoreNumber, e multiplicar essa quantidade pela quantidade de ingrediente em cada receita, e depois agrupar somando o total da quantidade de cada cliente para cada data.

# Gerando os dados

In [40]:
query = 'select * from recipe_ingredient_assignments'
recipeid_ingredientid = pd.read_sql(query, con=engine)

In [41]:
pt2['StoreNumber'].unique()

array([46673, 12631, 20974,  4904], dtype=int64)

In [42]:
# selecionar as rows store number
# fazer um group by por data e recipeId e somar a quantidade 
# para cada data usar o recipeId como chave estrangeira e fazer a freq vezes a quantidade de cada ingrediente.
dados = {'46673': {}, '12631':{}, '20974':{},  '4904':{}}
for store in pt2['StoreNumber'].unique(): 

    # selecionar as rows store number
    tmp = pt2.loc[ pt2['StoreNumber'] == store ]
    
    # fazer um group by por data e recipeId e somar a quantidade 
    tmp = tmp.groupby(by=['date', 'RecipeId']).agg({'Quantity':sum}).reset_index()
    
    # para cada data usar o recipeId como chave estrangeira e fazer a freq vezes a quantidade de cada ingrediente.
    merged = pd.merge( tmp, recipeid_ingredientid, on='RecipeId', how='inner' )

    # frequencia de cada receita vezes a quantidade de cada ingrediente
    merged['total'] = merged['Quantity_x'] * merged['Quantity_y']
    merged = merged[ ['date', 'IngredientId', 'total'] ]
    
    for ingredient in merged['IngredientId'].unique():

        dados_ingredient = merged.loc[merged['IngredientId']==ingredient]

        dados[str(store)][str(ingredient)] = dados_ingredient.to_dict()


The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.


The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.


The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.


The provided callable <built-in function sum> is currently using SeriesGroupBy.sum. In a future version of pandas, the provided callable will be used directly. To keep current behavior pass the string "sum" instead.



In [43]:
dados

{'46673': {'7.0': {'date': {0: '1930-03-15',
    1: '1930-03-15',
    9: '1930-03-15',
    11: '1930-03-15',
    18: '1930-03-15',
    21: '1930-03-15',
    23: '1930-03-15',
    97: '1930-04-15',
    108: '1930-04-15',
    111: '1930-04-15',
    113: '1930-04-15',
    192: '1930-05-15',
    193: '1930-05-15',
    199: '1930-05-15',
    239: '1931-03-15',
    240: '1931-03-15',
    246: '1931-03-15',
    251: '1931-03-15',
    254: '1931-03-15',
    256: '1931-03-15',
    259: '1931-03-15',
    323: '1931-05-15',
    332: '1931-05-15',
    339: '1931-05-15',
    342: '1931-05-15',
    344: '1931-05-15',
    422: '2001-04-15',
    423: '2001-04-15',
    431: '2001-04-15',
    436: '2001-04-15',
    439: '2001-04-15',
    441: '2001-04-15',
    460: '2001-04-15',
    535: '2001-05-15',
    542: '2001-05-15',
    544: '2001-05-15',
    549: '2001-05-15',
    552: '2001-05-15',
    607: '2001-06-15',
    608: '2001-06-15',
    616: '2001-06-15',
    621: '2001-06-15',
    624: '2001-06-15'

### Agora é só ajustar os modelos e automatizar a escolha e comparação entre eles.

### Para fazer uma melhor escolha sobre os modelos que serão utilizados irei verificar algumas característica sobre as series temporais, como por exemplo verificar se existe uma relação linear entre as observações passadas da serie temporal, se existe autocorrelação, 

### E apartir dessas características dos dados iremos escolher os modelos de acordo com as suposições que os modelos fazem.

### Quando falamos de series temporais existem 3 principais características em relação aos dados que podem nos guiar na escolha dos modelos, a presença de tendências, sazonalidade(quando de tempos em tempos temos um padrão), estacionariedade(quando a media, variancia e autocovariância são constantes), autocorrelação(quando existe correlação entre as observações ao longo do tempo) e qual o tipo dessa autocorrelação (linear ou não linear).

### Vamos verificar algumas dessas características : 

### Para verificar se existe autocorrelação automaticamente para todos os ingredientes, irei utilizar o teste Ljung-Box que tem como hipótese nula que o dado é independentemente distribuído(não possue autocorrelação) para um certo lag, um lag é uma unidade da unidade de tempo que estamos utilzando se tivermos ano por exemplo 1 lag seria 1 ano, com esse teste é possível verificar se existe autocorrelação, mas não o tipo da autocorrelação. Caso existir autocorrelação para verificar se a autocorrelação é linear ou não podemos plotar o gráfico ACF que se baseia na correlação linear de Pearson.

### Para verificar estacionariedade vamos utilizar o teste (ADF) Augmented Dickey-Fuller que assume na hipótese nula é que a série não é estacionária, quando rejeitamos a hipótese nula temos que a série é estacionária. 


In [45]:
# teste de autocorrelação
from statsmodels.stats.diagnostic import acorr_ljungbox
import numpy as np

lag = 1
result_test_autocorr = {'46673': {}, '12631':{}, '20974':{},  '4904':{}}
total_rej = 0
total_ing_loja = 0

for store, ingredients in dados.items():
    
    total_ing_loja = total_ing_loja + len(dados[store].keys())

    for ingredient, time_series in ingredients.items():
        result_test_autocorr[store][ingredient] = 0
        time_series = pd.DataFrame(time_series)

        if len(time_series['total']) != 1 or 0 :
            
            lb_test = acorr_ljungbox(time_series['total'], lags=[lag])
            
            if lb_test['lb_pvalue'].item() < 0.05 : 
                result_test_autocorr[store][ingredient] = result_test_autocorr[store][ingredient] + 1
                total_rej = total_rej + 1 
                print('rejeitou para',ingredient, store)
                
print('total de ingredientes que rejeitaram a hipótese de autocorrelação', total_rej)
print('total de ingredientes', total_ing_loja)

rejeitou para 13.0 46673
rejeitou para 6.0 46673
rejeitou para 9.0 46673
rejeitou para 33.0 46673
rejeitou para 35.0 46673
rejeitou para 47.0 46673
rejeitou para 140.0 46673
rejeitou para 110.0 46673
rejeitou para 147.0 46673
rejeitou para 13.0 12631
rejeitou para 11.0 12631
rejeitou para 4.0 12631



invalid value encountered in divide


invalid value encountered in divide



rejeitou para 6.0 12631
rejeitou para 9.0 12631
rejeitou para 2.0 12631
rejeitou para 3.0 12631
rejeitou para 39.0 12631
rejeitou para 8.0 12631
rejeitou para 92.0 12631
rejeitou para 30.0 12631
rejeitou para 35.0 12631
rejeitou para 36.0 12631
rejeitou para 91.0 12631
rejeitou para 140.0 12631
rejeitou para 110.0 12631
rejeitou para 147.0 12631
rejeitou para 47.0 12631
rejeitou para 37.0 12631



invalid value encountered in divide


invalid value encountered in divide


invalid value encountered in divide


invalid value encountered in divide


invalid value encountered in divide


invalid value encountered in divide


invalid value encountered in divide


invalid value encountered in divide



rejeitou para 7.0 20974
rejeitou para 11.0 20974
rejeitou para 4.0 20974
rejeitou para 6.0 20974
rejeitou para 9.0 20974
rejeitou para 2.0 20974
rejeitou para 3.0 20974
rejeitou para 39.0 20974
rejeitou para 8.0 20974
rejeitou para 92.0 20974
rejeitou para 30.0 20974
rejeitou para 35.0 20974
rejeitou para 15.0 20974
rejeitou para 140.0 20974
rejeitou para 21.0 20974
rejeitou para 110.0 20974
rejeitou para 147.0 20974
rejeitou para 47.0 20974
rejeitou para 79.0 20974
rejeitou para 7.0 4904
rejeitou para 13.0 4904
rejeitou para 4.0 4904
rejeitou para 6.0 4904
rejeitou para 3.0 4904
rejeitou para 8.0 4904
rejeitou para 92.0 4904
rejeitou para 35.0 4904
rejeitou para 5.0 4904
rejeitou para 47.0 4904
rejeitou para 140.0 4904
rejeitou para 97.0 4904
rejeitou para 100.0 4904
rejeitou para 110.0 4904
rejeitou para 147.0 4904
total de ingredientes que rejeitaram a hipótese de autocorrelação 62
total de ingredientes 276



invalid value encountered in divide


invalid value encountered in divide



### A maior parte dos ingredientes não possuem autocorrelação em sua série temporal com lag 1.

# Agora vamos repetir o mesmo processo para testar a estacionariedade

In [46]:
# teste de estacionáriedade

lag = 1
result_test_autocorr = {'46673': {}, '12631':{}, '20974':{},  '4904':{}}
total_rej = 0
total_ing_loja = 0

for store, ingredients in dados.items():
    
    total_ing_loja = total_ing_loja + len(dados[store].keys())

    for ingredient, time_series in ingredients.items():
        result_test_autocorr[store][ingredient] = 0
        time_series = pd.DataFrame(time_series)

        if (len(time_series['total']) != 1 or 0) and (len(time_series['total'].unique())!=1) and (len(time_series['total']) >=5 ) :
            
            adf_test = adfuller(time_series['total'])
            
            if adf_test[1] < 0.05 : 
                result_test_autocorr[store][ingredient] = result_test_autocorr[store][ingredient] + 1
                total_rej = total_rej + 1 
                print('rejeitou para',ingredient, store)

print('total de ingredientes que rejeitaram a hipótese de estacionariedade', total_rej)
print('total de ingredientes', total_ing_loja)

rejeitou para 7.0 46673
rejeitou para 13.0 46673
rejeitou para 11.0 46673
rejeitou para 4.0 46673
rejeitou para 6.0 46673
rejeitou para 9.0 46673
rejeitou para 10.0 46673
rejeitou para 2.0 46673
rejeitou para 3.0 46673
rejeitou para 39.0 46673
rejeitou para 40.0 46673
rejeitou para 12.0 46673
rejeitou para 33.0 46673
rejeitou para 8.0 46673
rejeitou para 92.0 46673
rejeitou para 30.0 46673
rejeitou para 35.0 46673
rejeitou para 5.0 46673
rejeitou para 82.0 46673
rejeitou para 47.0 46673
rejeitou para 91.0 46673
rejeitou para 15.0 46673
rejeitou para 140.0 46673
rejeitou para 22.0 46673
rejeitou para 29.0 46673
rejeitou para 97.0 46673
rejeitou para 100.0 46673
rejeitou para 110.0 46673
rejeitou para 147.0 46673
rejeitou para 161.0 46673
rejeitou para 162.0 46673
rejeitou para 44.0 46673
rejeitou para 167.0 46673
rejeitou para 14.0 46673
rejeitou para 18.0 46673
rejeitou para 26.0 46673
rejeitou para 27.0 46673
rejeitou para 28.0 46673
rejeitou para 43.0 46673
rejeitou para 109.0 46673



divide by zero encountered in log


divide by zero encountered in log



rejeitou para 225.0 12631
rejeitou para 113.0 12631
rejeitou para 68.0 12631
rejeitou para 7.0 20974
rejeitou para 13.0 20974
rejeitou para 10.0 20974
rejeitou para 11.0 20974
rejeitou para 4.0 20974
rejeitou para 6.0 20974
rejeitou para 9.0 20974
rejeitou para 2.0 20974
rejeitou para 3.0 20974
rejeitou para 39.0 20974
rejeitou para 40.0 20974
rejeitou para 12.0 20974
rejeitou para 33.0 20974
rejeitou para 8.0 20974
rejeitou para 92.0 20974
rejeitou para 30.0 20974
rejeitou para 35.0 20974
rejeitou para 5.0 20974
rejeitou para 36.0 20974
rejeitou para 59.0 20974
rejeitou para 82.0 20974
rejeitou para 91.0 20974
rejeitou para 15.0 20974
rejeitou para 140.0 20974
rejeitou para 21.0 20974
rejeitou para 22.0 20974
rejeitou para 29.0 20974
rejeitou para 97.0 20974
rejeitou para 100.0 20974
rejeitou para 110.0 20974
rejeitou para 147.0 20974
rejeitou para 156.0 20974
rejeitou para 160.0 20974
rejeitou para 161.0 20974
rejeitou para 162.0 20974
rejeitou para 167.0 20974
rejeitou para 47.0 209

### A maioria dos ingredientes não possuem série temporal estacionária, ou seja, não possuem média e variância constante.

# Vamos verificar se existe tendências nas series temporais

### O teste Mann-Kendall verifica se existe uma tendência positiva ou negativa nos dados.
### Hipótese nula : Não há tendência nos dados.

In [50]:
# teste de estacionáriedade

lag = 1
result_test_autocorr = {'46673': {}, '12631':{}, '20974':{},  '4904':{}}
total_rej = 0
total_ing_loja = 0

for store, ingredients in dados.items():
    
    total_ing_loja = total_ing_loja + len(dados[store].keys())

    for ingredient, time_series in ingredients.items():
        result_test_autocorr[store][ingredient] = 0
        time_series = pd.DataFrame(time_series)

        if (len(time_series['total']) != 1 or 0) and (len(time_series['total'].unique())!=1) and (len(time_series['total']) >=5 ) :
            
            result = mk.original_test(time_series['total'])
            
            if result.p < 0.05 : 
                result_test_autocorr[store][ingredient] = result_test_autocorr[store][ingredient] + 1
                total_rej = total_rej + 1 
                print('rejeitou para',ingredient, store)

print('total de ingredientes que rejeitaram a hipótese de não há nenhuma tendencia na serie temporal', total_rej)
print('total de ingredientes', total_ing_loja)

rejeitou para 30.0 12631
rejeitou para 15.0 20974
rejeitou para 13.0 4904
total de ingredientes que rejeitaram a hipótese de não há nenhuma tendencia na serie temporal 3
total de ingredientes 276


### Ou seja a maioria das series temporais possuem um tendência, positiva ou negativa, e além disso a maioria não possue comportamento estacionário (o que é coerente com o teste de que há tendência) e a maioria deles não apresenta autocorrelação. Essas informações são cruciais para ajustar os modelos de series temporais corretamente.

### Considerando essas características dos dados podemos ajustar os modelos:
### Exponential Smoothing Models que pode ser ajustado em dados que possuem tendências, sazonalidade e pode ser ajustado mesmo se a series não for estacionária.
### State Space Models que pode ser ajustado em dados não estacionários e com tendências.
### Além desses modelos de series temporais podemos utilizar modelos de aprendizado de máquina como o random forest e xgboost.


# Sobre a validação
### Será  utilizado um processo parecido com o k-fold, mas ao invés de aleatóriamente criar os folds vamos cria o treino de tal forma que no treino tenhamos apenas observações passadas em relação ao conjunto de teste e no conjunto de teste vamos ter apenas observações no futuro em relação ao conjunto de treino. Além disso vamos selecionar apenas observações dos anos dois mil para frente, pois temos poucos dados antes dos anos dois mil e isso pode atrapalhar o nosso modelos pois são dados muito antigos e distantes do atual, além de serem em menor quantidade.

# Vamos utilizar o MLFlow para salvar os experimentos e informações dos modelos, além disso temos a API do MLFlow que pode ser utilizada para manipular os modelos.

In [3]:
# pegandos os dados que foram salvos para não ser necessário rodar o preprocessamento denovo.
# para caso tenha algum problema no treino dos modelos já que temos muitos modelos para treinar de uma vez.
# carregar os dados em pickle.
""" import pickle

with open('dados_st.pkl', 'rb') as fp:
    dados = pickle.load(fp)
 """

In [5]:
total_ingredientes = 0
for store in dados.values():
    # Count the number of 'ingrediente' keys for each store
    total_ingredientes += len(store)

# Output the total count of 'ingrediente' keys
print("Número total de modelos para treinar para prever a demanda de ingrediente isoladamente para cada resturante:", total_ingredientes)


Número total de modelos para treinar para prever a demanda de ingrediente isoladamente para cada resturante: 276


In [2]:
def time_series_cross_val_mlflow(time_series_data, store_name, ingredient_name, n_splits=2):

    # Define split da série temporal
    tscv = TimeSeriesSplit(n_splits=n_splits)

    # Set the MLflow tracking URI to a directory in Google Drive for the specific store and ingredient
    # mlflow.set_tracking_uri(f'/c')  # Change the path as needed
    nome_experimento=f"{store_name}/{ingredient_name}"
    mlflow.create_experiment(nome_experimento)
    mlflow.set_experiment(nome_experimento)

    # Perform cross-validation
    for fold, (train_index, test_index) in enumerate(tscv.split(time_series_data)):
        
        train_data = time_series_data.iloc[train_index]
        test_data = time_series_data.iloc[test_index]
        train_data = train_data.drop(columns='IngredientId')
        test_data = test_data.drop(columns='IngredientId')
        train_data['date'] = pd.to_datetime(train_data['date'])
        test_data['date'] = pd.to_datetime(test_data['date'])
        train_data = train_data.set_index('date')
        test_data = test_data.set_index('date')

        # Models to evaluate
        models = {
            'ExponentialSmoothing': ExponentialSmoothing(train_data, trend="add", seasonal=None),
            'StateSpaceModel': UnobservedComponents(train_data, level='local linear trend'),
            'RandomForest': RandomForestRegressor(),
            'XGBoost': XGBRegressor(objective='reg:squarederror'),
        }

        # Train and evaluate models
        for model_name, model in models.items():
            with mlflow.start_run(run_name=f"{model_name}_fold_{fold + 1}"):

                # Fit the model and make predictions
                if model_name in ['RandomForest', 'XGBoost']:
                    model.fit(np.arange(len(train_data)).reshape(-1, 1), train_data)
                    predictions = model.predict(np.arange(len(train_data), len(train_data) + len(test_data)).reshape(-1, 1))
                elif model_name == 'ExponentialSmoothing' or model_name == 'StateSpaceModel':
                    model_fit = model.fit()
                    predictions = model_fit.forecast(steps=len(test_data))

                # Calculate metrics for the fitted models
                mse = mean_squared_error(test_data.values.flatten(), predictions)
                rmse = np.sqrt(mse)

                # Count under-predictions
                under_predictions_count = np.sum(predictions < test_data.values.flatten())

                # Log parameters, metrics, and model
                mlflow.log_param("fold", fold + 1)
                mlflow.log_param("model_name", model_name)
                mlflow.log_metric("mse", mse)
                mlflow.log_metric("rmse", rmse)
                mlflow.log_metric("under_predictions_count", under_predictions_count)

                if model_name in ['RandomForest', 'XGBoost']:
                    mlflow.sklearn.log_model(model, model_name)
                else:
                    mlflow.statsmodels.log_model(model_fit, model_name)

                print(f"Fold {fold + 1}, {model_name}: RMSE = {rmse:.4f}, Under Predictions Count = {under_predictions_count}")

                # Handle moving averages separately
        for ma_model_name in ['SimpleMovingAverage', 'ExponentialMovingAverage']:
            with mlflow.start_run(run_name=f"{ma_model_name}_fold_{fold + 1}"):
                if ma_model_name == 'SimpleMovingAverage':
                    predictions = train_data.rolling(window=3).mean().iloc[-len(test_data):]
                elif ma_model_name == 'ExponentialMovingAverage':
                    predictions = train_data.ewm(span=3).mean().iloc[-len(test_data):]

                # Drop NaN values from predictions
                predictions = predictions.dropna()

                # Check lengths
                print(f"Length of predictions: {len(predictions)}, Length of test_data: {len(test_data)}")

                # Ensure predictions and test_data are aligned
                if len(predictions) < len(test_data):
                    # If predictions are shorter, slice the test_data to match
                    test_data = test_data.iloc[-len(predictions):]
                elif len(predictions) > len(test_data):
                    # If predictions are longer, slice the predictions to match
                    predictions = predictions[-len(test_data):]

                # Check if predictions are empty after dropping NaNs
                if predictions.empty:
                    print(f"Warning: Predictions for {ma_model_name} are empty after dropping NaNs.")
                    continue

                # Calculate metrics
                mse = mean_squared_error(test_data.values.flatten(), predictions.values.flatten())
                rmse = np.sqrt(mse)

                # Count under-predictions
                under_predictions_count = np.sum(predictions.values.flatten() < test_data.values.flatten())

                # Log parameters, metrics, and model
                mlflow.log_param("fold", fold + 1)
                mlflow.log_param("model_name", ma_model_name)
                mlflow.log_metric("mse", mse)
                mlflow.log_metric("rmse", rmse)
                mlflow.log_metric("under_predictions_count", under_predictions_count)

                print(f"Fold {fold + 1}, {ma_model_name}: RMSE = {rmse:.4f}, Under Predictions Count = {under_predictions_count}")

In [79]:
not_enough_data = []
for store, ingredients in dados.items():

    for ingredient, time_series in ingredients.items():

        time_series = pd.DataFrame(time_series)
        time_series = time_series.loc[time_series['date'] > '2001-04-15' ]

        if time_series.shape[0] > 10:
          time_series_cross_val_mlflow(time_series, store, ingredient)

        else:
          store_e_ing = f'{store}/{ingredient}'
          not_enough_data.append(store_e_ing)

  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(
 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 7.1866, Under Predictions Count = 49
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.58109D+00    |proj g|=  1.05447D-01

At iterate    5    f=  3.38591D+00    |proj g|=  2.36042D-03

At iterate   10    f=  3.38586D+00    |proj g|=  2.46158D-02

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     26      1     0     0   1.052D-04   3.386D+00
  F =   3.3858435614964413     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH          

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 7.1866, Under Predictions Count = 49




Fold 1, RandomForest: RMSE = 7.8752, Under Predictions Count = 160


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 7.8311, Under Predictions Count = 103
Length of predictions: 212, Length of test_data: 213
Fold 1, SimpleMovingAverage: RMSE = 8.1133, Under Predictions Count = 75
Length of predictions: 212, Length of test_data: 212
Fold 1, ExponentialMovingAverage: RMSE = 7.8816, Under Predictions Count = 76


 This problem is unconstrained.


Fold 2, ExponentialSmoothing: RMSE = 5.8781, Under Predictions Count = 97
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.59378D+00    |proj g|=  8.31732D-02
  ys=-4.049E-01  -gs= 1.057E-01 BFGS update SKIPPED

At iterate    5    f=  3.38988D+00    |proj g|=  7.96077D-02

At iterate   10    f=  3.38649D+00    |proj g|=  3.32815D-01

At iterate   15    f=  3.38582D+00    |proj g|=  9.90671D-02

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     18     34      2     1     0 

  return get_prediction_index(
  return get_prediction_index(
  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 5.8780, Under Predictions Count = 97




Fold 2, RandomForest: RMSE = 6.4235, Under Predictions Count = 105


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 6.3646, Under Predictions Count = 105
Length of predictions: 213, Length of test_data: 213
Fold 2, SimpleMovingAverage: RMSE = 7.3913, Under Predictions Count = 77
Length of predictions: 213, Length of test_data: 213
Fold 2, ExponentialMovingAverage: RMSE = 7.2595, Under Predictions Count = 81


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 14.2338, Under Predictions Count = 79
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.25167D+00    |proj g|=  4.05336D-02
  ys=-4.568E-01  -gs= 9.935E-02 BFGS update SKIPPED

At iterate    5    f=  4.05456D+00    |proj g|=  1.06259D-02

At iterate   10    f=  4.05332D+00    |proj g|=  2.41123D-01

At iterate   15    f=  4.05085D+00    |proj g|=  1.96065D-03

At iterate   20    f=  4.05071D+00    |proj g|=  7.60117D-02

At iterate   25    f=  4.05067D+00    |proj g|=  5.72404D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final functi

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 14.2338, Under Predictions Count = 79




Fold 1, RandomForest: RMSE = 16.2189, Under Predictions Count = 131


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 16.2964, Under Predictions Count = 131
Length of predictions: 243, Length of test_data: 243
Fold 1, SimpleMovingAverage: RMSE = 15.9120, Under Predictions Count = 94
Length of predictions: 243, Length of test_data: 243
Fold 1, ExponentialMovingAverage: RMSE = 15.7233, Under Predictions Count = 90


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 13.7027, Under Predictions Count = 66
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.26498D+00    |proj g|=  4.59691D-02
  ys=-8.030E-01  -gs= 1.010E-01 BFGS update SKIPPED

At iterate    5    f=  4.06411D+00    |proj g|=  2.42662D-02

At iterate   10    f=  4.06285D+00    |proj g|=  4.72216D-01

At iterate   15    f=  4.05729D+00    |proj g|=  2.26181D-01

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     18     31      2     1     0

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 13.6992, Under Predictions Count = 66




Fold 2, RandomForest: RMSE = 16.6201, Under Predictions Count = 45


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 14.9629, Under Predictions Count = 50
Length of predictions: 243, Length of test_data: 243
Fold 2, SimpleMovingAverage: RMSE = 16.5648, Under Predictions Count = 86
Length of predictions: 243, Length of test_data: 243
Fold 2, ExponentialMovingAverage: RMSE = 16.3220, Under Predictions Count = 87


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 9.3396, Under Predictions Count = 39
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.63415D+00    |proj g|=  5.07037D-02

At iterate    5    f=  3.47391D+00    |proj g|=  1.65887D-02

At iterate   10    f=  3.47344D+00    |proj g|=  9.65814D-03

At iterate   15    f=  3.47335D+00    |proj g|=  7.95453D-07

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     15     22      1     0     0   7.955D-07   3.473D+00
  F =   3.4733506171344555  

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 14.2088, Under Predictions Count = 53




Fold 1, RandomForest: RMSE = 10.2613, Under Predictions Count = 40


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 10.5794, Under Predictions Count = 40
Length of predictions: 58, Length of test_data: 59
Fold 1, SimpleMovingAverage: RMSE = 9.5255, Under Predictions Count = 31
Length of predictions: 58, Length of test_data: 58
Fold 1, ExponentialMovingAverage: RMSE = 9.3670, Under Predictions Count = 29


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 7.9869, Under Predictions Count = 12
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.68414D+00    |proj g|=  6.07147D-02

At iterate    5    f=  3.52419D+00    |proj g|=  5.89708D-02

At iterate   10    f=  3.52231D+00    |proj g|=  1.23468D-03

At iterate   15    f=  3.52228D+00    |proj g|=  2.95628D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     18     29      1     0     0   4.387D-06   3.522D+00
  F =   3.5222787987171404  

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 8.1726, Under Predictions Count = 12




Fold 2, RandomForest: RMSE = 7.5290, Under Predictions Count = 20


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 7.5258, Under Predictions Count = 20
Length of predictions: 59, Length of test_data: 59
Fold 2, SimpleMovingAverage: RMSE = 9.6340, Under Predictions Count = 23
Length of predictions: 59, Length of test_data: 59
Fold 2, ExponentialMovingAverage: RMSE = 9.7976, Under Predictions Count = 23


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 39.0138, Under Predictions Count = 2
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.13627D+00    |proj g|=  3.92259D-02
  ys=-1.433E-01  -gs= 1.869E-01 BFGS update SKIPPED

At iterate    5    f=  3.87065D+00    |proj g|=  4.25592D-02

At iterate   10    f=  3.86744D+00    |proj g|=  3.76735D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     21      2     1     0   2.915D-06   3.867D+00
  F =   3.8674309111067111     

CONVE

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 39.0250, Under Predictions Count = 2




Fold 1, RandomForest: RMSE = 20.4700, Under Predictions Count = 6


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 26.6261, Under Predictions Count = 6
Length of predictions: 62, Length of test_data: 64
Fold 1, SimpleMovingAverage: RMSE = 11.0405, Under Predictions Count = 21
Length of predictions: 62, Length of test_data: 62
Fold 1, ExponentialMovingAverage: RMSE = 11.2127, Under Predictions Count = 23


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 10.6942, Under Predictions Count = 18
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.02577D+00    |proj g|=  5.21199D-02
  ys=-3.766E-02  -gs= 1.340E-01 BFGS update SKIPPED

At iterate    5    f=  3.85554D+00    |proj g|=  5.38013D-02

At iterate   10    f=  3.85317D+00    |proj g|=  6.78240D-04

At iterate   15    f=  3.85316D+00    |proj g|=  5.50852D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     17     24      2     1     0

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 11.7446, Under Predictions Count = 12




Fold 2, RandomForest: RMSE = 14.2589, Under Predictions Count = 8


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 19.1441, Under Predictions Count = 6
Length of predictions: 64, Length of test_data: 64
Fold 2, SimpleMovingAverage: RMSE = 13.0133, Under Predictions Count = 20
Length of predictions: 64, Length of test_data: 64
Fold 2, ExponentialMovingAverage: RMSE = 12.7407, Under Predictions Count = 21


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 4.6896, Under Predictions Count = 44
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.11884D+00    |proj g|=  1.62187D-01

At iterate    5    f=  2.97271D+00    |proj g|=  2.78217D-01

At iterate   10    f=  2.96840D+00    |proj g|=  1.35399D-02

At iterate   15    f=  2.96838D+00    |proj g|=  4.19322D-02

At iterate   20    f=  2.96834D+00    |proj g|=  1.73415D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     21     31      1   

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 4.6896, Under Predictions Count = 44




Fold 1, RandomForest: RMSE = 4.4824, Under Predictions Count = 44


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 4.4807, Under Predictions Count = 67
Length of predictions: 154, Length of test_data: 156
Fold 1, SimpleMovingAverage: RMSE = 5.1236, Under Predictions Count = 55
Length of predictions: 154, Length of test_data: 154
Fold 1, ExponentialMovingAverage: RMSE = 5.1887, Under Predictions Count = 58


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 4.4775, Under Predictions Count = 53
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.12051D+00    |proj g|=  1.70172D-01
  ys=-4.469E-01  -gs= 9.318E-02 BFGS update SKIPPED

At iterate    5    f=  2.95448D+00    |proj g|=  5.41496D-02

At iterate   10    f=  2.95220D+00    |proj g|=  1.34455D+00

At iterate   15    f=  2.94782D+00    |proj g|=  2.43382D-02

At iterate   20    f=  2.94781D+00    |proj g|=  1.86804D-02

At iterate   25    f=  2.94770D+00    |proj g|=  7.27297D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final functio

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 4.4875, Under Predictions Count = 53




Fold 2, RandomForest: RMSE = 4.7551, Under Predictions Count = 38


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 4.4645, Under Predictions Count = 53
Length of predictions: 156, Length of test_data: 156
Fold 2, SimpleMovingAverage: RMSE = 4.9869, Under Predictions Count = 54
Length of predictions: 156, Length of test_data: 156
Fold 2, ExponentialMovingAverage: RMSE = 4.9088, Under Predictions Count = 58


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 4.6627, Under Predictions Count = 43
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.10702D+00    |proj g|=  1.86134D-01
  ys=-1.626E-01  -gs= 9.131E-02 BFGS update SKIPPED

At iterate    5    f=  2.96179D+00    |proj g|=  4.22188D-01

At iterate   10    f=  2.95771D+00    |proj g|=  3.52154D-03

At iterate   15    f=  2.95768D+00    |proj g|=  7.44322D-02

At iterate   20    f=  2.95764D+00    |proj g|=  7.84260D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  N

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 4.6627, Under Predictions Count = 43




Fold 1, RandomForest: RMSE = 4.5057, Under Predictions Count = 43


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 4.5047, Under Predictions Count = 43
Length of predictions: 153, Length of test_data: 154
Fold 1, SimpleMovingAverage: RMSE = 5.0089, Under Predictions Count = 55
Length of predictions: 153, Length of test_data: 153
Fold 1, ExponentialMovingAverage: RMSE = 5.1994, Under Predictions Count = 59


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 4.4223, Under Predictions Count = 52
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.11412D+00    |proj g|=  1.85618D-01
  ys=-4.691E-01  -gs= 9.313E-02 BFGS update SKIPPED

At iterate    5    f=  2.95076D+00    |proj g|=  7.35754D-01

At iterate   10    f=  2.94468D+00    |proj g|=  2.90368D-02

At iterate   15    f=  2.94468D+00    |proj g|=  1.04442D-02

At iterate   20    f=  2.94457D+00    |proj g|=  2.48989D-02

At iterate   25    f=  2.94457D+00    |proj g|=  6.58128D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final functio

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 4.4223, Under Predictions Count = 52




Fold 2, RandomForest: RMSE = 4.6959, Under Predictions Count = 37


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 4.4114, Under Predictions Count = 52
Length of predictions: 154, Length of test_data: 154
Fold 2, SimpleMovingAverage: RMSE = 4.9275, Under Predictions Count = 54
Length of predictions: 154, Length of test_data: 154
Fold 2, ExponentialMovingAverage: RMSE = 4.8168, Under Predictions Count = 60


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 11.1255, Under Predictions Count = 59
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.45580D+00    |proj g|=  7.53235D-02

At iterate    5    f=  3.20074D+00    |proj g|=  5.21313D-02

At iterate   10    f=  3.20055D+00    |proj g|=  9.48765D-03

At iterate   15    f=  3.20049D+00    |proj g|=  5.59356D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     16     25      1     0     0   9.808D-06   3.200D+00
  F =   3.2004932652721214 

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 13.8894, Under Predictions Count = 67




Fold 1, RandomForest: RMSE = 10.8506, Under Predictions Count = 60


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 11.7747, Under Predictions Count = 60
Length of predictions: 70, Length of test_data: 70
Fold 1, SimpleMovingAverage: RMSE = 9.4415, Under Predictions Count = 28
Length of predictions: 70, Length of test_data: 70
Fold 1, ExponentialMovingAverage: RMSE = 9.6688, Under Predictions Count = 30


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 6.8920, Under Predictions Count = 35
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.62800D+00    |proj g|=  7.75496D-02
  ys=-2.087E-01  -gs= 1.395E-01 BFGS update SKIPPED

At iterate    5    f=  3.42468D+00    |proj g|=  1.85643D-02

At iterate   10    f=  3.42407D+00    |proj g|=  5.04571D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     22      2     1     0   5.497D-06   3.424D+00
  F =   3.4240680322850450     

CONVE

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 7.0455, Under Predictions Count = 27




Fold 2, RandomForest: RMSE = 13.3478, Under Predictions Count = 4


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 22.8267, Under Predictions Count = 0
Length of predictions: 70, Length of test_data: 70
Fold 2, SimpleMovingAverage: RMSE = 7.3532, Under Predictions Count = 29
Length of predictions: 70, Length of test_data: 70
Fold 2, ExponentialMovingAverage: RMSE = 7.7186, Under Predictions Count = 30


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 6.5840, Under Predictions Count = 84
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.56749D+00    |proj g|=  8.04603D-02
  ys=-3.838E-01  -gs= 9.820E-02 BFGS update SKIPPED

At iterate    5    f=  3.37936D+00    |proj g|=  1.97301D-02

At iterate   10    f=  3.37761D+00    |proj g|=  1.91970D-03

At iterate   15    f=  3.37758D+00    |proj g|=  1.88575D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     16     28      2     1     0 

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 6.5841, Under Predictions Count = 84




Fold 1, RandomForest: RMSE = 7.0562, Under Predictions Count = 114


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 8.3864, Under Predictions Count = 114
Length of predictions: 154, Length of test_data: 156
Fold 1, SimpleMovingAverage: RMSE = 7.6700, Under Predictions Count = 56
Length of predictions: 154, Length of test_data: 154
Fold 1, ExponentialMovingAverage: RMSE = 7.4954, Under Predictions Count = 63


 This problem is unconstrained.


Fold 2, ExponentialSmoothing: RMSE = 7.9016, Under Predictions Count = 45
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.51853D+00    |proj g|=  9.81078D-02
  ys=-6.430E-01  -gs= 1.035E-01 BFGS update SKIPPED

At iterate    5    f=  3.32087D+00    |proj g|=  1.01427D-01

At iterate   10    f=  3.31833D+00    |proj g|=  7.14677D-01

At iterate   15    f=  3.31766D+00    |proj g|=  1.97147D-02

At iterate   20    f=  3.31765D+00    |proj g|=  1.18924D-02

At iterate   25    f=  3.31760D+00    |proj g|=  3.59811D-02

At iterate   30    f=  3.31759D+00    |proj g|=  4.72386D-02

At iterate   35    f=  3.31758D+00    |proj g|=  1.68052D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = 

  return get_prediction_index(
  return get_prediction_index(
  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 7.8938, Under Predictions Count = 45




Fold 2, RandomForest: RMSE = 8.1991, Under Predictions Count = 106


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 8.5854, Under Predictions Count = 106
Length of predictions: 156, Length of test_data: 156
Fold 2, SimpleMovingAverage: RMSE = 8.5241, Under Predictions Count = 58
Length of predictions: 156, Length of test_data: 156
Fold 2, ExponentialMovingAverage: RMSE = 8.5079, Under Predictions Count = 61


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 9.8836, Under Predictions Count = 49
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.99267D+00    |proj g|=  5.61002D-02

At iterate    5    f=  3.80519D+00    |proj g|=  8.33464D-03

At iterate   10    f=  3.80515D+00    |proj g|=  5.57246D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     20      1     0     0   4.102D-07   3.805D+00
  F =   3.8051446891104077     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL         

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 10.0980, Under Predictions Count = 54




Fold 1, RandomForest: RMSE = 13.0441, Under Predictions Count = 98


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 13.1992, Under Predictions Count = 98
Length of predictions: 124, Length of test_data: 125
Fold 1, SimpleMovingAverage: RMSE = 11.4932, Under Predictions Count = 51
Length of predictions: 124, Length of test_data: 124
Fold 1, ExponentialMovingAverage: RMSE = 11.5358, Under Predictions Count = 52


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 11.0987, Under Predictions Count = 54
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.94957D+00    |proj g|=  6.80709D-02
  ys=-5.797E-01  -gs= 1.003E-01 BFGS update SKIPPED

At iterate    5    f=  3.76857D+00    |proj g|=  1.71422D-02

At iterate   10    f=  3.76792D+00    |proj g|=  3.46507D-01

At iterate   15    f=  3.76439D+00    |proj g|=  2.01972D-02

At iterate   20    f=  3.76432D+00    |proj g|=  1.04045D-01

At iterate   25    f=  3.76428D+00    |proj g|=  2.09735D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final functi

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 11.0987, Under Predictions Count = 54




Fold 2, RandomForest: RMSE = 12.3993, Under Predictions Count = 93


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 12.2833, Under Predictions Count = 63
Length of predictions: 125, Length of test_data: 125
Fold 2, SimpleMovingAverage: RMSE = 13.0304, Under Predictions Count = 52
Length of predictions: 125, Length of test_data: 125
Fold 2, ExponentialMovingAverage: RMSE = 12.7959, Under Predictions Count = 55


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 5.1124, Under Predictions Count = 16
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.25167D+00    |proj g|=  9.70313D-02

At iterate    5    f=  3.10587D+00    |proj g|=  2.35608D-02

At iterate   10    f=  3.10422D+00    |proj g|=  3.85868D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     18      1     0     0   2.455D-06   3.104D+00
  F =   3.1042139787329153     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL         

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 4.6564, Under Predictions Count = 30




Fold 1, RandomForest: RMSE = 4.6737, Under Predictions Count = 30


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 4.6754, Under Predictions Count = 30
Length of predictions: 58, Length of test_data: 60
Fold 1, SimpleMovingAverage: RMSE = 5.9699, Under Predictions Count = 23
Length of predictions: 58, Length of test_data: 58
Fold 1, ExponentialMovingAverage: RMSE = 6.1486, Under Predictions Count = 28


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 5.0132, Under Predictions Count = 33
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.20505D+00    |proj g|=  1.34641D-01

At iterate    5    f=  3.05247D+00    |proj g|=  2.52232D-03

At iterate   10    f=  3.05245D+00    |proj g|=  1.25963D-03

At iterate   15    f=  3.05243D+00    |proj g|=  9.72554D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     19     33      1     0     0   8.587D-06   3.052D+00
  F =   3.0524319330374770  

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 5.0311, Under Predictions Count = 33




Fold 2, RandomForest: RMSE = 5.0730, Under Predictions Count = 33


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 5.0116, Under Predictions Count = 33
Length of predictions: 60, Length of test_data: 60
Fold 2, SimpleMovingAverage: RMSE = 5.7355, Under Predictions Count = 27
Length of predictions: 60, Length of test_data: 60
Fold 2, ExponentialMovingAverage: RMSE = 5.9210, Under Predictions Count = 30


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 3.5191, Under Predictions Count = 13
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.13568D+00    |proj g|=  1.30017D-01

At iterate    5    f=  2.99196D+00    |proj g|=  3.69822D-03

At iterate   10    f=  2.99195D+00    |proj g|=  3.19066D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     12     18      1     0     0   4.359D-06   2.992D+00
  F =   2.9919527319402572     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL         

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 3.8895, Under Predictions Count = 36




Fold 1, RandomForest: RMSE = 3.8641, Under Predictions Count = 42


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 3.6344, Under Predictions Count = 42
Length of predictions: 58, Length of test_data: 60
Fold 1, SimpleMovingAverage: RMSE = 4.2399, Under Predictions Count = 23
Length of predictions: 58, Length of test_data: 58
Fold 1, ExponentialMovingAverage: RMSE = 4.3541, Under Predictions Count = 28


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 3.1672, Under Predictions Count = 30
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.00443D+00    |proj g|=  1.87842D-01
  ys=-3.980E-02  -gs= 9.826E-02 BFGS update SKIPPED

At iterate    5    f=  2.85857D+00    |proj g|=  7.80497D-02

At iterate   10    f=  2.85588D+00    |proj g|=  1.18841D-01

At iterate   15    f=  2.85580D+00    |proj g|=  1.48728D-02

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     18     27      2     1     0 

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 3.1672, Under Predictions Count = 30




Fold 2, RandomForest: RMSE = 3.4435, Under Predictions Count = 30


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 3.5000, Under Predictions Count = 30
Length of predictions: 60, Length of test_data: 60
Fold 2, SimpleMovingAverage: RMSE = 3.6014, Under Predictions Count = 28
Length of predictions: 60, Length of test_data: 60
Fold 2, ExponentialMovingAverage: RMSE = 3.7264, Under Predictions Count = 26


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 13.5600, Under Predictions Count = 29
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.01929D+00    |proj g|=  5.74436D-02

At iterate    5    f=  3.88854D+00    |proj g|=  1.01154D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     17      1     0     0   6.190D-07   3.888D+00
  F =   3.8884998937297959     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 13.5594, Under Predictions Count = 29




Fold 1, RandomForest: RMSE = 13.2058, Under Predictions Count = 28


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 14.0812, Under Predictions Count = 28
Length of predictions: 76, Length of test_data: 77
Fold 1, SimpleMovingAverage: RMSE = 14.7577, Under Predictions Count = 38
Length of predictions: 76, Length of test_data: 76
Fold 1, ExponentialMovingAverage: RMSE = 14.9430, Under Predictions Count = 43


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 12.9697, Under Predictions Count = 24
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.11331D+00    |proj g|=  6.09412D-02

At iterate    5    f=  3.95633D+00    |proj g|=  1.79574D-01

At iterate   10    f=  3.95227D+00    |proj g|=  1.06125D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     24      1     0     0   1.392D-05   3.952D+00
  F =   3.9522224347550057     

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH         

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 12.9696, Under Predictions Count = 24




Fold 2, RandomForest: RMSE = 16.5075, Under Predictions Count = 17


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 22.1505, Under Predictions Count = 7
Length of predictions: 77, Length of test_data: 77
Fold 2, SimpleMovingAverage: RMSE = 14.1517, Under Predictions Count = 30
Length of predictions: 77, Length of test_data: 77
Fold 2, ExponentialMovingAverage: RMSE = 14.7049, Under Predictions Count = 26


 This problem is unconstrained.

   evaluations in the last line search.  Termination
   may possibly be caused by a bad search direction.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 5.9575, Under Predictions Count = 29
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.19913D+00    |proj g|=  1.46158D-01
  ys=-8.445E-02  -gs= 8.390E-02 BFGS update SKIPPED

At iterate    5    f=  3.08562D+00    |proj g|=  1.73468D-02

At iterate   10    f=  3.08319D+00    |proj g|=  6.86329D-02

At iterate   15    f=  3.08311D+00    |proj g|=  5.65468D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     18     43      2     1     0 

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 5.9576, Under Predictions Count = 29




Fold 1, RandomForest: RMSE = 5.9424, Under Predictions Count = 37


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 5.9282, Under Predictions Count = 37
Length of predictions: 76, Length of test_data: 77
Fold 1, SimpleMovingAverage: RMSE = 6.7308, Under Predictions Count = 37
Length of predictions: 76, Length of test_data: 76
Fold 1, ExponentialMovingAverage: RMSE = 6.8065, Under Predictions Count = 36


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 5.7465, Under Predictions Count = 26
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.31241D+00    |proj g|=  1.44259D-01
  ys=-8.835E-03  -gs= 9.859E-02 BFGS update SKIPPED

At iterate    5    f=  3.15801D+00    |proj g|=  3.32386D-01

At iterate   10    f=  3.15303D+00    |proj g|=  9.49248D-03

At iterate   15    f=  3.15302D+00    |proj g|=  3.04013D-02

At iterate   20    f=  3.15296D+00    |proj g|=  5.17704D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  N

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 5.7465, Under Predictions Count = 26




Fold 2, RandomForest: RMSE = 7.7458, Under Predictions Count = 12


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 11.7994, Under Predictions Count = 5
Length of predictions: 77, Length of test_data: 77
Fold 2, SimpleMovingAverage: RMSE = 6.2280, Under Predictions Count = 25
Length of predictions: 77, Length of test_data: 77
Fold 2, ExponentialMovingAverage: RMSE = 6.4273, Under Predictions Count = 28


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 13.1657, Under Predictions Count = 6
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.03110D+00    |proj g|=  4.54342D-02

At iterate    5    f=  3.90218D+00    |proj g|=  1.66968D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      9     16      1     0     0   3.261D-06   3.902D+00
  F =   3.9020520451594778     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 13.1659, Under Predictions Count = 6




Fold 1, RandomForest: RMSE = 9.7180, Under Predictions Count = 17


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 10.3597, Under Predictions Count = 17
Length of predictions: 47, Length of test_data: 49
Fold 1, SimpleMovingAverage: RMSE = 12.8398, Under Predictions Count = 19
Length of predictions: 47, Length of test_data: 47
Fold 1, ExponentialMovingAverage: RMSE = 12.8817, Under Predictions Count = 18


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 9.0284, Under Predictions Count = 21
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.95112D+00    |proj g|=  5.88807D-02

At iterate    5    f=  3.82801D+00    |proj g|=  2.20314D-02

At iterate   10    f=  3.82509D+00    |proj g|=  2.60611D-02

At iterate   15    f=  3.82502D+00    |proj g|=  2.14862D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     16     26      1     0     0   3.793D-07   3.825D+00
  F =   3.8250201441284961  

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 9.0284, Under Predictions Count = 21




Fold 2, RandomForest: RMSE = 14.8594, Under Predictions Count = 5


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 12.4881, Under Predictions Count = 6
Length of predictions: 49, Length of test_data: 49
Fold 2, SimpleMovingAverage: RMSE = 11.0910, Under Predictions Count = 20
Length of predictions: 49, Length of test_data: 49
Fold 2, ExponentialMovingAverage: RMSE = 11.3386, Under Predictions Count = 23


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 19.2323, Under Predictions Count = 14
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.57695D+00    |proj g|=  3.28669D-02

At iterate    5    f=  4.48864D+00    |proj g|=  2.50403D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      7     13      1     0     0   2.514D-08   4.489D+00
  F =   4.4886430687449064     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 19.2332, Under Predictions Count = 14




Fold 1, RandomForest: RMSE = 19.2963, Under Predictions Count = 38


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 23.1608, Under Predictions Count = 38
Length of predictions: 48, Length of test_data: 49
Fold 1, SimpleMovingAverage: RMSE = 22.2351, Under Predictions Count = 19
Length of predictions: 48, Length of test_data: 48
Fold 1, ExponentialMovingAverage: RMSE = 22.1960, Under Predictions Count = 16


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 15.0034, Under Predictions Count = 21
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.53287D+00    |proj g|=  3.85291D-02
  ys=-6.819E-02  -gs= 8.654E-02 BFGS update SKIPPED

At iterate    5    f=  4.40134D+00    |proj g|=  2.18759D-02

At iterate   10    f=  4.39848D+00    |proj g|=  7.88835D-03

At iterate   15    f=  4.39844D+00    |proj g|=  1.31513D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     15     23      2     1     0

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 15.0034, Under Predictions Count = 21




Fold 2, RandomForest: RMSE = 14.9755, Under Predictions Count = 14


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 15.9143, Under Predictions Count = 21
Length of predictions: 49, Length of test_data: 49
Fold 2, SimpleMovingAverage: RMSE = 17.0532, Under Predictions Count = 23
Length of predictions: 49, Length of test_data: 49
Fold 2, ExponentialMovingAverage: RMSE = 15.5864, Under Predictions Count = 22


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 2.4282, Under Predictions Count = 32
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.15323D+00    |proj g|=  2.13906D-01

At iterate    5    f=  2.02829D+00    |proj g|=  4.17584D-03

At iterate   10    f=  2.02815D+00    |proj g|=  1.14460D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     11     19      1     0     0   6.097D-07   2.028D+00
  F =   2.0281531655952598     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL         

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 2.4282, Under Predictions Count = 32




Fold 1, RandomForest: RMSE = 2.0375, Under Predictions Count = 19


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 2.2188, Under Predictions Count = 19
Length of predictions: 38, Length of test_data: 39
Fold 1, SimpleMovingAverage: RMSE = 2.1791, Under Predictions Count = 14
Length of predictions: 38, Length of test_data: 38
Fold 1, ExponentialMovingAverage: RMSE = 2.2536, Under Predictions Count = 16


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 1.7712, Under Predictions Count = 14
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.16502D+00    |proj g|=  2.93440D-01
  ys=-8.667E-02  -gs= 9.739E-02 BFGS update SKIPPED

At iterate    5    f=  2.02133D+00    |proj g|=  1.03602D-01

At iterate   10    f=  2.02019D+00    |proj g|=  9.25932D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     11     19      2     1     0   4.449D-05   2.020D+00
  F =   2.0201894833159750     

CONVE

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 1.7787, Under Predictions Count = 14




Fold 2, RandomForest: RMSE = 1.9931, Under Predictions Count = 14


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)


Fold 2, XGBoost: RMSE = 2.0734, Under Predictions Count = 14
Length of predictions: 39, Length of test_data: 39
Fold 2, SimpleMovingAverage: RMSE = 2.0028, Under Predictions Count = 10
Length of predictions: 39, Length of test_data: 39
Fold 2, ExponentialMovingAverage: RMSE = 1.8182, Under Predictions Count = 9


  return get_prediction_index(
  return get_prediction_index(
 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 4.6909, Under Predictions Count = 35
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.99071D+00    |proj g|=  1.26925D-01

At iterate    5    f=  2.82373D+00    |proj g|=  1.10755D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     16      1     0     0   7.899D-06   2.824D+00
  F =   2.8237235139963417     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 4.6909, Under Predictions Count = 35




Fold 1, RandomForest: RMSE = 4.0988, Under Predictions Count = 30


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 4.7196, Under Predictions Count = 30
Length of predictions: 40, Length of test_data: 40
Fold 1, SimpleMovingAverage: RMSE = 4.3729, Under Predictions Count = 14
Length of predictions: 40, Length of test_data: 40
Fold 1, ExponentialMovingAverage: RMSE = 4.2095, Under Predictions Count = 14


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 4.2785, Under Predictions Count = 19
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.95985D+00    |proj g|=  1.50329D-01
  ys=-2.531E-02  -gs= 1.292E-01 BFGS update SKIPPED

At iterate    5    f=  2.77317D+00    |proj g|=  1.23599D-02

At iterate   10    f=  2.77298D+00    |proj g|=  6.75545D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     20      2     1     0   2.036D-07   2.773D+00
  F =   2.7729733002955745     

CONVE

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 4.3069, Under Predictions Count = 19




Fold 2, RandomForest: RMSE = 4.7516, Under Predictions Count = 24


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 5.2189, Under Predictions Count = 24
Length of predictions: 40, Length of test_data: 40
Fold 2, SimpleMovingAverage: RMSE = 4.7317, Under Predictions Count = 15
Length of predictions: 40, Length of test_data: 40
Fold 2, ExponentialMovingAverage: RMSE = 4.6988, Under Predictions Count = 16


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 9.8286, Under Predictions Count = 16
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.02638D+00    |proj g|=  4.98535D-02

At iterate    5    f=  3.93179D+00    |proj g|=  3.36634D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      6     13      1     0     0   1.443D-06   3.932D+00
  F =   3.9317903508903762     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 9.8257, Under Predictions Count = 16




Fold 1, RandomForest: RMSE = 9.7123, Under Predictions Count = 16


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 10.3492, Under Predictions Count = 16
Length of predictions: 41, Length of test_data: 42
Fold 1, SimpleMovingAverage: RMSE = 11.7291, Under Predictions Count = 20
Length of predictions: 41, Length of test_data: 41
Fold 1, ExponentialMovingAverage: RMSE = 11.9223, Under Predictions Count = 21


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 12.7158, Under Predictions Count = 11
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.99326D+00    |proj g|=  5.60996D-02

At iterate    5    f=  3.83449D+00    |proj g|=  3.19002D-03

At iterate   10    f=  3.83415D+00    |proj g|=  3.03157D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     10     18      1     0     0   3.032D-06   3.834D+00
  F =   3.8341506782471724     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL        

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 12.7158, Under Predictions Count = 11




Fold 2, RandomForest: RMSE = 12.4946, Under Predictions Count = 12


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 12.5481, Under Predictions Count = 12
Length of predictions: 42, Length of test_data: 42
Fold 2, SimpleMovingAverage: RMSE = 12.6236, Under Predictions Count = 12
Length of predictions: 42, Length of test_data: 42
Fold 2, ExponentialMovingAverage: RMSE = 12.8668, Under Predictions Count = 11


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 18.5599, Under Predictions Count = 12
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.26847D+00    |proj g|=  2.36918D-02

At iterate    5    f=  4.23024D+00    |proj g|=  3.13321D-03

At iterate   10    f=  4.22341D+00    |proj g|=  1.37650D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     17      1     0     0   3.281D-06   4.223D+00
  F =   4.2234020983802028     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL        

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 19.4962, Under Predictions Count = 28




Fold 1, RandomForest: RMSE = 18.3907, Under Predictions Count = 17


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 19.4507, Under Predictions Count = 17
Length of predictions: 44, Length of test_data: 46
Fold 1, SimpleMovingAverage: RMSE = 22.6346, Under Predictions Count = 13
Length of predictions: 44, Length of test_data: 44
Fold 1, ExponentialMovingAverage: RMSE = 22.5718, Under Predictions Count = 16


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 14.7771, Under Predictions Count = 19
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.40072D+00    |proj g|=  2.80993D-02

At iterate    5    f=  4.28817D+00    |proj g|=  2.33897D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      9     17      1     0     0   1.772D-06   4.288D+00
  F =   4.2881425883249609     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 15.2382, Under Predictions Count = 19




Fold 2, RandomForest: RMSE = 19.3813, Under Predictions Count = 38


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 20.8937, Under Predictions Count = 38
Length of predictions: 46, Length of test_data: 46
Fold 2, SimpleMovingAverage: RMSE = 20.4074, Under Predictions Count = 16
Length of predictions: 46, Length of test_data: 46
Fold 2, ExponentialMovingAverage: RMSE = 20.4105, Under Predictions Count = 18


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 6.5944, Under Predictions Count = 16
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.46683D+00    |proj g|=  1.03089D-01

At iterate    5    f=  3.31317D+00    |proj g|=  9.76414D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      7     14      1     0     0   3.976D-06   3.313D+00
  F =   3.3131587978351185     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 6.5945, Under Predictions Count = 16




Fold 1, RandomForest: RMSE = 7.1139, Under Predictions Count = 18


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 5.0264, Under Predictions Count = 21
Length of predictions: 77, Length of test_data: 77
Fold 1, SimpleMovingAverage: RMSE = 6.0350, Under Predictions Count = 22
Length of predictions: 77, Length of test_data: 77
Fold 1, ExponentialMovingAverage: RMSE = 6.1246, Under Predictions Count = 26


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 6.0584, Under Predictions Count = 26
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.38082D+00    |proj g|=  1.17504D-01

At iterate    5    f=  3.20799D+00    |proj g|=  1.41009D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     14      1     0     0   1.434D-07   3.208D+00
  F =   3.2079716177603315     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 6.0584, Under Predictions Count = 26




Fold 2, RandomForest: RMSE = 6.1063, Under Predictions Count = 26


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 6.1143, Under Predictions Count = 26
Length of predictions: 77, Length of test_data: 77
Fold 2, SimpleMovingAverage: RMSE = 7.1468, Under Predictions Count = 37
Length of predictions: 77, Length of test_data: 77
Fold 2, ExponentialMovingAverage: RMSE = 7.0998, Under Predictions Count = 37


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 1.7917, Under Predictions Count = 12
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.32728D+00    |proj g|=  1.97992D-01

At iterate    5    f=  2.30828D+00    |proj g|=  1.85893D-03

At iterate   10    f=  2.30828D+00    |proj g|=  3.12493D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     11     13      1     0     0   2.117D-06   2.308D+00
  F =   2.3082787322999465     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL         

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 1.7917, Under Predictions Count = 12




Fold 1, RandomForest: RMSE = 1.7499, Under Predictions Count = 12


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 1.7638, Under Predictions Count = 12
Length of predictions: 16, Length of test_data: 18
Fold 1, SimpleMovingAverage: RMSE = 2.1538, Under Predictions Count = 9
Length of predictions: 16, Length of test_data: 16
Fold 1, ExponentialMovingAverage: RMSE = 2.1813, Under Predictions Count = 9


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 1.6242, Under Predictions Count = 9
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.27888D+00    |proj g|=  3.79956D-01

At iterate    5    f=  2.21393D+00    |proj g|=  3.68721D-02

At iterate   10    f=  2.21355D+00    |proj g|=  4.64762D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     12     15      1     0     0   3.681D-06   2.214D+00
  F =   2.2135536244671599     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL          

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 1.6242, Under Predictions Count = 9




Fold 2, RandomForest: RMSE = 1.5144, Under Predictions Count = 9


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 1.6991, Under Predictions Count = 9
Length of predictions: 18, Length of test_data: 18
Fold 2, SimpleMovingAverage: RMSE = 1.6178, Under Predictions Count = 5
Length of predictions: 18, Length of test_data: 18
Fold 2, ExponentialMovingAverage: RMSE = 1.7922, Under Predictions Count = 7


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 10.6317, Under Predictions Count = 21
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.89709D+00    |proj g|=  5.66344D-02

At iterate    5    f=  3.79980D+00    |proj g|=  1.18549D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      9     17      1     0     0   4.903D-06   3.800D+00
  F =   3.7997556075965195     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 10.6316, Under Predictions Count = 21




Fold 1, RandomForest: RMSE = 10.4270, Under Predictions Count = 22


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 10.3039, Under Predictions Count = 22
Length of predictions: 61, Length of test_data: 61
Fold 1, SimpleMovingAverage: RMSE = 11.9315, Under Predictions Count = 30
Length of predictions: 61, Length of test_data: 61
Fold 1, ExponentialMovingAverage: RMSE = 12.3766, Under Predictions Count = 31


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 9.7504, Under Predictions Count = 27
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.92441D+00    |proj g|=  6.09846D-02
  ys=-2.309E-01  -gs= 7.607E-02 BFGS update SKIPPED

At iterate    5    f=  3.78573D+00    |proj g|=  1.57976D-02

At iterate   10    f=  3.78393D+00    |proj g|=  1.61216D-01

At iterate   15    f=  3.78337D+00    |proj g|=  7.44366D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     17     27      2     1     0 

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 9.7504, Under Predictions Count = 27




Fold 2, RandomForest: RMSE = 13.6490, Under Predictions Count = 9


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 12.0741, Under Predictions Count = 15
Length of predictions: 61, Length of test_data: 61
Fold 2, SimpleMovingAverage: RMSE = 11.0442, Under Predictions Count = 25
Length of predictions: 61, Length of test_data: 61
Fold 2, ExponentialMovingAverage: RMSE = 11.4034, Under Predictions Count = 28


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 8.6056, Under Predictions Count = 19
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.91671D+00    |proj g|=  9.12798D-02

At iterate    5    f=  3.83961D+00    |proj g|=  7.46210D-03

At iterate   10    f=  3.83959D+00    |proj g|=  8.53313D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     14     22      1     0     0   9.114D-07   3.840D+00
  F =   3.8395564731299494     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL         

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 8.6057, Under Predictions Count = 19




Fold 1, RandomForest: RMSE = 10.6421, Under Predictions Count = 12


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 15.9042, Under Predictions Count = 6
Length of predictions: 60, Length of test_data: 61
Fold 1, SimpleMovingAverage: RMSE = 10.6799, Under Predictions Count = 26
Length of predictions: 60, Length of test_data: 60
Fold 1, ExponentialMovingAverage: RMSE = 10.2536, Under Predictions Count = 26


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 9.9010, Under Predictions Count = 28
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.87289D+00    |proj g|=  8.66793D-02
  ys=-1.645E-01  -gs= 8.706E-02 BFGS update SKIPPED

At iterate    5    f=  3.72738D+00    |proj g|=  1.06272D-02

At iterate   10    f=  3.72645D+00    |proj g|=  1.21690D-01

At iterate   15    f=  3.72622D+00    |proj g|=  1.02833D-03

At iterate   20    f=  3.72619D+00    |proj g|=  7.24762D-06

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  N

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 9.9010, Under Predictions Count = 28




Fold 2, RandomForest: RMSE = 13.4410, Under Predictions Count = 8


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 17.6609, Under Predictions Count = 7
Length of predictions: 61, Length of test_data: 61
Fold 2, SimpleMovingAverage: RMSE = 11.0492, Under Predictions Count = 29
Length of predictions: 61, Length of test_data: 61
Fold 2, ExponentialMovingAverage: RMSE = 11.5843, Under Predictions Count = 30


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 27.4155, Under Predictions Count = 20
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.75495D+00    |proj g|=  1.60086D-02

At iterate    5    f=  4.73495D+00    |proj g|=  3.06874D-04

At iterate   10    f=  4.73477D+00    |proj g|=  1.12424D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     12     14      1     0     0   1.860D-06   4.735D+00
  F =   4.7347740699197711     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL        

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 27.4141, Under Predictions Count = 20




Fold 1, RandomForest: RMSE = 42.3037, Under Predictions Count = 28


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 54.4189, Under Predictions Count = 30
Length of predictions: 31, Length of test_data: 32
Fold 1, SimpleMovingAverage: RMSE = 28.5700, Under Predictions Count = 13
Length of predictions: 31, Length of test_data: 31
Fold 1, ExponentialMovingAverage: RMSE = 30.5695, Under Predictions Count = 15


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 27.6240, Under Predictions Count = 16
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.83646D+00    |proj g|=  3.07402D-02

At iterate    5    f=  4.75372D+00    |proj g|=  1.28084D-02

At iterate   10    f=  4.75308D+00    |proj g|=  4.15199D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     18      1     0     0   3.688D-07   4.753D+00
  F =   4.7530767515184511     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL        

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 27.6236, Under Predictions Count = 16




Fold 2, RandomForest: RMSE = 33.1863, Under Predictions Count = 12


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 51.3366, Under Predictions Count = 1
Length of predictions: 32, Length of test_data: 32
Fold 2, SimpleMovingAverage: RMSE = 29.9518, Under Predictions Count = 18
Length of predictions: 32, Length of test_data: 32
Fold 2, ExponentialMovingAverage: RMSE = 32.5179, Under Predictions Count = 18


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 14.3900, Under Predictions Count = 5
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.96521D+00    |proj g|=  4.25492D-02

At iterate    5    f=  3.87726D+00    |proj g|=  6.26816D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      7     11      1     0     0   5.697D-06   3.877D+00
  F =   3.8772171381317855     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 14.3904, Under Predictions Count = 5




Fold 1, RandomForest: RMSE = 9.9033, Under Predictions Count = 20


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 13.5763, Under Predictions Count = 25
Length of predictions: 31, Length of test_data: 32
Fold 1, SimpleMovingAverage: RMSE = 10.4912, Under Predictions Count = 8
Length of predictions: 31, Length of test_data: 31
Fold 1, ExponentialMovingAverage: RMSE = 11.0958, Under Predictions Count = 7


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 11.2865, Under Predictions Count = 19
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.98390D+00    |proj g|=  4.32835D-02

At iterate    5    f=  3.82428D+00    |proj g|=  4.20128D-03

At iterate   10    f=  3.82375D+00    |proj g|=  4.78527D-04

At iterate   15    f=  3.82375D+00    |proj g|=  9.81357D-04

At iterate   20    f=  3.82374D+00    |proj g|=  6.42426D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     24     33      1  

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 12.6513, Under Predictions Count = 24




Fold 2, RandomForest: RMSE = 11.0956, Under Predictions Count = 19


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 10.6880, Under Predictions Count = 15
Length of predictions: 32, Length of test_data: 32
Fold 2, SimpleMovingAverage: RMSE = 11.6201, Under Predictions Count = 15
Length of predictions: 32, Length of test_data: 32
Fold 2, ExponentialMovingAverage: RMSE = 11.8100, Under Predictions Count = 15


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 20.7143, Under Predictions Count = 20
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.43576D+00    |proj g|=  1.73062D-02

At iterate    5    f=  4.41824D+00    |proj g|=  1.95137D-03

At iterate   10    f=  4.41389D+00    |proj g|=  2.94099D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     12     17      1     0     0   2.557D-06   4.414D+00
  F =   4.4138887322160514     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL        

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 26.3269, Under Predictions Count = 27




Fold 1, RandomForest: RMSE = 29.5937, Under Predictions Count = 28


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 34.7744, Under Predictions Count = 28
Length of predictions: 31, Length of test_data: 32
Fold 1, SimpleMovingAverage: RMSE = 23.6504, Under Predictions Count = 16
Length of predictions: 31, Length of test_data: 31
Fold 1, ExponentialMovingAverage: RMSE = 22.4041, Under Predictions Count = 15


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 19.0209, Under Predictions Count = 17
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.55963D+00    |proj g|=  3.27466D-02

At iterate    5    f=  4.45771D+00    |proj g|=  1.01064D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      7     15      1     0     0   9.298D-07   4.458D+00
  F =   4.4577085252099122     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 19.0325, Under Predictions Count = 17




Fold 2, RandomForest: RMSE = 19.4328, Under Predictions Count = 12


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 19.8684, Under Predictions Count = 12
Length of predictions: 32, Length of test_data: 32
Fold 2, SimpleMovingAverage: RMSE = 21.7613, Under Predictions Count = 17
Length of predictions: 32, Length of test_data: 32
Fold 2, ExponentialMovingAverage: RMSE = 23.5270, Under Predictions Count = 16


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 20.7143, Under Predictions Count = 20
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.43576D+00    |proj g|=  1.73062D-02

At iterate    5    f=  4.41824D+00    |proj g|=  1.95137D-03

At iterate   10    f=  4.41389D+00    |proj g|=  2.94099D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     12     17      1     0     0   2.557D-06   4.414D+00
  F =   4.4138887322160514     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL        

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 26.3269, Under Predictions Count = 27




Fold 1, RandomForest: RMSE = 28.8431, Under Predictions Count = 28


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 34.7744, Under Predictions Count = 28
Length of predictions: 31, Length of test_data: 32
Fold 1, SimpleMovingAverage: RMSE = 23.6504, Under Predictions Count = 16
Length of predictions: 31, Length of test_data: 31
Fold 1, ExponentialMovingAverage: RMSE = 22.4041, Under Predictions Count = 15


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 19.0209, Under Predictions Count = 17
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.55963D+00    |proj g|=  3.27466D-02

At iterate    5    f=  4.45771D+00    |proj g|=  1.01064D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      7     15      1     0     0   9.298D-07   4.458D+00
  F =   4.4577085252099122     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 19.0325, Under Predictions Count = 17




Fold 2, RandomForest: RMSE = 19.5096, Under Predictions Count = 12


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 19.8684, Under Predictions Count = 12
Length of predictions: 32, Length of test_data: 32
Fold 2, SimpleMovingAverage: RMSE = 21.7613, Under Predictions Count = 17
Length of predictions: 32, Length of test_data: 32
Fold 2, ExponentialMovingAverage: RMSE = 23.5270, Under Predictions Count = 16


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 56.1400, Under Predictions Count = 29
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.45653D+00    |proj g|=  1.47938D-02

At iterate    5    f=  5.32119D+00    |proj g|=  3.28782D-03

At iterate   10    f=  5.32041D+00    |proj g|=  1.26153D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     13     22      1     0     0   1.059D-06   5.320D+00
  F =   5.3203985255567385     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL        

  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 54.4086, Under Predictions Count = 28




Fold 1, RandomForest: RMSE = 65.5092, Under Predictions Count = 41


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 71.8690, Under Predictions Count = 61
Length of predictions: 69, Length of test_data: 70
Fold 1, SimpleMovingAverage: RMSE = 58.3717, Under Predictions Count = 28
Length of predictions: 69, Length of test_data: 69
Fold 1, ExponentialMovingAverage: RMSE = 60.0743, Under Predictions Count = 27


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 52.6671, Under Predictions Count = 26
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  5.49547D+00    |proj g|=  2.01484D-02

At iterate    5    f=  5.37028D+00    |proj g|=  8.67891D-03

At iterate   10    f=  5.36783D+00    |proj g|=  2.64765D-04

At iterate   15    f=  5.36779D+00    |proj g|=  5.31456D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     16     28      1     0     0   1.804D-06   5.368D+00
  F =   5.3677881159666958 

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 52.3044, Under Predictions Count = 28




Fold 2, RandomForest: RMSE = 53.5516, Under Predictions Count = 30


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 66.1483, Under Predictions Count = 46
Length of predictions: 70, Length of test_data: 70
Fold 2, SimpleMovingAverage: RMSE = 57.4503, Under Predictions Count = 33
Length of predictions: 70, Length of test_data: 70
Fold 2, ExponentialMovingAverage: RMSE = 59.0142, Under Predictions Count = 30


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 28.2486, Under Predictions Count = 29
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.81527D+00    |proj g|=  2.93623D-02

At iterate    5    f=  4.68503D+00    |proj g|=  1.22047D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     16      1     0     0   3.250D-06   4.685D+00
  F =   4.6846647950520559     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 28.2831, Under Predictions Count = 29




Fold 1, RandomForest: RMSE = 32.7356, Under Predictions Count = 31


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 36.4050, Under Predictions Count = 53
Length of predictions: 69, Length of test_data: 70
Fold 1, SimpleMovingAverage: RMSE = 30.2657, Under Predictions Count = 29
Length of predictions: 69, Length of test_data: 69
Fold 1, ExponentialMovingAverage: RMSE = 31.1019, Under Predictions Count = 28


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 27.6241, Under Predictions Count = 26
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  4.84757D+00    |proj g|=  3.98388D-02
  ys=-9.646E-02  -gs= 7.694E-02 BFGS update SKIPPED

At iterate    5    f=  4.72467D+00    |proj g|=  2.27220D-02

At iterate   10    f=  4.72239D+00    |proj g|=  1.84039D-02

At iterate   15    f=  4.72231D+00    |proj g|=  1.46941D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     17     27      2     1     0

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 27.2635, Under Predictions Count = 29




Fold 2, RandomForest: RMSE = 27.9282, Under Predictions Count = 29


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 34.1875, Under Predictions Count = 45
Length of predictions: 70, Length of test_data: 70
Fold 2, SimpleMovingAverage: RMSE = 29.9987, Under Predictions Count = 31
Length of predictions: 70, Length of test_data: 70
Fold 2, ExponentialMovingAverage: RMSE = 30.6591, Under Predictions Count = 30


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 2.0712, Under Predictions Count = 2
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.11054D+00    |proj g|=  1.59543D-01

At iterate    5    f=  2.05797D+00    |proj g|=  4.95996D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     11      1     0     0   1.161D-06   2.058D+00
  F =   2.0579671032137545     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 2.0712, Under Predictions Count = 2




Fold 1, RandomForest: RMSE = 1.2454, Under Predictions Count = 5


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 1.5113, Under Predictions Count = 5
Length of predictions: 12, Length of test_data: 14
Fold 1, SimpleMovingAverage: RMSE = 1.5635, Under Predictions Count = 2
Length of predictions: 12, Length of test_data: 12
Fold 1, ExponentialMovingAverage: RMSE = 1.5718, Under Predictions Count = 3


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 3.2162, Under Predictions Count = 6
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.96784D+00    |proj g|=  3.30379D-01

At iterate    5    f=  1.95653D+00    |proj g|=  1.28832D-02

At iterate   10    f=  1.95640D+00    |proj g|=  1.59563D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     11     14      1     0     0   4.547D-07   1.956D+00
  F =   1.9564019027801327     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL          

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 3.3033, Under Predictions Count = 6




Fold 2, RandomForest: RMSE = 3.4259, Under Predictions Count = 6


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 3.6637, Under Predictions Count = 6
Length of predictions: 14, Length of test_data: 14
Fold 2, SimpleMovingAverage: RMSE = 3.2950, Under Predictions Count = 6
Length of predictions: 14, Length of test_data: 14
Fold 2, ExponentialMovingAverage: RMSE = 3.2421, Under Predictions Count = 6


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 2.0712, Under Predictions Count = 2
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.11054D+00    |proj g|=  1.59543D-01

At iterate    5    f=  2.05797D+00    |proj g|=  4.95996D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      8     11      1     0     0   1.161D-06   2.058D+00
  F =   2.0579671032137545     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 2.0712, Under Predictions Count = 2




Fold 1, RandomForest: RMSE = 1.2536, Under Predictions Count = 5


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 1.5113, Under Predictions Count = 5
Length of predictions: 12, Length of test_data: 14
Fold 1, SimpleMovingAverage: RMSE = 1.5635, Under Predictions Count = 2
Length of predictions: 12, Length of test_data: 12
Fold 1, ExponentialMovingAverage: RMSE = 1.5718, Under Predictions Count = 3


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 3.2162, Under Predictions Count = 6
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  1.96784D+00    |proj g|=  3.30379D-01

At iterate    5    f=  1.95653D+00    |proj g|=  1.28832D-02

At iterate   10    f=  1.95640D+00    |proj g|=  1.59563D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     11     14      1     0     0   4.547D-07   1.956D+00
  F =   1.9564019027801327     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL          

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 3.3033, Under Predictions Count = 6




Fold 2, RandomForest: RMSE = 3.4459, Under Predictions Count = 6


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 3.6637, Under Predictions Count = 6
Length of predictions: 14, Length of test_data: 14
Fold 2, SimpleMovingAverage: RMSE = 3.2950, Under Predictions Count = 6
Length of predictions: 14, Length of test_data: 14
Fold 2, ExponentialMovingAverage: RMSE = 3.2421, Under Predictions Count = 6


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 3.9454, Under Predictions Count = 2
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.48240D+00    |proj g|=  1.13968D-01

At iterate    5    f=  2.39645D+00    |proj g|=  7.32722D-04

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      9     14      1     0     0   8.128D-07   2.396D+00
  F =   2.3964477681592879     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


  return fit_method(estimator, *args, **kwargs)


Fold 1, StateSpaceModel: RMSE = 3.9454, Under Predictions Count = 2




Fold 1, RandomForest: RMSE = 2.7495, Under Predictions Count = 2


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 1, XGBoost: RMSE = 2.1216, Under Predictions Count = 2
Length of predictions: 14, Length of test_data: 16
Fold 1, SimpleMovingAverage: RMSE = 2.1822, Under Predictions Count = 2
Length of predictions: 14, Length of test_data: 14
Fold 1, ExponentialMovingAverage: RMSE = 2.2925, Under Predictions Count = 2


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 2, ExponentialSmoothing: RMSE = 2.4690, Under Predictions Count = 12
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  2.36214D+00    |proj g|=  1.84818D-01

At iterate    5    f=  2.27475D+00    |proj g|=  7.30498D-03

At iterate   10    f=  2.27222D+00    |proj g|=  7.82647D-03

At iterate   15    f=  2.27214D+00    |proj g|=  5.94300D-05

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3     16     22      1     0     0   7.114D-06   2.272D+00
  F =   2.2721401674126822  

  return fit_method(estimator, *args, **kwargs)


Fold 2, StateSpaceModel: RMSE = 2.7036, Under Predictions Count = 16




Fold 2, RandomForest: RMSE = 2.5495, Under Predictions Count = 5


  self._init_dates(dates, freq)
  self._init_dates(dates, freq)
  return get_prediction_index(
  return get_prediction_index(


Fold 2, XGBoost: RMSE = 2.5494, Under Predictions Count = 5
Length of predictions: 16, Length of test_data: 16
Fold 2, SimpleMovingAverage: RMSE = 1.9221, Under Predictions Count = 3
Length of predictions: 16, Length of test_data: 16
Fold 2, ExponentialMovingAverage: RMSE = 1.9716, Under Predictions Count = 3


 This problem is unconstrained.
  return get_prediction_index(
  return get_prediction_index(


Fold 1, ExponentialSmoothing: RMSE = 7.9534, Under Predictions Count = 0
RUNNING THE L-BFGS-B CODE

           * * *

Machine precision = 2.220D-16
 N =            3     M =           10

At X0         0 variables are exactly at the bounds

At iterate    0    f=  3.07127D+00    |proj g|=  6.63788D-02

At iterate    5    f=  2.94782D+00    |proj g|=  1.60896D-03

           * * *

Tit   = total number of iterations
Tnf   = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip  = number of BFGS updates skipped
Nact  = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F     = final function value

           * * *

   N    Tit     Tnf  Tnint  Skip  Nact     Projg        F
    3      9     14      1     0     0   3.433D-06   2.948D+00
  F =   2.9477324945662673     

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL            


KeyboardInterrupt: 

### Após fazer os experimentos e treinar os modelos podemos observar no navegador os modelos salvos, utilizando o comando mlflow ui no terminal.
<img src="EDA/mlflowgui.png"/>

### Para cada ingrediente vamos ter um modelo diferente que vai performar melhor, ele pode ser a média móvel (com lag=3 que era o utilizado anteriormente) ou pode ser outro modelo entre os testados. Para estimar o tanto que o modelo escolhido no experimento ajudou vamos selecionar o RSME do modelo com menor RSME e o RSME da média móvel, além disso vamos pegar a contagem de observações subestimadas, ou seja onde foi previsto um valor menor do que o real, vamos pegar a diferença do RSME do modelo com menor RSME e o RSME da média móvel e a variância dessas diferenças.

In [4]:
import mlflow
import pandas as pd
from statistics import mean
from statistics import stdev

# Load all experiments
experiments = mlflow.search_experiments()

# Initialize a list to store RMSE values
rmse_values = []
under_pred_values = []
sma_rmses = []
sma_under_pred = []
difs = []

# Loop through each experiment and retrieve RMSE values
for experiment in experiments:
    
    # Get runs for the experiment
    runs = mlflow.search_runs(experiment_ids=[experiment.experiment_id])
    tmp_sma_rsme = pd.DataFrame()
    tmp_rmse_values = pd.DataFrame()
    # Extract RMSE values for each run
    i = 0
    for index, row in runs.iterrows():
        # Assuming RMSE is logged as a metric named 'rmse'
        rmse = row['metrics.rmse']
        under_pred = row['metrics.under_predictions_count']
        model_name = row['tags.mlflow.runName']  # Adjust this if your model name is stored differently
        
        # Check for SimpleMovingAverage models
        if model_name == 'SimpleMovingAverage_fold_1' or model_name == 'SimpleMovingAverage_fold_2':
            
            tmp_sma_rsme.loc[i, 'model'] = model_name
            tmp_sma_rsme.loc[i, 'rmse'] = rmse
            tmp_sma_rsme.loc[i, 'under_predictions_count'] = under_pred
            
        else:
            
            tmp_rmse_values.loc[i, 'model'] = model_name
            tmp_rmse_values.loc[i, 'rmse'] = rmse
            tmp_rmse_values.loc[i, 'under_predictions_count'] = under_pred
        
        i = i + 1

    if tmp_rmse_values.shape[0] != 0 :

        tmp_rmse_values['md'] = [i.split('_')[0] for i in tmp_rmse_values['model']]
        tmp_rmse_values = tmp_rmse_values.groupby(by='md').agg({'rmse':'mean','under_predictions_count':'mean'}).reset_index()
        
        if tmp_sma_rsme.shape[0] != 0 and tmp_rmse_values.shape[0] != 0:

            sma_rmses.append( mean( tmp_sma_rsme['rmse'] ) )
            sma_under_pred.append( mean( tmp_sma_rsme['under_predictions_count'] ) )
            rmse_values.append( min(tmp_rmse_values['rmse'] ) )
            under_pred_values.append( mean(tmp_rmse_values['under_predictions_count']) )
            difs.append( mean(tmp_sma_rsme['rmse']) - min(tmp_rmse_values['rmse']) ) 

    else:
        continue

# Calculate the average RMSE for SimpleMovingAverage models
average_sma_rmse = sum(sma_rmses) / len(sma_rmses)
average_sma_under_pred = mean(sma_under_pred)

# Find the minimum RMSE from the other models
min_rmse = mean(rmse_values)
avg_min_under_pred = mean(under_pred_values)
# Calculate the difference
difference = mean(difs)
sd_diff = stdev(difs)

# Print results
print(f"Média do rsme dos modelos de média móvel: {round(average_sma_rmse)}")
print(f"Média do número de predições abaixo da média para a média móvel: {round(average_sma_under_pred)}")
print(f"Média do RSME dos modelos com menor RMSE: {round(min_rmse)}")
print(f"Média do número de predições abaixo da média para o modelo com rsme mínimo: {round(avg_min_under_pred)}")
print(f"média das diferenças entre o rsme da média móvel e dos modelos com menor rsme: {round(difference)}")
print(f"Desvio padrão das diferenças entre o rsme da média móvel e dos modelos com menor rsme: {round(sd_diff)}")

Média do rsme dos modelos de média móvel: 16
Média do número de predições abaixo da média para a média móvel: 32
Média do RSME dos modelos com menor RMSE: 14
Média do número de predições abaixo da média para o modelo com rsme mínimo: 35
média das diferenças entre o rsme da média móvel e dos modelos com menor rsme: 1
Desvio padrão das diferenças entre o rsme da média móvel e dos modelos com menor rsme: 1


### Por mais que tenhamos ingredientes que possuem comportamentos e demandas diferentes ao tirar uma média entre o RSME dos modelos com menor RSME e o RSME do modelo média móvel (com lag = 3), juntamente com o desvio padrão dessas diferenças para todos os modelos que é igual a 1, podemos ter uma noção de como está performando os modelos de nosso experimento, e de acordo com os resultados temos uma melhora entre os valores preditos com os modelos que escolhemos em nosso experimento em relação com a média móvel, na média temos uma redução de 12,5% nos valores do RSME, o que indica que as previsões feitas pelos modelos do nosso experimento erram menos.  Além disso com os modelos de nosso experimentos tendemos a errar um pouco mais de tal forma que subestimamos a demanda do ingrediente, ou seja, estimamos uma quantidade um pouco menor do que o real valor, não temos uma diferença muito grande em relação a essa métrica quando comparamos com a mesma métrica no caso da média móvel, tendo um aumento de 9% nessa métrica com os modelos do nosso experimento, neste caso teria de ser avaliado o real custo de errar a estimativa para mais ou menos e então escolher o modelo de maneira a minimizar os custos em um caso real teria de ser avaliado individualmente esse custo para cada ingrediente.

In [None]:
# deletar todos os experimentos caso seja necessário rodar todos eles novamente
""" # Delete an experiment by ID
experiments = mlflow.search_experiments()
for exp in experiments:
    if exp.experiment_id != '0':
        mlflow.delete_experiment(exp.experiment_id) """