--- 

## Bottom up approch

---  
### Approche Bottom-Up en Séries Temporelles

L'approche **Bottom-Up** en séries temporelles consiste à construire des prévisions globales en partant des niveaux les plus bas de granularité (par exemple, les magasins individuels) et en les agrégeant progressivement jusqu'à obtenir une vue d'ensemble.

---



#### Étapes
1. **Prévisions locales** :
   - Les séries temporelles sont modélisées pour chaque magasin individuellement (ou pour chaque cluster).
2. **Agrégation** :
   - Les résultats obtenus sont agrégés pour calculer les prévisions globales.
3. **Comparaison avec la prévision globale** :
   - Les prévisions obtenues par agrégation sont comparées à une prévision globale réalisée directement.

---





In [27]:
import pandas as pd
import plotly.graph_objects as go
from functions import *
import plotly.express as px

import warnings

# Ignorer les avertissements spécifiques
warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings("ignore", category=FutureWarning)

In [28]:
# --clusters--
nodes = ["Prévisions Globales", "Cluster 0", "Cluster 1", "Cluster 2", "Cluster 3", "Cluster 4",
         "{32, 42, 20, 21, 22, 52, 53, 29}", "{7, 8, 9, 11, 48, 49, 50, 51}",
         "{3, 44, 45, 46, 47}", "{2, 34, 4, 5, 6, 37, 38, 39, 40, 43, 17, 24, 27, 28, 31}",
         "{1, 33, 35, 36, 41, 10, 12, 13, 14, 15, 16, 18, 19, 54, 23, 25, 26, 30}"]

source = [0, 0, 0, 0, 0, 1, 2, 3, 4, 5]
target = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# --dendrogramme pour visualiser l'approche--
fig = go.Figure(go.Sankey(node=dict(label=nodes), link=dict(source=source, target=target, value=[1]*10)))
fig.update_layout(title_text="Dendrogramme Hiérarchique", font_size=10)
fig.show()

In [29]:
# --Tables--
train_clustered = pd.read_csv("all_clusters_time_series_train.csv")
test_clustered = pd.read_csv("all_clusters_time_series_test.csv")

---
## SARIMA 



### Pourquoi choisir `SARIMA` ?

Dans notre benchmark de prévision global, la méthode `SARIMA` arrive en dernière position avec un RMSE de **0.14**. Cela montre qu'il y a une marge de progression et que nous pouvons faire mieux.

#### Objectif
Améliorer la précision des prédictions en utilisant une approche **bottom-up**.


In [30]:
#--train--
Cluster_0_train = train_clustered[train_clustered['cluster']=='Cluster 0']
Cluster_1_train = train_clustered[train_clustered['cluster']=='Cluster 1']
Cluster_2_train = train_clustered[train_clustered['cluster']=='Cluster 2']
Cluster_3_train = train_clustered[train_clustered['cluster']=='Cluster 3']
Cluster_4_train = train_clustered[train_clustered['cluster']=='Cluster 4']
#--test--
Cluster_0_test = test_clustered[test_clustered['cluster']=='Cluster 0']
Cluster_1_test = test_clustered[test_clustered['cluster']=='Cluster 1']
Cluster_2_test = test_clustered[test_clustered['cluster']=='Cluster 2']
Cluster_3_test = test_clustered[test_clustered['cluster']=='Cluster 3']
Cluster_4_test = test_clustered[test_clustered['cluster']=='Cluster 4']

#### Utilisation de `auto_arima` pour l'optimisation des modèles

Nous utilisons la fonction `auto_arima` afin de trouver les meilleurs modèles pour chaque cluster, en minimisant le critère d'information d'Akaike (`AIC`).

In [None]:
#--cluster0--
result_0 = find_best_auto_arima(Cluster_0_train)

#--cluster1--
result_1 = find_best_auto_arima(Cluster_1_train)

#--cluster2--
result_2 = find_best_auto_arima(Cluster_2_train)

#--cluster3--
result_3 = find_best_auto_arima(Cluster_3_train)

#--cluster4--
result_4 = find_best_auto_arima(Cluster_4_train)

Performing stepwise search to minimize aic
 ARIMA(2,1,2)(1,0,1)[7] intercept   : AIC=22445.161, Time=4.93 sec
 ARIMA(0,1,0)(0,0,0)[7] intercept   : AIC=23234.764, Time=0.03 sec
 ARIMA(1,1,0)(1,0,0)[7] intercept   : AIC=22796.172, Time=0.42 sec
 ARIMA(0,1,1)(0,0,1)[7] intercept   : AIC=22838.875, Time=0.28 sec
 ARIMA(0,1,0)(0,0,0)[7]             : AIC=23232.839, Time=0.02 sec
 ARIMA(2,1,2)(0,0,1)[7] intercept   : AIC=inf, Time=2.06 sec
 ARIMA(2,1,2)(1,0,0)[7] intercept   : AIC=inf, Time=1.73 sec
 ARIMA(2,1,2)(2,0,1)[7] intercept   : AIC=inf, Time=4.55 sec
 ARIMA(2,1,2)(1,0,2)[7] intercept   : AIC=inf, Time=6.39 sec
 ARIMA(2,1,2)(0,0,0)[7] intercept   : AIC=22885.748, Time=1.05 sec
 ARIMA(2,1,2)(0,0,2)[7] intercept   : AIC=inf, Time=5.38 sec
 ARIMA(2,1,2)(2,0,0)[7] intercept   : AIC=inf, Time=5.97 sec
 ARIMA(2,1,2)(2,0,2)[7] intercept   : AIC=inf, Time=5.84 sec
 ARIMA(1,1,2)(1,0,1)[7] intercept   : AIC=inf, Time=1.74 sec
 ARIMA(2,1,1)(1,0,1)[7] intercept   : AIC=22510.844, Time=2.58 sec


In [6]:
#print(result_0['summary'])
print(f"\nMeilleur modèle 0: {result_0['best_params']}")

#print(result_1['summary'])
print(f"\nMeilleur modèle 1: {result_1['best_params']}")

#print(result_2['summary'])
print(f"\nMeilleur modèle 2: {result_2['best_params']}")

#print(result_3['summary'])
print(f"\nMeilleur modèle 3: {result_3['best_params']}")

#print(result_4['summary'])
print(f"\nMeilleur modèle 4: {result_4['best_params']}")


Meilleur modèle 0: {'order': (5, 1, 2), 'seasonal_order': (2, 0, 2, 7), 'aic': 22278.036452004126}

Meilleur modèle 1: {'order': (4, 1, 4), 'seasonal_order': (1, 0, 1, 7), 'aic': 25721.52285698665}

Meilleur modèle 2: {'order': (5, 1, 3), 'seasonal_order': (2, 0, 1, 7), 'aic': 25895.83484835895}

Meilleur modèle 3: {'order': (4, 1, 5), 'seasonal_order': (2, 0, 1, 7), 'aic': 25856.539746644863}

Meilleur modèle 4: {'order': (1, 1, 2), 'seasonal_order': (1, 0, 1, 7), 'aic': 25254.246019963397}


In [None]:
#--cluster0--
p, d, q = 2, 1, 0  # Non-saisonnier
P, D, Q, s = 2, 0, 2, 7  # Saisonnier
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_0_train, Cluster_0_test, p, d, q, P, D, Q, s
)
Cluster_0_train['Train Forecast'] = train_forecast_values
Cluster_0_test['Test Forecast'] = test_forecast_values

#--cluster1--
p, d, q = 4, 1, 4  # Non-saisonnier
P, D, Q, s = 1, 0, 1, 7  # Saisonnier
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_1_train, Cluster_1_test, p, d, q, P, D, Q, s
)
Cluster_1_train['Train Forecast'] = train_forecast_values
Cluster_1_test['Test Forecast'] = test_forecast_values

#--cluster2--
p, d, q = 5, 1, 3  # Non-saisonnier
P, D, Q, s = 2, 0, 1, 7  # Saisonnier
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_2_train, Cluster_2_test, p, d, q, P, D, Q, s
)
Cluster_2_train['Train Forecast'] = train_forecast_values
Cluster_2_test['Test Forecast'] = test_forecast_values

#--cluster3--
p, d, q = 4, 1, 5  # Non-saisonnier
P, D, Q, s = 2, 0, 1, 7  # Saisonnier
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_3_train, Cluster_3_test, p, d, q, P, D, Q, s
)
Cluster_3_train['Train Forecast'] = train_forecast_values
Cluster_3_test['Test Forecast'] = test_forecast_values

#--cluster4--
p, d, q = 1, 1, 2  # Non-saisonnier
P, D, Q, s = 1, 0, 1, 7  # Saisonnier
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_4_train, Cluster_4_test, p, d, q, P, D, Q, s
)
Cluster_4_train['Train Forecast'] = train_forecast_values
Cluster_4_test['Test Forecast'] = test_forecast_values



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/

In [None]:
Cluster_0_test['cluster'] = 'Cluster 0'
Cluster_1_test['cluster'] = 'Cluster 1'
Cluster_2_test['cluster'] = 'Cluster 2'
Cluster_3_test['cluster'] = 'Cluster 3'
Cluster_4_test['cluster'] = 'Cluster 4'
all_test_data = pd.concat([Cluster_0_test,Cluster_1_test, Cluster_2_test, Cluster_3_test, Cluster_4_test])


fig = px.line(all_test_data,x='date',y=['sales','Test Forecast'],color='cluster',title="Test Forecast pour tous les clusters")
fig.show()




A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/

### Trouver le meilleur modèle par cluster

Nous allons tester toutes les combinaisons des ordres saisonniers et non saisonniers pour un modèle SARIMA afin de trouver celui qui minimise le `RMSE`.

In [9]:
p_values = range(0, 4)
d_values = [1]
q_values = range(0, 4)
P_values = range(0, 2)
D_values = range(0, 2)
Q_values = range(0, 2)
s_values = [4,7,12]

In [None]:
#--cluster0--
print('Cluster 0')
best_model_0 = find_best_sarimax(
    series_train=Cluster_0_train,
    series_test=Cluster_0_test,
    p_values=p_values,
    d_values=d_values,
    q_values=q_values,
    P_values=P_values,
    D_values=D_values,
    Q_values=Q_values,
    s_values=s_values
)

Cluster 0


Testing SARIMAX models:  65%|██████▌   | 251/384 [03:19<03:19,  1.50s/it]

Erreur avec les paramètres (2, 1, 2, 0, 1, 1, 7): LU decomposition error.


Testing SARIMAX models: 100%|██████████| 384/384 [07:37<00:00,  1.19s/it]


In [None]:
#--cluster0--
print('Cluster 0')
best_model_0 = find_best_sarimax(
    series_train=Cluster_0_train,
    series_test=Cluster_0_test,
    p_values=p_values,
    d_values=d_values,
    q_values=q_values,
    P_values=P_values,
    D_values=D_values,
    Q_values=Q_values,
    s_values=s_values
)

#--cluster1--
print('Cluster 1')
best_model_1 = find_best_sarimax(
    series_train=Cluster_1_train,
    series_test=Cluster_1_test,
    p_values=p_values,
    d_values=d_values,
    q_values=q_values,
    P_values=P_values,
    D_values=D_values,
    Q_values=Q_values,
    s_values=s_values
)


#--cluster2--
print('Cluster 2')
best_model_2 = find_best_sarimax(
    series_train=Cluster_2_train,
    series_test=Cluster_2_test,
    p_values=p_values,
    d_values=d_values,
    q_values=q_values,
    P_values=P_values,
    D_values=D_values,
    Q_values=Q_values,
    s_values=s_values
)


#--cluster3--
print('Cluster 3')
best_model_3 = find_best_sarimax(
    series_train=Cluster_3_train,
    series_test=Cluster_3_test,
    p_values=p_values,
    d_values=d_values,
    q_values=q_values,
    P_values=P_values,
    D_values=D_values,
    Q_values=Q_values,
    s_values=s_values
)


#--cluster4--
print('Cluster 4')
best_model_4 = find_best_sarimax(
    series_train=Cluster_4_train,
    series_test=Cluster_4_test,
    p_values=p_values,
    d_values=d_values,
    q_values=q_values,
    P_values=P_values,
    D_values=D_values,
    Q_values=Q_values,
    s_values=s_values
)


Cluster 0


Testing SARIMAX models:  65%|██████▌   | 251/384 [04:27<05:29,  2.48s/it]

Erreur avec les paramètres (2, 1, 2, 0, 1, 1, 7): LU decomposition error.


Testing SARIMAX models: 100%|██████████| 384/384 [08:29<00:00,  1.33s/it]


Cluster 1


Testing SARIMAX models: 100%|██████████| 384/384 [15:18<00:00,  2.39s/it]


Cluster 2


Testing SARIMAX models: 100%|██████████| 384/384 [11:21<00:00,  1.78s/it] 


Cluster 3


Testing SARIMAX models: 100%|██████████| 384/384 [06:54<00:00,  1.08s/it]


Cluster 4


Testing SARIMAX models: 100%|██████████| 384/384 [23:43<00:00,  3.71s/it]   


In [12]:
#--0--
min_rmse_index_0 = best_model_0[0].index(min(best_model_0[0]))
best_params_0 = best_model_0[1][min_rmse_index_0]

#--1--
min_rmse_index_1 = best_model_1[0].index(min(best_model_1[0]))
best_params_1 = best_model_1[1][min_rmse_index_1]

#--2--
min_rmse_index_2 = best_model_2[0].index(min(best_model_2[0]))
best_params_2 = best_model_2[1][min_rmse_index_2]

#--3--
min_rmse_index_3 = best_model_3[0].index(min(best_model_3[0]))
best_params_3 = best_model_3[1][min_rmse_index_3]

#--4--
min_rmse_index_4 = best_model_4[0].index(min(best_model_4[0]))
best_params_4 = best_model_4[1][min_rmse_index_4]

In [None]:
#--cluster0--
p, d, q = best_params_0['order']
P, D, Q, s = best_params_0['seasonal_order']
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_0_train, Cluster_0_test, p, d, q, P, D, Q, s
)
Cluster_0_train['Train_Forecast'] = train_forecast_values
Cluster_0_test['Test_Forecast'] = test_forecast_values

#--cluster1--
p, d, q = best_params_1['order']
P, D, Q, s = best_params_1['seasonal_order']
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_1_train, Cluster_1_test, p, d, q, P, D, Q, s
)
Cluster_1_train['Train_Forecast'] = train_forecast_values
Cluster_1_test['Test_Forecast'] = test_forecast_values

#--cluster2--
p, d, q = best_params_2['order']
P, D, Q, s = best_params_2['seasonal_order']
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_2_train, Cluster_2_test, p, d, q, P, D, Q, s
)
Cluster_2_train['Train_Forecast'] = train_forecast_values
Cluster_2_test['Test_Forecast'] = test_forecast_values

#--cluster3--
p, d, q = best_params_3['order']
P, D, Q, s = best_params_3['seasonal_order']
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_3_train, Cluster_3_test, p, d, q, P, D, Q, s
)
Cluster_3_train['Train_Forecast'] = train_forecast_values
Cluster_3_test['Test_Forecast'] = test_forecast_values

#--cluster4--
p, d, q = best_params_4['order']
P, D, Q, s = best_params_4['seasonal_order']
train_forecast_values, test_forecast_values = fit_and_forecast_sarimax(
    Cluster_4_train, Cluster_4_test, p, d, q, P, D, Q, s
)
Cluster_4_train['Train_Forecast'] = train_forecast_values
Cluster_4_test['Test_Forecast'] = test_forecast_values



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/

In [None]:
Cluster_0_test['cluster'] = 'Cluster 0'
Cluster_1_test['cluster'] = 'Cluster 1'
Cluster_2_test['cluster'] = 'Cluster 2'
Cluster_3_test['cluster'] = 'Cluster 3'
Cluster_4_test['cluster'] = 'Cluster 4'
all_test_data = pd.concat([Cluster_0_test,Cluster_1_test, Cluster_2_test, Cluster_3_test, Cluster_4_test])

fig = px.line(all_test_data,x='date',y=['sales','Test_Forecast'],color='cluster',title="Test Forecast pour tous les clusters")
fig.show()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/

ValueError: All arguments should have the same length. The length of argument `y` is 2, whereas the length of  previously-processed arguments ['date'] is 3040

Exemple cluster 3 :

In [None]:
fig = px.line(Cluster_3_test, x='date', y=['sales', 'Test_Forecast','Test Forecast'],title="Predictions du test")
fig.show()

>SARIMA : Grande difficulté à capturer les ventes, même en choisissant les paramètres en minimisant l'AIC ou le RMSE.

---

## Prophet Online

In [None]:
vars = ['date', 'sales']
chunk_size = 7  # Prévisions hebdomadaires
window_size = 365  # Taille de la fenêtre glissante (1 an)

testing_0 = rolling_prophet_forecast(Cluster_0_train, Cluster_0_test, vars, chunk_size, window_size)

testing_1 = rolling_prophet_forecast(Cluster_1_train, Cluster_1_test, vars, chunk_size, window_size)

testing_2 = rolling_prophet_forecast(Cluster_2_train, Cluster_2_test, vars, chunk_size, window_size)

testing_3 = rolling_prophet_forecast(Cluster_3_train, Cluster_3_test, vars, chunk_size, window_size)

testing_4 = rolling_prophet_forecast(Cluster_4_train, Cluster_4_test, vars, chunk_size, window_size)

17:31:06 - cmdstanpy - INFO - Chain [1] start processing
17:31:06 - cmdstanpy - INFO - Chain [1] done processing
17:31:06 - cmdstanpy - INFO - Chain [1] start processing
17:31:06 - cmdstanpy - INFO - Chain [1] done processing
17:31:06 - cmdstanpy - INFO - Chain [1] start processing
17:31:06 - cmdstanpy - INFO - Chain [1] done processing
17:31:06 - cmdstanpy - INFO - Chain [1] start processing
17:31:06 - cmdstanpy - INFO - Chain [1] done processing
17:31:06 - cmdstanpy - INFO - Chain [1] start processing
17:31:06 - cmdstanpy - INFO - Chain [1] done processing
17:31:07 - cmdstanpy - INFO - Chain [1] start processing
17:31:07 - cmdstanpy - INFO - Chain [1] done processing
17:31:07 - cmdstanpy - INFO - Chain [1] start processing
17:31:07 - cmdstanpy - INFO - Chain [1] done processing
17:31:07 - cmdstanpy - INFO - Chain [1] start processing
17:31:07 - cmdstanpy - INFO - Chain [1] done processing
17:31:07 - cmdstanpy - INFO - Chain [1] start processing
17:31:07 - cmdstanpy - INFO - Chain [1]

In [17]:
Cluster_0_test['test_Forecast_prophet_online'] = np.nan
Cluster_0_test['test_Forecast_prophet_online'] = np.concatenate(testing_0)[:-1]

Cluster_1_test['test_Forecast_prophet_online'] = np.nan
Cluster_1_test['test_Forecast_prophet_online'] = np.concatenate(testing_1)[:-1]

Cluster_2_test['test_Forecast_prophet_online'] = np.nan
Cluster_2_test['test_Forecast_prophet_online'] = np.concatenate(testing_2)[:-1]

Cluster_3_test['test_Forecast_prophet_online'] = np.nan
Cluster_3_test['test_Forecast_prophet_online'] = np.concatenate(testing_3)[:-1]

Cluster_4_test['test_Forecast_prophet_online'] = np.nan
Cluster_4_test['test_Forecast_prophet_online'] = np.concatenate(testing_4)[:-1]



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/

In [None]:
Cluster_0_test['cluster'] = 'Cluster 0'
Cluster_1_test['cluster'] = 'Cluster 1'
Cluster_2_test['cluster'] = 'Cluster 2'
Cluster_3_test['cluster'] = 'Cluster 3'
Cluster_4_test['cluster'] = 'Cluster 4'
all_test_data = pd.concat([Cluster_0_test,Cluster_1_test, Cluster_2_test, Cluster_3_test, Cluster_4_test])

fig = px.line(all_test_data,x='date',y=['sales','test_Forecast_prophet_online'],color='cluster',title="test_Forecast_prophet_online pour tous les clusters")
fig.show()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/

In [19]:
#Cluster_0_test.to_csv('Cluster_0_test.csv', index=False)
#Cluster_1_test.to_csv('Cluster_1_test.csv', index=False)
#Cluster_2_test.to_csv('Cluster_2_test.csv', index=False)
#Cluster_3_test.to_csv('Cluster_3_test.csv', index=False)
#Cluster_4_test.to_csv('Cluster_4_test.csv', index=False)