<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Задача" data-toc-modified-id="Задача-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Задача</a></span></li><li><span><a href="#Загрузки" data-toc-modified-id="Загрузки-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Загрузки</a></span></li><li><span><a href="#EDA" data-toc-modified-id="EDA-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>EDA</a></span></li><li><span><a href="#Убираем-дырки" data-toc-modified-id="Убираем-дырки-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Убираем дырки</a></span></li><li><span><a href="#Строим-графики" data-toc-modified-id="Строим-графики-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Строим графики</a></span></li></ul></div>

# Предварительный анализ задачи торговой стратегии

<div class="alert alert-info">
<font size="4", color = "black"><b>✍ Вопрос</b></font>
    <br /> 
    <font size="3", color = "black">
<br /> Добрый день. Немного запутался в задаче и в разных результатах. Буду благодарен за помощь.


А еще будет отлично, если подскатеже по структуре и в целом, как это должно выглядеть в идеале.

## Задача

1. Загрузить данные о котировках ценных бумаг из списка SnP500 и котировки криптовалют (BTC, ETH, SOL, XRP).

2. Подготовьте автоматическое отображение графиков текущей ситуации.

3. Проверьть пропуски и ошибки. 

4. Проанализировать выбросы. Определить, что это: выбросы или реальные данные, с которыми предстоит работать.

## Загрузки

In [1]:
'''Системные'''
import os
from datetime import datetime, timedelta
from tqdm import tqdm 

'''База'''
import talib
import yfinance as yf
import pandas as pd
import numpy as np


'''Графики'''
from plotly.subplots import make_subplots
import plotly.express as px
import plotly.graph_objects as go
import dash
from dash import dcc, html
import plotly.graph_objects as go

'''Обучение'''
import lightgbm as lgb
from lightgbm import early_stopping
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report, accuracy_score
from catboost import CatBoostClassifier
import optuna


# Настройка уровня логирования Optuna
import logging
optuna.logging.set_verbosity(optuna.logging.ERROR)



  from .autonotebook import tqdm as notebook_tqdm


In [2]:
tickers_crypt = ['BTC-USD', 'ETH-USD', 'SOL-USD', 'XRP-USD']
output_crypt_file = 'crypto_data.csv'
output_file = 'snp500_data.csv'
all_data = []
all_data_crypt = []
end_date = datetime.today().strftime('%Y-%m-%d')
start_date = (datetime.today() - timedelta(days=1 * 365)).strftime('%Y-%m-%d')

In [3]:
if os.path.exists(output_file):
    data = pd.read_csv(output_file, index_col=0)
    print("Данные успешно загружены:")
    display(data.head()) 

else:

    url = "https://en.wikipedia.org/wiki/List_of_S%26P_500_companies"

    table = pd.read_html(url)[0]
    tickers = table['Symbol'].tolist()[:2]
    print(f"Всего {len(tickers)} тикеров.")

    for ticker in tqdm(tickers, desc="Загрузка данных", unit="тикер"):
        try:
            data_temp = yf.download(ticker, start=start_date, end=end_date, progress=False)
            if isinstance(data_temp.columns, pd.MultiIndex):
                data_temp.columns = data_temp.columns.droplevel([1])
            data_temp = data_temp.reset_index(drop=False)
#             if 'Price' in data_temp.index.names:
#                 data_temp.index = data_temp.index.droplevel('Price')
            data_temp.index.name = 'Price'
#             data_temp.rename_axis(None, inplace=True)
            data_temp.columns.name = None
            data_temp['Ticker'] = ticker
            data_temp = data_temp.reset_index(drop=True)
#             data_temp = data_temp.set_index('Date')
#             display(data_temp)
            all_data.append(data_temp)
        except Exception as e:
            print(f"Ошибка в тикере {ticker}: {e}")

    if all_data:
        data = pd.concat(all_data, axis=0, ignore_index=True)
        data.to_csv(output_file)
        print(f"Данные сохранены в {output_file}")
    else:
        print("Нет данных.")
        

Данные успешно загружены:


Unnamed: 0,Date,Close,High,Low,Open,Volume,Ticker
0,2022-12-27,92.099701,92.567017,91.287646,92.038416,2166195,MMM
1,2022-12-28,90.621147,92.697265,90.590508,92.199303,2345356,MMM
2,2022-12-29,92.367836,92.590005,90.782026,91.06548,2464717,MMM
3,2022-12-30,91.869881,91.954157,90.789694,91.663041,2506816,MMM
4,2023-01-03,93.82341,93.953648,92.214616,93.095624,3124909,MMM


In [4]:
if os.path.exists(output_crypt_file):
    data_crypt = pd.read_csv(output_crypt_file, index_col=0)
    print("Данные успешно загружены:")
    print(data_crypt.head())
else:
#     all_data_crypt = [] 
    for ticker in tqdm(tickers_crypt, desc="Загрузка данных", unit="тикер"):
        try:
            data_temp_crypt = yf.download(ticker, start=start_date, end=end_date, progress=False)
            
            if isinstance(data_temp_crypt.columns, pd.MultiIndex):
                data_temp_crypt.columns = data_temp_crypt.columns.droplevel([1])
            
            data_temp_crypt = data_temp_crypt.reset_index(drop=False)
            data_temp_crypt.columns.name = None
            data_temp_crypt['Ticker'] = ticker
            
#             print(ticker, ' ', len(data_temp_crypt))
            if not data_temp_crypt.empty:
                all_data_crypt.append(data_temp_crypt)
        except Exception as e:
            print(f"Ошибка в тикере {ticker}: {e}")

    if all_data_crypt: 
        data_crypt = pd.concat(all_data_crypt, axis=0, ignore_index=True)
        print("Данные объединены.")
    else:
        print("Нет данных.")

Загрузка данных: 100%|██████████| 4/4 [00:01<00:00,  2.36тикер/s]

Данные объединены.





In [5]:
df = pd.concat([data, data_crypt], axis=0, ignore_index=True)
df

Unnamed: 0,Date,Close,High,Low,Open,Volume,Ticker
0,2022-12-27,92.099701,92.567017,91.287646,92.038416,2166195,MMM
1,2022-12-28,90.621147,92.697265,90.590508,92.199303,2345356,MMM
2,2022-12-29,92.367836,92.590005,90.782026,91.065480,2464717,MMM
3,2022-12-30,91.869881,91.954157,90.789694,91.663041,2506816,MMM
4,2023-01-03,93.823410,93.953648,92.214616,93.095624,3124909,MMM
...,...,...,...,...,...,...,...
2459,2024-12-28 00:00:00,2.180824,2.199492,2.135006,2.141667,2759395789,XRP-USD
2460,2024-12-29 00:00:00,2.093180,2.192813,2.071951,2.180833,3053146362,XRP-USD
2461,2024-12-30 00:00:00,2.057571,2.143189,2.000231,2.093189,6671570513,XRP-USD
2462,2024-12-31 00:00:00,2.080128,2.140974,2.014132,2.057545,4725443244,XRP-USD


## EDA

In [6]:
def viewing_statistics(df_list):
    print('Посмотрим на данные:')
    for i in table:
        if len(i) >= 3:
            display(i.sample(3))
        else:
            display(i)
        display(i.info())
        display(i.columns)
        print('\n')
table = [df]   
viewing_statistics(table)

Посмотрим на данные:


Unnamed: 0,Date,Close,High,Low,Open,Volume,Ticker
295,2024-03-01,75.867287,76.015945,75.066157,76.015945,4064128,MMM
2033,2024-10-28 00:00:00,178.104904,179.407974,172.872253,176.55368,3623055701,SOL-USD
982,2024-11-22,73.440002,73.800003,73.0,73.099998,716600,AOS


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2464 entries, 0 to 2463
Data columns (total 7 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Date    2464 non-null   object 
 1   Close   2464 non-null   float64
 2   High    2464 non-null   float64
 3   Low     2464 non-null   float64
 4   Open    2464 non-null   float64
 5   Volume  2464 non-null   int64  
 6   Ticker  2464 non-null   object 
dtypes: float64(4), int64(1), object(2)
memory usage: 134.9+ KB


None

Index(['Date', 'Close', 'High', 'Low', 'Open', 'Volume', 'Ticker'], dtype='object')





In [7]:
print('Проверим пропуски:')
for i in table:
    display(i.isnull().mean().sort_values())

Проверим пропуски:


Date      0.0
Close     0.0
High      0.0
Low       0.0
Open      0.0
Volume    0.0
Ticker    0.0
dtype: float64

## Заполняем пропуски
пока нет, что бы не забыть оставлю.

In [8]:
df = df.fillna(method='ffill')
df = pd.DataFrame(df.replace(to_replace=0, method='ffill'))

  df = df.fillna(method='ffill')
  df = pd.DataFrame(df.replace(to_replace=0, method='ffill'))


In [9]:
# pip install nbformat

## Строим графики

In [10]:
fig = go.Figure()

for ticker in df['Ticker'].unique():
    _ = df[df['Ticker'] == ticker].copy()
    
    '''Убираю ошибку с лишней линией, проверю дубликаты и пропуски после'''
    _ = _.drop_duplicates(subset=['Date']).sort_values(by='Date')
    if _['Close'].isnull().any():
        _['Close'] = _['Close'].fillna(method='ffill')  # Заполнение пропусков предыдущим значением
    

    _['Growth'] = (_['Close'] / _['Close'].iloc[0]) * 100  # Нормализация, первое значение = 100%
    fig.add_trace(go.Scatter(
        x=_['Date'],
        y=_['Growth'],
        mode='lines',
        name=ticker,
    ))

fig.update_layout(
    template="plotly_dark",
    title="Темпы прироста всех тикеров (нормализация к 100%)",
    title_x=0.5,
    xaxis_title="Дата",
    yaxis_title="Темп прироста (%)",
    xaxis=dict(showgrid=False),
    yaxis=dict(showgrid=False),
    font=dict(size=14),
)

fig.show()


In [11]:
for ticker in df['Ticker'].unique():
    _ = df[df['Ticker'] == ticker]
    
    '''Убираю ошибку с лишней линией, проверю дубликаты и пропуски после'''
    _ = _.drop_duplicates(subset=['Date']).sort_values(by='Date')
    if _['Close'].isnull().any():
        _['Close'] = _['Close'].fillna(method='ffill')  #'''Заполнение пропусков предыдущим значением'''
    
    '''Вычисление максимума и минимума'''
    max_row = _.loc[_['Close'].idxmax()]
    min_row = _.loc[_['Close'].idxmin()]
    

    fig = px.line(
        _,
        x='Date',
        y='Close',
        title=f'Временной ряд для {ticker}',
        labels={'Close': 'Цена закрытия', 'Date': 'Дата'},
    )
    
    fig.add_annotation(
        x=max_row['Date'],
        y=max_row['Close'],
        text=f"Макс: {max_row['Close']:.2f}",
        showarrow=True,
        arrowhead=2,
        ax=20,
        ay=-30,
        bgcolor="green",
        font=dict(color="white"),
    )
    fig.add_annotation(
        x=min_row['Date'],
        y=min_row['Close'],
        text=f"Мин: {min_row['Close']:.2f}",
        showarrow=True,
        arrowhead=2,
        ax=20,
        ay=30,
        bgcolor="red",
        font=dict(color="white"),
    )
    

    fig.update_layout(
        template="plotly_dark",
        title=dict(x=0.5),
        xaxis=dict(showgrid=False),
        yaxis=dict(showgrid=False),
        font=dict(size=14),
    )

    fig.show()

## Анализ торговых стратегий

In [None]:
df['Ticker'].unique()


### Простые стратегии

In [13]:
tema_df = df[df['Ticker'] == df['Ticker'].unique().tolist()[3]]
tema_df['Date'] = pd.to_datetime(tema_df['Date'])
tema_df.set_index('Date', inplace=True)
period =7

'''EMA: Простая EMA от цены'''
tema_df['EMA'] = talib.EMA(tema_df['Close'], timeperiod=period)

'''Удаление NaN значений, что бы не ломалось, потом нужно будет сделать заполнение .fillna(method='ffill')'''
tema_df = tema_df.dropna()

'''Инициализация сигналов'''
tema_df['Signal'] = 0

'''Логика для входа в короткую позицию (Short)'''
tema_df.loc[tema_df['Close'] < tema_df['EMA'], 'Signal'] = -1

'''Логика для входа в длинную позицию (Long)'''
tema_df.loc[tema_df['Close'] > tema_df['EMA'], 'Signal'] = 1

buy_signals = tema_df[tema_df['Signal'] == 1]
sell_signals = tema_df[tema_df['Signal'] == -1]

'''Визуализация сигналов'''
fig = go.Figure()
fig.add_trace(go.Scatter(x=tema_df.index, y=tema_df['Close'], mode='lines', name='Close Price'))
fig.add_trace(go.Scatter(x=tema_df.index, y=tema_df['EMA'], mode='lines', name='EMA', line=dict(color='blue')))
fig.add_trace(go.Scatter(x=buy_signals.index, y=buy_signals['Close'], mode='markers', marker=dict(color='green', size=10), name='Buy Signal'))
fig.add_trace(go.Scatter(x=sell_signals.index, y=sell_signals['Close'], mode='markers', marker=dict(color='red', size=10), name='Sell Signal'))
fig.update_layout(title='Simple EMA Strategy', xaxis_title='Date', yaxis_title='Price')
fig.show()


# Константы комиссий
commission_buy = 0.0003
commission_sell = 0.0002

# Подсчёт прибыли по сделкам с учётом комиссий и накоплением дохода
def calculate_cumulative_pnl_with_commissions(dataframe, commission_buy, commission_sell):
    pnl_list = []
    cumulative_pnl = 0
    open_position = None  # Храним цену входа для текущей позиции
    position_type = None  # Тип текущей позиции: 'long' или 'short'

    for index, row in dataframe.iterrows():
        if row['Signal'] == 1:  # Сигнал для длинной позиции
            if position_type == 'short':  # Закрытие короткой позиции
                cumulative_pnl += open_position - row['Close'] - row['Close'] * commission_buy
                pnl_list.append((index, cumulative_pnl))
                open_position = None
                position_type = None
            elif open_position is None:  # Открытие длинной позиции
                open_position = row['Close']
                position_type = 'long'
                cumulative_pnl -= open_position * commission_buy

        elif row['Signal'] == -1:  # Сигнал для короткой позиции
            if position_type == 'long':  # Закрытие длинной позиции
                cumulative_pnl += row['Close'] - open_position - row['Close'] * commission_sell
                pnl_list.append((index, cumulative_pnl))
                open_position = None
                position_type = None
            elif open_position is None:  # Открытие короткой позиции
                open_position = row['Close']
                position_type = 'short'
                cumulative_pnl -= open_position * commission_sell

    # Если осталась открытая позиция, учитываем её закрытие на последней цене
    if open_position is not None:
        if position_type == 'long':
            cumulative_pnl += dataframe.iloc[-1]['Close'] - open_position - dataframe.iloc[-1]['Close'] * commission_sell
        elif position_type == 'short':
            cumulative_pnl += open_position - dataframe.iloc[-1]['Close'] - dataframe.iloc[-1]['Close'] * commission_buy
        pnl_list.append((dataframe.index[-1], cumulative_pnl))

    return pd.DataFrame(pnl_list, columns=['Date', 'Cumulative PnL']).set_index('Date')

# Подсчёт накопительного дохода с учётом комиссий
cumulative_pnl_df = calculate_cumulative_pnl_with_commissions(tema_df, commission_buy, commission_sell)

# График дохода накопительного итога
fig = go.Figure()
fig.add_trace(go.Scatter(x=cumulative_pnl_df.index, y=cumulative_pnl_df['Cumulative PnL'],
                         mode='lines+markers', name='Cumulative PnL', line=dict(color='green')))
fig.update_layout(title='Накопительный PnL (с учетом комиссии)', xaxis_title='Date', yaxis_title='Cumulative PnL')
fig.show()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



<div class="alert alert-info">
<font size="4", color = "black"><b>✍ Вопрос</b></font>
    <br /> 
    <font size="3", color = "black">
<br /> Добавил данные из теории, что бы на них опереться, по итогу я их не использовал, но пока решил оставить в коде.

### RSI и MACD и стратегии на основе LGBM

In [14]:
tema_df['RSI'] = talib.RSI(tema_df['Close'], timeperiod=9)
tema_df['MACD'], tema_df['MACD_signal'], _ = talib.MACD(tema_df['Close'], fastperiod=5, slowperiod=13, signalperiod=3)

# Удаляем NaN значения
tema_df = tema_df.dropna()

# Добавляем целевую переменную
tema_df['Target'] = np.where(tema_df['Close'].shift(-1) > tema_df['Close'], 1, 0)

# Разделение данных на обучающую, тестовую и валидационную выборки
train_ratio = 0.7
val_ratio = 0.2
test_ratio = 0.1


n = len(tema_df)
train_end = int(n * train_ratio)
val_end = train_end + int(n * val_ratio)

# train_data = tema_df.iloc[:train_end]
# val_data = tema_df.iloc[train_end:val_end]
# test_data = tema_df.iloc[val_end:]
# Создаем срезы
train_data = tema_df.iloc[:train_end].copy() 
val_data = tema_df.iloc[train_end:val_end].copy() 
test_data = tema_df.iloc[val_end:].copy()  


# Выделение признаков и целевой переменной
features = ['RSI', 'MACD', 'MACD_signal']
X_train, y_train = train_data[features], train_data['Target']
X_val, y_val = val_data[features], val_data['Target']
X_test, y_test = test_data[features], test_data['Target']


'''Оптимизация гиперпараметров через Optuna'''
def objective(trial):
    params = {
        'num_leaves': trial.suggest_int('num_leaves', 20, 100),
        'max_depth': trial.suggest_int('max_depth', -1, 20),
        'n_estimators': trial.suggest_int('n_estimators', 50, 200),
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),  
        'subsample': trial.suggest_float('subsample', 0.5, 1.0),
        'colsample_bytree': trial.suggest_float('colsample_bytree', 0.5, 1.0),
        'random_state': 42,
    }

    model = lgb.LGBMClassifier(**params)
    model.fit(
        X_train, 
        y_train, 
        eval_set=[(X_val, y_val)], 
        eval_metric='logloss', 
        callbacks=[lgb.early_stopping(stopping_rounds=5)]
    )
    y_pred = model.predict(X_val)
    return -accuracy_score(y_val, y_pred)

# Запуск Optuna для поиска лучших гиперпараметров
study = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=10)

# Обучение модели с лучшими параметрами
best_params = study.best_params
best_model = lgb.LGBMClassifier(**best_params)
best_model.fit(X_train, y_train)

# Тестирование модели
y_pred_val = best_model.predict(X_val)
print("Validation Accuracy:", accuracy_score(y_val, y_pred_val))
print(classification_report(y_val, y_pred_val))

y_pred_test = best_model.predict(X_test)
print("Test Accuracy:", accuracy_score(y_test, y_pred_test))
print(classification_report(y_test, y_pred_test))

# Визуализация стратегии
val_data['Prediction'] = y_pred_val
val_data['Strategy_Returns'] = np.where(val_data['Prediction'] == 1, 
                                        val_data['Close'].pct_change(), 0).cumsum()

val_data['Market_Returns'] = val_data['Close'].pct_change().cumsum()

# График Plotly
fig = go.Figure()
fig.add_trace(go.Scatter(
    x=val_data.index, 
    y=val_data['Strategy_Returns'], 
    mode='lines', 
    name='Strategy Returns'
))
fig.add_trace(go.Scatter(
    x=val_data.index, 
    y=val_data['Market_Returns'], 
    mode='lines', 
    name='Market Returns'
))
fig.update_layout(
    title="Strategy Returns vs Market Returns",
    xaxis_title="Date",
    yaxis_title="Cumulative Returns"
)
fig.show()



final_market_returns = val_data['Market_Returns'].iloc[-1]
final_strategy_returns = val_data['Strategy_Returns'].iloc[-1]

print(f"Итоговое изменение Market Returns: {final_market_returns:.2%}")
print(f"Итоговое изменение Strategy Returns: {final_strategy_returns:.2%}")
print(best_params)

[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing row-wise multi-threading, the overhead of testing was 0.000035 seconds.
You can set `force_row_wise=true` to remove the overhead.
And if memory is not enough, you can set `force_col_wise=true`.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.690821
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000023 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set

Итоговое изменение Market Returns: 39.56%
Итоговое изменение Strategy Returns: 31.88%
{'num_leaves': 89, 'max_depth': 8, 'n_estimators': 100, 'learning_rate': 0.1298888754873815, 'subsample': 0.5157933534261312, 'colsample_bytree': 0.6909374101540773}


In [15]:
tema_df.head(2)

Unnamed: 0_level_0,Close,High,Low,Open,Volume,Ticker,EMA,Signal,RSI,MACD,MACD_signal,Target
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2024-01-23,2240.686035,2348.03125,2167.282471,2310.95166,16182147521,ETH-USD,2389.853195,-1,39.268016,-83.604757,-56.119853,0
2024-01-24,2233.561768,2261.384521,2197.656738,2241.749756,10134722960,ETH-USD,2350.780338,-1,38.797516,-96.388546,-76.254199,0


### Оптимизация через скользящее прогнозирование

In [16]:
'''Расчет доходности стратегии'''
def calculate_returns(data):
    returns = []
    for i in range(1, len(data)):
        if data['Prediction'].iloc[i - 1] == 1:  # Покупка
            trade_return = data['Close'].iloc[i] / data['Close'].iloc[i - 1] - 1
            trade_return -= commission_buy + commission_sell
            returns.append(trade_return)
        else:
            returns.append(0)
    return returns



In [17]:
def sliding_window_backtest(data, features, period_train=90, period_val=10, period_test=30):
    start = 0
    strategy_returns = []
    market_returns = []

    while start + period_train + period_val + period_test <= len(data):
        # Разделение данных на обучающую, валидационную и тестовую выборки
        train_data = data.iloc[start:start + period_train]
        val_data = data.iloc[start + period_train:start + period_train + period_val]
        test_data = data.iloc[start + period_train + period_val:start + period_train + period_val + period_test]

        X_train, y_train = train_data[features], train_data['Target']
        X_val, y_val = val_data[features], val_data['Target']
        X_test, y_test = test_data[features], test_data['Target']

        # Оптимизация гиперпараметров через Optuna
        study = optuna.create_study(direction='maximize')
        study.optimize(objective, n_trials=50)

        # Получение лучших параметров
        best_params = study.best_params

        # Обучение модели с лучшими параметрами
        best_model = lgb.LGBMClassifier(**best_params)
        best_model.fit(X_train, y_train)

        # Тестирование модели
        test_data['Prediction'] = best_model.predict(X_test)

        strategy_returns.extend(calculate_returns(test_data))
        market_returns.extend(test_data['Close'].pct_change().fillna(0).tolist())

        # Сдвиг окна
        start += period_test

    return np.cumsum(strategy_returns), np.cumsum(market_returns)

In [18]:
# Запуск цепного расчета
features = ['RSI', 'MACD', 'MACD_signal']
strategy_returns, market_returns = sliding_window_backtest(tema_df, features)

# Промежуточные результаты
if len(strategy_returns) == 0:
    print("Стратегия не сработала, нет данных для доходности стратегии.")
else:
    print("Кумулятивная доходность стратегии:", strategy_returns[-1])

if len(market_returns) == 0:
    print("Нет данных для доходности рынка.")
else:
    print("Кумулятивная доходность рынка:", market_returns[-1])

# Визуализация результатов Plotly
if len(strategy_returns) > 0 and len(market_returns) > 0:
    fig = go.Figure()

    # Добавление графика доходности стратегии
    fig.add_trace(go.Scatter(
        x=np.arange(len(strategy_returns)), 
        y=strategy_returns, 
        mode='lines', 
        name='Strategy Returns'
    ))

    # Добавление графика доходности рынка
    fig.add_trace(go.Scatter(
        x=np.arange(len(market_returns)), 
        y=market_returns, 
        mode='lines', 
        name='Market Returns'
    ))

    # Настройка макета графика
    fig.update_layout(
        title="Доходность стратегии против рынка (скользящее обучение)",
        xaxis_title="Период",
        yaxis_title="Кумулятивная доходность",
        legend=dict(x=0, y=1)
    )

    # Отображение графика
    fig.show()


[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000031 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[10]	valid_0's binary_logloss: 0.687153
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000033 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boo



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000030 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.687859
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000028 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000029 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.690744
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000026 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000028 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[4]	valid_0's binary_logloss: 0.690337
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000018 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000031 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.690766
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000030 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000027 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.687472
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000021 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000029 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.689749
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000030 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000035 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[6]	valid_0's binary_logloss: 0.69057
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000024 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boost



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



In [19]:
# Добавляем SMA, раньше ее не было
tema_df['SMA'] = talib.SMA(tema_df['Close'], timeperiod=7)

def train_and_evaluate_model(data, features, model_name):
    # Проверка на признаки
    if not all(feature in data.columns for feature in features):
        raise ValueError(f"Некоторые признаки отсутствуют в данных: {features}")

    strategy_returns, market_returns = sliding_window_backtest(data, features)
    return {
        'model_name': model_name,
        'strategy_returns': strategy_returns,
        'market_returns': market_returns
    }

# Построение трех моделей на основе паттернов
def build_models(data):
    models = []

    # Модель с RSI
    models.append(train_and_evaluate_model(data, ['RSI'], 'RSI-Based Model'))

    # Модель с MACD
    models.append(train_and_evaluate_model(data, ['MACD', 'MACD_signal'], 'MACD-Based Model'))

    # Модель с SMA
    models.append(train_and_evaluate_model(data, ['SMA'], 'SMA-Based Model'))

    return models

# Обучение моделей
models_results = build_models(tema_df)

# Создаем общий график
fig = go.Figure()

# Визуализация результатов для каждой модели
for model_result in models_results:
    strategy_returns = model_result['strategy_returns']
    market_returns = model_result['market_returns']

    # Добавляем линию для доходности стратегии
    fig.add_trace(go.Scatter(
        x=np.arange(len(strategy_returns)), 
        y=strategy_returns, 
        mode='lines', 
        name=f"{model_result['model_name']} Strategy Returns"
    ))

# Добавляем линию для рыночной доходности
fig.add_trace(go.Scatter(
    x=np.arange(len(market_returns)), 
    y=market_returns, 
    mode='lines', 
    name='Market Returns'
))

# Настройка макета графика
fig.update_layout(
    title="Strategy Returns vs Market Returns",
    xaxis_title="Period",
    yaxis_title="Cumulative Returns",
    legend=dict(x=0, y=1)
)

# Отображение графика
fig.show()

# Итоговые изменения
final_market_returns = market_returns[-1]
final_strategy_returns = strategy_returns[-1]

print(f"Итоговое изменение Market Returns для {model_result['model_name']}: {final_market_returns:.2%}")
print(f"Итоговое изменение Strategy Returns для {model_result['model_name']}: {final_strategy_returns:.2%}")

[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000028 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.690779
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000025 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Early stopping, best iteration is:
[3]	valid_0's binary_logloss: 0.689571
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000026 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.689633
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000021 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000032 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.68736
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000022 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boost



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000032 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[4]	valid_0's binary_logloss: 0.690335
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000024 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000027 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[3]	valid_0's binary_logloss: 0.689761
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000020 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000027 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[3]	valid_0's binary_logloss: 0.689504
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000024 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000022 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.687173
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000022 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000028 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[3]	valid_0's binary_logloss: 0.683502
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000025 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000021 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[18]	valid_0's binary_logloss: 0.68376
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000030 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000026 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.687531
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000029 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000024 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.690854
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000029 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000032 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[3]	valid_0's binary_logloss: 0.690528
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000029 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000027 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[5]	valid_0's binary_logloss: 0.690771
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000025 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000033 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.687439
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000020 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000037 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[4]	valid_0's binary_logloss: 0.690419
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000024 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000029 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.685259
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000026 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000027 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[3]	valid_0's binary_logloss: 0.689489
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000030 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000025 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.689069
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000026 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000029 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[1]	valid_0's binary_logloss: 0.687474
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000033 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000027 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[3]	valid_0's binary_logloss: 0.689588
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000031 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000027 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.687923
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000019 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000033 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.689777
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000025 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000021 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[2]	valid_0's binary_logloss: 0.688939
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000026 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000049 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.526971 -> initscore=0.107989
[LightGBM] [Info] Start training from score 0.107989
Training until validation scores don't improve for 5 rounds
Early stopping, best iteration is:
[3]	valid_0's binary_logloss: 0.687939
[LightGBM] [Info] Number of positive: 127, number of negative: 114
[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000027 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 246
[LightGBM] [Info] Number of data points in the train set: 241, number of used features: 3
[LightGBM] [Info] [binary:Boos



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



Итоговое изменение Market Returns для SMA-Based Model: 24.72%
Итоговое изменение Strategy Returns для SMA-Based Model: 46.25%
