# Importing Libs

In [1]:
# ML
from sklearn.svm import SVR

import sklearn

# TA tools
import ta

# basic
import numpy as np
import pandas as pd

# visualize
import matplotlib.pyplot as plt
import seaborn as sns

# options
pd.set_option('display.max_columns', None)

# versions used
print(f"pandas=={pd.__version__}")
print(f"numpy=={np.__version__}")
print(f"sklearn=={sklearn.__version__}")

pandas==2.2.3
numpy==2.2.4
sklearn==1.6.1


# Data

We have loaded data in a `stock_loader_experiments.ipynb` notebook  using our custom class which access historical and latest data from T-bank broker API

In [2]:
# let's take Gazprom stocks data
df_base = pd.read_parquet('data/raw_data/GAZP_2023_2024.parquet')

# converting and deleting some columns
df_base['date_time_start'] = pd.to_datetime(df_base['date_time_start'])
df_base.drop(columns=['uid', 'ticker', 'figi', 'Spare'], inplace=True)

df_base.sample(1)

Unnamed: 0,date_time_start,Open,Close,High,Low,Volume
330947,2024-04-05 09:44:00+00:00,163.9,163.91,163.92,163.9,7528


There is a problem with consistency in data (i.e. some minutes are missed), so let's check it

In [3]:
def check_minute_difference(datetime1, datetime2):
  difference = datetime1 - datetime2
  return difference.total_seconds() == 60.0

In [4]:
for i in range(df_base.shape[0]):
    if i == df_base.shape[0] - 1:
        break
    if not check_minute_difference(df_base['date_time_start'].iloc[i+1], df_base['date_time_start'].iloc[i]):
        print(i)

42
51
75
78
83
173
192
197
239
251
255
258
265
266
292
322
326
333
344
398
447
450
628
634
645
735
759
1279
1284
1570
2090
2095
2205
2380
2900
2905
2973
2975
3189
3709
3714
3840
3878
3998
4038
4071
4128
4150
4154
4165
4189
4217
4304
4325
4329
4331
4337
4355
4364
4377
4385
4414
4450
4454
4475
4556
4589
4662
4681
4713
4741
4742
4772
4776
4777
4791
4832
4849
4880
4881
4885
4886
4911
4921
4931
4942
4950
4986
4990
5510
5515
5771
5800
6320
6325
6458
6466
6609
7129
7134
7420
7940
7945
8231
8751
8756
9042
9117
9181
9417
9419
9489
9532
9555
9572
9574
9739
9785
9807
9838
9878
9929
9937
9941
9943
10040
10058
10076
10079
10082
10096
10616
10621
10907
11427
11432
11718
12238
12243
12529
13049
13054
13340
13860
13865
14151
14446
14636
14662
14667
14683
14686
14850
14860
14878
14891
14917
14959
14981
14985
14997
15025
15030
15032
15037
15068
15109
15148
15193
15199
15206
15726
15731
16017
16537
16542
16828
17348
17353
17639
18159
18164
18450
18970
18975
19261
19346
19378
19435
19437
19450
19469
19496

So on real-world implementation we would want to create a training dataset manually with realtime data, for now we will treat each row as a single minute

In [5]:
df_base.drop(columns='date_time_start', inplace=True)

# Classic ML experiments

Before using any DL approaches we want to build a baseline classic ML approach which will work as a guideline when looking at alternatives which DL techniques that can may or not help us

## Features

Let's add indicators that can be used by several approaches to enhance its results

### Classical indicators

#### SMA & EMA

Moving Averages (SMA & EMA): The Simple Moving Average (SMA) and Exponential Moving Average (EMA) smooth out price data to reveal the underlying trend. They are fundamental trend-following indicators. Intraday traders often plot short-term moving averages (e.g. 5-minute or 15-minute SMA) to identify trend direction. A common strategy is the moving average crossover – for example, if a short-term EMA crosses above a longer-term EMA, it generates a buy signal (anticipating upward momentum). These crossovers have been used on MOEX stocks to catch emerging trends. In one study, various SMA lengths were tested in an intraday RSI strategy; shorter-period SMAs improved profitability during downtrends, indicating the importance of choosing an appropriate moving average length for the market condition ￼. Overall, moving averages help traders filter out noise and decide if momentum favors long or short positions. They are also components of other indicators (for instance, the MACD uses EMA calculations).

In [6]:
df_base['SMA_9'] = ta.trend.sma_indicator(df_base['Close'], 9)
df_base['SMA_10'] = ta.trend.sma_indicator(df_base['Close'], 10)
df_base['SMA_21'] = ta.trend.sma_indicator(df_base['Close'], 21)

df_base['EMA_9'] = ta.trend.ema_indicator(df_base['Close'], 9)
df_base['EMA_10'] = ta.trend.ema_indicator(df_base['Close'], 10)
df_base['EMA_21'] = ta.trend.ema_indicator(df_base['Close'], 21)

#### RSI

Relative Strength Index (RSI): RSI is a popular momentum oscillator that measures the magnitude of recent price gains vs. losses on a 0–100 scale ￼. It is used to identify overbought conditions (RSI above 70, indicating prices may have risen too fast) and oversold conditions (RSI below 10) ￼. RSI has shown its value in many markets; for example, studies on emerging markets found that RSI signals can generate accurate buy/sell prompts and even produce abnormal returns ￼. Traders often use RSI on intraday MOEX charts to anticipate reversals – if a stock’s RSI dips below 10, it may be poised for a bounce (buy signal), whereas an RSI above 70 could warn of an upcoming pullback (sell signal). Importantly, combining RSI with other indicators strengthens its effectiveness. Research suggests the best results come from using RSI alongside complementary signals like moving averages ￼, which confirm the trend context. For instance, if the RSI gives an oversold reading and at the same time the price is bouncing off a key moving average support, the confluence increases confidence in a buy trade. This indicator’s ubiquity among traders makes it a self-fulfilling tool at times – many MOEX trading algorithms monitor RSI levels, contributing to short-term support and resistance around those threshold values.

In [7]:
df_base['RSI_9'] = ta.momentum.rsi(df_base['Close'], 9)
df_base['RSI_11'] = ta.momentum.rsi(df_base['Close'], 11)

#### MACD

Moving Average Convergence Divergence (MACD): MACD is another momentum/trend indicator that calculates the difference between two EMAs and a signal line (the EMA of that difference). It oscillates above and below zero, highlighting changes in trend momentum. A positive MACD indicates upward momentum, while a negative MACD indicates downward momentum; the crossing of MACD above its signal line is a classic bullish signal (and vice versa for bearish). MACD is widely favored for intraday trading because it combines aspects of trend following and momentum in one indicator. Traders on the MOEX use MACD histograms and crossovers to spot trend reversals or trend strength changes in stocks or the MOEX index itself. A strong use case is pairing MACD with RSI: one popular strategy requires MACD line crossing above the signal and RSI coming out of oversold territory to trigger a buy, capturing both trend and momentum confirmation. In fact, backtests of strategies combining MACD and RSI have shown high success rates – one such strategy yielded about a 73% win rate over hundreds of trades, with an average 0.88% gain per trade ￼. This demonstrates how MACD, especially in combination with RSI, can be a powerful tool for timing intraday entries and exits.

In [8]:

# the best parameters for intraday trading mentioned here -> https://market-bulls.com/macd-indicator-trading-strategies/
macd = ta.trend.MACD(
    close=df_base['Close'],
    window_slow=17,
    window_fast=8,
    window_sign=9
)

df_base['MACD'] = macd.macd()
df_base['MACD_Signal'] = macd.macd_signal()
df_base['MACD_Hist'] = macd.macd_diff()

#### Bollinger Bands

Bollinger Bands: Bollinger Bands are a volatility indicator consisting of a moving average (typically 20-period SMA) and upper/lower bands set a certain number of standard deviations away (often 2σ). The bands widen when volatility increases and contract when volatility drops. Intraday traders use Bollinger Bands to identify potential breakouts or mean-reversion opportunities. For example, when price repeatedly touches the upper band, the market may be overextended to the upside (potential reversal or short setup), whereas a sharp move outside the bands could signal an emerging breakout with increased volatility. In high-frequency trading contexts, Bollinger Bands help gauge if current price swings are outside normal volatility ranges ￼. On the MOEX, which can experience sudden moves due to news or low liquidity in certain stocks, Bollinger Band signals are quite useful. A common strategy is to fade extreme moves – if a Russian stock’s price spikes well above the upper band on an intraday chart, traders might short expecting a pullback toward the mean. Conversely, touching the lower band after a steady decline could present a buy-the-dip opportunity. However, it’s important to confirm with other indicators (Bollinger Band breakouts combined with volume spikes or momentum divergences provide stronger evidence of a true breakout or reversal).

In [9]:
# the best parameters for intraday trading mentioned here -> https://www.stockdaddy.in/blog/bollinger-bands-strategy
bb = ta.volatility.BollingerBands(close=df_base['Close'], window=20, window_dev=2)

df_base['BB_mavg'] = bb.bollinger_mavg()
df_base['BB_hband'] = bb.bollinger_hband()
df_base['BB_lband'] = bb.bollinger_lband()
df_base['BB_width'] = bb.bollinger_wband()
df_base['BB_pband'] = bb.bollinger_pband()
df_base['BB_hband_ind'] = bb.bollinger_hband_indicator()
df_base['BB_lband_ind'] = bb.bollinger_lband_indicator()

#### OBV

On-Balance Volume (OBV) and Volume Indicators: Volume-based indicators play a crucial role in validating price movements. OBV is a simple yet powerful indicator that accumulates volume, adding volume on up days and subtracting on down days ￼. It effectively measures buying and selling pressure. Rising OBV indicates that volume is flowing into an asset (buyers are dominant), often foreshadowing an upward breakout, while falling OBV signals distribution (selling pressure) ￼. Traders use OBV intraday to confirm trends: if price is climbing and OBV is also steadily rising, it suggests the uptrend is backed by strong volume (and likely to continue). If price makes new highs but OBV fails to reach a new high (a bearish divergence), it can warn that the rally is losing support and may reverse. On MOEX, where certain stocks can have erratic volume, OBV helps filter false price moves. For instance, a sudden price jump on low volume is treated suspiciously by algo-traders – if OBV doesn’t confirm the move, they may avoid the bait. Other volume indicators like the Volume Weighted Average Price (VWAP) are also popular intraday tools (VWAP is often used by institutional traders as a benchmark; prices above VWAP indicate an uptrend with strength, below VWAP indicates a downtrend). In general, volume indicators complement price-based indicators by adding the dimension of market participation, which is key in the relatively smaller and sometimes volatile Russian market. Combining volume signals with price signals is a widely recommended practice – for example, a breakout above a resistance level is far more convincing if accompanied by a surge in OBV or volume, confirming that big players are driving the move ￼.

In [10]:
obv = ta.volume.OnBalanceVolumeIndicator(
    close=df_base['Close'],
    volume=df_base['Volume']
)

df_base['OBV'] = obv.on_balance_volume()

####  Stochastic oscilator

The Stochastic Oscillator (similar to RSI, used to indicate overbought/oversold levels based on recent closing prices relative to price range) is often used on short time frames for MOEX stocks to pick turning points.

In [11]:
stoch = ta.momentum.StochasticOscillator(
    high=df_base['High'],
    low=df_base['Low'],
    close=df_base['Close'],
    window=9,
    smooth_window=3
)

df_base['Stoch_K'] = stoch.stoch()
df_base['Stoch_D'] = stoch.stoch_signal()

#### Commodity Channel Index

The Commodity Channel Index (CCI) is another momentum indicator highlighted in studies for trend detection ￼.

In [12]:
cci = ta.trend.CCIIndicator(
    high=df_base['High'],
    low=df_base['Low'],
    close=df_base['Close'],
    window=14,
    constant=0.010 # more sensitive
)

df_base['CCI'] = cci.cci()

#### Average True Range

Average True Range (ATR) is commonly monitored to gauge intraday volatility – for example, a widening ATR on a stock like Sberbank might imply the next price swing could be larger than usual, prompting traders to adjust stop-loss distances.

In [13]:
atr = ta.volatility.AverageTrueRange(
    high=df_base['High'],
    low=df_base['Low'],
    close=df_base['Close'],
    window=9
)

df_base['ATR_9'] = atr.average_true_range()

#### Ichimoku Cloud

Ichimoku Cloud (a comprehensive trend system) is sometimes applied to index futures or highly liquid Russian equities to map support/resistance and trend momentum at a glance.

In [14]:
ichimoku = ta.trend.IchimokuIndicator(
    high=df_base['High'],
    low=df_base['Low'],
    window1=9,
    window2=18,
    window3=34,
    visual=False
)

df_base['Ichimoku_A'] = ichimoku.ichimoku_a()
df_base['Ichimoku_B'] = ichimoku.ichimoku_b()
df_base['Ichimoku_Base'] = ichimoku.ichimoku_base_line()
df_base['Ichimoku_Conversion'] = ichimoku.ichimoku_conversion_line()

## SVM

We will start with support vector machine, in several studies it proved to be prominent in intraday trading, so we can try it on our datasets.

When talking about what indicators we can combine it with, the ones that were tested in several articles, are: 

### One step forward

### Multi-step forward