Parameter Heatmap
==========

This tutorial will show how to optimize strategies with multiple parameters and how to examine and reason about optimization results.
It is assumed you're already familiar with
[basic _backtesting.py_ usage](https://kernc.github.io/backtesting.py/doc/examples/Quick%20Start%20User%20Guide.html).

First, let's again import our helper moving average function.
In practice, one should use functions from an indicator library, such as
[TA-Lib](https://github.com/mrjbq7/ta-lib) or
[Tulipy](https://tulipindicators.org).

In [1]:
# 1. 导入必要的库
import pandas as pd
import os
from datetime import datetime
from glob import glob

def load_and_resample_data(symbol, start_date, end_date, source_timeframe='1m', target_timeframe='30m', data_path=r'\\znas\Main\futures'):
    """
    加载并重采样期货数据
    
    参数:
        symbol (str): 交易对名称，如 'KASUSDT'
        start_date (str): 开始日期，格式 'YYYY-MM-DD'
        end_date (str): 结束日期，格式 'YYYY-MM-DD'
        source_timeframe (str): 源数据时间周期，默认 '1m'
        target_timeframe (str): 目标时间周期，默认 '30m'
        data_path (str): 数据文件路径
        
    返回:
        pd.DataFrame: 符合backtesting.py格式的DataFrame
    """
    # 生成日期范围
    date_range = pd.date_range(start=start_date, end=end_date, freq='D')
    
    # 准备存储所有数据的列表
    all_data = []
    
    # 标准化交易对名称
    formatted_symbol = symbol.replace('/', '_').replace(':', '_')
    if not formatted_symbol.endswith('USDT'):
        formatted_symbol = f"{formatted_symbol}USDT"
    
    # 遍历每一天
    for date in date_range:
        date_str = date.strftime('%Y-%m-%d')
        # 构建文件路径
        file_path = os.path.join(data_path, date_str, f"{date_str}_{formatted_symbol}_USDT_{source_timeframe}.csv")
        
        try:
            if os.path.exists(file_path):
                # 读取数据
                df = pd.read_csv(file_path)
                df['datetime'] = pd.to_datetime(df['datetime'])
                all_data.append(df)
            else:
                print(f"文件不存在: {file_path}")
        except Exception as e:
            print(f"读取文件出错 {file_path}: {str(e)}")
            continue
    
    if not all_data:
        raise ValueError(f"未找到 {symbol} 在指定日期范围内的数据")
    
    # 合并所有数据
    combined_df = pd.concat(all_data, ignore_index=True)
    combined_df = combined_df.sort_values('datetime')
    
    # 设置时间索引
    combined_df.set_index('datetime', inplace=True)
    
    # 重采样到目标时间周期
    resampled = combined_df.resample(target_timeframe).agg({
        'open': 'first',
        'high': 'max',
        'low': 'min',
        'close': 'last',
        'volume': 'sum'
    }).dropna()  # 立即删除NaN值
    
    # 转换为backtesting.py格式
    backtesting_df = pd.DataFrame({
        'Open': resampled['open'],
        'High': resampled['high'],
        'Low': resampled['low'],
        'Close': resampled['close'],
        'Volume': resampled['volume']
    })
    
    # 确保所有数据都是数值类型并删除任何无效值
    for col in ['Open', 'High', 'Low', 'Close', 'Volume']:
        backtesting_df[col] = pd.to_numeric(backtesting_df[col], errors='coerce')
    
    # 最终清理
    backtesting_df = backtesting_df.dropna()
    
    return backtesting_df



Our strategy will be a similar moving average cross-over strategy to the one in
[Quick Start User Guide](https://kernc.github.io/backtesting.py/doc/examples/Quick%20Start%20User%20Guide.html),
but we will use four moving averages in total:
two moving averages whose relationship determines a general trend
(we only trade long when the shorter MA is above the longer one, and vice versa),
and two moving averages whose cross-over with daily _close_ prices determine the signal to enter or exit the position.

In [2]:
# 加载数据
symbol = 'KASUSDT'
start_date = '2024-01-01'
end_date = '2025-01-01'

try:
    backtesting_df = load_and_resample_data(
        symbol=symbol,
        start_date=start_date,
        end_date=end_date,
        source_timeframe='1m',
        target_timeframe='30min'
    )
    
    print(f"数据加载成功:")
    print(f"时间范围: {backtesting_df.index.min()} 到 {backtesting_df.index.max()}")
    print(f"数据条数: {len(backtesting_df)}")
    print("\n数据示例:")
    print(backtesting_df.head())
    
except Exception as e:
    print(f"处理数据时出错: {str(e)}")

数据加载成功:
时间范围: 2024-01-01 00:00:00 到 2025-01-01 23:30:00
数据条数: 17616

数据示例:
                        Open     High      Low    Close     Volume
datetime                                                          
2024-01-01 00:00:00  0.11220  0.11296  0.11204  0.11287  3421830.0
2024-01-01 00:30:00  0.11284  0.11334  0.11172  0.11323  2515295.0
2024-01-01 01:00:00  0.11318  0.11340  0.11288  0.11326  1227419.0
2024-01-01 01:30:00  0.11326  0.11414  0.11326  0.11394  1893081.0
2024-01-01 02:00:00  0.11396  0.11426  0.11328  0.11402  1711190.0


### 导入策略

In [3]:
from MeanReverterShort import MeanReverterShort

import pandas as pd
import numpy as np
import talib as ta
from backtesting import Strategy

class MeanReverterShort(Strategy):
    """
    均值回归策略：做空版
      利用 TA Lib 计算的 RSI 和 ATR 指标判断开空、加仓和平仓时机，
      实现分批开空。这里我们使用资金比例（dca）下单方式实现 pyramiding 加仓，
      去掉对 cash 参数的依赖（backtesting 库中并没有该参数）。

    参数说明：
      frequency             : 用于计算慢速 RSI 均线的周期（默认 10），平滑 RSI 指标
      rsiFrequency          : 计算 RSI 的周期（默认 40），衡量市场动能
      sellZoneDistance      : RSI 高于慢速RSI均线的比例（默认 5%），认为处于超买区域，作为做空入场条件
      avgUpATRSum           : 累计 ATR 的周期个数（默认 3），用于加仓时判断价格涨幅（做空时要求价格高于加权均价）
      useAbsoluteRSIBarrier : 是否使用绝对 RSI 障碍（默认 True），平仓时要求 RSI 低于 barrierLevel
      barrierLevel          : RSI 障碍水平（默认 50），当启用绝对障碍时，只有 RSI 低于该值才平仓
      pyramiding            : 最大允许加仓次数（例如：8，即最多允许 8 次卖空/加仓）
    """
    frequency = 10
    rsiFrequency = 40
    sellZoneDistance = 5
    avgUpATRSum = 3
    useAbsoluteRSIBarrier = True
    barrierLevel = 50
    pyramiding = 8  # 最大允许加仓次数

    def init(self):
        # 初始化已加仓次数及单位资金比例（用来分批加仓）
        self.opentrades = 0
        self.unit_ratio = 1 / self.pyramiding

    def next(self):
        # 获取当前最新价格
        price = self.data.Close[-1]

        # -------------------------------
        # 使用 TA Lib 计算指标
        # -------------------------------
        close_arr = np.asarray(self.data.Close)
        rsi_series = ta.RSI(close_arr, timeperiod=self.rsiFrequency)
        rsi_val = rsi_series[-1]
        sma_series = ta.SMA(rsi_series, timeperiod=self.frequency)
        rsi_slow = sma_series[-1]
        
        high_arr = np.asarray(self.data.High)
        low_arr = np.asarray(self.data.Low)
        atr_series = ta.ATR(high_arr, low_arr, close_arr, timeperiod=20)
        if len(atr_series) >= self.avgUpATRSum:
            atr_sum = np.sum(atr_series[-self.avgUpATRSum:])
        else:
            atr_sum = 0

        # -------------------------------
        # 开空/加空条件判断
        # -------------------------------
        # 条件1：RSI 处于超买区域：RSI > 慢速RSI均线*(1 + sellZoneDistance/100)
        cond_sell_zone = rsi_val > rsi_slow * (1 + self.sellZoneDistance / 100)
        
        # 条件2：价格确认。若已有空仓，则需计算加权平均入场价格，
        # 对做空来说，要求当前价格高于调整后的平均价格（有利于获得更高的卖空均价）
        if self.position:
            trades = self._broker.trades  # 获取所有交易记录
            if trades:
                total_size = sum(abs(trade.size) for trade in trades)
                avg_price = sum(trade.entry_price * abs(trade.size) for trade in trades) / total_size
                price_above_avg = price > avg_price * (1 + 0.01 * self.opentrades)
            else:
                price_above_avg = True
        else:
            price_above_avg = True
        
        # 条件3：检查加仓次数是否未达到最大允许次数
        cond_max = self.opentrades < self.pyramiding

        isShort = cond_sell_zone and price_above_avg and cond_max

        # -------------------------------
        # 平仓条件判断（买回平仓）
        # -------------------------------
        # 当 RSI 回落：RSI < 慢速RSI均线，且在启用绝对障碍时 RSI 必须低于 barrierLevel
        isCover = (rsi_val < rsi_slow) and (rsi_val < self.barrierLevel or not self.useAbsoluteRSIBarrier)

        # -------------------------------
        # 执行交易信号
        # -------------------------------
        if isShort:
            # 计算当前应使用的资金比例（累进式下单）
            current_ratio = self.unit_ratio * (self.opentrades + 1)
            # 直接使用资金比例下单，不依赖账户现金
            self.sell(size=current_ratio)
            self.opentrades += 1

        if self.position and isCover:
            self.position.close()
            self.opentrades = 0


It's not a robust strategy, but we can optimize it.

[Grid search](https://en.wikipedia.org/wiki/Hyperparameter_optimization#Grid_search)
is an exhaustive search through a set of specified sets of values of hyperparameters. One evaluates the performance for each set of parameters and finally selects the combination that performs best.

Let's optimize our strategy on Google stock data using _randomized_ grid search over the parameter space, evaluating at most (approximately) 200 randomly chosen combinations:

In [16]:
# 使用自定义评分函数(推荐)
def custom_score(stats):
    """
    自定义评分函数，综合考虑多个指标
    """
    # 获取关键指标
    sharpe = stats['Sharpe Ratio']
    max_dd = stats['Max. Drawdown [%]']
    ret = stats['Return [%]']
    win_rate = stats['Win Rate [%]']
    sqn = stats['SQN']  # 假设stats中包含SQN指标
    trades = stats['# Trades']  # 假设stats中包含交易次数指标
    
    # 对最大回撤进行惩罚（回撤越大，分数越低）
    dd_penalty = 1 / (1 + abs(max_dd/100))
    
    # 对交易次数进行惩罚（交易次数少于50次，分数越低）
    trade_penalty = 1 if trades >= 50 else trades / 50
    
    # 计算综合得分
    score = (
        0.4 * (ret/100) +          # 40% 权重给收益率
        0.2 * sqn +                # 20% 权重给SQN
        0.2 * sharpe +             # 20% 权重给夏普比率
        0.1 * (win_rate/100) +     # 10% 权重给胜率
        0.1 * dd_penalty           # 10% 权重给回撤惩罚项
    ) * trade_penalty              # 乘以交易次数惩罚项
    
    return score

Notice `return_heatmap=True` parameter passed to
[`Backtest.optimize()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.optimize).
It makes the function return a heatmap series along with the usual stats of the best run.
`heatmap` is a pandas Series indexed with a MultiIndex, a cartesian product of all permissible (tried) parameter values.
The series values are from the `maximize=` argument we provided.

This heatmap contains the results of all the runs,
making it very easy to obtain parameter combinations for e.g. three best runs:

But we use vision to make judgements on larger data sets much faster.
Let's plot the whole heatmap by projecting it on two chosen dimensions.
Say we're mostly interested in how parameters `n1` and `n2`, on average, affect the outcome.

Let's plot this table as a heatmap:

We see that, on average, we obtain the highest result using trend-determining parameters `n1=30` and `n2=100` or `n1=70` and `n2=80`,
and it's not like other nearby combinations work similarly well — for our particular strategy, these combinations really stand out.

Since our strategy contains several parameters, we might be interested in other relationships between their values.
We can use
[`backtesting.lib.plot_heatmaps()`](https://kernc.github.io/backtesting.py/doc/backtesting/lib.html#backtesting.lib.plot_heatmaps)
function to plot interactive heatmaps of all parameter combinations simultaneously.

## Model-based optimization

Above, we used _randomized grid search_ optimization method. Any kind of grid search, however, might be computationally expensive for large data sets. In the follwing example, we will use
[_SAMBO Optimization_](https://sambo-optimization.github.io)
package to guide our optimization better informed using forests of decision trees.
The hyperparameter model is sequentially improved by evaluating the expensive function (the backtest) at the next best point, thereby hopefully converging to a set of optimal parameters with **as few evaluations as possible**.

So, with `method="sambo"`:

In [17]:
import sys
print(sys.path)

['c:\\Users\\x7498\\anaconda3\\envs\\backtesting\\python310.zip', 'c:\\Users\\x7498\\anaconda3\\envs\\backtesting\\DLLs', 'c:\\Users\\x7498\\anaconda3\\envs\\backtesting\\lib', 'c:\\Users\\x7498\\anaconda3\\envs\\backtesting', '', 'c:\\Users\\x7498\\anaconda3\\envs\\backtesting\\lib\\site-packages', 'c:\\Users\\x7498\\anaconda3\\envs\\backtesting\\lib\\site-packages\\win32', 'c:\\Users\\x7498\\anaconda3\\envs\\backtesting\\lib\\site-packages\\win32\\lib', 'c:\\Users\\x7498\\anaconda3\\envs\\backtesting\\lib\\site-packages\\Pythonwin']


In [4]:
from backtesting import Backtest

# 初始化回测实例
backtest = Backtest(
    backtesting_df,  # 输入的OHLCV数据
    MeanReverterShort,    # 使用我们的均值回归策略类（策略参数已修改）
    commission=.0004,  # 手续费万分之四
    exclusive_orders=True,  # 每次交易前先平掉旧仓位
    cash=10000,  # 初始资金
    margin=1/3  # 保证金比例为1/3，相当于3倍杠杆
)

# 使用sambo优化策略参数（除了 useAbsoluteRSIBarrier 以外，其余所有参数均进行调参优化）
stats, heatmap, optimize_result = backtest.optimize(
    frequency=range(5, 51, 5),         # 优化用于计算慢速RSI均线的周期（默认10）
    rsiFrequency=range(2, 51, 5),        # 优化RSI计算周期（默认40）
    sellZoneDistance=range(1, 11, 1),    # 优化RSI超卖区域判断比例（默认5）
    avgUpATRSum=range(1, 7, 1),          # 优化累计ATR周期个数（默认3）
    barrierLevel=range(45, 61, 5),        # 优化RSI障碍水平（默认50）
    pyramiding=range(4, 12, 1),          # 优化最大允许加仓次数（默认8）
    constraint=lambda p: True,
    maximize=custom_score,
    method='sambo',  # 指定使用sambo方法
    max_tries=10,    # 尝试次数
    random_state=0,
    return_heatmap=True,
    return_optimization=True
)

NameError: name 'custom_score' is not defined

In [19]:
heatmap.sort_values().iloc[-5:]

from sambo.plot import plot_objective
import numpy as np

# 更新参数名称以匹配修改后的策略参数
names = ['frequency', 'rsiFrequency', 'barrierLevel', 'pyramiding']
plot_dims = np.array(range(len(names)), dtype=np.int32)  # 使用 int32 类型
_ = plot_objective(optimize_result, names=names, estimator='et', plot_dims=plot_dims)

from sambo.plot import plot_evaluations

# 将plot_dims转换为int32类型
plot_dims = np.array(range(len(names)), dtype=np.int32)
_ = plot_evaluations(optimize_result, names=names, plot_dims=plot_dims)


import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

# 策略参数已调整：本次优化仅针对 frequency、rsiFrequency、barrierLevel 和 pyramiding 四个参数
# 创建三维散点图
fig = plt.figure(figsize=(15, 15))
ax = fig.add_subplot(111, projection='3d')

# 获取更新后的参数值（假设 heatmap 的 index 已更新为 ['frequency', 'rsiFrequency', 'barrierLevel', 'pyramiding'] 格式）
points = heatmap.index.to_frame()
scores = heatmap.values

# 创建散点图，用颜色表示优化得分，并以 frequency、rsiFrequency、barrierLevel 三个维度绘制
scatter = ax.scatter(points['frequency'],
                     points['rsiFrequency'],
                     points['barrierLevel'],
                     c=scores,
                     cmap='viridis',
                     s=100)  # 设置散点大小

# 设置坐标轴标签（使用中文以便更直观体现策略参数调整）
ax.set_xlabel('频率')
ax.set_ylabel('RSI周期')
ax.set_zlabel('障碍水平')

# 添加颜色条显示优化得分
plt.colorbar(scatter, label='优化得分')

# 设置图形标题
plt.title('三维参数优化热力图')

# 调整视角获得较好视觉效果
ax.view_init(elev=20, azim=45)

# 添加网格线
ax.grid(True)

plt.show()

# 打印出得分最高的前5个参数组合，便于理解最优参数区域
print("\n最佳的前5个参数组合：")
top_5 = heatmap.sort_values(ascending=False).head(5)
for idx, score in top_5.items():
    print(f"\n得分: {score:.4f}")
    print(f"频率: {idx[0]}")
    print(f"RSI周期: {idx[1]}")
    print(f"障碍水平: {idx[2]}")
    print(f"加仓次数: {idx[3]}")

# 可选：添加二维投影图
fig = plt.figure(figsize=(15, 5))

# 频率 vs RSI周期
plt.subplot(131)
plt.scatter(points['frequency'], 
            points['rsiFrequency'], 
            c=scores, 
            cmap='viridis')
plt.xlabel('频率')
plt.ylabel('RSI周期')
plt.colorbar(label='得分')

# 频率 vs 加仓次数
plt.subplot(132)
plt.scatter(points['frequency'], 
            points['pyramiding'], 
            c=scores, 
            cmap='viridis')
plt.xlabel('频率')
plt.ylabel('加仓次数')
plt.colorbar(label='得分')

# RSI周期 vs 加仓次数
plt.subplot(133)
plt.scatter(points['rsiFrequency'], 
            points['pyramiding'], 
            c=scores, 
            cmap='viridis')
plt.xlabel('RSI周期')
plt.ylabel('加仓次数')
plt.colorbar(label='得分')

plt.tight_layout()
plt.show()

frequency  rsiFrequency  buyZoneDistance  avgDownATRSum  barrierLevel  pyramiding
47         47            10               3              45            8             2.286909
37         37            9                2              47            7             2.639739
32         32            5                5              48            6             2.882155
19         2             3                5              60            5             3.261858
49         24            7                6              46            5             3.869815
dtype: float64

Notice how the optimization runs somewhat slower even though `max_tries=` is lower. This is due to the sequential nature of the algorithm and should actually perform quite comparably even in cases of _much larger parameter spaces_ where grid search would effectively blow up, likely reaching a better optimum than a simple randomized search would.
A note of warning, again, to take steps to avoid
[overfitting](https://en.wikipedia.org/wiki/Overfitting)
insofar as possible.

Understanding the impact of each parameter on the computed objective function is easy in two dimensions, but as the number of dimensions grows, partial dependency plots are increasingly useful.
[Plotting tools from _SAMBO_](https://sambo-optimization.github.io/doc/sambo/plot.html)
take care of the more mundane things needed to make good and informative plots of the parameter space.

Note, because SAMBO internally only does _minimization_, the values in `optimize_result` are negated (less is better).

Learn more by exploring further
[examples](https://kernc.github.io/backtesting.py/doc/backtesting/index.html#tutorials)
or find more framework options in the
[full API reference](https://kernc.github.io/backtesting.py/doc/backtesting/index.html#header-submodules).

In [24]:
from backtesting import Backtest

# 初始化回测实例，使用 MeanReverterShort 策略，所有参数均可通过 run 方法传入
bt = Backtest(
    backtesting_df,
    MeanReverterShort,
    commission=0.0004,
    margin=1/3,  # 3倍杠杆
    trade_on_close=True,  # 与 TradingView 的 process_orders_on_close 保持一致
    exclusive_orders=True,
    hedging=False  # 禁止对冲
)

# 使用自定义参数运行回测，所有策略参数均显式列出，
# 这些参数对应于 MeanReverterShort 策略中的各个设置（参见 file_context_0）
stats = bt.run(
    frequency=24,               # 慢速 RSI 均线周期（默认值：10）
    rsiFrequency=4,             # RSI 指标计算周期（默认值：40）
    sellZoneDistance=5,         # RSI 高于慢速 RSI 均线的比例，判定超买区域（默认值：5）
    avgUpATRSum=3,              # 累计 ATR 周期个数，用于判断加仓（默认值：3）
    useAbsoluteRSIBarrier=True, # 是否使用绝对 RSI 障碍（默认值：True）
    barrierLevel=73,            # RSI 障碍水平（默认值：50）
    pyramiding=8                # 最大允许加仓次数（默认值：8）
)

# 打印回测统计信息
print("\n=== Backtest Results ===")
print(f"Total Return: {stats['Return [%]']:.2f}%")
print(f"Sharpe Ratio: {stats['Sharpe Ratio']:.2f}")
print(f"Max Drawdown: {stats['Max. Drawdown [%]']:.2f}%") 
print(f"Win Rate: {stats['Win Rate [%]']:.2f}%")
print(f"Total Trades: {stats['# Trades']}")



=== Backtest Results ===
Total Return: 481.91%
Sharpe Ratio: 1.41
Max Drawdown: -33.97%
Win Rate: 52.43%
Total Trades: 1484


In [25]:
# 从 heatmap 中获取最优参数组合，并映射到更新后的策略参数
best_idx = heatmap.sort_values(ascending=False).index[0]
best_params = {
    'frequency': best_idx[0],
    'rsiFrequency': best_idx[1],
    'sellZoneDistance': best_idx[2],   # 原来可能为 buyZoneDistance，现在调整为 sellZoneDistance
    'avgUpATRSum': best_idx[3],         # 原来可能为 avgDownATRSum，现在调整为 avgUpATRSum
    'barrierLevel': best_idx[4],
    'pyramiding': best_idx[5]
}

# 打印最优参数
print("\n=== Best Parameters ===")
print(f"Frequency: {best_params['frequency']}")
print(f"RSI Frequency: {best_params['rsiFrequency']}")
print(f"Sell Zone Distance: {best_params['sellZoneDistance']}")
print(f"Avg Up ATR Sum: {best_params['avgUpATRSum']}")
print(f"Barrier Level: {best_params['barrierLevel']}")
print(f"Pyramiding: {best_params['pyramiding']}")

# 使用更新后的策略参数运行回测
stats = bt.run(
    frequency=best_params['frequency'],
    rsiFrequency=best_params['rsiFrequency'],
    sellZoneDistance=best_params['sellZoneDistance'],
    avgUpATRSum=best_params['avgUpATRSum'],
    useAbsoluteRSIBarrier=True,
    barrierLevel=best_params['barrierLevel'],
    pyramiding=best_params['pyramiding']
)

# 打印回测统计信息
print("\n=== Backtest Results ===")
print(f"Total Return: {stats['Return [%]']:.2f}%")
print(f"Sharpe Ratio: {stats['Sharpe Ratio']:.2f}")
print(f"Max Drawdown: {stats['Max. Drawdown [%]']:.2f}%")
print(f"Win Rate: {stats['Win Rate [%]']:.2f}%")
print(f"Total Trades: {stats['# Trades']}")

# 显示回测图形，可禁用重采样以避免日期转换错误
try:
    bt.plot(resample=False)
except TypeError as e:
    print("Warning: Plot error, possibly due to data type conversion issues")
    print(f"Error message: {str(e)}")


=== Best Parameters ===
Lowest Point Bars: 49
RSI Length: 24
Sell Barrier: 7
DCA Parts: 8


AttributeError: Strategy 'MeanReverter' is missing parameter 'lowest_point_bars'.Strategy class should define parameters as class variables before they can be optimized or run with.