## Requirements

构建组合：选取净利润断层标的数量排名靠前的行业作为入选行业，对每一入选行业，选取该行业全部净利润断层个股标的。

调仓日：选择一季报、半年报和三季报这三个财报季，即每年的4月30日、 8月31日和10月31日作为调仓日。

回测区间：2009年5月至2021年7月

构建两个净利润断层策略，分别是“前三行业策略”和“前五行业策略”，并进行回测，统计年化收益、胜率、回撤、每期个股数等

可以用后一交易日 min（开，收）>前一交易日max（开，收）做一个简化版的跳空

In [1]:
import numpy as np
import pandas as pd
import cvxpy as cp
import statsmodels.api as sm
import investpy
import yfinance as yf
import matplotlib.pyplot as plt
import datetime
from dateutil.relativedelta import relativedelta
import seaborn as sns
import matplotlib.transforms as transforms
import quandl
from tqdm import tqdm
import dask
import dask.dataframe as dd


## 1. Read in the Data

In [13]:
# Shanghai Stock Exchange
SSE_OPEN = pd.read_excel('SSE_OPEN.xlsx')
SSE_CLOSE = pd.read_excel('SSE_CLOSE.xlsx')

# ChiNext （创业板）
ChiNext_OPEN = pd.read_excel('ChiNext_OPEN.xlsx')
ChiNext_CLOSE = pd.read_excel('ChiNext_CLOSE.xlsx')

Disclosure_time = pd.read_excel('Disclosure_time.xlsx')

# Shenzhen Stock Exchange
SZSE_OPEN = pd.read_excel('SZSE_OPEN.xlsx')
SZSE_CLOSE = pd.read_excel('SZSE_CLOSE.xlsx')

# STAR Market （科创板）
STAR_OPEN = pd.read_excel('STAR_OPEN.xlsx')
STAR_CLOSE = pd.read_excel('STAR_CLOSE.xlsx')

In [25]:
# set the header to the first row of the dataframe
SSE_OPEN_renamed = SSE_OPEN.rename(columns=SSE_OPEN.iloc[0])
# rename the first column to 'Date'
SSE_OPEN_renamed.rename(columns={'日期': 'Date'}, inplace=True)
# drop the first row of the dataframe
SSE_OPEN_renamed.set_index('Date', inplace=True)
SSE_OPEN_renamed.drop(SSE_OPEN_renamed.index[0], inplace=True)
SSE_OPEN_renamed

Unnamed: 0_level_0,浦发银行600000.SH,白云机场600004.SH,东风汽车600006.SH,中国国贸600007.SH,首创环保600008.SH,上海机场600009.SH,包钢股份600010.SH,华能国际600011.SH,皖通高速600012.SH,华夏银行600015.SH,...,国邦医药605507.SH,德昌股份605555.SH,福莱蒽特605566.SH,春雪食品605567.SH,龙版传媒605577.SH,恒盛能源605580.SH,冠石科技605588.SH,圣泉集团605589.SH,上海港湾605598.SH,菜百股份605599.SH
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2010-01-04 00:00:00,21.83,10.15,6.93,12.4,7.27,17.45,4.65,8.11,5.97,12.41,...,--,--,--,--,--,--,--,--,--,--
2010-01-05 00:00:00,21.41,10.15,6.91,11.86,7.16,17.9,4.58,8.03,5.88,12.37,...,--,--,--,--,--,--,--,--,--,--
2010-01-06 00:00:00,21.29,10.17,6.87,11.49,7.19,18,4.55,8.1,5.9,12.53,...,--,--,--,--,--,--,--,--,--,--
2010-01-07 00:00:00,20.88,10,6.85,11.6,7.22,17.6,4.57,8.02,5.86,12.13,...,--,--,--,--,--,--,--,--,--,--
2010-01-08 00:00:00,20.34,9.73,6.56,11.65,7,17.41,4.46,7.86,5.74,11.8,...,--,--,--,--,--,--,--,--,--,--
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2021-12-29 00:00:00,8.57,11.61,6.91,14.17,3.36,45.69,2.83,9.4,7.02,5.59,...,28.21,40.12,31.22,17.42,14.05,16.1,38.84,38.64,16.31,13
2021-12-30 00:00:00,8.54,11.93,6.82,14.25,3.36,46.3,2.79,9.13,7.05,5.6,...,28.16,39.91,31.09,17.5,15,16.24,38.86,37.6,15.92,12.94
2021-12-31 00:00:00,8.54,12.01,6.83,14.28,3.36,47.64,2.75,9.7,7.04,5.59,...,28.09,40.06,30.78,19.36,15,16.2,39.6,39.8,16.31,13.09
,,,,,,,,,,,...,,,,,,,,,,


In [26]:
ChiNext_OPEN.head()

Unnamed: 0.1,Unnamed: 0,开盘价(元),Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,Unnamed: 1081,Unnamed: 1082,Unnamed: 1083,Unnamed: 1084,Unnamed: 1085,Unnamed: 1086,Unnamed: 1087,Unnamed: 1088,Unnamed: 1089,Unnamed: 1090
0,日期,特锐德300001.SZ,神州泰岳300002.SZ,乐普医疗300003.SZ,南风股份300004.SZ,探路者300005.SZ,莱美药业300006.SZ,汉威科技300007.SZ,天海防务300008.SZ,安科生物300009.SZ,...,超达装备301186.SZ,力诺特玻301188.SZ,奥尼电子301189.SZ,善水科技301190.SZ,家联科技301193.SZ,喜悦智行301198.SZ,迈赫股份301199.SZ,亨迪药业301211.SZ,观想科技301213.SZ,光庭信息301221.SZ
1,2010-01-04 00:00:00,42.49,106.5,51.59,39.7,43.5,33.49,44.48,51.2,43,...,--,--,--,--,--,--,--,--,--,--
2,2010-01-05 00:00:00,42.28,106.6,51.62,39.6,43,33.99,44.1,51.3,42.73,...,--,--,--,--,--,--,--,--,--,--
3,2010-01-06 00:00:00,42.29,110.39,51.7,38.68,42.98,34.68,43.88,51.26,42.09,...,--,--,--,--,--,--,--,--,--,--
4,2010-01-07 00:00:00,41.35,109.02,50.7,37.6,41.8,33.03,42.75,50.04,41.06,...,--,--,--,--,--,--,--,--,--,--


In [28]:
# set the header to the first row of the dataframe
ChiNext_OPEN_renamed = ChiNext_OPEN.rename(columns=ChiNext_OPEN.iloc[0])
# rename the first column to 'Date'
ChiNext_OPEN_renamed.rename(columns={'日期': 'Date'}, inplace=True)
# drop the first row of the dataframe
ChiNext_OPEN_renamed.set_index('Date', inplace=True)
ChiNext_OPEN_renamed.drop(ChiNext_OPEN_renamed.index[0], inplace=True)
ChiNext_OPEN_renamed

Unnamed: 0_level_0,特锐德300001.SZ,神州泰岳300002.SZ,乐普医疗300003.SZ,南风股份300004.SZ,探路者300005.SZ,莱美药业300006.SZ,汉威科技300007.SZ,天海防务300008.SZ,安科生物300009.SZ,豆神教育300010.SZ,...,超达装备301186.SZ,力诺特玻301188.SZ,奥尼电子301189.SZ,善水科技301190.SZ,家联科技301193.SZ,喜悦智行301198.SZ,迈赫股份301199.SZ,亨迪药业301211.SZ,观想科技301213.SZ,光庭信息301221.SZ
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
2010-01-04 00:00:00,42.49,106.5,51.59,39.7,43.5,33.49,44.48,51.2,43,32.99,...,--,--,--,--,--,--,--,--,--,--
2010-01-05 00:00:00,42.28,106.6,51.62,39.6,43,33.99,44.1,51.3,42.73,32.88,...,--,--,--,--,--,--,--,--,--,--
2010-01-06 00:00:00,42.29,110.39,51.7,38.68,42.98,34.68,43.88,51.26,42.09,32.38,...,--,--,--,--,--,--,--,--,--,--
2010-01-07 00:00:00,41.35,109.02,50.7,37.6,41.8,33.03,42.75,50.04,41.06,31.5,...,--,--,--,--,--,--,--,--,--,--
2010-01-08 00:00:00,41.29,105.11,49.5,37,40.95,31.88,41.79,48.6,40,30.8,...,--,--,--,--,--,--,--,--,--,--
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2021-12-29 00:00:00,25.26,6.31,22.88,6.76,9.69,6.25,28.08,4.82,12.54,4.17,...,47.34,24.01,56.5,30.34,32.05,37.1,35.26,44.14,63.2,87.66
2021-12-30 00:00:00,24.59,6.19,22.68,6.78,9.69,6.64,27.5,4.81,12.8,4.18,...,46.69,24.1,56.98,31.7,31.96,37.29,35.58,39.4,62.26,87.48
2021-12-31 00:00:00,24.85,6.4,22.66,6.75,9.75,6.81,28.76,4.89,12.86,4.37,...,44.66,24.77,64.49,30.96,32.38,37.15,34.68,39.8,65,86.76
,,,,,,,,,,,...,,,,,,,,,,


In [33]:
names = ChiNext_OPEN_renamed.columns.to_list()

In [32]:
names.to_list()

['特锐德300001.SZ',
 '神州泰岳300002.SZ',
 '乐普医疗300003.SZ',
 '南风股份300004.SZ',
 '探路者300005.SZ',
 '莱美药业300006.SZ',
 '汉威科技300007.SZ',
 '天海防务300008.SZ',
 '安科生物300009.SZ',
 '豆神教育300010.SZ',
 '鼎汉技术300011.SZ',
 '华测检测300012.SZ',
 '新宁物流300013.SZ',
 '亿纬锂能300014.SZ',
 '爱尔眼科300015.SZ',
 '北陆药业300016.SZ',
 '网宿科技300017.SZ',
 '中元股份300018.SZ',
 '硅宝科技300019.SZ',
 '银江技术300020.SZ',
 '大禹节水300021.SZ',
 '吉峰科技300022.SZ',
 '*ST宝德300023.SZ',
 '机器人300024.SZ',
 '华星创业300025.SZ',
 '红日药业300026.SZ',
 '华谊兄弟300027.SZ',
 '*ST天龙300029.SZ',
 '阳普医疗300030.SZ',
 '宝通科技300031.SZ',
 '金龙机电300032.SZ',
 '同花顺300033.SZ',
 '钢研高纳300034.SZ',
 '中科电气300035.SZ',
 '超图软件300036.SZ',
 '新宙邦300037.SZ',
 '*ST数知300038.SZ',
 '上海凯宝300039.SZ',
 '九洲集团300040.SZ',
 '回天新材300041.SZ',
 '朗科科技300042.SZ',
 '星辉娱乐300043.SZ',
 '*ST赛为300044.SZ',
 '华力创通300045.SZ',
 '台基股份300046.SZ',
 '天源迪科300047.SZ',
 '合康新能300048.SZ',
 '福瑞股份300049.SZ',
 '世纪鼎利300050.SZ',
 'ST三五300051.SZ',
 '中青宝300052.SZ',
 '欧比特300053.SZ',
 '鼎龙股份300054.SZ',
 '万邦达300055.SZ',
 '中创环保300056.SZ',
 '万顺新材300057.S