Skip to content
No description, website, or topics provided.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
config
dataReceiver
data_origin
origin
props
reader
storage
sync
test
.editorconfig
.gitignore
README.md
VERSION.txt
__init__.py
check.py
data.py
datasync_install.bat
lock.py
log.py
proxy.py
requirements.txt
setup.py
utils.py

README.md

DataSync开发文档

1. DataSync功能

DataSync是基于Python开发的简化版证券数据ETL工具,项目已接入多种数据源与数据库,简化了数据调取清洗写入的具体技术细节,简单的用户需求可以仅通过修改配置文件就可获取需要的数据,如果需求较为复杂,也可以进行简单的二次开发来解决。

2. 安装服务

2.1 自动安装服务

在Datasync目录下(鼠标右键)以管理员模式运行datasync_install.bat

2.2 手动安装

如果自动安装失败,可尝试手动安装。

2.2.1 下载项目代码

在本地打开git命令行,使用以下命令克隆项目代码

git clone https://github.com/sicher123/DataSync.git

2.2.2 安装项目

在DataSync目录下打开命令行,输入以下命令

python setup.py install

2.2.3 配置

配置信息以Excel表格形式存放,可根据需求新增表,但必须严格按照原有格式,否则服务会出现错误。

默认配置文件存放位置为:
DataSync\datasync\config\config.xlsx
Daily_data sheet 存放的是日线配置信息,lb_data sheet存放的是季度数据配置信息,表格示例如下:

表名 (例)dbo.AINDEXEODPRICES
origin 数据源,目前支持SqlServer/Oracle MSSqlOrigin/OracleOrigin
db_config 数据库信息配置,包括数据库地址,用户名,密码 {'addr': '172.16.100.7', 'user': 'user1', 'password': '123456'}
fields 需要请求的数据字段,若为空则取表内全品种 S_DQ_LOW,S_DQ_HIGH
S_INFO_WINDCODE 需要请求的证券代码,若为空则取表内全品种 000001.SZ,600000.SH
DATE_NAME 日期索引的字段名 TRADE_DT
start_date 默认开始日期,若本地数据为空,则请求以该日期为起始时间的数据 20080101
其它字段 …… ……

2.2.4 安装为windows定时任务

1)打开windows任务计划程序

1539055935883

2)创建任务

1539056202717

3)在DataSync项目下找到run_sync.bat文件,在计划任务下设置以日频率执行该脚本。

1539056540773

2.3 服务确认

​ 在定时任务执行了run_sync.bat或手动运行后,在系统桌面会生成一个日志文件目录,检查日志,查看是否提示数据同步成功,否则需要检查代码。

3. 项目结构

3.1 config&props

config目录下存放的是固定的配置文件,props目录下提供了不同配置文件的读写接口,最终输出的是一个python字典对象。目前的服务使用的配置文件是config.xlsx文件,具体配置信息如下:

表名 (例)dbo.AINDEXEODPRICES
origin 数据源,目前支持SqlServer/Oracle MSSqlOrigin/OracleOrigin
db_config 数据库信息配置,包括数据库地址,用户名,密码 {'addr': '172.16.100.7', 'user': 'user1', 'password': '123456'}
fields 需要请求的数据字段,若为空则取表内全品种 S_DQ_LOW,S_DQ_HIGH
S_INFO_WINDCODE 需要请求的证券代码,若为空则取表内全品种 000001.SZ,600000.SH
DATE_NAME 日期索引的字段名 TRADE_DT
start_date 默认开始日期,若本地数据为空,则请求以该日期为起始时间的数据 20080101
其它字段 …… ……

在目前已经支持的数据接口范围内,可通过修改/添加文件配置;

3.2 origin

3.2.1 基本信息

数据源模块,目前支持的数据源:

数据源 支持数据 说明
jaqs 股票分钟行情/日行情/财务数据 需要安装jaqs包 ,见http://qunatos.org/pro/
oracle/sqlserver/mongodb 本地仓库数据 /

如需拓展新的数据源,可自定义添加。

3.2.2 函数说明

origin都需要实现共同的基础方法;以MSSqlorigin为例说明,必须实现的方法有:

  • props_to_sql : 将配置信息转换为数据源的识别语句
  • connect : 数据库连接
  • read : 数据读取接口

3.3 storage

3.3.1 基本信息

本地数据仓库接口,目前支持数据仓库有:

数据仓库 说明 使用场景
内存 数据存储在内存中 仅使用小数据量,低频率数据
excel 以excel文件存储数据 需要跨平台做研究
hdf5-pandas 基于pandas的HDF5文件,使用方便但占用较多资源 中等数据量,频繁全量读取
hdf5 原生HDF5,性能较好但不够灵活 大数据量,频繁全量读取
mongodb 键值型数据库,比sql类数据库更适合证券数据 全类数据存储
sqlite 文件形式的轻型sql数据库 中小数据量存储,频繁查询

可自定义添加新的数据库。

3.3.2 函数说明

origin都需要实现共同的基础方法;以MSSqlorigin为例说明,必须实现的方法有:

  • get_update_info : 获取时间序列数据最晚一条记录的时间
  • update_file/update_table: 数据写入接口

其它自定义方法有:

  • execute : 简化sql执行操作
  • set_attr : 写入其它信息的接口

3.4 sync

3.4.1 基本信息

此目录下是实现同步服务的脚本,可根据自己的实际功能需求编写与拓展,没有固定格式。

3.4.2 函数说明

以guojin_sync为模板,具体流程如下:

  • read_config : 调取配置
  • get_props : 分解配置信息,放置出现一次性调用数据过多导致内存不足的情况
  • spc_treatment : 清洗非标准化数据
  • Updater : 更新器,分流不同格式的数据以不同方式更新
  • run : 使用配置信息读取数据
  • check_n_rollback : 检查本地数据正确性并备份
dbo.AINDEXEODPRICES dbo.ASHAREEODDERIVATIVEINDICATOR
DATE_NAME TRADE_DT TRADE_DT
S_INFO_WINDCODE 000015.SH,399675.SZ,000095.SH,399635.SZ,399437.SZ,399015.SZ,399363.SZ,399989.SZ,000905.SH,000090.SH,000147.SH,399377.SZ,399396.SZ,000010.SH,000005.SH,399374.SZ,000941.SH,000841.SH,000153.SH,000801.SH,399995.SZ,000957.SH,000912.SH,000909.SH,399244.SZ,399429.SZ,000994.SH,000117.SH,399398.SZ,399685.SZ,399324.SZ,000091.SH,399002.SZ,000071.SH,000977.SH,000961.SH,399417.SZ,000019.SH,000978.SH,399004.SZ,399646.SZ,000985.SH,000098.SH,000917.SH,000910.SH,000938.SH,000828.SH,399012.SZ,399673.SZ,000099.SH,000037.SH,000937.SH,000965.SH,000094.SH,399409.SZ,399411.SZ,399624.SZ,399314.SZ,000943.SH,399393.SZ,399419.SZ,399814.SZ,000045.SH,399677.SZ,399385.SZ,399629.SZ,000944.SH,399439.SZ,399556.SZ,000819.SH,399337.SZ,000135.SH,000003.SH,399432.SZ,000073.SH,399806.SZ,399312.SZ,399380.SZ,399994.SZ,399384.SZ,000044.SH,399642.SZ,399965.SZ,399602.SZ,399664.SZ,000002.SH,000046.SH,000155.SH,000919.SH,399686.SZ,000815.SH,399681.SZ,399997.SZ,000068.SH,399627.SZ,399382.SZ,399403.SZ,399656.SZ,399356.SZ,399972.SZ,399554.SZ,000921.SH,399616.SZ,000054.SH,399339.SZ,399557.SZ,399553.SZ,000152.SH,399622.SZ,000030.SH,000018.SH,000952.SH,399604.SZ,399606.SZ,000138.SH,000097.SH,399618.SZ,399307.SZ,399695.SZ,000108.SH,399306.SZ,399555.SZ,399006.SZ,399240.SZ,399341.SZ,000121.SH,000983.SH,399697.SZ,399626.SZ,399007.SZ,000008.SH,399657.SZ,399418.SZ,000855.SH,399703.SZ,399322.SZ,399100.SZ,399010.SZ,399638.SZ,399433.SZ,000141.SH,399321.SZ,399362.SZ,000846.SH,399706.SZ,000984.SH,399550.SZ,000078.SH,399316.SZ,000025.SH,000145.SH,399392.SZ,000810.SH,399992.SZ,399348.SZ,399966.SZ,399993.SZ,000118.SH,000829.SH,399674.SZ,399687.SZ,000107.SH,399364.SZ,399610.SZ,399353.SZ,000038.SH,399813.SZ,399431.SZ,000040.SH,000131.SH,399310.SZ,000106.SH,399662.SZ,000066.SH,399420.SZ,000920.SH,399407.SZ,000120.SH,000092.SH,399361.SZ,000975.SH,000058.SH,399670.SZ,399639.SZ,399623.SZ,000161.SH,399351.SZ,399651.SZ,000968.SH,399315.SZ,399389.SZ,000915.SH,399669.SZ,000150.SH,399705.SZ,000129.SH,000840.SH,399249.SZ,399394.SZ,399434.SZ,399236.SZ,399011.SZ,000953.SH,000096.SH,399481.SZ,000918.SH,399428.SZ,000009.SH,000033.SH,399375.SZ,000922.SH,000047.SH,399103.SZ,000814.SH,000824.SH,000805.SH,000995.SH,000827.SH,000852.SH,000838.SH,000151.SH,000126.SH,000991.SH,399684.SZ,000130.SH,000069.SH,399611.SZ,000812.SH,399412.SZ,000914.SH,399676.SZ,000049.SH,399701.SZ,000048.SH,000960.SH,399406.SZ,399672.SZ,399613.SZ,000928.SH,399683.SZ,399998.SZ,399369.SZ,399647.SZ,000158.SH,399808.SZ,399303.SZ,399366.SZ,000804.SH,000811.SH,000806.SH,399619.SZ,000136.SH,000820.SH,000986.SH,000104.SH,000052.SH,399405.SZ,399630.SZ,399693.SZ,399017.SZ,399807.SZ,000972.SH,000992.SH,399682.SZ,399986.SZ,399388.SZ,399648.SZ,000955.SH,000950.SH,000939.SH,000825.SH,399378.SZ,399655.SZ,399238.SZ,000125.SH,000077.SH,399370.SZ,399391.SZ,000027.SH,399013.SZ,399653.SZ,000064.SH,399232.SZ,399235.SZ,399976.SZ,000945.SH,399643.SZ,399400.SZ,399698.SZ,000103.SH,000032.SH,000115.SH,000119.SH,399234.SZ,399241.SZ,399438.SZ,000031.SH,399365.SZ,000109.SH,000826.SH,399634.SZ,399628.SZ,000039.SH,399617.SZ,000056.SH,000079.SH,000041.SH,399248.SZ,000970.SH,000074.SH,000026.SH,399803.SZ,000940.SH,399102.SZ,000122.SH,000114.SH,000958.SH,399661.SZ,000839.SH,399413.SZ,000929.SH,000100.SH,000927.SH,399973.SZ,000822.SH,000816.SH,399372.SZ,000802.SH,000093.SH,000128.SH,399983.SZ,399395.SZ,399346.SZ,399423.SZ,000133.SH,000134.SH,399333.SZ,399422.SZ,000053.SH,399625.SZ,399243.SZ,399317.SZ,000947.SH,000146.SH,399313.SZ,399975.SZ,000063.SH,399390.SZ,399410.SZ,399680.SZ,399679.SZ,000933.SH,000982.SH,000160.SH,000809.SH,000966.SH,000007.SH,399399.SZ,399358.SZ,000110.SH,399016.SZ,399357.SZ,399699.SZ,399707.SZ,399386.SZ,000057.SH,000908.SH,399614.SZ,399678.SZ,000959.SH,000979.SH,000998.SH,000808.SH,000817.SH,000901.SH,000990.SH,000028.SH,399688.SZ,399691.SZ,000844.SH,000987.SH,399659.SZ,000139.SH,399671.SZ,399974.SZ,399644.SZ,000926.SH,399809.SZ,000946.SH,000902.SH,399326.SZ,399620.SZ,000981.SH,399959.SZ,399668.SZ,000062.SH,399970.SZ,000034.SH,000832.SH,399441.SZ,000076.SH,399381.SZ,399352.SZ,000132.SH,000913.SH,000936.SH,399330.SZ,399344.SZ,000951.SH,399612.SZ,399404.SZ,000807.SH,399637.SZ,399996.SZ,399237.SZ,000903.SH,399650.SZ,000803.SH,399704.SZ,399009.SZ,399636.SZ,399018.SZ,399379.SZ,000948.SH,399552.SZ,000020.SH,399702.SZ,399387.SZ,399551.SZ,000006.SH,000949.SH,399812.SZ,000907.SH,399320.SZ,399632.SZ,399101.SZ,399631.SZ,399408.SZ,000035.SH,000954.SH,399667.SZ,000159.SH,000123.SH,000967.SH,399242.SZ,399367.SZ,000102.SH,000989.SH,000818.SH,000906.SH,399649.SZ,399633.SZ,000065.SH,000070.SH,000830.SH,399350.SZ,000050.SH,000980.SH,399383.SZ,399335.SZ,000017.SH,399001.SZ,000969.SH,000932.SH,399689.SZ,399654.SZ,399231.SZ,399435.SZ,000930.SH,399660.SZ,399990.SZ,000971.SH,399107.SZ,399373.SZ,399427.SZ,000021.SH,000016.SH,000051.SH,000993.SH,399663.SZ,399376.SZ,399805.SZ,000956.SH,399401.SZ,000149.SH,399696.SZ,399694.SZ,000060.SH,000148.SH,000105.SH,000942.SH,000001.SH,399666.SZ,399355.SZ,000988.SH,399608.SZ,399359.SZ,399652.SZ,399233.SZ,399319.SZ,000113.SH,399368.SZ,000843.SH,000137.SH,000821.SH,399008.SZ,000142.SH,000963.SH,000904.SH,000072.SH,399690.SZ,000931.SH,000935.SH,399300.SZ,000075.SH,000112.SH,000925.SH,000055.SH,399645.SZ,000911.SH,000036.SH,399371.SZ,399991.SZ,000067.SH,000004.SH,399436.SZ,000962.SH,399311.SZ,000842.SH,000813.SH,399621.SZ,399640.SZ,399641.SZ,000059.SH,399665.SZ,000934.SH,399239.SZ,399360.SZ,000831.SH,000042.SH,000029.SH,399440.SZ,399005.SZ,399802.SZ,000162.SH,000043.SH,399397.SZ,000916.SH,399658.SZ,399804.SZ,399328.SZ,000964.SH,399967.SZ,000111.SH,399615.SZ,399692.SZ,399971.SZ,399402.SZ,000300.SH
db_config {'addr': '172.16.100.7', 'user': 'bigfish01', 'password': 'bigfish01@0514'} {'addr': '172.16.100.7', 'user': 'bigfish01', 'password': 'bigfish01@0514'}
fields S_DQ_VOLUME,S_DQ_AMOUNT,S_DQ_OPEN,S_DQ_HIGH,OBJECT_ID,S_DQ_LOW,TRADE_DT,S_DQ_CLOSE,S_INFO_WINDCODE OPER_REV_TTM,UP_DOWN_LIMIT_STATUS,TOT_SHR_TODAY,S_DQ_MV,S_VAL_PS_TTM,S_VAL_PCF_NCF,NET_PROFIT_PARENT_COMP_LYR,S_PRICE_DIV_DPS,S_DQ_CLOSE_TODAY,S_VAL_PCF_OCF,NET_CASH_FLOWS_OPER_ACT_TTM,FLOAT_A_SHR_TODAY,S_VAL_PCF_NCFTTM,OBJECT_ID,OPER_REV_LYR,S_INFO_WINDCODE,S_VAL_PB_NEW,NET_ASSETS_TODAY,S_VAL_PCF_OCFTTM,S_VAL_PE,FREE_SHARES_TODAY,NET_CASH_FLOWS_OPER_ACT_LYR,S_DQ_TURN,S_VAL_PS,S_VAL_MV,NET_PROFIT_PARENT_COMP_TTM,S_VAL_PE_TTM,S_DQ_FREETURNOVER,TRADE_DT
folder_path D:/hdf5_data D:/hdf5_data
origin MSSqlOrigin MSSqlOrigin
start_date 20000101 20000101

3.4.3 特别注意

  • 因为本服务以固定频率自动更新数据;若改写服务,则需规范每日数据的起止时间,否则会出现数据缺失或者数据重复的现象。
You can’t perform that action at this time.