# 問題集
run_pipeline 出現 `IndexError cannot do a non-empty take from an empty axes.`

# 解決方法:
使用 Pipeline 的 quantiles 時將 run_pipeline 的 start_date 至少設定為 bundle 的 start_date 後的下一個交易日。

# 環境及import設定

In [1]:
import os
import time
import pandas as pd
import numpy as np 
from logbook import Logger, StderrHandler, INFO, WARNING

# tej_key
# os.environ['TEJAPI_KEY'] = 'your key'
# os.environ['TEJAPI_BASE'] = 'https://api.tej.com.tw'  

os.environ['TEJAPI_KEY'] = 'yMV4yu7fMvcLoOV6w73S0i0V18TlXl'
os.environ['TEJAPI_BASE'] = 'http://10.10.10.66'


from zipline.sources.TEJ_Api_Data import get_universe, get_Benchmark_Return

from zipline.data.run_ingest import simple_ingest

from zipline.TQresearch.tej_pipeline import run_pipeline

from zipline.pipeline import Pipeline
from zipline.pipeline.domain import TW_EQUITIES
from zipline.pipeline.data import EquityPricing

log_handler = StderrHandler(format_string='[{record.time:%Y-%m-%d %H:%M:%S.%f}]: ' +
                            '{record.level_name}: {record.func_name}: {record.message}',
                            level=INFO)
log_handler.push_application()
log = Logger('Algorithm')

In [2]:
bundle_name = 'tquant'

# 取tejapi資料起日
start = '2024-01-02'
start_dt = pd.Timestamp(start, tz='UTC')

# 迄日
end = '2024-03-31'
end_dt= pd.Timestamp(end, tz='UTC')

# 設定fields給ingest時使用
col = ['Market_Cap_Dollars']

# ingest

In [3]:
# 設定ticker給ingest時使用
pool = get_universe(start,
                    end,
                    idx_id='IX0002'
                   )

# 價量資料（Pricing Data）
simple_ingest(name = 'tquant',
              tickers = pool+['IR0001'],
              start_date = start,
              end_date = end)

[2024-04-24 10:19:08.356092]: INFO: get_universe_TW: Filters：{'idx_id': ['IX0002']}


Currently used TEJ API key call quota 933/100000 (0.93%)
Currently used TEJ API key data quota 78315269/10000000 (783.15%)
Now ingesting data.
End of ingesting tquant.
Please call function `get_bundle(start_dt = pd.Timestamp('2024-01-02', tz = 'utc'),end_dt = pd.Timestamp('2024-03-31' ,tz = 'utc'))` in `zipline.data.data_portal` to check data.
Currently used TEJ API key call quota 933/100000 (0.93%)
Currently used TEJ API key data quota 78315269/10000000 (783.15%)


# 建立Pipeline

In [4]:
my_pipe = Pipeline(domain = TW_EQUITIES)                         
my_pipe.add(EquityPricing.close.latest.quantiles(10), 'close_quartiles') 

# 問題

run_pipeline時出現錯誤`IndexError: cannot do a non-empty take from an empty axes.`。

In [5]:
result = run_pipeline(my_pipe, start_dt, end_dt)

IndexError: cannot do a non-empty take from an empty axes.

# 解法

這個錯誤的原因是因為 pipeline 在 2024-01-02 計算 quantiles 時會試圖導入**前一個交易日**的資料，然而 bundle 的起始日為 2024-01-02 （參考`! zipline bundle-info`結果中的 start_date），找不到前一個交易日的資料，所以 pipeline 利用 `quantiles` 進行分群時便無法分群，進而引發錯誤。

In [6]:
! zipline bundle-info

tickers :
1101 1216 1301 1303 1326 1590 2002 2207 2301 2303
2308 2317 2327 2330 2345 2357 2379 2382 2395 2408
2412 2454 2603 2801 2880 2881 2882 2883 2884 2885
2886 2887 2890 2891 2892 2912 3008 3034 3037 3045
3231 3661 3711 4904 4938 5871 5876 5880 6505 6669
9910 IR0001 
start_date : 20240102.
end_date : 20240329.


將 run_pipeline 的 start_date 至少設定為 bundle 的 start_date 後的下一個交易日。
- 以本案例來說 bundle 的 start_date 後的下一個交易日，是 2024-01-03，所以將 run_pipeline 的 start_date 設定為 2024-01-03 即可。

In [7]:
rev_start_dt = TW_EQUITIES.next_open(start)
rev_start_dt

Timestamp('2024-01-03 00:00:00+0000', tz='UTC')

In [8]:
result = run_pipeline(my_pipe, rev_start_dt, end_dt)

In [9]:
result

Unnamed: 0,Unnamed: 1,close_quartiles
2024-01-03 00:00:00+00:00,Equity(0 [1101]),2
2024-01-03 00:00:00+00:00,Equity(1 [1216]),3
2024-01-03 00:00:00+00:00,Equity(2 [1301]),4
2024-01-03 00:00:00+00:00,Equity(3 [1303]),3
2024-01-03 00:00:00+00:00,Equity(4 [1326]),3
...,...,...
2024-03-29 00:00:00+00:00,Equity(47 [5880]),1
2024-03-29 00:00:00+00:00,Equity(48 [6505]),3
2024-03-29 00:00:00+00:00,Equity(49 [6669]),9
2024-03-29 00:00:00+00:00,Equity(50 [9910]),6
