# XTech US Equity Flow™ Analytics Demo Notebook

# Unifier Data Warehouse & API Simplify Data Access With A Single Interface

In [73]:
import os
from IPython.display import display

### If you have not installed unifier you can do so by running the following command:

In [74]:
# pip install unifier

## Simply Import and initialize Unifier and Go!

In [75]:
from unifier import unifier

In [None]:
unifier.user = '<unifier username>'
unifier.token ='<unifer api token>'
os.environ['UNIFIER_USER'] = unifier.user
os.environ['UNIFIER_TOKEN'] = unifier.token

Replace "unifier.user = 'unifer username'" with your Unifier user ID and "unifier.token ='unifier api token'" with your Unifier API token.

# XTech Flow™ US Equity Flow Analytics Documentation

## Overview
The **XTech Flow™ US Equity Flow Analytics** dataset provides unparalleled insights into US equity market flows with daily granularity and over 15 years of historical coverage. Designed for both fundamental and systematic investors, it enables detailed analysis of market actions and reactions.

### Key Details
- **Product**: XTech Flow™ US Equity Flow Analytics  
- **Version**: Indigo Panther  
- **Coverage**: US Equities  
- **Delivery Frequency**: Daily and Minutely / 15-minute delayed / Real-time 
- **Delivery Time**: 3 am ET / 15-minute delayed / Real-time  
- **Delivery Method**: Unifier API  
- **Data Frequency**: 1 minute, HourlyDaily, Weekly  
- **Data Size**: 50–200 MB/day  
- **Deep History**: January 1, 2007 to Present (17+ years)  

---

## Datasets: **`xtech_us_equity_flow_daily`** and **`xtech_us_equity_flow_1min`**
### Description
These datasets offer daily and minutely data intervals and capture comprehensive market flow details for the US equity market. The `symbol` field represents tickers, providing insights into market activity on a daily and minutely basis.

### Applications
- **Support/Resistance Identification**: Pinpoint critical levels by symbol.  
- **Market Impact Analysis**: Estimate impact functions overall and by investor type.  
- **Flow Correlation Visualization**: Understand cross-asset flow correlations.  
- **Momentum & Reversal Signals**: Identify actionable market signals.  
- **HFT Behavior Analysis**: Detect patterns in unexplained or curious high-frequency trading behaviors.  
- **Risk Management**: Analyze concentration risk and other systematic factors.  

---

### Field Descriptions

| **Column Name**   | **Data Type** | **Description**                                                                |
|--------------------|---------------|--------------------------------------------------------------------------------|
| `asof_datetime`    | string        | Timestamp indicating the latest moment at which the data would be available in a real-time feed in America/New York time.                                |
| `extended_hours`   | int           | Indicates if the minute bar is during or completing during the extended
hours session (1: Extended Session; 0: Regular Session starts at 09:31:00
NYT and ends at 16:01:00 NYT)                          |
| `ticker`           | string        | Security trading symbol                 |
| `trade_count`      | int           | Number of discrete trades in time period                                                     |
| `dollar_volume`    | double        | Dollar volume of trades in time period period                                                     |
| `m1_inst_buy`      | double        | Buy dollar volume calculated using Method 1 to identify institutional trades.                                  |
| `m1_inst_sell`     | double        | Sell dollar volume calculated using Method 1 to identify institutional trades.                                 |
| `m1_inst_buy_count`     | int           | Buy trade count calculated using Method 1 to identify institutional trades.                                    |
| `m1_inst_sell_count`    | int           | Sell trade volume calculated using Method 1 to identify institutional trades.                                   |
| `m2_inst_buy `     | double        | Buy dollar volume calculated using Method 2 to identify institutional trades.                                  |
| `m2_inst_sell`     | double        | Sell dollar volume calculated using Method 2 to identify institutional trades.                                 |
| `m2_inst_buy_count`     | int           | Buy trade count calculated using Method 2 to identify institutional trades.                                    |
| `m2_inst_sell_count`    | int           | Sell trade count calculated using Method 2 to identify institutional trades.                                   |
| `m3_inst_buy`      | double        | Buy dollar volume calculated using Method 3 to identify institutional trades.                                  |
| `m3_inst_sell`     | double        | Sell dollar volume calculated using Method 3 to identify institutional trades.                                 |
| `m3_inst_buy_count`     | int           | Buy trade count calculated using Method 3 to identify institutional trades.                                    |
| `m3_inst_sell_count`    | int           | Sell trade count calculated using Method 3 to identify institutional trades.                                   |
| `m4_inst_buy`      | double        | Buy dollar volume calculated using Method 4 to identify institutional trades.                                  |
| `m4_inst_sell`     | double        | Sell dollar volume calculated using Method 4 to identify institutional trades.                                 |
| `m4_inst_buy_count`     | int           | Buy trade count calculated using Method 4 to identify institutional trades.                                    |
| `m4_inst_sell_count`    | int           | Sell trade count calculated using Method 4 to identify institutional trades.                                   |
| `m5_inst_buy`      | double        | Buy dollar volume calculated using Method 5 to identify institutional trades.                                |
| `m5_inst_sell`     | double        | Sell dollar volume calculated using Method 5 to identify institutional trades.                               |
| `m5_inst_buy_count`     | int           | Buy trade count calculated using Method 5 to identify institutional trades.                                    |
| `m5_inst_sell_count`    | int           | Sell trade count calculated using Method 5 to identify institutional trades.                                   |
| `retail_buy`       | double        | Buy dollar volume calculated using Method 6 to identify retail trades.                                  |
| `retail_sell`      | double        | Sell dollar volume calculated using Method 6 to identify retail trades.                                 |
| `retail_buy_count `     | int           | Buy trade count calculated using Method 6 to identify retail trades.                                    |
| `retail_sell_count `    | int           | Sell trade count calculated using Method 6 to identify retail trades.                                   |

---

### Why This Dataset is Unique
This is a new dataset, with unprecedented 1 minute granularity and 15 years of history. It offers analysts
the unique ability to distinguish institutional and retail flow. The team that developed this product has
over 20 years of experience in HFT and other systematic strategies across all major asset classes. It
provides near-realtime market flow color comprehensively across the entire US market. Fundamental
and systematic investors alike can utilize this data to interpret market action and reaction to new events
to directly disentangle whether returns are explained by market impact vs changes in expectations based
on new information. In the near futures, extended versions of this data will built upon this foundational
dataset and will provide even greater granularity, flow decomposition and more investor types. 

---

### Potential Use Cases
- Identify critical support/resistance levels by symbol
- Estimate market impact functions overall and by investor type
- Visualize cross-asset flow correlations overall and by investor type
- Identify Momentum Signals
- Identify Reversal Signals
- Overlay on stat arb models to understand when temporary market impact is ending
- Identify what types of players are responsible for unexplained or curious HFT behaviors
observed in other strategies
- Analyze Concentration Risk
- Understand Stock Option Short Gamma Behavior of Dealers
- Other Example Strategies Might Be:
  - Enter after large position changes
  - Enter positions based on price threshold + position increases
  - Enter based on position threshold and exist after a certain period of time
- Identifying large market moving trades as they happen
- Intraday momentum trades
- 13D/13F Announcement Predictions
- M&A Position Tracking
- Index Add/Delete Strategy Tracking
- Closed-End-Fund Arbitrage Strategies
- Open/Closing Auction Imbalance Prediction
- Close-Open Returns
- Open-Close Returns
- Can be combined with XTech Option Flow Analytic to provide complete view of order imbalances each minute

---


## Retrieve data for a specific date using the asof_date parameter

In [77]:
df = unifier.get_dataframe(name="xtech_us_equity_flow_daily", asof_date='2024-01-23', limit=100)
display(df)

Unnamed: 0,asof_datetime,timestamp,includes_extended_hours,ticker,trade_count,dollar_volume,method1_inst_buy,method1_inst_sell,method1_inst_buy_count,method1_inst_sell_count,...,method4_inst_sell_count,method5_inst_buy,method5_inst_sell,method5_inst_buy_count,method5_inst_sell_count,retail_buy,retail_sell,retail_buy_count,retail_sell_count,date
0,2007-06-11 20:00:03.000,2007-06-11 16:02:00,1,A,6750,5.275413e+07,23021924.46,2.913888e+07,2715,3562,...,919,25735357.55,2.642545e+07,474,621,148375.69,444943.18,21,60,2007-06-11
1,2007-06-11 20:00:03.000,2007-06-11 16:35:00,1,AA,19158,2.049722e+08,95615725.38,1.074674e+08,8547,10324,...,2352,26726279.97,1.763568e+08,1902,15672,820605.06,1068477.41,70,96,2007-06-11
2,2007-06-11 20:00:03.000,2007-06-11 15:56:00,0,AAB.WS,7,2.191001e+04,0.00,2.071000e+04,0,7,...,3,3425.00,1.728500e+04,1,2,0.00,1200.01,0,1,2007-06-11
3,2007-06-11 20:00:03.000,2007-06-11 15:56:00,0,AAC,109,4.628252e+04,8168.35,3.300697e+04,50,59,...,37,22669.12,1.850623e+04,5,3,1405.73,3701.47,3,9,2007-06-11
4,2007-06-11 20:00:03.000,2007-06-11 16:01:00,1,AACC,669,2.895495e+06,1245424.48,1.647937e+06,270,376,...,119,1441376.12,1.451985e+06,76,68,2133.96,0.00,1,0,2007-06-11
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2007-06-11 20:00:03.000,2007-06-11 18:13:00,1,ACUS,699,7.084372e+05,170950.26,5.083458e+05,214,451,...,84,334460.59,3.448355e+05,35,63,7102.93,22038.22,6,8,2007-06-11
96,2007-06-11 20:00:03.000,2007-06-11 16:05:00,1,ACV,1991,1.001814e+07,5332689.70,4.579717e+06,1056,899,...,287,3581158.77,6.331248e+06,234,770,82490.17,23241.66,10,3,2007-06-11
97,2007-06-11 20:00:03.000,2007-06-11 16:34:00,1,ACW,1462,5.351107e+06,3131347.98,2.210540e+06,777,684,...,263,4178474.12,1.163414e+06,1092,368,9218.83,0.00,1,0,2007-06-11
98,2007-06-11 20:00:03.000,2007-06-11 16:02:00,1,ACXM,1360,6.602411e+06,2766083.42,3.771577e+06,574,718,...,207,3479872.19,3.057788e+06,131,81,19998.62,44752.07,3,6,2007-06-11


In [78]:
df = unifier.get_dataframe(name="xtech_us_equity_flow_1min", asof_date='2024-01-23', limit=100)
display(df)

Unnamed: 0,asof_datetime,timestamp,extended_hours,ticker,trade_count,dollar_volume,method1_inst_buy,method1_inst_sell,method1_inst_buy_count,method1_inst_sell_count,...,method4_inst_sell_count,method5_inst_buy,method5_inst_sell,method5_inst_buy_count,method5_inst_sell_count,retail_buy,retail_sell,retail_buy_count,retail_sell_count,date
0,2015-02-17 09:30:03,2015-02-17 09:30:00.000,1,OMED,7,86880.62,49646.07,37234.55,4,3,...,3,43440.31,43440.31,0,0,0.00,0.00,0,0,2015-02-17
1,2015-02-17 09:31:03,2015-02-17 09:31:00.000,0,OMED,1,2639.10,0.00,0.00,1,0,...,1,0.00,0.00,0,0,0.00,2639.10,0,1,2015-02-17
2,2015-02-17 09:32:03,2015-02-17 09:32:00.000,0,OMED,1,26.21,0.00,26.21,0,1,...,1,13.10,13.10,0,0,0.00,0.00,0,0,2015-02-17
3,2015-02-17 09:34:03,2015-02-17 09:34:00.000,0,OMED,1,3937.35,0.00,0.00,0,1,...,0,0.00,0.00,0,0,3937.35,0.00,1,0,2015-02-17
4,2015-02-17 09:35:03,2015-02-17 09:35:00.000,0,OMED,1,2643.00,2643.00,0.00,1,0,...,0,1321.50,1321.50,0,0,0.00,0.00,0,0,2015-02-17
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2015-02-17 12:16:03,2015-02-17 12:16:00.000,0,OMED,2,7717.62,7717.62,0.00,2,0,...,0,3858.81,3858.81,0,0,0.00,0.00,0,0,2015-02-17
96,2015-02-17 12:20:03,2015-02-17 12:20:00.000,0,OMED,1,2644.00,2644.00,0.00,1,0,...,0,2644.00,0.00,1,0,0.00,0.00,0,0,2015-02-17
97,2015-02-17 12:21:03,2015-02-17 12:21:00.000,0,OMED,1,2642.00,2642.00,0.00,1,0,...,1,1321.00,1321.00,0,0,0.00,0.00,0,0,2015-02-17
98,2015-02-17 12:22:03,2015-02-17 12:22:00.000,0,OMED,12,31800.38,1325.02,30475.36,0,11,...,4,14575.17,17225.21,0,1,0.00,0.00,0,0,2015-02-17


## Retrieve data for a specific date range using the back_to and up_to parameters


In [79]:
df = unifier.get_dataframe(name='xtech_us_equity_flow_daily', back_to='2024-01-01', up_to='2024-02-01', limit=100)
display(df)

Unnamed: 0,asof_datetime,timestamp,includes_extended_hours,ticker,trade_count,dollar_volume,method1_inst_buy,method1_inst_sell,method1_inst_buy_count,method1_inst_sell_count,...,method4_inst_sell_count,method5_inst_buy,method5_inst_sell,method5_inst_buy_count,method5_inst_sell_count,retail_buy,retail_sell,retail_buy_count,retail_sell_count,date
0,2024-01-26 20:00:03.000,2024-01-26 19:00:00,1,A,21421,1.548267e+08,90109307.82,61486811.53,10432,10980,...,5561,4.431022e+07,1.072859e+08,7924,13495,1649613.11,1580923.47,312,274,2024-01-26
1,2024-01-26 20:00:03.000,2024-01-26 19:58:00,1,AA,42045,1.478652e+08,75346410.57,65922664.26,24437,17608,...,5716,1.374260e+08,3.843087e+06,40184,1861,3452718.52,3143378.19,997,937,2024-01-26
2,2024-01-26 20:00:03.000,2024-01-26 20:00:00,1,AAA,82,3.617058e+05,174148.92,104432.79,54,15,...,23,1.258004e+05,1.527813e+05,6,2,79532.21,3591.89,6,4,2024-01-26
3,2024-01-26 20:00:03.000,2024-01-26 20:00:00,1,AAAU,1291,2.812714e+07,10596104.86,11127907.08,593,693,...,303,2.764566e+06,1.895945e+07,131,1136,3359260.28,3043866.08,168,127,2024-01-26
4,2024-01-26 20:00:03.000,2024-01-26 19:54:00,1,AACG,80,7.770543e+03,4566.31,2678.43,34,27,...,26,3.659610e+03,3.585090e+03,2,0,190.75,335.03,4,9,2024-01-26
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2024-01-26 20:00:03.000,2024-01-26 19:01:00,1,ACRE,3176,3.623532e+06,1490077.55,1967011.38,1118,1136,...,501,1.753788e+06,1.703301e+06,83,70,63660.53,102782.35,84,64,2024-01-26
96,2024-01-26 20:00:03.000,2024-01-26 19:39:00,1,ACRS,2673,8.623227e+05,372692.24,420052.17,1035,1174,...,385,3.861224e+05,4.066219e+05,30,44,40745.21,28833.12,91,75,2024-01-26
97,2024-01-26 20:00:03.000,2024-01-26 16:03:00,1,ACRV,620,2.436263e+05,103739.07,129669.46,265,355,...,134,4.674265e+04,1.866659e+05,106,514,9898.28,319.54,16,6,2024-01-26
98,2024-01-26 20:00:03.000,2024-01-26 19:00:00,1,ACRpC,78,1.979822e+05,120491.67,57311.37,46,30,...,24,9.018207e+04,8.762096e+04,0,6,23.85,20155.33,1,4,2024-01-26


In [80]:
df = unifier.get_dataframe(name='xtech_us_equity_flow_1min', back_to='2024-01-01', up_to='2024-02-01', limit=100)
display(df)

Unnamed: 0,asof_datetime,timestamp,extended_hours,ticker,trade_count,dollar_volume,method1_inst_buy,method1_inst_sell,method1_inst_buy_count,method1_inst_sell_count,...,method4_inst_sell_count,method5_inst_buy,method5_inst_sell,method5_inst_buy_count,method5_inst_sell_count,retail_buy,retail_sell,retail_buy_count,retail_sell_count,date
0,2024-01-24 08:27:03,2024-01-24 08:27:00.000,1,UNFI,3,45.0300,0.00,45.03,0,3,...,0,30.02,15.01,2,1,0.00,0.0,0,0,2024-01-24
1,2024-01-24 08:48:03,2024-01-24 08:48:00.000,1,UNFI,1,301.0000,0.00,301.00,0,1,...,0,301.00,0.00,1,0,0.00,0.0,0,0,2024-01-24
2,2024-01-24 09:29:03,2024-01-24 09:29:00.000,1,UNFI,3,452.7000,452.70,0.00,3,0,...,0,452.70,0.00,3,0,0.00,0.0,0,0,2024-01-24
3,2024-01-24 09:30:03,2024-01-24 09:30:00.000,1,UNFI,14,82818.9286,70987.65,5915.64,13,1,...,4,76903.29,0.00,14,0,5915.64,0.0,1,0,2024-01-24
4,2024-01-24 09:31:03,2024-01-24 09:31:00.000,0,UNFI,5,242.1402,145.28,96.86,3,2,...,3,242.14,0.00,5,0,0.00,0.0,0,0,2024-01-24
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2024-01-24 11:05:03,2024-01-24 11:05:00.000,0,UNFI,4,332.0000,166.00,166.00,2,2,...,1,332.00,0.00,4,0,0.00,0.0,0,0,2024-01-24
96,2024-01-24 11:06:03,2024-01-24 11:06:00.000,0,UNFI,1,45.2400,0.00,45.24,0,1,...,1,45.24,0.00,1,0,0.00,0.0,0,0,2024-01-24
97,2024-01-24 11:07:03,2024-01-24 11:07:00.000,0,UNFI,4,497.7400,124.44,373.30,1,3,...,1,497.74,0.00,4,0,0.00,0.0,0,0,2024-01-24
98,2024-01-24 11:08:03,2024-01-24 11:08:00.000,0,UNFI,7,3289.5400,939.87,2349.67,2,5,...,1,3289.54,0.00,7,0,0.00,0.0,0,0,2024-01-24


## Retrieve data for a specific date with a specific ticker (key='ticker') using the asof_date parameter

In [81]:
df = unifier.get_dataframe(name='xtech_us_equity_flow_daily',key='AAPL', asof_date='2024-01-01', limit=100)
display(df)

Unnamed: 0,asof_datetime,timestamp,includes_extended_hours,ticker,trade_count,dollar_volume,method1_inst_buy,method1_inst_sell,method1_inst_buy_count,method1_inst_sell_count,...,method4_inst_sell_count,method5_inst_buy,method5_inst_sell,method5_inst_buy_count,method5_inst_sell_count,retail_buy,retail_sell,retail_buy_count,retail_sell_count,date
0,2007-03-09 20:00:03.000,2007-03-09 19:47:00,1,AAPL,77686,1.398196e+09,6.562730e+08,7.068474e+08,37233,39729,...,9749,6.308373e+08,7.322830e+08,31058,36805,1.635379e+07,1.872188e+07,888,641,2007-03-09
1,2007-01-09 20:00:03.000,2007-01-09 20:01:00,1,AAPL,569527,1.068215e+10,5.679503e+09,4.809443e+09,256935,210141,...,93771,7.567946e+09,2.921000e+09,388304,136790,1.133242e+08,7.988301e+07,6049,4246,2007-01-09
2,2007-07-09 20:00:03.000,2007-07-09 19:59:00,1,AAPL,144402,4.556740e+09,2.062378e+09,2.359438e+09,66304,76463,...,24610,1.389937e+09,3.031880e+09,41545,102668,7.071391e+07,6.420932e+07,2169,1969,2007-07-09
3,2007-04-16 20:00:03.000,2007-04-16 20:00:00,1,AAPL,91880,1.880573e+09,1.028096e+09,8.069202e+08,49854,39117,...,9165,1.417609e+09,4.174070e+08,57704,9605,2.560577e+07,1.995090e+07,1220,947,2007-04-16
4,2007-01-18 20:00:03.000,2007-01-18 20:01:00,1,AAPL,392605,7.594435e+09,3.657181e+09,3.807896e+09,165389,172593,...,55459,3.288715e+09,4.176363e+09,169676,222927,7.599306e+07,5.336402e+07,4036,2844,2007-01-18
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2007-06-12 20:00:03.000,2007-06-12 20:00:00,1,AAPL,247836,5.991503e+09,2.977940e+09,2.881145e+09,124891,119874,...,40112,3.334052e+09,2.525034e+09,140292,107542,8.165592e+07,5.076194e+07,3317,2050,2007-06-12
96,2007-07-05 20:00:03.000,2007-07-05 20:01:00,1,AAPL,229508,6.627189e+09,3.390201e+09,2.996050e+09,120213,106468,...,40544,4.291421e+09,2.094831e+09,150291,79210,1.388906e+08,1.020466e+08,4796,3506,2007-07-05
97,2007-05-15 20:00:03.000,2007-05-15 20:01:00,1,AAPL,153101,3.690993e+09,1.715724e+09,1.880629e+09,72396,79395,...,23787,1.455175e+09,2.141178e+09,60443,92651,4.719894e+07,4.744076e+07,1919,1926,2007-05-15
98,2007-08-03 20:00:03.000,2007-08-03 20:01:00,1,AAPL,109714,3.082531e+09,1.400931e+09,1.592674e+09,49966,56539,...,23557,8.975946e+08,2.096010e+09,32937,76763,4.827793e+07,4.064883e+07,1674,1429,2007-08-03


In [82]:
df = unifier.get_dataframe(name='xtech_us_equity_flow_1min',key='AAPL', asof_date='2024-01-01', limit=100)
display(df)

Unnamed: 0,asof_datetime,timestamp,extended_hours,ticker,trade_count,dollar_volume,method1_inst_buy,method1_inst_sell,method1_inst_buy_count,method1_inst_sell_count,...,method4_inst_sell_count,method5_inst_buy,method5_inst_sell,method5_inst_buy_count,method5_inst_sell_count,retail_buy,retail_sell,retail_buy_count,retail_sell_count,date
0,2015-02-19 04:01:03,2015-02-19 04:01:00.000,1,AAPL,1,38577.00,38577.00,0.00,1,0,...,0,19288.50,19288.50,0,0,0,0,0,0,2015-02-19
1,2015-02-19 04:02:03,2015-02-19 04:02:00.000,1,AAPL,1,12859.00,6429.50,6429.50,0,0,...,0,6429.50,6429.50,0,0,0,0,0,0,2015-02-19
2,2015-02-19 04:03:03,2015-02-19 04:03:00.000,1,AAPL,1,12859.00,12859.00,0.00,1,0,...,0,12859.00,0.00,1,0,0,0,0,0,2015-02-19
3,2015-02-19 04:07:03,2015-02-19 04:07:00.000,1,AAPL,1,61704.00,61704.00,0.00,1,0,...,1,30852.00,30852.00,0,0,0,0,0,0,2015-02-19
4,2015-02-19 04:23:03,2015-02-19 04:23:00.000,1,AAPL,1,12864.00,12864.00,0.00,1,0,...,0,6432.00,6432.00,0,0,0,0,0,0,2015-02-19
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2015-02-19 07:55:03,2015-02-19 07:55:00.000,1,AAPL,7,256860.88,0.00,256860.88,0,7,...,4,128430.44,128430.44,0,0,0,0,0,0,2015-02-19
96,2015-02-19 07:58:03,2015-02-19 07:58:00.000,1,AAPL,3,38559.00,38559.00,0.00,3,0,...,0,19279.50,19279.50,0,0,0,0,0,0,2015-02-19
97,2015-02-19 07:59:03,2015-02-19 07:59:00.000,1,AAPL,11,745411.25,745411.25,0.00,11,0,...,1,372705.62,372705.62,0,0,0,0,0,0,2015-02-19
98,2015-02-19 08:00:03,2015-02-19 08:00:00.000,1,AAPL,1,35994.00,35994.00,0.00,1,0,...,0,17997.00,17997.00,0,0,0,0,0,0,2015-02-19


## Retrieve data for a specific date range with a specific ticker (key='ticker') using the back_to and up_to parameters


In [83]:
df = unifier.get_dataframe(name='xtech_us_equity_flow_daily',key='AAPL', back_to='2024-01-01', up_to='2024-02-01', limit=100)
display(df)

Unnamed: 0,asof_datetime,timestamp,includes_extended_hours,ticker,trade_count,dollar_volume,method1_inst_buy,method1_inst_sell,method1_inst_buy_count,method1_inst_sell_count,...,method4_inst_sell_count,method5_inst_buy,method5_inst_sell,method5_inst_buy_count,method5_inst_sell_count,retail_buy,retail_sell,retail_buy_count,retail_sell_count,date
0,2024-01-02 20:00:03.000,2024-01-02 20:00:00,1,AAPL,990769,14124460000.0,6623136000.0,6059647000.0,542285,446477,...,203660,5313102000.0,7369681000.0,246488,744247,936135300.0,505543300.0,69922,36466,2024-01-02
1,2024-01-09 20:00:03.000,2024-01-09 20:00:00,1,AAPL,530846,7457582000.0,4200570000.0,2515406000.0,348168,182397,...,98440,5819325000.0,896650600.0,453101,77735,423238100.0,318368000.0,32096,23905,2024-01-09
2,2024-01-26 20:00:03.000,2024-01-26 20:00:00,1,AAPL,527652,8164331000.0,3866378000.0,3615130000.0,311708,214627,...,92775,4173297000.0,3308211000.0,306436,220071,382761300.0,300061700.0,24914,19586,2024-01-26
3,2024-01-29 20:00:03.000,2024-01-29 20:00:00,1,AAPL,590108,8135218000.0,3895377000.0,3484026000.0,360768,226701,...,105285,2352155000.0,5027247000.0,218967,369775,437841500.0,317974000.0,31284,22478,2024-01-29
4,2024-01-30 20:00:03.000,2024-01-30 20:00:00,1,AAPL,682123,9997138000.0,5243521000.0,3764508000.0,398518,282550,...,131715,4989138000.0,4018891000.0,331875,350248,598316600.0,390792100.0,41139,26414,2024-01-30
5,2024-01-31 20:00:03.000,2024-01-31 20:00:00,1,AAPL,673798,9649756000.0,4751073000.0,4092012000.0,399686,271692,...,129079,4309913000.0,4533171000.0,419310,254487,490122800.0,316548600.0,36382,23948,2024-01-31
6,2024-01-19 20:00:03.000,2024-01-19 20:00:00,1,AAPL,672873,12477520000.0,6453219000.0,5029133000.0,417858,250310,...,121470,7644821000.0,3837530000.0,532463,140410,541587500.0,453579300.0,32810,25615,2024-01-19
7,2024-01-04 20:00:03.000,2024-01-04 20:00:00,1,AAPL,704320,11606130000.0,5085719000.0,5255783000.0,410959,291530,...,141718,3824973000.0,6516528000.0,285462,418792,852138200.0,412488400.0,52306,25296,2024-01-04
8,2024-01-25 20:00:03.000,2024-01-25 20:00:00,1,AAPL,635510,9921383000.0,4653089000.0,4391300000.0,364205,269872,...,110063,4187310000.0,4857079000.0,305904,329605,480837400.0,396156200.0,31153,25473,2024-01-25
9,2024-01-18 20:00:03.000,2024-01-18 20:00:00,1,AAPL,777009,14074120000.0,6525488000.0,6259469000.0,444246,329015,...,144862,6136234000.0,6648724000.0,461588,313287,697219800.0,591946800.0,39269,31654,2024-01-18


In [84]:
df = unifier.get_dataframe(name='xtech_us_equity_flow_1min',key='AAPL', back_to='2024-01-01', up_to='2024-02-01', limit=100)
display(df)

Unnamed: 0,asof_datetime,timestamp,extended_hours,ticker,trade_count,dollar_volume,method1_inst_buy,method1_inst_sell,method1_inst_buy_count,method1_inst_sell_count,...,method4_inst_sell_count,method5_inst_buy,method5_inst_sell,method5_inst_buy_count,method5_inst_sell_count,retail_buy,retail_sell,retail_buy_count,retail_sell_count,date
0,2024-01-18 04:01:03,2024-01-18 04:01:00.000,1,AAPL,71,231336.92,136847.19,94489.73,42,29,...,16,97747.99,133588.93,0,11,0,0,0,0,2024-01-18
1,2024-01-18 04:02:03,2024-01-18 04:02:00.000,1,AAPL,12,33070.45,11023.48,22046.97,4,8,...,5,16535.22,16535.22,0,0,0,0,0,0,2024-01-18
2,2024-01-18 04:03:03,2024-01-18 04:03:00.000,1,AAPL,4,65249.05,65249.05,0.00,4,0,...,1,32624.52,32624.52,0,0,0,0,0,0,2024-01-18
3,2024-01-18 04:04:03,2024-01-18 04:04:00.000,1,AAPL,22,94689.69,86081.54,8608.15,20,2,...,4,47344.84,47344.84,0,0,0,0,0,0,2024-01-18
4,2024-01-18 04:05:03,2024-01-18 04:05:00.000,1,AAPL,17,220041.51,155323.42,64718.09,12,5,...,3,110020.76,110020.76,0,0,0,0,0,0,2024-01-18
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,2024-01-18 05:36:03,2024-01-18 05:36:00.000,1,AAPL,47,563551.55,23980.92,539570.63,2,45,...,12,281775.77,281775.77,0,0,0,0,0,0,2024-01-18
96,2024-01-18 05:37:03,2024-01-18 05:37:00.000,1,AAPL,10,54697.98,19144.29,35553.69,3,6,...,3,27348.99,27348.99,0,0,0,0,0,0,2024-01-18
97,2024-01-18 05:38:03,2024-01-18 05:38:00.000,1,AAPL,8,10939.97,1367.50,9572.47,1,7,...,3,5469.98,5469.98,0,0,0,0,0,0,2024-01-18
98,2024-01-18 05:39:03,2024-01-18 05:39:00.000,1,AAPL,8,10211.68,4467.61,5744.07,2,3,...,3,5105.84,5105.84,0,0,0,0,0,0,2024-01-18


In [85]:
import pandas as pd

In [109]:
def plot_flow(
    flow_df,
    price_df,
    method="method5_inst",
    title="Flow Visualization",
    zscore_window=60,
    start_date=None,
    end_date=None,
    show_fig=True,
    use_deviation=False,
    short_window=10,
    long_window=60,
):
    """
    Plot institutional flow visualization.
    For min flow, index should be timestamp.
    For daily flow, index should be date.
    """
    import pandas as pd
    import numpy as np
    import plotly.graph_objects as go
    from plotly.subplots import make_subplots

    # Ensure datetime index
    flow_df.index = pd.to_datetime(flow_df.index)
    flow_df = flow_df.copy()
    
    price_df = price_df.copy()
    price_df.index = pd.to_datetime(price_df.index)

    # Filter by date
    if start_date:
        start_date = pd.to_datetime(start_date)
        flow_df = flow_df[flow_df.index >= start_date]
        price_df = price_df[price_df.index >= start_date]
    if end_date:
        end_date = pd.to_datetime(end_date)
        flow_df = flow_df[flow_df.index <= end_date]
        price_df = price_df[price_df.index <= end_date]

    # Align to shared datetime range
    min_timestamp = max(flow_df.index.min(), price_df.index.min())
    max_timestamp = min(flow_df.index.max(), price_df.index.max())
    flow_df = flow_df[(flow_df.index >= min_timestamp) & (flow_df.index <= max_timestamp)]
    price_df = price_df[(price_df.index >= min_timestamp) & (price_df.index <= max_timestamp)]

    # Apply deviation logic if needed
    flow_df[f'{method}_netflow'] = flow_df[f'{method}_buy'] - flow_df[f'{method}_sell']
    if use_deviation:
        short_ma = flow_df[f'{method}_netflow'].rolling(window=short_window).mean()
        long_ma = flow_df[f'{method}_netflow'].rolling(window=long_window).mean()
        flow_df[f'{method}_netflow'] = short_ma - long_ma
        flow_df.dropna(inplace=True)

    # Calculate z-score
    rolling_mean = flow_df[f'{method}_netflow'].rolling(zscore_window).mean()
    rolling_std = flow_df[f'{method}_netflow'].rolling(zscore_window).std().replace(0, np.nan)
    flow_df['zscore'] = (flow_df[f'{method}_netflow'] - rolling_mean) / rolling_std

    # Format datetime for category plotting
    flow_df['date_str'] = flow_df.index.astype(str)

    # Forward-fill price across flow timestamps so they share the same x-axis
    price_aligned = flow_df[['date_str']].copy()
    price_aligned['price'] = price_df['close'].reindex(flow_df.index, method='ffill')

    # Set up plot
    fig = make_subplots(
        rows=3, cols=1,
        shared_xaxes=True,
        row_heights=[0.5, 0.25, 0.25],
        vertical_spacing=0.08,
        specs=[[{"secondary_y": True}], [{}], [{}]]
    )

    # Row 1: Cumulative net flow
    name = "Cumulative NetFlow" if not use_deviation else "Cumulative NetFlow (detrended)"
    fig.add_trace(
        go.Scatter(
            x=flow_df['date_str'],
            y=flow_df[f'{method}_netflow'].cumsum(),
            name=name,
            line=dict(color='blue'),
            fill='tozeroy',
            fillcolor='rgba(0, 0, 255, 0.1)'
        ),
        row=1, col=1,
        secondary_y=False
    )

    # Price line (aligned to flow timestamps)
    fig.add_trace(
        go.Scatter(
            x=price_aligned['date_str'],
            y=price_aligned['price'],
            name="Price",
            line=dict(color='black'),
        ),
        row=1, col=1,
        secondary_y=True
    )

    # Row 2: Inst buy and sell
    fig.add_trace(
        go.Scatter(
            x=flow_df['date_str'],
            y=flow_df[f'{method}_buy'],
            name="Inst Buy",
            line=dict(color='green'),
            opacity=0.3
        ),
        row=2, col=1
    )
    fig.add_trace(
        go.Scatter(
            x=flow_df['date_str'],
            y=-flow_df[f'{method}_sell'],
            name="Inst Sell",
            line=dict(color='red'),
            opacity=0.3
        ),
        row=2, col=1
    )

    # Row 3: Z-score
    fig.add_trace(
        go.Scatter(
            x=flow_df['date_str'],
            y=flow_df['zscore'],
            name=f"Z-Score (window={zscore_window})",
            line=dict(color='#1E88E5')
        ),
        row=3, col=1
    )
    for level in [-2, 2]:
        fig.add_hline(y=level, line=dict(color="gray", dash="dash"), row=3, col=1)

    # Layout
    fig.update_layout(
        template="plotly_white",
        width=1200,
        height=900,
        images=[
            dict(
                source="Exponential-Title-Wide.png",
                xref="paper",
                yref="paper",
                x=0.5,
                y=0.5,
                sizex=0.8,
                sizey=0.5,
                xanchor="center",
                yanchor="middle",
                sizing="contain",
                opacity=0.12,
                layer="below",
            )
        ],
        title=dict(text=title, x=0.5, font=dict(size=20)),
        font=dict(size=14),
        legend=dict(orientation="h", y=1.02, x=1, xanchor="right", yanchor="bottom"),
        hovermode="x unified"
    )

    # X-axis: category mode to avoid gaps
# Reduce number of tick labels by sampling every Nth point
    max_ticks = 10
    all_ticks = flow_df['date_str'].tolist()
    tick_step = max(1, len(all_ticks) // max_ticks)
    sparse_ticks = all_ticks[::tick_step]

    fig.update_xaxes(type='category', tickangle=45, tickvals=sparse_ticks, row=1, col=1)
    fig.update_xaxes(type='category', tickangle=45, tickvals=sparse_ticks, row=2, col=1)
    fig.update_xaxes(title="Datetime", type='category', tickangle=45, tickvals=sparse_ticks, row=3, col=1)


    # Y-axes
    fig.update_yaxes(title="Net Flow (cumsum)", row=1, col=1, secondary_y=False)
    fig.update_yaxes(title="Price", row=1, col=1, secondary_y=True)
    fig.update_yaxes(title="Buy / Sell", row=2, col=1)
    fig.update_yaxes(title="Z-Score", row=3, col=1)

    if show_fig:
        fig.show()

    return fig


### Min flow data

In [94]:
# pip install yfinance

In [110]:
import pandas as pd
import yfinance as yf
from datetime import datetime, timedelta

# Set fixed 1-week window starting 15 days ago
start_date = datetime.today() - timedelta(days=15)
end_date = start_date + timedelta(days=7)

start_str = start_date.strftime('%Y-%m-%d')
end_str = end_date.strftime('%Y-%m-%d')

# === Load 1-minute equity flow data from Unifier ===
flow_df = unifier.get_dataframe(
    name='xtech_us_equity_flow_1min',
    key='AAPL',
    back_to=start_str,
    up_to=end_str
)
flow_df.set_index('timestamp', inplace=True)
flow_df.sort_index(inplace=True)
flow_df.index = pd.to_datetime(flow_df.index)

# Filter to regular US market hours: 9:30 AM to 4:00 PM
flow_df = flow_df[(flow_df.index.hour >= 9) & (flow_df.index.hour <= 16)]
flow_df = flow_df[~((flow_df.index.hour == 9) & (flow_df.index.minute < 30))]

# === Load 1-minute price data from yfinance ===
price_df = yf.download(
    'AAPL',
    interval='1m',
    start=start_str,
    end=end_str,
    progress=False
)
price_df = price_df[['Close']].rename(columns={'Close': 'close'})
price_df.index = pd.to_datetime(price_df.index)
price_df.sort_index(inplace=True)

# Optional: match flow_df market hours

# Strip timezone from price_df (make it tz-naive)
price_df.index = price_df.index.tz_convert('US/Eastern')

# flow data is probably naive (no timezone) → localize to US/Eastern
flow_df.index = flow_df.index.tz_localize('US/Eastern')

# === Align to shared datetime range ===

# Filter to regular market hours: 9:30 AM to just before 4:00 PM
price_df = price_df[(price_df.index.hour > 9) & (price_df.index.hour < 16) |
                    ((price_df.index.hour == 9) & (price_df.index.minute >= 30))]

flow_df = flow_df[(flow_df.index.hour > 9) & (flow_df.index.hour < 16) |
                  ((flow_df.index.hour == 9) & (flow_df.index.minute >= 30))]

min_timestamp = max(flow_df.index.min(), price_df.index.min())
max_timestamp = min(flow_df.index.max(), price_df.index.max())

flow_df = flow_df[(flow_df.index >= min_timestamp) & (flow_df.index <= max_timestamp)]
price_df = price_df[(price_df.index >= min_timestamp) & (price_df.index <= max_timestamp)]



YF.download() has changed argument auto_adjust default to True



In [111]:
fig = plot_flow(flow_df,price_df,method="method5_inst")

### daily flow

In [97]:
# pip install yfinance

In [112]:
# load equity flow data
flow_df = unifier.get_dataframe(name='xtech_us_equity_flow_daily',key='AAPL', back_to='2022-04-05', up_to='2025-07-11')
flow_df.set_index('date', inplace=True)
flow_df.sort_index(inplace=True)
flow_df.index=pd.to_datetime(flow_df.index)

# load daily price data from yfinance
import yfinance as yf
price_df = yf.download('AAPL', start='2022-04-05', end='2025-07-12')
price_df = price_df[['Close']].rename(columns={'Close': 'close'})
price_df.columns=['close']
price_df.sort_index(inplace=True)
price_df.index=pd.to_datetime(price_df.index)

# align price and flow
common_dates = sorted(list(set(flow_df.index).intersection(set(price_df.index))))
flow_df = flow_df.loc[common_dates]
price_df = price_df.loc[common_dates]





YF.download() has changed argument auto_adjust default to True

[*********************100%***********************]  1 of 1 completed



In [113]:
fig = plot_flow(flow_df,price_df,method="method5_inst",zscore_window=20)