# Pendle YT Timing & Limit Order Arrival Time Prediction Tool **Version 6**

[Author's Twitter](https://twitter.com/quant_sheep?t=KqHtg0lNFy-sejP_dFOUXg&s=09)

No coding skills? No problem! I've open-sourced and simplified the analysis: **just fill out the form and click in Colab**. Now, you can discover the best strategy for earning points with Pendle's YT on your own!

This tool includes features such as:

- Simple timing strategy for YT & PT
- Limit order assistance tool: Prediction of counterpart order arrival time

## Risk Warning

1. **abcETH de-pegging risk and underlying asset price risk:** The red line represents the amount of points obtained by investing 1 abcETH (not the underlying asset) to purchase YT at a certain point in time and holding it until maturity. The points obtained depend on the pool corresponding to abcETH (such as ezETH), but abcETH may de-peg from ETH. The underlying asset (such as ETH) itself also carries price risk.

2. **Strategy Risk:**

   - **YT & PT Timing Strategy:** The fair implied APY of the strategy is determined based on historical data, so the fair price line should not be considered an accurate prediction of future prices. The YT Fair price line is a rough estimate of the average YT price based on historical data.
   - **Counterpart Order Arrival Time Prediction:** Please note that the counterpart order arrival time prediction function relies on the analysis of current market sentiment. However, due to market volatility and unpredictability, the prediction results are for reference only and should not be the sole basis for your investment decisions. The actual time for order fulfillment may be affected by sudden events, changes in liquidity, or significant market fluctuations, leading to discrepancies between the expected and actual execution time. To avoid potential losses caused by significant market fluctuations, it is recommended to monitor real-time market data and adjust order parameters to respond to sudden changes when using this feature.

3. **Analysis Tool Risk:** The data obtained by analysis tools may not be timely enough. These tools are provided for learning and reference purposes only, and their stability in a production environment is not guaranteed.

When using this tool, please ensure that you understand the associated risks and are prepared to bear any potential losses. Investment involves risks, and decisions should be made cautiously.

---

# Pendle的YT择时&限价单预测工具 **第六版本**

[作者的推特账号](https://twitter.com/quant_sheep?t=KqHtg0lNFy-sejP_dFOUXg&s=09)



没有编程技能？没问题！我已经开源并简化了分析：**只需填写表格并在 Colab 上点击**。现在，您可以自己发现用 Pendle 的 YT 赚取积分的最佳策略！

本工具包含功能：

- YT&PT的简单择时策略
- 限价单辅助工具：对手订单等待时间预测

# 风险提示

1. abcETH与ETH的脱锚风险和underlying asset币价风险：实际上 红色的线代表：在某个时间点，投入 1 个abcETH （而不是underlying）购买yt，持有到期获得的积分数量。所谓的积分数量是依据池子对应的abcETH（比如ezETH）所决定的，但是abcETH会与ETH发生脱锚（depeg）。underlying asset（如ETH）本身也有币价风险。

2. 策略风险：

- YT&PT择时策略：策略的fair implied apy是依据历史数据去确定的，因此fair price line不能看做对未来价格的精准预测。YT Fair price line是一种基于历史数据对YT 价格平均水平的粗略估计。
- 对手单等待时间预测：请注意，对手单等待时间预测功能依赖于当前市场情绪的分析结果，但由于市场的波动性和不可预测性，预测结果仅供参考，不能作为投资决策的唯一依据。实际的订单成交时间可能会受到突发事件、流动性变化或市场大幅波动的影响，从而导致预期的成交时间与实际情况存在偏差。为了避免因市场剧烈波动而产生的损失，建议在使用此功能时，不仅要参考预测结果，还应关注实时市场数据，调整订单参数以应对突发变化。


3. 分析工具风险：分析工具获取的数据可能不够及时。工具本身只是供学习参考，在生产环境中的稳定性没有保障。


使用此工具时，请确保您理解其中的风险，并愿意承担由此产生的可能损失。投资有风险，决策需谨慎。

---

# 1. YT & PT Timing Strategy
For the tutorial and the full version of the tool, please refer to the V5 version tool: [Colab Link](https://colab.research.google.com/drive/1xr_18PesSBV5DpRPVKibO5Ta6o4lpo-y#scrollTo=W6r7KNoMGpzR)

# 1. YT&PT择时策略
教程以及完整版本工具见V5版本工具：
[Colab Link](https://colab.research.google.com/drive/1xr_18PesSBV5DpRPVKibO5Ta6o4lpo-y#scrollTo=W6r7KNoMGpzR)


YOU CAN FIND pendle_multiplier HERE


In [1]:
# @title 必填表单，填写后请运行这个单元格，确保数据获取。 /  Required form. After filling it out, please run this cell to ensure data retrieval.
import io
import requests
import pandas as pd
import numpy as np
from datetime import datetime, timezone, timedelta
from dateutil import parser
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
import json
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import logging
import time
import random
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from scipy.interpolate import UnivariateSpline
from tabulate import tabulate

# Network configuration
network = 'ethereum'  # @param {type:"string"}
network_ids = {
    'arbitrum': '/42161',
    'ethereum': '/1',
    'mantle': '/5000'
}

# Initialize session with retry mechanism
session = requests.session()
retry = Retry(total=3, backoff_factor=1)
session.mount('http://', HTTPAdapter(max_retries=retry))
session.mount('https://', HTTPAdapter(max_retries=retry))

# Retrieve network ID
network_id = network_ids.get(network.lower())
if network_id is None:
    raise ValueError("Unsupported network type")

# Construct the request URL and headers
url = f'https://api-v2.pendle.finance/core/v1{network_id}/assets/all'
headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/113.0.0.0 Safari/537.36"
}

# Send request and parse response
response = session.get(url, headers=headers)
data = response.json()

# Function to find valid assets based on given parameters
def find_valid_assets(data, base_type, expiry_key, address):
    """
    Find valid assets matching the specified criteria.

    :param data: The asset data returned by the API
    :param base_type: The type of asset to filter by (e.g., 'YT')
    :param expiry_key: The key indicating the expiry date of the asset
    :param address: The contract address to filter by
    :return: A list of valid assets that meet the criteria
    """
    current_time = datetime.utcnow().replace(tzinfo=timezone.utc)

    def parse_to_utc(date_str):
        dt = parser.parse(date_str)
        return dt.astimezone(timezone.utc)

    def format_expiry(date_str):
        dt = parse_to_utc(date_str)
        return dt.strftime('%Y-%m-%d %H:%M:%S')

    valid_assets = [
        {**item, expiry_key: format_expiry(item[expiry_key])} for item in data
        if item.get('baseType') == base_type and
           item.get('address') == address and
           expiry_key in item
    ]

    return valid_assets

# Form parameter configuration
market_contract = "0x36d3ca43ae7939645c306e26603ce16e39a89192"  # @param {type:"string"}
yt_contract = '0xeb993b610b68f2631f70ca1cf4fe651db81f368e'  # @param {type:"string"}
start_time = "2023-01-01 00:00:00"
underlying_amount = 1  # @param {type:"number"}
points_per_hour_per_underlying = 0.04  # @param {type:"number"}
pendle_multiplier = 5  # @param {type:"number"}
dark_mode = True  # @param {type:"boolean"}

pendle_yt_multiplier = pendle_multiplier

# Handle time and chart mode
datetime_obj = datetime.strptime(start_time, '%Y-%m-%d %H:%M:%S')
start_time = datetime_obj.replace(tzinfo=timezone.utc).isoformat(timespec='milliseconds').replace('+00:00', 'Z')
points = points_per_hour_per_underlying

mode = 'plotly_dark' if dark_mode else 'plotly_white'

# Find valid YT assets
valid_assets = find_valid_assets(data, 'YT', 'expiry', yt_contract)
if not valid_assets:
    raise ValueError("No valid assets found with the given parameters")

symbol = valid_assets[0]['symbol']
maturity = valid_assets[0]['expiry']

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class DataAcquisition:
    def __init__(self, market_contract, yt_contract, start_time_str, network='ethereum'):
        """
        Initialize the DataAcquisition class with the required parameters.

        :param market_contract: The market contract address.
        :param yt_contract: The YT contract address.
        :param start_time_str: The start time in ISO format (e.g., '2023-01-01T00:00:00.000Z').
        :param network: The network name ('ethereum', 'arbitrum', or 'mantle').
        """
        self.session = self._init_session()
        self.market_contract = market_contract.lower()
        self.yt_contract = yt_contract.lower()
        self.start_time_str = start_time_str
        self.end_time_str = datetime.utcnow().strftime('%Y-%m-%dT%H:%M:%S.000Z')
        self.network_id = self._get_network_id(network)
        self.headers = {
            "User-Agent": f"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 "
                          f"(KHTML, like Gecko) Chrome/{random.randint(80, 100)}.0.{random.randint(1000, 2000)}.0 Safari/537.36"
        }
        # Construct URLs
        self.url_apy = f'https://api-v2.pendle.finance/core/v1/{self.network_id}/markets/{self.market_contract}/apy-history-1ma'
        self.url_ohlcv = f'https://api-v2.pendle.finance/core/v3/{self.network_id}/prices/{self.yt_contract}/ohlcv'
        self.url_transactions = f'https://api-v2.pendle.finance/core/v3/{self.network_id}/transactions'

    @staticmethod
    def _init_session():
        """Initialize the session with retry capability."""
        session = requests.Session()
        retry_strategy = requests.packages.urllib3.util.retry.Retry(
            total=3,
            backoff_factor=1,
            status_forcelist=[429, 500, 502, 503, 504],
            allowed_methods=["GET"]
        )
        adapter = requests.adapters.HTTPAdapter(max_retries=retry_strategy)
        session.mount('https://', adapter)
        session.mount('http://', adapter)
        return session

    @staticmethod
    def _get_network_id(network):
        """Retrieve the network ID based on the network name."""
        network_ids = {
            'arbitrum': '42161',
            'ethereum': '1',
            'mantle': '5000'
        }
        network_id = network_ids.get(network.lower())
        if network_id is None:
            raise ValueError(f"Unsupported network type: {network}")
        return network_id

    def fetch_transactions(self, limit=1000, max_attempts=1, retry_delay=5):
        """
        Fetch as many transactions as possible by handling pagination and rate limits.

        :param limit: Number of transactions to fetch per request (max 1000).
        :param max_attempts: Max number of retry attempts in case of 429 or other errors.
        :param retry_delay: Initial delay between retries when rate limit is hit (in seconds).
        :return: DataFrame containing the transactions data.
        """
        all_transactions = []
        skip = 0
        attempts = 0

        while True:
            params = {
                'market': self.market_contract,
                'action': 'SWAP_PT,SWAP_PY,SWAP_YT',
                'origin': 'PENDLE_MARKET,YT',
                'skip': str(skip),
                'limit': str(limit),
                'minValue': '0'
            }

            try:
                # Log the request for debugging
                logger.info(f"Requesting URL: {self.url_transactions} with params: {params}")

                response = self.session.get(self.url_transactions, headers=self.headers, params=params)
                response.raise_for_status()  # Will raise HTTPError for bad requests

                data = response.json()
                transactions = data.get('results', [])
                if not transactions:
                    logger.info("No more transactions found.")
                    break

                df_transactions = pd.DataFrame(transactions)
                df_transactions['timestamp'] = pd.to_datetime(df_transactions['timestamp'], utc=True)
                all_transactions.append(df_transactions)

                # Update skip for pagination
                skip += limit
                attempts = 0  # Reset attempts after a successful request

                # Introduce a random delay to avoid rate limiting
                random_delay = random.uniform(0.15, 0.55)  # Random delay between0.15, 0.55 seconds
                logger.info(f"Sleeping for {random_delay:.2f} seconds to avoid rate limiting.")
                time.sleep(random_delay)

            except requests.HTTPError as e:
                if response.status_code == 400:
                    # Rate limit exceeded, implement exponential backoff
                    attempts += 1
                    if attempts > max_attempts:
                        logger.error(f"Max retry attempts reached. Could not fetch all transactions.")
                        break
                    wait_time = retry_delay * (2 ** (attempts - 1))  # Exponential backoff
                    random_extra_delay = random.uniform(1, 5)  # Adding a random extra delay
                    total_wait_time = wait_time + random_extra_delay
                    logger.warning(f"Rate limit exceeded. Retrying in {total_wait_time:.3f} seconds...")
                    time.sleep(total_wait_time)
                else:
                    logger.error(f"Failed to fetch transactions: {e}")
                    break
            except Exception as e:
                # Handle other unexpected errors with a random delay before retrying
                attempts += 1
                if attempts > max_attempts:
                    logger.error(f"Max retry attempts reached due to unexpected error: {e}")
                    break
                random_failure_delay = random.uniform(5, 10)  # Random delay between 5 and 10 seconds
                logger.warning(f"Unexpected error: {e}. Retrying in {random_failure_delay:.3f} seconds...")
                time.sleep(random_failure_delay)

        if all_transactions:
            return pd.concat(all_transactions, ignore_index=True)
        else:
            return pd.DataFrame()

    def fetch_ohlcv(self):
        """
        Fetch OHLCV data for the YT contract.

        :return: DataFrame containing the OHLCV data.
        """
        params = {
            "time_frame": "hour",
            "timestamp_start": self.start_time_str,
            "timestamp_end": self.end_time_str
        }
        try:
            response = self.session.get(self.url_ohlcv, headers=self.headers, params=params)
            response.raise_for_status()
            results = response.json().get('results', [])
            data = []
            for item in results:
                time = datetime.fromisoformat(item['time'].rstrip('Z'))
                open_price = float(item['open'])
                high_price = float(item['high'])
                low_price = float(item['low'])
                close_price = float(item['close'])
                volume = float(item.get('volume', 0))
                data.append([time, open_price, high_price, low_price, close_price, volume])
            df_ohlcv = pd.DataFrame(data, columns=['Time', 'Open', 'High', 'Low', 'Close', 'Volume'])

            # Convert 'Time' to tz-aware UTC timestamps
            df_ohlcv['Time'] = df_ohlcv['Time'].dt.tz_localize('UTC')

            return df_ohlcv
        except requests.RequestException as e:
            logger.error(f"Failed to fetch OHLCV data: {e}")
            return pd.DataFrame()

    def fetch_apy(self):
        """
        Fetch APY data for the market contract.

        :return: DataFrame containing the APY data.
        """
        params = {
            "time_frame": "hour",
            "timestamp_start": self.start_time_str,
            "timestamp_end": self.end_time_str
        }
        try:
            response = self.session.get(self.url_apy, headers=self.headers, params=params)
            response.raise_for_status()
            data = response.json()
            csv_data = data.get('results', '')
            if csv_data:
                df = pd.read_csv(io.StringIO(csv_data))
                df['Time'] = pd.to_datetime(df['timestamp'], unit='s', utc=True)
                df.drop(columns=['timestamp'], inplace=True)
                return df
            else:
                logger.warning("No APY data found in the API response.")
                return pd.DataFrame()
        except requests.RequestException as e:
            logger.error(f"Failed to fetch APY data: {e}")
            return pd.DataFrame()

    def run(self):
        """
        Execute the data retrieval process and combine the results.

        :return: Tuple of DataFrames (df_combined, df_transactions).
        """
        df_apy = self.fetch_apy()
        df_ohlcv = self.fetch_ohlcv()
        if df_apy.empty:
            logger.warning("APY data is empty.")
        if df_ohlcv.empty:
            logger.warning("OHLCV data is empty.")

        # Merge APY and OHLCV data on the timestamp
        if not df_apy.empty and not df_ohlcv.empty:
            # Merge data
            df_combined = pd.merge_asof(
                df_apy.sort_values('Time'),
                df_ohlcv.sort_values('Time'),
                on='Time'
            )
            # Save df_combined for use in fetch_transactions
            self.df_combined = df_combined
            # Fetch transactions with pagination and rate limit handling
            df_transactions = self.fetch_transactions()
            return df_combined, df_transactions



# Initialize the data acquisition instance and run the data retrieval process
data_acquisition = DataAcquisition(market_contract, yt_contract, start_time, network)
df_combined, df_transactions = data_acquisition.run()

df_tran_cleaned = df_transactions.copy()

df_tran_cleaned['market'] = df_tran_cleaned['market'].astype(str).apply(lambda x: json.loads(x.replace("'", '"')))
market_df = pd.json_normalize(df_tran_cleaned['market'])
market_df.columns = [f"market_{col}" for col in market_df.columns]
df_tran_cleaned = df_tran_cleaned.drop('market', axis=1).join(market_df)

def expand_rows(df, col_name, new_cols):
    expanded_rows = []
    for _, row in df.iterrows():
        items_list = row[col_name]
        if isinstance(items_list, list):
            for item in items_list:
                expanded_row = row.to_dict()
                expanded_row[new_cols[0]] = item.get('asset', {}).get('address')
                expanded_row[new_cols[1]] = item.get('asset', {}).get('baseType')
                expanded_rows.append(expanded_row)
        else:
            expanded_rows.append(row.to_dict())
    return pd.DataFrame(expanded_rows)

df_tran_cleaned = df_tran_cleaned[df_tran_cleaned['inputs'].notnull()]
df_tran_cleaned = expand_rows(df_tran_cleaned, 'inputs', ['input_address', 'input_baseType'])
df_tran_cleaned = df_tran_cleaned.drop('inputs', axis=1)

df_tran_cleaned = df_tran_cleaned[df_tran_cleaned['outputs'].notnull()]
df_tran_cleaned = expand_rows(df_tran_cleaned, 'outputs', ['output_address', 'output_baseType'])
df_tran_cleaned = df_tran_cleaned.drop('outputs', axis=1)

df_tran_cleaned['valuation'] = df_tran_cleaned['valuation'].astype(str).apply(lambda x: json.loads(x.replace("'", '"')))
valuation_df = pd.json_normalize(df_tran_cleaned['valuation'])
valuation_df.columns = [f"valuation_{col}" for col in valuation_df.columns]
df_tran_cleaned = df_tran_cleaned.drop('valuation', axis=1).join(valuation_df)

df_tran_cleaned['timestamp'] = pd.to_datetime(df_tran_cleaned['timestamp'])

df_tran_cleaned = df_tran_cleaned.astype({col: 'object' for col in df_tran_cleaned.columns})
df_tran_cleaned = df_tran_cleaned.drop_duplicates()

df_combined['timestamp'] = df_combined['Time']

df_combined['timestamp'] = pd.to_datetime(df_combined['timestamp'], utc=True)
df_tran_cleaned['timestamp'] = pd.to_datetime(df_tran_cleaned['timestamp'], utc=True)
df_combined = df_combined.sort_values('timestamp')
df_tran_cleaned = df_tran_cleaned.sort_values('timestamp')



df_merged = pd.merge_asof(
    df_tran_cleaned,
    df_combined[['timestamp', 'underlyingApy']],
    on='timestamp',
    direction='backward'
)

# Convert maturity time to datetime object
maturity_time = pd.to_datetime(maturity, format='%Y-%m-%d %H:%M:%S', utc=True)
df_merged['timestamp'] = pd.to_datetime(df_merged['timestamp'], utc=True)
# Calculate hours to maturity for each timestamp in the DataFrame
df_merged['hours_to_maturity'] = (maturity_time - df_merged['timestamp']).dt.total_seconds() / 3600
df_merged['Time'] = pd.to_datetime(df_merged['timestamp'], utc=True)

# Calculate yt/underling and long_yield_apy based on APYs and time to maturity
df_merged['yt/underling'] = (df_merged['impliedApy'] + 1) ** (df_merged['hours_to_maturity'] / 8760) - 1
df_merged['long_yield_apy'] = (1 + (df_merged['underlyingApy'] - df_merged['impliedApy']) / df_merged['impliedApy']) ** (8760 / df_merged['hours_to_maturity']) - 1

# Calculate price and weighted points
price = df_merged['yt/underling']
time_diff_hours = (maturity_time - df_merged['Time']).dt.total_seconds() / 3600
df_merged['points'] = 1 / price * time_diff_hours * points * underlying_amount * pendle_yt_multiplier

# Generate a date range in hourly intervals from the first timestamp to maturity
h_range = pd.date_range(start=df_merged['Time'].iloc[0], end=maturity_time, freq='H')

# Calculate the average implied APY weighted by volume
implied_apy_average = (df_merged['impliedApy'] * df_merged['valuation_usd'] / df_merged['valuation_usd'].sum()).sum()

# Calculate the fair value curve based on the average implied APY
fair_value_curve = 1 - 1 / (1 + implied_apy_average) ** (((maturity_time - h_range).total_seconds() / 3600) / 8760)

# Calculate weighted points for each row
df_merged['weighted_points'] = df_merged['points'] * df_merged['valuation_usd'] / df_merged['valuation_usd'].sum()

# Sum weighted points per underlying asset
weighted_points_per_underlying = df_merged['weighted_points'].sum()

# Add fair value curve and calculate the difference between fair value and yt/underling
df_combined['fair'] = fair_value_curve[:len(df_combined)]

# @title YT Price/Points Earned/Fair Value Curve of YT
def plot_yt_price_points_curve(df, h_range, fair_value_curve, symbol, network, mode, underlying_amount, yt_purchase_time=None,
                               add_difference_curve=False):
    """Plots YT Price, Points Earned, and optionally the Difference Curve."""

    fig = go.Figure()

    # Add YT Price and Points Earned curves
    fig.add_trace(go.Scatter(x=df['Time'], y=df['yt/underling'], mode='lines', name='YT Price', yaxis='y'))
    fig.add_trace(go.Scatter(x=df['Time'], y=df['points'], mode='lines', name='Points Earned', yaxis='y2'))

    # Add Fair Value Curve
    fig.add_trace(go.Scatter(
        x=h_range,
        y=fair_value_curve,
        mode='lines',
        name='Fair Value Curve of YT',
        line=dict(color='yellow', dash='dot', width=3),
        yaxis='y'
    ))

    # Layout settings
    yaxis_config = dict(title='YT Price', side='left')
    yaxis2_config = dict(title='Points Earned', overlaying='y', side='right')

    layout_config = {
        'title': f'{symbol} on {network} [{underlying_amount} underlying coin]<br />|BUY YT WHEN THE yt Price IS UNDER THE FAIR VALUE CURVE TO MAXIMIZE POINTS EARNED|<br />|在yt价格低于公平价值曲线时购买yt以最大化获得的积分|<br />|yt価格が公正価値曲線よりも低い場合にytを購入してポイントを最大化する|',
        'xaxis_title': 'Certain Time of Purchasing YT',
        'yaxis': yaxis_config,
        'yaxis2': yaxis2_config,
        'template': mode
    }


    fig.update_layout(**layout_config)


    # Display the plot
    fig.show()

    # Print the weighted Implied APY used to calculate the fair value curve

    print(f'The weighted Implied APY used to calculate the fair value curve is: {implied_apy_average:.2%}')
    print(f'用于计算公允价值曲线的加权隐含年收益率是：{implied_apy_average:.2%}')
    print(f'公正な価値曲線を計算するために使用された加重インプライドAPYは：{implied_apy_average:.2%}')


plot_yt_price_points_curve(df_merged, h_range, fair_value_curve, symbol, network, mode, underlying_amount)

INFO:__main__:Requesting URL: https://api-v2.pendle.finance/core/v3/1/transactions with params: {'market': '0x36d3ca43ae7939645c306e26603ce16e39a89192', 'action': 'SWAP_PT,SWAP_PY,SWAP_YT', 'origin': 'PENDLE_MARKET,YT', 'skip': '0', 'limit': '1000', 'minValue': '0'}
INFO:__main__:Sleeping for 0.17 seconds to avoid rate limiting.
INFO:__main__:Requesting URL: https://api-v2.pendle.finance/core/v3/1/transactions with params: {'market': '0x36d3ca43ae7939645c306e26603ce16e39a89192', 'action': 'SWAP_PT,SWAP_PY,SWAP_YT', 'origin': 'PENDLE_MARKET,YT', 'skip': '1000', 'limit': '1000', 'minValue': '0'}
INFO:__main__:Sleeping for 0.53 seconds to avoid rate limiting.
INFO:__main__:Requesting URL: https://api-v2.pendle.finance/core/v3/1/transactions with params: {'market': '0x36d3ca43ae7939645c306e26603ce16e39a89192', 'action': 'SWAP_PT,SWAP_PY,SWAP_YT', 'origin': 'PENDLE_MARKET,YT', 'skip': '2000', 'limit': '1000', 'minValue': '0'}
INFO:__main__:Sleeping for 0.38 seconds to avoid rate limiting.
I

The weighted Implied APY used to calculate the fair value curve is: 6.53%
用于计算公允价值曲线的加权隐含年收益率是：6.53%
公正な価値曲線を計算するために使用された加重インプライドAPYは：6.53%


---

# 2. Prediction of Buy/Sell Order Arrival Time
## Feature Overview
This feature is designed to optimize the efficiency of placing limit orders. You can input the amount of YT you wish to buy or sell, and this feature will predict how long it will take for the order to be fully executed based on the **current market sentiment**. If you're selling, the feature predicts when a corresponding buy order might arrive, with the actual time likely being shorter than the prediction since orders can be partially filled.

For example, if you need to sell 1 YT, you can use this feature to set an appropriate waiting time, avoiding the need to repeatedly place orders due to too short a limit time, or missing market opportunities due to too long a waiting period. If the market sentiment is bullish and APY skyrockets, your YT might sell out quickly; if the sentiment is bearish, you might consider setting a lower APY to facilitate the sale.

**Please make sure to read the risk warning and tips section at the top of this tool for important information.**

## How to Use
- If you want to buy YT, check BUY_YT.
- If you want to sell YT, check SELL_YT.
- If you want to buy PT, check BUY_PT.
- If you want to sell PT, check SELL_PT.

**Note: Only one option can be checked.**

Then, enter the amount of YT or PT you wish to buy or sell.

---
# 2. 买/卖单到来时间预测
## 功能介绍
该功能旨在优化限价单的挂单效率。你可以输入买/卖一定数量的YT，该功能会根据【当前市场情绪】预测大约多久后订单可能会一次性成交（如果你要卖出，该功能将预测买单的到来时间，并且实际时间大概率会小于预测时间，因为订单可能部分成交）。

例如，你需要卖出1个YT。通过此功能，你可以合理设置等待时间，避免因设置过短的限价单而频繁下单，或因等待时间过长而错失市场变化的机会。如果市场情绪高涨，APY暴涨，你的YT可能会卖飞；如果市场情绪低落，你则可以考虑调整APY以促成交易。

**请务必阅读本工具顶部的风险警告和提示部分，以获取重要信息。**

## 使用方法
- 如果你要购买YT，请勾选BUY_YT。
- 如果你要卖出YT，请勾选SELL_YT。
- 如果你要购买PT，请勾选BUY_PT。
- 如果你要卖出PT，请勾选SELL_PT。

【注意：仅允许勾选其中一项】

接着，填写你希望购买或卖出的YT或PT的金额。

---

In [2]:
# @title Prediction of Buy/Sell Order Arrival Time

import pandas as pd
import numpy as np
from scipy.stats import expon, gamma, weibull_min, pareto, burr, lognorm, beta
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.metrics import silhouette_score, calinski_harabasz_score, davies_bouldin_score
from sklearn.model_selection import TimeSeriesSplit
from tabulate import tabulate
import warnings
import logging
from joblib import Parallel, delayed
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px  # For Plotly visualizations
import plotly.graph_objects as go

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

# 2. Set Parameters
BUY_YT = False  # @param {"type":"boolean"}
SELL_YT = True   # @param {"type":"boolean"}
BUY_PT = False   # @param {"type":"boolean"}
SELL_PT = False  # @param {"type":"boolean"}
YT_OR_PT_AMOUNT = 0.01  # @param {"type":"number"}

# 3. Suppress Warnings
warnings.filterwarnings('ignore')

def validate_selection(buy_yt=BUY_YT, sell_yt=SELL_YT, buy_pt=BUY_PT, sell_pt=SELL_PT):
    """
    Validate that exactly one option is selected.
    """
    selected_options = [buy_yt, sell_yt, buy_pt, sell_pt]
    if sum(selected_options) != 1:
        raise ValueError("Please select exactly one option.")

    if buy_yt or sell_pt:
        return 0  # BUY = 0
    elif buy_pt or sell_yt:
        return 1  # BUY = 1
    else:
        raise ValueError("Invalid selection configuration.")

def preprocess_data(df, buy):
    """
    Preprocess the transaction data.
    """
    required_columns = ['timestamp', 'input_baseType', 'valuation_acc']
    for col in required_columns:
        if col not in df.columns:
            raise KeyError(f"Required column '{col}' not found in DataFrame.")

    # Convert 'timestamp' to datetime and sort
    df['timestamp'] = pd.to_datetime(df['timestamp'], utc=True)
    df = df.sort_values('timestamp')

    # Define 'buy_sell' column: PT = 1, others = 0
    df['buy_sell'] = (df['input_baseType'] == 'PT').astype(int)

    # Filter orders based on user selection and amount
    AMOUNT = YT_OR_PT_AMOUNT
    if buy == 1:
        df_filtered = df[
            (df['buy_sell'] == 1) &
            (df['valuation_acc'] >= AMOUNT)
        ].copy()
    else:
        df_filtered = df[
            (df['buy_sell'] == 0) &
            (df['valuation_acc'] >= AMOUNT)
        ].copy()

    return df_filtered

def extract_features(df):
    """
    Extract statistical features from the transaction data.
    """
    # Define time window (4-hour intervals)
    df['time_window'] = df['timestamp'].dt.floor('4H')

    # Group by time window
    grouped = df.groupby('time_window')

    # Initialize feature DataFrame
    feature_df = pd.DataFrame()
    feature_df['order_count'] = grouped.size()
    feature_df['mean_inter_arrival'] = grouped['timestamp'].apply(lambda x: x.diff().dt.total_seconds().mean())
    feature_df['std_inter_arrival'] = grouped['timestamp'].apply(lambda x: x.diff().dt.total_seconds().std())
    feature_df['hour'] = feature_df.index.hour
    feature_df['weekday'] = feature_df.index.weekday
    feature_df['min_inter_arrival'] = grouped['timestamp'].apply(lambda x: x.diff().dt.total_seconds().min())
    feature_df['max_inter_arrival'] = grouped['timestamp'].apply(lambda x: x.diff().dt.total_seconds().max())
    feature_df['median_inter_arrival'] = grouped['timestamp'].apply(lambda x: x.diff().dt.total_seconds().median())
    feature_df['skew_inter_arrival'] = grouped['timestamp'].apply(lambda x: x.diff().dt.total_seconds().skew())
    feature_df['kurtosis_inter_arrival'] = grouped['timestamp'].apply(lambda x: x.diff().dt.total_seconds().kurtosis())
    feature_df['total_time_span'] = grouped['timestamp'].apply(lambda x: (x.max() - x.min()).total_seconds())

    # Additional Features
    # Fee-related Features
    if 'implicitSwapFeeSy' in df.columns and 'explicitSwapFeeSy' in df.columns:
        feature_df['total_swap_fee_sy'] = grouped[['implicitSwapFeeSy', 'explicitSwapFeeSy']].sum().sum(axis=1)
        feature_df['avg_swap_fee_sy'] = grouped[['implicitSwapFeeSy', 'explicitSwapFeeSy']].mean().mean(axis=1)
    else:
        logging.warning("Fee-related columns not found. Skipping fee feature engineering.")
        feature_df['total_swap_fee_sy'] = 0
        feature_df['avg_swap_fee_sy'] = 0

    # User Interaction Features
    if 'user' in df.columns:
        feature_df['unique_users'] = grouped['user'].nunique()
    else:
        logging.warning("Column 'user' not found. Skipping user interaction features.")
        feature_df['unique_users'] = 0

    # Action-Based Features
    if 'action' in df.columns:
        feature_df['swap_pt_count'] = grouped['action'].apply(lambda x: (x == 'SWAP_PT').sum())
    else:
        logging.warning("Column 'action' not found. Skipping action-based features.")
        feature_df['swap_pt_count'] = 0

    # Market-Based Features
    if 'market_symbol' in df.columns and 'market_expiry' in df.columns:
        feature_df['unique_market_symbols'] = grouped['market_symbol'].nunique()
        feature_df['days_until_expiry'] = grouped['market_expiry'].apply(lambda x: (pd.to_datetime(x).max() - pd.Timestamp.utcnow()).days)
    else:
        logging.warning("Market-related columns not found. Skipping market-based features.")
        feature_df['unique_market_symbols'] = 0
        feature_df['days_until_expiry'] = 0

    # Valuation Features
    if 'valuation_usd' in df.columns:
        feature_df['total_valuation_usd'] = grouped['valuation_usd'].sum()
        feature_df['avg_valuation_usd'] = grouped['valuation_usd'].mean()
    else:
        logging.warning("Column 'valuation_usd' not found. Skipping valuation features.")
        feature_df['total_valuation_usd'] = 0
        feature_df['avg_valuation_usd'] = 0

    # Address-Type Features
    if 'input_baseType' in df.columns and 'output_baseType' in df.columns:
        feature_df['unique_input_baseType'] = grouped['input_baseType'].nunique()
        feature_df['unique_output_baseType'] = grouped['output_baseType'].nunique()
    else:
        logging.warning("Address-Type columns not found. Skipping address-type features.")
        feature_df['unique_input_baseType'] = 0
        feature_df['unique_output_baseType'] = 0

    # Handle missing values
    feature_df.fillna(method='ffill', inplace=True)
    feature_df.fillna(method='bfill', inplace=True)

    # Additional Feature Engineering
    feature_df['rolling_mean_inter_arrival'] = feature_df['mean_inter_arrival'].rolling(window=3).mean()
    feature_df['rolling_std_inter_arrival'] = feature_df['mean_inter_arrival'].rolling(window=3).std()
    feature_df['lag_order_count'] = feature_df['order_count'].shift(1)
    feature_df['lag_mean_inter_arrival'] = feature_df['mean_inter_arrival'].shift(1)

    # Fill new missing values after feature engineering
    feature_df.fillna(method='ffill', inplace=True)
    feature_df.fillna(method='bfill', inplace=True)

    return feature_df

def reduce_dimensionality(features_scaled, variance_threshold=0.95):
    """
    Apply PCA for dimensionality reduction to improve clustering performance.
    """
    pca = PCA(n_components=variance_threshold, random_state=42)
    features_pca = pca.fit_transform(features_scaled)
    logging.info(f"PCA reduced features to {features_pca.shape[1]} dimensions explaining {variance_threshold*100}% variance.")
    return features_pca, pca

def determine_optimal_k(features_scaled, k_min=2, k_max=10):
    """
    Determine the optimal number of clusters using Silhouette and Calinski-Harabasz methods.
    """
    inertias = []
    silhouette_scores = []
    calinski_scores = []
    davies_scores = []
    K_range = range(k_min, k_max + 1)

    for k in K_range:
        kmeans = KMeans(n_clusters=k, random_state=42)
        labels = kmeans.fit_predict(features_scaled)
        inertias.append(kmeans.inertia_)

        if k > 1:
            sil_score = silhouette_score(features_scaled, labels)
            silhouette_scores.append(sil_score)
            calinski_score = calinski_harabasz_score(features_scaled, labels)
            calinski_scores.append(calinski_score)
            davies_score = davies_bouldin_score(features_scaled, labels)
            davies_scores.append(davies_score)
        else:
            silhouette_scores.append(None)
            calinski_scores.append(None)
            davies_scores.append(None)

    # Determine optimal K based on Silhouette Score
    optimal_k_silhouette = K_range[np.argmax(silhouette_scores)]
    logging.info(f"Optimal number of clusters determined: {optimal_k_silhouette} (Silhouette Score)")

    return optimal_k_silhouette

def fit_distributions_parallel(inter_arrival_times, distributions):
    """
    Fit distributions in parallel to speed up the process.
    """
    def fit_distribution(name, dist, data):
        try:
            params = dist.fit(data)
            expected_inter_arrival = dist.mean(*params)
            if not np.isfinite(expected_inter_arrival) or expected_inter_arrival <= 0:
                return None
            log_likelihood = np.sum(dist.logpdf(data, *params))
            k_params = len(params)
            bic = k_params * np.log(len(data)) - 2 * log_likelihood
            return (name, {'params': params, 'bic': bic})
        except Exception as e:
            logging.warning(f"Error fitting {name}: {e}")
            return None

    results = Parallel(n_jobs=-1)(
        delayed(fit_distribution)(name, dist, inter_arrival_times)
        for name, dist in distributions.items()
    )

    # Filter out failed fits
    fit_results = {name: info for result in results if result is not None for name, info in [result]}
    return fit_results

def plot_clusters(features_pca, clusters, title='Cluster Visualization with KMeans'):
    """
    Plot the clusters using the first two PCA components with Plotly for consistency.
    """
    # Create a DataFrame for Plotly
    plot_df = pd.DataFrame({
        'PCA1': features_pca[:, 0],
        'PCA2': features_pca[:, 1],
        'Cluster': clusters.astype(str)
    })

    # Define color palette
    unique_clusters = plot_df['Cluster'].unique()
    color_palette = px.colors.qualitative.Plotly
    color_map = {cluster: color_palette[i % len(color_palette)] for i, cluster in enumerate(unique_clusters)}

    # Create Plotly scatter plot
    fig = px.scatter(
        plot_df,
        x='PCA1',
        y='PCA2',
        color='Cluster',
        color_discrete_map=color_map,
        title=title,
        labels={'PCA1': 'PCA Component 1', 'PCA2': 'PCA Component 2'},
        hover_data=['Cluster'],
        width=800,
        height=600
    )

    # Update layout for better aesthetics
    fig.update_layout(
        legend_title_text='Cluster',
        title_x=0.5,
        template='plotly_white'
    )

    fig.show()

def predict_next_order_time(current_time, last_order_time, best_dist, best_params):
    """
    Predict the next order arrival time based on the fitted distribution.
    """
    time_since_last_order = (current_time - last_order_time).total_seconds()
    expected_inter_arrival = best_dist.mean(*best_params)

    # Calculate time until next order
    time_until_next_order = expected_inter_arrival - time_since_last_order
    if time_until_next_order < 0:
        time_until_next_order = expected_inter_arrival

    next_order_time = current_time + pd.Timedelta(seconds=time_until_next_order)

    return next_order_time, time_until_next_order

def waiting_time(df_tran_cleaned):
    """
    Main function to execute the order arrival time prediction pipeline.
    """
    # Validate User Selection
    try:
        BUY = validate_selection()
    except ValueError as ve:
        logging.error(ve)
        return

    # Data Preprocessing
    try:
        df_orders = preprocess_data(df_tran_cleaned, BUY)
    except KeyError as ke:
        logging.error(ke)
        return

    # Check Data Sufficiency
    MIN_TRANSACTIONS = 100
    MIN_ORDERS = 8

    if df_orders.empty:
        logging.info("No orders above the threshold.")
        return
    elif len(df_tran_cleaned) < MIN_TRANSACTIONS:
        logging.info(f"Not enough transactions in the dataset. At least {MIN_TRANSACTIONS} required.")
        return
    elif len(df_orders) < MIN_ORDERS:
        logging.info(f"Not enough orders above the threshold. At least {MIN_ORDERS} required.")
        return

    # Feature Extraction
    feature_df = extract_features(df_orders)

    # Save feature columns for consistency
    feature_columns = feature_df.columns.tolist()

    # Standardize Features
    scaler = StandardScaler()
    features_scaled = scaler.fit_transform(feature_df)

    # Dimensionality Reduction
    features_pca, pca = reduce_dimensionality(features_scaled, variance_threshold=0.95)

    # Determine Optimal Number of Clusters
    optimal_k = determine_optimal_k(features_pca, k_min=2, k_max=10)

    # Perform Clustering using KMeans
    kmeans = KMeans(n_clusters=optimal_k, random_state=42)
    clusters = kmeans.fit_predict(features_pca)
    feature_df['cluster'] = clusters

    # Cluster Visualization with Plotly
    plot_clusters(features_pca, clusters, title='Cluster Visualization with KMeans')

    # Set 'time_window' as index for merging
    feature_df = feature_df.reset_index().set_index('time_window')

    # Merge Cluster Labels Back to Orders DataFrame
    df_orders = df_orders.merge(feature_df['cluster'], left_on='time_window', right_index=True, how='left')

    # Segment-wise Modeling
    segment_models = {}
    distributions = {
        'Exponential': expon,
        'Gamma': gamma,
        'Weibull': weibull_min,
        'Pareto': pareto,
        'Burr': burr,
        'LogNormal': lognorm,
        'Beta': beta
    }

    for cluster in feature_df['cluster'].unique():
        df_segment = df_orders[df_orders['cluster'] == cluster].copy()
        df_segment = df_segment.sort_values('timestamp')

        # Calculate inter-arrival times
        df_segment['inter_arrival_time'] = df_segment['timestamp'].diff().dt.total_seconds()
        df_segment = df_segment[df_segment['inter_arrival_time'] > 0].dropna(subset=['inter_arrival_time'])

        if len(df_segment) < MIN_ORDERS:
            logging.info(f"Cluster {cluster} has insufficient data. Skipping.")
            continue

        inter_arrival_times = df_segment['inter_arrival_time'].values

        # Fit distributions in parallel
        fit_results = fit_distributions_parallel(inter_arrival_times, distributions)

        # Select the best-fitting distribution based on BIC
        if fit_results:
            best_fit_name, best_fit_info = min(fit_results.items(), key=lambda x: x[1]['bic'])
            best_dist = distributions[best_fit_name]
            best_params = best_fit_info['params']
            best_bic = best_fit_info['bic']

            segment_models[cluster] = {
                'distribution_name': best_fit_name,
                'distribution': best_dist,
                'params': best_params,
                'bic': best_bic
            }
            logging.info(f"Cluster {cluster}: Best fit - {best_fit_name} with BIC={best_bic:.2f}")
        else:
            logging.info(f"Cluster {cluster}: No suitable distribution found.")

    # Prediction
    # Extract current time features
    current_time = pd.Timestamp.utcnow()
    current_features = {
        'order_count': feature_df['order_count'].mean(),
        'mean_inter_arrival': feature_df['mean_inter_arrival'].mean(),
        'std_inter_arrival': feature_df['std_inter_arrival'].mean(),
        'min_inter_arrival': feature_df['min_inter_arrival'].mean(),
        'max_inter_arrival': feature_df['max_inter_arrival'].mean(),
        'median_inter_arrival': feature_df['median_inter_arrival'].mean(),
        'skew_inter_arrival': feature_df['skew_inter_arrival'].mean(),
        'kurtosis_inter_arrival': feature_df['kurtosis_inter_arrival'].mean(),
        'total_time_span': feature_df['total_time_span'].mean(),
        'total_swap_fee_sy': feature_df.get('total_swap_fee_sy', 0).mean(),
        'avg_swap_fee_sy': feature_df.get('avg_swap_fee_sy', 0).mean(),
        'unique_users': feature_df.get('unique_users', 0).mean(),
        'swap_pt_count': feature_df.get('swap_pt_count', 0).mean(),
        'unique_market_symbols': feature_df.get('unique_market_symbols', 0).mean(),
        'days_until_expiry': feature_df.get('days_until_expiry', 0).mean(),
        'total_valuation_usd': feature_df.get('total_valuation_usd', 0).mean(),
        'avg_valuation_usd': feature_df.get('avg_valuation_usd', 0).mean(),
        'unique_input_baseType': feature_df.get('unique_input_baseType', 0).mean(),
        'unique_output_baseType': feature_df.get('unique_output_baseType', 0).mean(),
        'rolling_mean_inter_arrival': feature_df['rolling_mean_inter_arrival'].mean(),
        'rolling_std_inter_arrival': feature_df['rolling_std_inter_arrival'].mean(),
        'lag_order_count': feature_df['lag_order_count'].mean(),
        'lag_mean_inter_arrival': feature_df['lag_mean_inter_arrival'].mean(),
        'hour': current_time.hour,
        'weekday': current_time.weekday()
    }

    # Create DataFrame with consistent feature order
    current_features_df = pd.DataFrame([current_features], columns=feature_columns)

    # Standardize current features
    current_features_scaled = scaler.transform(current_features_df)

    # Apply PCA transformation
    current_features_pca = pca.transform(current_features_scaled)

    # Predict cluster for current time window using KMeans
    current_cluster = kmeans.predict(current_features_pca)[0]
    logging.info(f"Predicted cluster for current time window: {current_cluster}")

    if current_cluster in segment_models:
        model_info = segment_models[current_cluster]
        best_dist = model_info['distribution']
        best_params = model_info['params']

        # Get the last order time in the current cluster
        df_current_cluster = df_orders[df_orders['cluster'] == current_cluster].copy()
        if df_current_cluster.empty:
            logging.info("Not enough data in the current segment to make a prediction.")
        else:
            last_order_time = df_current_cluster['timestamp'].max()
            next_order_time, time_until_next_order = predict_next_order_time(current_time, last_order_time, best_dist, best_params)

            # Display results
            time_delta_seconds = time_until_next_order
            time_delta_minutes = time_delta_seconds / 60
            time_delta_hours = time_delta_seconds / 3600
            time_delta_days = time_delta_seconds / 86400

            order_type = "buy" if BUY == 1 else "sell"
            table_data_en = [
                [f"Predicted Next {YT_OR_PT_AMOUNT} AMOUNT {order_type.capitalize()} Order Arrival Time", f"{next_order_time}"],
                ["Approximately", f"{time_delta_seconds:.2f} seconds"],
                ["Or approximately", f"{time_delta_minutes:.2f} minutes"],
                ["Or approximately", f"{time_delta_hours:.2f} hours"],
                ["Or approximately", f"{time_delta_days:.2f} days"],
            ]

            table_data_cn = [
                [f"预测的下一笔 {YT_OR_PT_AMOUNT}数额的 {order_type} 单到达时间", f"{next_order_time}"],
                ["距离现在约为", f"{time_delta_seconds:.2f} 秒"],
                ["或者约为", f"{time_delta_minutes:.2f} 分钟"],
                ["或者约为", f"{time_delta_hours:.2f} 小时"],
                ["或者约为", f"{time_delta_days:.2f} 天"],
            ]

            headers_en = ["Description", "Value"]
            headers_cn = ["描述", "数值"]

            table_en = tabulate(table_data_en, headers=headers_en, tablefmt="grid", stralign="center", numalign="right")
            table_cn = tabulate(table_data_cn, headers=headers_cn, tablefmt="grid", stralign="center", numalign="right")

            print(table_en)
            print("\n" + table_cn)

            # Display model information
            print("\nModel used:")
            print(f"Segment (Cluster): {current_cluster}")
            print(f"Best-fitting distribution: {model_info['distribution_name']}")
            print(f"Parameters: {model_info['params']}")
            print(f"BIC: {model_info['bic']:.2f}")
    else:
        logging.info("Not enough data in the current segment to make a prediction.")
    return feature_df
feature_df = waiting_time(df_tran_cleaned)

BUY = validate_selection()
df_orders = preprocess_data(df_tran_cleaned, BUY)

df_orders['timestamp'] = pd.to_datetime(df_orders['timestamp'], utc=True)
feature_df = feature_df.reset_index()

df_orders['time_window'] = df_orders['timestamp'].dt.floor('4H')
feature_df['time_window'] = pd.to_datetime(feature_df['time_window'], utc=True)

df_orders = df_orders.merge(feature_df[['time_window', 'cluster']], on='time_window', how='left')


INFO:root:PCA reduced features to 11 dimensions explaining 95.0% variance.
INFO:root:Optimal number of clusters determined: 2 (Silhouette Score)


INFO:root:Cluster 0: Best fit - Weibull with BIC=63581.10
INFO:root:Cluster 1: Best fit - Pareto with BIC=5267.43
INFO:root:Predicted cluster for current time window: 0


+---------------------------------------------------+-------------------------------------+
|                    Description                    |                Value                |
| Predicted Next 0.01 AMOUNT Buy Order Arrival Time | 2024-10-12 12:16:00.810021603+00:00 |
+---------------------------------------------------+-------------------------------------+
|                   Approximately                   |           440.57 seconds            |
+---------------------------------------------------+-------------------------------------+
|                 Or approximately                  |            7.34 minutes             |
+---------------------------------------------------+-------------------------------------+
|                 Or approximately                  |             0.12 hours              |
+---------------------------------------------------+-------------------------------------+
|                 Or approximately                  |              0.01 days    

In [3]:
# @title Historical {AMOUNT} {order_type} Order Count Time Series
import plotly.express as px
import plotly.graph_objects as go

df_orders['time_window'] = df_orders['timestamp'].dt.floor('H')
order_counts = df_orders.groupby('time_window').size().reset_index(name='order_count')
order_type = 'BUY' if BUY == 1 else 'SELL'

fig = px.line(order_counts, x='time_window', y='order_count',
              title=f'{symbol} on {network} <br />历史 > {YT_OR_PT_AMOUNT} {order_type} 订单数量时间序列 / Historical > {YT_OR_PT_AMOUNT} {order_type} Order Count Time Series',
              labels={'time_window': '时间 / Time', 'order_count': '订单数量 / Order Count'})

fig.update_layout(
    title={'x':0.5},
    xaxis_title='时间 / Time',
    yaxis_title='订单数量 / Order Count'
)

fig.show()


---

# 3. Inter-Arrival Time and Implied APY Chart Overview

## Overview

- **X-axis (Time)**: Represents the specific time when each order occurred.

- **Left Y-axis (Inter-Arrival Time in seconds)**: Indicates the time interval between consecutive orders, measured in seconds.

- **Right Y-axis (Implied APY in %)**: Reflects the implied annual percentage yield (APY) of the market, which represents the market's expectations for future returns.

- **Scatter Plot**: The vertical position of the points represents the waiting time between orders, while the density of the points reflects the level of order activity on different dates. The color of each point is used to distinguish different order categories, with each category exhibiting unique characteristics.

- **Red Line Chart**: Shows the trend of the implied APY over time.

## How to Use

By combining the order arrival time predictions and order categories provided in the previous analysis, you can use this chart to observe how similar types of orders have historically influenced the implied APY and their corresponding waiting times. This chart serves as a subjective analysis tool, helping users make a secondary confirmation of their decisions.

---

# 3. 间隔时间点图与隐含年化收益率 (Implied APY) 图表介绍

## 概览

- **X轴（时间 / Time）**：表示订单发生的具体时间。
  
- **左Y轴（间隔时间（秒） / Inter-Arrival Time (seconds)）**：表示每笔订单与前一笔订单之间的时间间隔，单位为秒。
  
- **右Y轴（隐含年化收益率（%） / Implied APY (%)）**：表示市场的隐含年化收益率，反映市场对未来收益的预期水平。

- **散点图**：点的纵向位置表示订单的等待时间，点的密集程度反映了不同日期订单的活跃程度。点的颜色则用于区分订单的类别，不同类别的订单具有不同的特征。

- **红色折线图**：显示隐含年化收益率的变化趋势。

## 使用方法

结合上一个分析中提供的对手订单到达时间预测和交易类别，你可以通过该图表观察历史上相同类型订单对隐含年化收益率的影响，以及对应的历史等待时间。此图表可作为一种主观分析工具，帮助用户进行二次确认。

---

In [4]:
# @title Inter-Arrival Time and Implied APY Visualization

import pandas as pd
import numpy as np
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import plotly.express as px
import warnings

warnings.filterwarnings('ignore')

df_orders_sorted = df_orders.sort_values('timestamp')
df_orders_sorted['inter_arrival_time'] = df_orders_sorted['timestamp'].diff().dt.total_seconds()

if 'impliedApy' not in df_orders_sorted.columns:
    df_orders_sorted = df_orders_sorted.merge(df_orders[['timestamp', 'impliedApy']], on='timestamp', how='left')

df_orders_sorted['impliedApy'] = pd.to_numeric(df_orders_sorted['impliedApy'], errors='coerce')

# Drop rows with missing values in key columns
df_orders_sorted = df_orders_sorted.dropna(subset=['impliedApy', 'inter_arrival_time', 'cluster'])
df_orders_sorted['cluster'] = df_orders_sorted['cluster'].astype(int)

clusters = sorted(df_orders_sorted['cluster'].unique())
num_clusters = len(clusters)

color_palette = px.colors.qualitative.Plotly

if num_clusters > len(color_palette):
    color_palette = color_palette * (num_clusters // len(color_palette) + 1)

cluster_color_map = {cluster: color_palette[i] for i, cluster in enumerate(clusters)}

fig = make_subplots(specs=[[{"secondary_y": True}]])

scatter = go.Scatter(
    x=df_orders_sorted['timestamp'],
    y=df_orders_sorted['inter_arrival_time'],
    mode='markers',
    marker=dict(
        size=6,
        color=df_orders_sorted['cluster'].map(cluster_color_map),
        opacity=0.6,
        line=dict(width=0.5, color='DarkSlateGrey')
    ),
    name='Inter-Arrival Time (by Cluster)',
    hovertemplate=
        '<b>Time:</b> %{x}<br>' +
        '<b>Inter-Arrival Time (seconds):</b> %{y:.2f}<br>' +
        '<b>Cluster:</b> %{text}<br>' +
        '<extra></extra>',
    text=df_orders_sorted['cluster'].astype(str)
)

fig.add_trace(scatter, secondary_y=False)


line = go.Scatter(
    x=df_orders_sorted['timestamp'],
    y=df_orders_sorted['impliedApy'],
    mode='lines',
    line=dict(color='red', width=2),
    name='Implied APY',
    hovertemplate=
        '<b>Time:</b> %{x}<br>' +
        '<b>Implied APY (%):</b> %{y:.2f}%<br>' +
        '<extra></extra>'
)

fig.add_trace(line, secondary_y=True)

# Update layout for better aesthetics
fig.update_layout(
    title={
        'text': 'Inter-Arrival Time and Implied APY',
        'y':0.95,
        'x':0.5,
        'xanchor': 'center',
        'yanchor': 'top'
    },
    xaxis_title='Time',
    legend=dict(
        x=0.01,
        y=0.99,
        xanchor='left',
        yanchor='top',
        bgcolor='rgba(255,255,255,0.5)',
        bordercolor='rgba(0,0,0,0)',
        title='Cluster',
        orientation='v',
        font=dict(size=10),
    ),
    template='plotly_white',
    hovermode='closest'
)

# Update y-axes titles
fig.update_yaxes(
    title_text='Inter-Arrival Time (seconds)',
    secondary_y=False,
    gridcolor='lightgrey'
)

fig.update_yaxes(
    title_text='Implied APY (%)',
    secondary_y=True,
    gridcolor='lightgrey'
)

# Add cluster legend manually to replace color bar
for cluster, color in cluster_color_map.items():
    fig.add_trace(
        go.Scatter(
            x=[None],
            y=[None],
            mode='markers',
            marker=dict(size=8, color=color),
            legendgroup=str(cluster),
            showlegend=True,
            name=f'Cluster {cluster}'
        )
    )

# Remove the original scatter trace from the legend to avoid duplicates
fig.data = fig.data[:2] + fig.data[2:]

fig.update_layout(
    legend=dict(
        x=0.01,
        y=0.99,
        xanchor='left',
        yanchor='top',
        bgcolor='rgba(255,255,255,0.5)',
        bordercolor='rgba(0,0,0,0)',
        title='Cluster',
        orientation='v',
        font=dict(size=11),
    )
)

fig.show()


---

# Detailed Risk Disclosure


## 1. Market Risk

The value of assets such as YT (Yield Tokens) and PT (Principal Tokens) can fluctuate significantly due to market conditions. This volatility may result in substantial losses. Investors should be aware that the value of their investment can go up or down, and there is a possibility of losing the entire investment.

## 2. Liquidity Risk

Liquidity in the market for YT and PT may vary. During times of low liquidity, you may find it difficult to execute trades at favorable prices, or you may be unable to sell your assets at all. This could result in losses or the inability to exit a position when desired.


## 3. De-Pegging Risk

For assets like abcETH, there is a risk of de-pegging from their underlying assets, such as ETH. If abcETH loses its peg, its value may diverge significantly from ETH, leading to potential losses. Investors should consider the risks of holding pegged assets in volatile markets.

## 4. Strategy Risk

The strategies provided, such as the fair implied APY and order arrival time prediction, are based on historical data and current market conditions. However, past performance is not indicative of future results. These strategies should not be relied upon as guarantees of future performance, and investors should be prepared for unexpected outcomes.

## 5. Tool and Data Risk

The analytical tools and data provided are intended for educational and reference purposes only. The data may not be updated in real-time, and there is no guarantee of the tools' accuracy or reliability. Using these tools in a live trading environment carries the risk of inaccuracies that could lead to financial losses.



## 6. Investment Risk

Investing in digital assets, including YT and PT, carries inherent risks. These assets may be highly speculative, and there is no guarantee that your investment will generate positive returns. It is important to only invest what you can afford to lose and to seek advice from a financial professional if needed.

---

# 详细风险披露

## 1. 市场风险

YT（收益代币）和PT（本金代币）等资产的价值可能因市场状况而发生剧烈波动。这种波动可能导致重大损失。投资者应意识到，他们的投资价值可能会上升或下降，并且有可能失去全部投资。

## 2. 流动性风险

YT和PT的市场流动性可能会有所变化。在流动性较低的时期，您可能难以以有利的价格执行交易，或者可能无法出售您的资产。这可能导致损失或在需要时无法退出头寸。

## 3. 脱锚风险

对于像abcETH这样的资产，存在脱锚于其基础资产（如ETH）的风险。如果abcETH失去锚定，其价值可能会与ETH显著偏离，从而导致潜在损失。投资者在动荡的市场中应考虑持有锚定资产的风险。

## 4. 策略风险

所提供的策略（如公平隐含APY和订单到达时间预测）基于历史数据和当前市场状况。然而，过去的表现并不能代表未来的结果。这些策略不应被视为未来表现的保证，投资者应做好应对意外结果的准备。

## 5. 工具和数据风险

所提供的分析工具和数据仅用于教育和参考目的。数据可能不是实时更新的，并且无法保证工具的准确性或可靠性。在实际交易环境中使用这些工具可能会因不准确性而导致财务损失。

## 6. 投资风险

投资数字资产，包括YT和PT，具有固有的风险。这些资产可能具有高度投机性，并且没有保证您的投资会产生正回报。重要的是仅投资您能够承受损失的资金，并在需要时寻求金融专业人士的建议。

---