# EquiLend Gating Framework Process & Overview

A systematic, data-driven process designed to identify securities with the highest potential for short-squeeze risk through sequential filtering gates.

# Gating Framework: Thought Process, Research, and Overview

## Overview
The gating framework is a systematic, data-driven process designed to identify securities with the highest potential for short-squeeze risk. By applying a series of sequential filters ("gates"), the framework narrows down a broad universe of securities to a focused list of candidates that meet strict criteria for liquidity, demand, and risk factors.

## Gate Sequence and Rationale
1. **Initial Universe**: All securities in the dataset.
2. **On-Loan Value ≥ $1M**: Ensures the security has a meaningful level of short interest, filtering out illiquid or inactive names.
3. **Borrower Count ≥ 3**: Validates that short demand is broad-based and not driven by a single entity, reducing idiosyncratic risk.
4. **Lender Count ≥ 2**: Ensures the supply side is not concentrated, reducing the risk of a single lender's actions creating a misleading signal.
5. **Lendable Inventory > 10% of Float**: Confirms the security is part of the liquid lendable market, avoiding signals driven by artificially small supply.
6. **Utilization ≥ 50%**: Focuses on names where lendable supply is significantly constrained, increasing squeeze potential.
7. **Days to Cover ≥ 2**: Indicates it would take multiple days for shorts to exit positions, which can amplify a squeeze.

## Float Calculation
Float is a critical metric in the analysis, calculated as:

    Float = On Loan Quantity / (Short Interest Indicator / 100)

This approach ensures that the float reflects the true available supply in the market, accounting for both borrow activity and reported short interest.

## Research & Thought Process
- **Liquidity and Market Depth**: The initial gates focus on ensuring that only liquid, actively traded securities are considered, as these are more likely to experience meaningful price movements.
- **Breadth of Demand and Supply**: By requiring a minimum number of borrowers and lenders, the framework avoids names where activity is dominated by a single participant, which could skew results or create false signals.
- **Supply Constraints**: High utilization and low lendable inventory relative to float are classic indicators of potential squeezes, as they signal that shorts may struggle to cover positions if demand spikes or supply contracts.
- **Exit Difficulty**: Days to cover is a direct measure of how hard it would be for shorts to unwind positions, with higher values indicating greater risk of a squeeze.

## Summary
This gating process is the result of research into market microstructure, historical short-squeeze events, and best practices in securities lending analytics. The framework is designed to be robust, transparent, and adaptable to evolving market conditions.

<a href="https://colab.research.google.com/github/BobSheehan23/EquiLend/blob/main/EquiLend_Gating_Framework_Process_%26_Overview.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

# === Configuration ===
MAIN_COLOR = '#006EB7'
DATA_FILE_PATH = 'Daily_Data_Dispatch_2025-06-16_bsheehan_adhoc.csv'

# Set up plotting style
plt.style.use('default')
sns.set_palette([MAIN_COLOR])

print("Libraries imported successfully")
print(f"Analysis Date: {datetime.today().strftime('%B %d, %Y')}")

## Data Loading and Preparation

This section loads the raw data and creates the derived metrics needed for the gating framework analysis.

In [None]:
def load_and_prepare_data(file_path):
    """
    Load and prepare data for gating framework analysis.
    Creates derived metrics including Float, Days to Cover, and Market Cap.
    """
    try:
        df = pd.read_csv(file_path)
        print(f"Successfully loaded {len(df):,} securities from {file_path}")
    except FileNotFoundError:
        print(f"Error: The file '{file_path}' was not found.")
        return None
    
    # Rename columns for easier handling
    column_mapping = {
        'Utilization (%)': 'Utilization_Pct',
        'On Loan Value (USD)': 'OnLoanValueUSD',
        'Short Interest Indicator': 'ShortInterestPct',
        'Total Lendable Value (USD)': 'LendableValueUSD',
        'Security Price (USD)': 'PriceUSD',
        'Borrower Count': 'BorrowerCount',
        'Lender Count': 'LenderCount',
        'Average Fee': 'AvgFee',
        'On Loan Quantity': 'OnLoanQty',
        'Composite 20-Day ADV': 'ADV20Day'
    }
    
    df.rename(columns=column_mapping, inplace=True, errors='ignore')
    
    # Calculate derived metrics
    df['Float'] = np.where(df['ShortInterestPct'] > 0, 
                          df['OnLoanQty'] / (df['ShortInterestPct'] / 100), 0)
    
    df['LendableShares'] = np.where(df['PriceUSD'] > 0, 
                                   df['LendableValueUSD'] / df['PriceUSD'], 0)
    
    df['LendablePctFloat'] = np.where(df['Float'] > 0, 
                                     (df['LendableShares'] / df['Float']) * 100, 0)
    
    df['DaysToCover'] = np.where(df['ADV20Day'] > 0, 
                                df['OnLoanQty'] / df['ADV20Day'], 0)
    
    df['MarketCapUSD'] = df['Float'] * df['PriceUSD']
    
    print("\nDerived metrics calculated:")
    print("- Float (shares)")
    print("- Lendable as % of Float")
    print("- Days to Cover")
    print("- Market Capitalization")
    
    return df

# Load the data
raw_df = load_and_prepare_data(DATA_FILE_PATH)

## Gating Framework Implementation

The core gating framework that sequentially applies filters to identify the most compelling short-squeeze candidates.

In [None]:
def apply_gating_framework(df):
    """
    Apply the sequential gating framework to identify short-squeeze candidates.
    Returns gate counts, analysis of dropped securities, and final filtered dataset.
    """
    if df is None:
        return {}, [], pd.DataFrame()
    
    # Define the sequential gates
    gates = {
        '1. Initial Universe': lambda d: d,
        '2. On-Loan Value >= $1M': lambda d: d[d['OnLoanValueUSD'] >= 1_000_000],
        '3. Lendable > 10% of Float': lambda d: d[d['LendablePctFloat'] > 10],
        '4. Utilization >= 50%': lambda d: d[d['Utilization_Pct'] >= 50],
        '5. Days to Cover >= 2': lambda d: d[d['DaysToCover'] >= 2],
        '6. Borrower Count >= 3': lambda d: d[d['BorrowerCount'] >= 3],
        '7. Lender Count >= 2': lambda d: d[d['LenderCount'] >= 2]
    }
    
    gate_counts = {}
    dropped_securities_analysis = []
    df_filtered = df.copy()
    last_count = len(df_filtered)
    
    print("Applying gating framework...\n")
    
    for name, gate_func in gates.items():
        df_after_gate = gate_func(df_filtered)
        current_count = len(df_after_gate)
        gate_counts[name] = current_count
        
        # Analyze dropped securities
        if current_count < last_count:
            dropped_indices = df_filtered.index.difference(df_after_gate.index)
            dropped_df = df_filtered.loc[dropped_indices]
            
            analysis = {
                'Gate': name,
                'Dropped Count': last_count - current_count,
                'Median Market Cap (M)': round(dropped_df['MarketCapUSD'].median() / 1e6, 1),
                'Median Utilization (%)': round(dropped_df['Utilization_Pct'].median(), 1),
                'Median Fee (bps)': round(dropped_df['AvgFee'].median(), 1)
            }
            dropped_securities_analysis.append(analysis)
            
            print(f"{name}: {current_count:,} securities (-{last_count - current_count:,})")
        else:
            print(f"{name}: {current_count:,} securities (no change)")
        
        df_filtered = df_after_gate
        last_count = current_count
    
    return gate_counts, dropped_securities_analysis, df_filtered

# Apply the gating framework
if raw_df is not None:
    gate_results, dropped_stats, final_df = apply_gating_framework(raw_df)
    
    initial_count = list(gate_results.values())[0]
    final_count = list(gate_results.values())[-1]
    reduction_pct = round(100 * (1 - final_count / initial_count), 1) if initial_count > 0 else 0
    
    print(f"\n=== SUMMARY ===")
    print(f"Initial Universe: {initial_count:,} securities")
    print(f"Final Candidates: {final_count:,} securities")
    print(f"Reduction: {reduction_pct}%")

## Visualization: Gating Framework Attrition

Visual representation of how each gate reduces the universe of securities.

In [None]:
def create_attrition_chart(gate_counts):
    """
    Create a bar chart showing the attrition through each gate.
    """
    labels = [label.split('. ')[1] for label in gate_counts.keys()]
    counts = list(gate_counts.values())
    
    fig, ax = plt.subplots(figsize=(12, 8))
    bars = ax.bar(labels, counts, color=MAIN_COLOR, alpha=0.8)
    
    ax.set_ylabel('Number of Securities', fontsize=12)
    ax.set_title('Gating Framework Attrition Analysis', fontsize=16, pad=20)
    
    # Remove top and right spines
    ax.spines['top'].set_visible(False)
    ax.spines['right'].set_visible(False)
    
    # Rotate x-axis labels for better readability
    ax.tick_params(axis='x', rotation=45, labelsize=10)
    ax.grid(axis='y', linestyle='--', alpha=0.6)
    
    # Add value labels on bars
    for bar in bars:
        yval = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2.0, yval + counts[0]*0.01, 
                f'{yval:,}', ha='center', va='bottom', fontweight='bold')
    
    plt.tight_layout()
    plt.show()

# Create the attrition chart
if 'gate_results' in locals() and gate_results:
    create_attrition_chart(gate_results)

## Analysis of Filtered Securities

Detailed analysis of the characteristics of securities dropped at each gate.

In [None]:
def display_dropped_analysis(dropped_stats):
    """
    Display analysis of securities dropped at each gate.
    """
    if not dropped_stats:
        print("No dropped securities analysis available.")
        return
    
    df_dropped = pd.DataFrame(dropped_stats)
    
    # Clean up gate names for display
    df_dropped['Gate'] = df_dropped['Gate'].apply(lambda x: x.split('. ')[1])
    
    print("=== ANALYSIS OF FILTERED SECURITIES ===")
    print("\nCharacteristics of securities dropped at each gate:")
    print(df_dropped.to_string(index=False))
    
    return df_dropped

# Display the analysis
if 'dropped_stats' in locals():
    dropped_analysis_df = display_dropped_analysis(dropped_stats)

## Final Candidate Analysis

Detailed examination of the securities that passed all gates.

In [None]:
def analyze_final_candidates(final_df):
    """
    Analyze and display the final list of candidates that passed all gates.
    """
    if final_df.empty:
        print("No securities passed all gates.")
        return
    
    print(f"=== FINAL CANDIDATE ANALYSIS ({len(final_df)} securities) ===")
    
    # Summary statistics
    print("\nSummary Statistics:")
    summary_stats = {
        'Average Utilization (%)': f"{final_df['Utilization_Pct'].mean():.1f}",
        'Average Fee (bps)': f"{final_df['AvgFee'].mean():.0f}",
        'Average Days to Cover': f"{final_df['DaysToCover'].mean():.2f}",
        'Median Market Cap ($M)': f"{(final_df['MarketCapUSD'].median() / 1e6):.1f}",
        'Average Borrower Count': f"{final_df['BorrowerCount'].mean():.1f}"
    }
    
    for key, value in summary_stats.items():
        print(f"  {key}: {value}")
    
    # Top candidates by Days to Cover
    print("\nTop 10 Candidates by Days to Cover:")
    display_cols = ['Ticker', 'OnLoanValueUSD', 'Utilization_Pct', 'DaysToCover', 
                   'AvgFee', 'BorrowerCount', 'LenderCount']
    
    top_candidates = final_df[display_cols].sort_values(by='DaysToCover', ascending=False).head(10)
    
    # Format for display
    top_candidates_display = top_candidates.copy()
    top_candidates_display['OnLoanValueUSD'] = top_candidates_display['OnLoanValueUSD'].apply(
        lambda x: f"${x/1e6:,.1f}M")
    top_candidates_display['Utilization_Pct'] = top_candidates_display['Utilization_Pct'].apply(
        lambda x: f"{x:.1f}%")
    top_candidates_display['DaysToCover'] = top_candidates_display['DaysToCover'].apply(
        lambda x: f"{x:.2f}")
    top_candidates_display['AvgFee'] = top_candidates_display['AvgFee'].apply(
        lambda x: f"{x:.0f} bps")
    
    print(top_candidates_display.to_string(index=False))
    
    return final_df

# Analyze final candidates
if 'final_df' in locals():
    analyzed_final = analyze_final_candidates(final_df)

## Distribution Analysis

Visual analysis of key metrics for the final candidate list.

In [None]:
def create_distribution_charts(final_df):
    """
    Create distribution charts for key metrics of final candidates.
    """
    if final_df.empty:
        print("No data available for distribution analysis.")
        return
    
    fig, axes = plt.subplots(2, 2, figsize=(15, 12))
    fig.suptitle('Distribution Analysis of Final Candidates', fontsize=16, y=0.98)
    
    # Utilization distribution
    axes[0, 0].hist(final_df['Utilization_Pct'], bins=20, color=MAIN_COLOR, alpha=0.7, edgecolor='black')
    axes[0, 0].set_title('Utilization Distribution (%)')
    axes[0, 0].set_xlabel('Utilization (%)')
    axes[0, 0].set_ylabel('Frequency')
    axes[0, 0].grid(True, alpha=0.3)
    
    # Days to Cover distribution
    axes[0, 1].hist(final_df['DaysToCover'], bins=20, color=MAIN_COLOR, alpha=0.7, edgecolor='black')
    axes[0, 1].set_title('Days to Cover Distribution')
    axes[0, 1].set_xlabel('Days to Cover')
    axes[0, 1].set_ylabel('Frequency')
    axes[0, 1].grid(True, alpha=0.3)
    
    # Average Fee distribution
    axes[1, 0].hist(final_df['AvgFee'], bins=20, color=MAIN_COLOR, alpha=0.7, edgecolor='black')
    axes[1, 0].set_title('Average Fee Distribution (bps)')
    axes[1, 0].set_xlabel('Average Fee (bps)')
    axes[1, 0].set_ylabel('Frequency')
    axes[1, 0].grid(True, alpha=0.3)
    
    # Market Cap distribution (log scale)
    market_cap_millions = final_df['MarketCapUSD'] / 1e6
    axes[1, 1].hist(np.log10(market_cap_millions), bins=20, color=MAIN_COLOR, alpha=0.7, edgecolor='black')
    axes[1, 1].set_title('Market Cap Distribution (Log Scale)')
    axes[1, 1].set_xlabel('Log10(Market Cap $M)')
    axes[1, 1].set_ylabel('Frequency')
    axes[1, 1].grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

# Create distribution charts
if 'final_df' in locals() and not final_df.empty:
    create_distribution_charts(final_df)

## Export Results

Export the final candidate list and analysis results for further use.

In [None]:
def export_results(final_df, gate_results, dropped_stats):
    """
    Export the analysis results to CSV files.
    """
    timestamp = datetime.now().strftime("%Y%m%d_%H%M")
    
    # Export final candidate list
    if not final_df.empty:
        final_filename = f'EquiLend_Final_Candidates_{timestamp}.csv'
        final_df.to_csv(final_filename, index=False)
        print(f"Final candidates exported to: {final_filename}")
    
    # Export gate analysis
    gate_df = pd.DataFrame(list(gate_results.items()), columns=['Gate', 'Count'])
    gate_filename = f'EquiLend_Gate_Analysis_{timestamp}.csv'
    gate_df.to_csv(gate_filename, index=False)
    print(f"Gate analysis exported to: {gate_filename}")
    
    # Export dropped securities analysis
    if dropped_stats:
        dropped_df = pd.DataFrame(dropped_stats)
        dropped_filename = f'EquiLend_Dropped_Analysis_{timestamp}.csv'
        dropped_df.to_csv(dropped_filename, index=False)
        print(f"Dropped securities analysis exported to: {dropped_filename}")
    
    print(f"\nAll analysis completed at {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

# Export results
if all(var in locals() for var in ['final_df', 'gate_results', 'dropped_stats']):
    export_results(final_df, gate_results, dropped_stats)
else:
    print("Results not available for export. Please run the analysis first.")