[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/Initial-Insider-Transactions.ipynb)

# Analyze Insider Trading Data from SEC Form 3 with Python -- Free, No API Key

Use **edgartools** to extract and analyze insider ownership data from SEC Form 3 filings in Python -- completely free, no API key or paid subscription required. Form 3 is the initial statement of beneficial ownership that insiders must file when they first become officers, directors, or 10% shareholders.

**What you'll learn:**
- Retrieve Form 3 insider ownership filings for any public company
- Extract structured data (insider names, positions, share counts, derivatives)
- Analyze ownership patterns, trends, and peer comparisons with pandas and matplotlib

In [None]:
!pip install -U edgartools seaborn

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from tqdm.auto import tqdm

from edgar import *

pd.options.display.max_columns = None
pd.options.display.max_rows = 200

# The SEC requires you to identify yourself (any email works)
set_identity("your.name@example.com")

## Find Insider Filings on SEC EDGAR

To list insider filings use `get_filings()` and filter by form type. The SEC requires three forms for insider ownership:

- **Form 3** -- Initial statement of beneficial ownership
- **Form 4** -- Changes in ownership (most common)
- **Form 5** -- Annual summary of small or late transactions

In [None]:
filings = get_filings(form=4).head(10)
filings

## Get Insider Filings for a Specific Company

To analyze insider ownership for a specific company, start with `Company()` and filter for Form 3 filings. Let's look at **Vertex Pharmaceuticals** (VRTX):

In [None]:
company = Company("VRTX")
company

## View Initial Insider Positions

Form 3 filings contain the initial ownership positions for insiders. This tells us when someone became an insider and what they held at that point.

Electronic filings on EDGAR go back to around 2003-2004 when the SEC mandated electronic submission. Let's see all of Vertex's initial filings:

In [None]:
initial_filings = (company.get_filings(form=3)
                     .filter(filing_date=':2025-03-04')) # Filter up to this date to keep the notebook data stable
initial_filings

There are just over 50 initial insider filings since 2004 for VRTX. Each filing can be converted to a typed Python object with `.obj()`:

In [None]:
initial_filings[0]

In [None]:
form3 = initial_filings[0].obj()
form3

In [None]:
form3.to_dataframe()

## Extract Insider Data into a DataFrame

To analyze insider data at scale, loop through each Form 3 filing, convert to a data object with `.obj()`, and call `.to_dataframe()`. Then concatenate into a single DataFrame:

In [None]:
def get_company_insiders(ticker:str,
                         summary:bool=True):
    """Transform raw Form 3 filings into analyzable data
    
    Parameters:
        ticker (str): Stock symbol of target company
        summary (str): If True produce one summary row. Otherwise break out in individual holdings
    Returns:
        Depends on whether we are doing the summary or not 
        if summary== True
            pd.DataFrame: Structured ownership data with columns:
                - Date: The date of the transaction
                - Insider: Name of executive/director
                - Position: Company role
                - TotalShares: The total shares held
                - Holdings: How many different holding positions e.g 3 derivatives, 1 common stock
                - Common Stock Holdings: How many different Common Stock Holdings
                - Derivative Holdings: How many different Derivative Holdings 
        else:
            May contain additional columns about individual positions like
            - Expiration Date
            - OWnership Nature
            - Underlying Security
    """
    c = Company(ticker)
    initial_filings = c.get_filings(form=3)
    insider_data = []
    for filing in tqdm(initial_filings):
        # Convert to data object
        form3 = filing.obj()
        # Get the data in a dataframe. Set detailed = False to get 1 summary row
        df = form3.to_dataframe(detailed = not summary)

        insider_data.append(df)

    insider_data = (pd.concat(insider_data, ignore_index=True)
                    .reset_index(drop=True)

    )
    insider_data = insider_data.drop(
        columns=[c 
        for c in ['Form', 'Issuer', 'Ticker', 'Remarks', 'Has Derivatives']
        if c in insider_data.columns
        ]
    )
    return insider_data


# Fix the variable name in your existing code
vrtx_summary = get_company_insiders("VRTX")



## Analyze Ownership Patterns

Vertex demonstrates a two-tier ownership pattern:

- **Executive Leadership**: Technical, legal, and therapeutic division leaders show substantial direct ownership (13,879-21,175 shares)
- **Board Directors**: Most recent director appointments show zero initial ownership, suggesting performance-contingent compensation

In [None]:
# Function to analyze ownership patterns in Form 3 filings
def analyze_ownership_patterns(df):
    """Analyze ownership patterns from Form 3 summary data
    
    Parameters:
        df (pd.DataFrame): DataFrame containing Form 3 summary data
        
    Returns:
        dict: Dictionary of insights and metrics
    """
    # Add role classification
    role_patterns = df.assign(
        is_director=df['Position'].str.contains('Director'),
        is_executive=~df['Position'].str.contains('Director'),
        has_skin_in_game=df['Total Shares'] > 0
    )

    # Calculate key metrics
    insights = {
        'director_ownership_rate': role_patterns[role_patterns['is_director']]['has_skin_in_game'].mean(),
        'executive_ownership_rate': role_patterns[role_patterns['is_executive']]['has_skin_in_game'].mean(),
        'derivative_users': (role_patterns['Derivative Holdings'] > 0).sum(),
        'avg_exec_shares': role_patterns[role_patterns['is_executive']]['Total Shares'].mean(),
        'recent_trend': role_patterns.sort_values('Date', ascending=False).head(5)['Total Shares'].mean(),
        'historical_trend': role_patterns.sort_values('Date').head(5)['Total Shares'].mean(),
        'role_patterns': role_patterns  # Return the enhanced DataFrame for further analysis
    }

    return insights

In [None]:
# 1. Analyze ownership patterns
vrtx_insights = analyze_ownership_patterns(vrtx_summary)
print(f"Director ownership rate: {vrtx_insights['director_ownership_rate']:.1%}")
print(f"Executive ownership rate: {vrtx_insights['executive_ownership_rate']:.1%}")
print(f"Recent trend (last 5 filings): {vrtx_insights['recent_trend']:.0f} shares")
print(f"Historical trend (first 5 filings): {vrtx_insights['historical_trend']:.0f} shares")

## Visualize Ownership Trends Over Time

The data reveals a shift in Vertex's compensation philosophy:

- **2020-2021**: Both executives and directors received meaningful initial equity positions
- **2022-2024**: Zero-ownership appointments became standard for directors and some executives
- This transition coincides with Vertex's expansion beyond cystic fibrosis into broader therapeutic areas

In [None]:
# Function to visualize ownership trends over time
def visualize_ownership_trends(df):
    """Create visualizations of ownership trends from Form 3 data
    
    Parameters:
        df (pd.DataFrame): DataFrame containing Form 3 summary data
        
    Returns:
        matplotlib.figure.Figure: Figure object containing the plot
    """
    # Convert date to datetime if it's not already
    if not pd.api.types.is_datetime64_any_dtype(df['Date']):
        df['Date'] = pd.to_datetime(df['Date'])

    # Create a figure with multiple plots
    fig, axes = plt.subplots(2, 1, figsize=(12, 10))

    # Plot 1: Scatter plot of initial ownership by role over time
    ax = axes[0]
    scatter = ax.scatter(
        df['Date'], 
        df['Total Shares'],
        s=100,
        c=df['Position'].str.contains('Director').map({True: 'blue', False: 'red'}),
        alpha=0.7
    )

    # Add a legend
    from matplotlib.lines import Line2D
    legend_elements = [
        Line2D([0], [0], marker='o', color='w', markerfacecolor='red', markersize=10, label='Executive'),
        Line2D([0], [0], marker='o', color='w', markerfacecolor='blue', markersize=10, label='Director')
    ]
    ax.legend(handles=legend_elements, loc='upper right')

    # Add labels and title
    ax.set_xlabel('Filing Date')
    ax.set_ylabel('Total Shares')
    ax.set_title('Initial Ownership Trends Over Time')

    # Add a trend line (rolling average)
    if len(df) > 5:
        # Sort by date
        df_sorted = df.sort_values('Date')
        # Calculate rolling average
        df_sorted['rolling_avg'] = df_sorted['Total Shares'].rolling(window=3, min_periods=1).mean()
        # Plot rolling average
        ax.plot(df_sorted['Date'], df_sorted['rolling_avg'], 'k--', alpha=0.5)

    # Plot 2: Bar chart of average ownership by role
    ax = axes[1]
    role_data = df.assign(Role=df['Position'].apply(lambda x: 'Director' if 'Director' in x else 'Executive'))
    role_avg = role_data.groupby('Role')['Total Shares'].mean().reset_index()

    # Create bar chart
    bars = ax.bar(
        role_avg['Role'],
        role_avg['Total Shares'],
        color=['blue', 'red']
    )

    # Add labels and title
    ax.set_xlabel('Role')
    ax.set_ylabel('Average Initial Shares')
    ax.set_title('Average Initial Ownership by Role')

    # Add value labels on bars
    for bar in bars:
        height = bar.get_height()
        ax.text(
            bar.get_x() + bar.get_width()/2.,
            height,
            f'{int(height):,}',
            ha='center',
            va='bottom'
        )

    # Adjust layout and return figure
    plt.tight_layout()
    return fig


In [None]:
# 2. Visualize ownership trends
fig = visualize_ownership_trends(vrtx_summary)
plt.show()

## Analyze Derivative vs Direct Ownership

Only one insider (Joy Liu, General Counsel) shows significant derivative holdings (3) alongside common stock. This pattern suggests:

- Specialized retention strategy for legal expertise
- Potential negotiated compensation structure
- Different risk profile compared to other executives

In [None]:
# Function to analyze derivative positions
def analyze_derivative_positions(df):
    """Analyze derivative positions from Form 3 summary data
    
    Parameters:
        df (pd.DataFrame): DataFrame containing Form 3 summary data
        
    Returns:
        dict: Dictionary of derivative insights
    """
    # Identify insiders with derivative positions
    derivative_holders = df[df['Derivative Holdings'] > 0].copy()

    # Calculate metrics
    insights = {
        'derivative_holder_count': len(derivative_holders),
        'derivative_holder_pct': len(derivative_holders) / len(df) if len(df) > 0 else 0,
        'avg_derivative_holdings': derivative_holders['Derivative Holdings'].mean() if len(derivative_holders) > 0 else 0,
        'derivative_holders': derivative_holders,
        'derivative_to_common_ratio': derivative_holders['Derivative Holdings'].sum() / 
                                     derivative_holders['Common Stock Holdings'].sum() 
                                     if derivative_holders['Common Stock Holdings'].sum() > 0 else float('inf')
    }

    return insights

In [None]:
# 3. Analyze derivative positions
derivative_insights = analyze_derivative_positions(vrtx_summary)
print(f"Derivative holders: {derivative_insights['derivative_holder_count']} ({derivative_insights['derivative_holder_pct']:.1%})")
print(f"Average derivative holdings per holder: {derivative_insights['avg_derivative_holdings']:.1f}")
print(f"Derivative to common stock ratio: {derivative_insights['derivative_to_common_ratio']:.2f}")

# Show the derivative holders
derivative_insights['derivative_holders']

## Calculate Strategic Insights

These patterns provide actionable intelligence for investors:

- **Board Independence**: Zero-ownership directors may exercise more independent judgment
- **Executive Alignment**: Operational leaders maintain significant "skin in the game"
- **Succession Planning**: Track subsequent Form 4 filings from zero-ownership appointees to identify which executives are being groomed for larger roles
- **Compensation Evolution**: The shift toward zero initial ownership suggests increased emphasis on performance-based equity awards

In [None]:
# Function to calculate strategic implications
def calculate_strategic_implications(df):
    """Calculate strategic implications from Form 3 data
    
    Parameters:
        df (pd.DataFrame): DataFrame containing Form 3 summary data
        
    Returns:
        dict: Dictionary of strategic insights
    """
    # Calculate ownership concentration
    top_holders = df.nlargest(3, 'Total Shares')
    ownership_concentration = top_holders['Total Shares'].sum() / df['Total Shares'].sum() if df['Total Shares'].sum() > 0 else 0

    # Calculate temporal shifts
    df['year'] = pd.to_datetime(df['Date']).dt.year
    yearly_avg = df.groupby('year')['Total Shares'].mean().reset_index()

    # Calculate zero-ownership appointments
    zero_ownership = df[df['Total Shares'] == 0]
    zero_ownership_rate = len(zero_ownership) / len(df) if len(df) > 0 else 0

    # Calculate executive vs director metrics
    is_director = df['Position'].str.contains('Director')
    director_df = df[is_director]
    executive_df = df[~is_director]

    director_zero_rate = (director_df['Total Shares'] == 0).mean() if len(director_df) > 0 else 0
    executive_zero_rate = (executive_df['Total Shares'] == 0).mean() if len(executive_df) > 0 else 0

    # Compile insights
    insights = {
        'ownership_concentration': ownership_concentration,
        'yearly_avg': yearly_avg,
        'zero_ownership_rate': zero_ownership_rate,
        'director_zero_rate': director_zero_rate,
        'executive_zero_rate': executive_zero_rate,
        'compensation_evolution': yearly_avg,
        'top_holders': top_holders
    }

    return insights

In [None]:
# 4. Calculate strategic implications
strategic_insights = calculate_strategic_implications(vrtx_summary)
print(f"Top 3 insiders control {strategic_insights['ownership_concentration']:.1%} of reported insider shares")
print(f"Zero-ownership appointment rate: {strategic_insights['zero_ownership_rate']:.1%}")
print(f"Director zero-ownership rate: {strategic_insights['director_zero_rate']:.1%}")
print(f"Executive zero-ownership rate: {strategic_insights['executive_zero_rate']:.1%}")

# Show yearly average ownership trends
strategic_insights['yearly_avg']

## Compare with Industry Peers

Compared to biotech industry norms, Vertex's pattern of substantial executive ownership (particularly in therapeutic leadership) signals:

- Strong confidence in pipeline prospects
- Alignment with long-term shareholder interests
- Potential retention strategy for key scientific talent

These insights demonstrate how even summary-level Form 3 data can reveal significant strategic patterns when analyzed systematically.

In [None]:
# Function to compare with industry peers
def compare_with_peers(ticker, peer_tickers):
    """Compare Form 3 patterns with industry peers
    
    Parameters:
        ticker (str): Main company ticker
        peer_tickers (list): List of peer company tickers
        
    Returns:
        pd.DataFrame: Comparison metrics across peers with nicely formatted values
    """
    # Function to get summary metrics for a company
    def get_company_metrics(ticker):
        try:
            # Get insider data
            c = Company(ticker)
            initial_filings = c.get_filings(form=3)

            # Process filings
            insider_data = []
            for filing in tqdm(initial_filings, desc=f"Processing {ticker}"):
                form3 = filing.obj()
                df = form3.to_dataframe(detailed=False)
                insider_data.append(df)

            if not insider_data:
                return {
                    'ticker': ticker,
                    'avg_shares': 0,
                    'zero_rate': 1.0,
                    'derivative_rate': 0,
                    'exec_ownership': 0
                }

            # Combine data
            insider_df = pd.concat(insider_data, ignore_index=True).reset_index(drop=True)

            # Calculate metrics
            is_director = insider_df['Position'].str.contains('Director')
            director_df = insider_df[is_director]
            executive_df = insider_df[~is_director]

            return {
                'ticker': ticker,
                'avg_shares': insider_df['Total Shares'].mean(),
                'zero_rate': (insider_df['Total Shares'] == 0).mean(),
                'derivative_rate': (insider_df['Derivative Holdings'] > 0).mean(),
                'exec_ownership': executive_df['Total Shares'].mean() if len(executive_df) > 0 else 0
            }
        except Exception as e:
            print(f"Error processing {ticker}: {e}")
            return {
                'ticker': ticker,
                'avg_shares': None,
                'zero_rate': None,
                'derivative_rate': None,
                'exec_ownership': None
            }

    # Get metrics for main company and peers
    all_tickers = [ticker] + peer_tickers
    metrics = [get_company_metrics(t) for t in all_tickers]

    # Convert to DataFrame
    comparison_df = pd.DataFrame(metrics)

    # Format the DataFrame for better readability
    formatted_df = comparison_df.copy()

    # Rename columns for clarity
    formatted_df.columns = [
        'Ticker', 
        'Avg Initial Shares', 
        'Zero-Ownership Rate', 
        'Derivative Usage Rate',
        'Avg Executive Shares'
    ]

    # Format numbers nicely
    # Integer formatting for share counts
    formatted_df['Avg Initial Shares'] = formatted_df['Avg Initial Shares'].apply(
        lambda x: f"{int(x):,}" if pd.notnull(x) else 'N/A'
    )
    formatted_df['Avg Executive Shares'] = formatted_df['Avg Executive Shares'].apply(
        lambda x: f"{int(x):,}" if pd.notnull(x) else 'N/A'
    )

    # Percentage formatting
    formatted_df['Zero-Ownership Rate'] = formatted_df['Zero-Ownership Rate'].apply(
        lambda x: f"{x:.1%}" if pd.notnull(x) else 'N/A'
    )
    formatted_df['Derivative Usage Rate'] = formatted_df['Derivative Usage Rate'].apply(
        lambda x: f"{x:.1%}" if pd.notnull(x) else 'N/A'
    )

    # Set ticker as index for better display
    formatted_df = formatted_df.set_index('Ticker')

    return formatted_df

In [None]:
# 5. Compare with industry peers (this may take a few minutes to run)
peer_comparison = compare_with_peers("VRTX", ["REGN", "BIIB", "GILD"])
peer_comparison

## View Individual Insider Filings

Let's look at a few specific filings to see how edgartools renders Form 3 data:

**Jennifer Schneider (Director)** -- initial filing indicated no holdings:

In [None]:
initial_filings[0].obj()

**Edward Morrow Atkinson III (EVP)** -- initial position in common stock:

In [None]:
initial_filings[3].obj()

**Ourania Tatsis (SVP, CRO)** -- both common stock and derivative holdings:

In [None]:
initial_filings[10].obj()

## What's Next

You've learned how to extract and analyze insider trading data from SEC Form 3 filings with Python using edgartools. Here are related tutorials:

- [Search SEC Filings with Python](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/01_getting_started.ipynb)
- [Extract Financial Statements from SEC Filings with Python](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/Viewing-Financial-Statements.ipynb)
- [Extract Earnings Releases from 8-K Filings](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/Extract-Earnings-Releases.ipynb)
- [Parse XBRL Financial Data from SEC EDGAR](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/Reading-Data-From-XBRL.ipynb)

**Resources:**
- [EdgarTools Documentation](https://edgartools.readthedocs.io/)
- [GitHub Repository](https://github.com/dgunning/edgartools)
- [PyPI Package](https://pypi.org/project/edgartools/)

---

## Support EdgarTools

If you found this tutorial helpful, here are a few ways to support the project:

- **Star the repo** -- [github.com/dgunning/edgartools](https://github.com/dgunning/edgartools) -- it helps others discover edgartools
- **Visit edgartools.io** -- [edgartools.io](https://www.edgartools.io/) -- for more tutorials, articles, and updates
- **Report issues** -- found a bug or have a feature idea? [Open an issue](https://github.com/dgunning/edgartools/issues)
- **Share this notebook** -- know someone who works with SEC data? Send them the Colab link

*edgartools is free, open-source, and community-driven. No API key or paid subscription required.*