[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/sec-industry-sic-code-python.ipynb)

# Filter SEC EDGAR Companies by Industry and SIC Code with Python -- Free, No API Key

Use **edgartools** to filter SEC-registered companies by industry and SIC code -- completely free, no API key or paid subscription required. The SEC classifies every company using Standard Industrial Classification (SIC) codes, making it easy to find and analyze companies within any industry.

**What you'll learn:**
- Find all companies in any industry using SIC codes
- Use built-in helpers for major sectors (semiconductors, banking, pharma, etc.)
- Search industries by keyword
- Filter companies by state of incorporation
- Compare financial data across companies in the same industry
- See the largest industries on SEC EDGAR

## Install edgartools

In [None]:
!pip install -U edgartools

## Setup

The SEC requires all automated tools to identify themselves. Replace the email below with your own -- any valid email works.

In [None]:
import pandas as pd
from edgar import *
from edgar.reference import (
    get_companies_by_industry,
    get_companies_by_state,
    get_semiconductor_companies,
    get_software_companies,
    get_pharmaceutical_companies,
    get_banking_companies,
    get_oil_gas_companies,
    get_retail_companies,
    get_insurance_companies,
    get_real_estate_companies,
    get_biotechnology_companies,
)

# The SEC requires you to identify yourself (any email works)
set_identity("your.name@example.com")

## Find All Companies in an Industry

Every SEC filer is classified by SIC code. Use `get_companies_by_industry()` to find all companies with a specific SIC code or search by keyword:

In [None]:
# Find all semiconductor companies (SIC 3674)
semis = get_semiconductor_companies()

# Filter to publicly traded (have ticker symbols)
traded = semis[semis["ticker"].notna()]

print(f"Semiconductor companies on EDGAR: {len(semis):,}")
print(f"Publicly traded (with tickers):   {len(traded)}\n")

traded.head(10)[["ticker", "name", "exchange"]]

## Search Industries by Keyword

Don't know the SIC code? Search by keyword to find companies in any industry:

In [None]:
# Search for software companies by keyword
software = get_companies_by_industry(sic_description_contains="software")
traded = software[software["ticker"].notna()]

print(f"Software companies: {len(software):,} total, {len(traded)} with tickers\n")
traded.head(10)[["ticker", "name", "sic", "sic_description"]]

## Filter by SIC Code or Range

Use specific SIC codes or ranges to target precise industry segments:

In [None]:
# By specific SIC code
computers = get_companies_by_industry(sic=3571)
print(f"SIC 3571 (Electronic Computers): {len(computers):,} companies")

# By SIC code range (all computer hardware)
hardware = get_companies_by_industry(sic_range=(3570, 3579))
traded = hardware[hardware["ticker"].notna()]
print(f"SIC 3570-3579 (Computer Hardware): {len(hardware):,} total, {len(traded)} traded\n")

# Show the SIC sub-categories in this range
breakdown = traded.groupby("sic_description").size().sort_values(ascending=False)
for desc, count in breakdown.items():
    print(f"  {count:>3}  {desc}")

## Major Industries Overview

edgartools includes built-in helpers for the most common industries. See the landscape at a glance:

In [None]:
industries = [
    ("Software", get_software_companies),
    ("Retail", get_retail_companies),
    ("Biotechnology", get_biotechnology_companies),
    ("Banking", get_banking_companies),
    ("Oil & Gas", get_oil_gas_companies),
    ("Pharmaceuticals", get_pharmaceutical_companies),
    ("Real Estate", get_real_estate_companies),
    ("Insurance", get_insurance_companies),
    ("Semiconductors", get_semiconductor_companies),
]

rows = []
for name, func in industries:
    df = func()
    traded = df[df["ticker"].notna()]
    rows.append({
        "Industry": name,
        "Total Companies": f"{len(df):,}",
        "With Tickers": len(traded),
    })

pd.DataFrame(rows).set_index("Industry").sort_values("With Tickers", ascending=False)

## Filter by State of Incorporation

Find companies incorporated in a specific state -- useful for jurisdictional analysis:

In [None]:
ny_companies = get_companies_by_state("NY")
traded = ny_companies[ny_companies["ticker"].notna()]

print(f"New York companies: {len(ny_companies):,} total, {len(traded)} with tickers\n")
traded.head(10)[["ticker", "name", "sic_description"]]

## Compare Financials Within an Industry

Once you've identified companies in an industry, pull their financial data for comparison:

In [None]:
tickers = ["NVDA", "AMD", "INTC", "AVGO", "TXN"]
rows = []

for ticker in tickers:
    c = Company(ticker)
    fin = c.get_financials()
    rev = fin.get_revenue()
    ni = fin.get_net_income()
    rows.append({
        "Ticker": ticker,
        "Name": c.name,
        "Revenue ($B)": round(rev / 1e9, 1) if rev else None,
        "Net Income ($B)": round(ni / 1e9, 1) if ni else None,
    })

pd.DataFrame(rows).set_index("Ticker")

## Why EdgarTools?

EdgarTools is free and open-source. Compare filtering SEC companies by industry:

**With edgartools (free, no API key):**
```python
from edgar.reference import get_semiconductor_companies
semis = get_semiconductor_companies()           # All semiconductor filers
get_companies_by_industry(sic=3674)             # By SIC code
get_companies_by_industry(sic_range=(3570,3579))# By SIC range
get_companies_by_state("CA")                    # By state
```

**Typical approach (manual EDGAR browsing):**
```python
import requests
# ... download company_tickers.json from SEC,
# ... cross-reference with SIC codes from separate endpoint,
# ... manually filter and merge DataFrames
```

With edgartools, industry filtering is built in -- find companies by SIC code, keyword, or state with a single function call.

## Quick Reference

```python
from edgar import *
from edgar.reference import *
set_identity("your.name@example.com")

# ── Find companies by industry ──
get_companies_by_industry(sic=3674)                  # By SIC code
get_companies_by_industry(sic_range=(3570, 3579))    # By SIC range
get_companies_by_industry(sic_description_contains="software")  # By keyword

# ── Built-in industry helpers ──
get_semiconductor_companies()      # SIC 3674
get_software_companies()           # SIC 7372
get_pharmaceutical_companies()     # SIC 2834
get_banking_companies()            # SIC 6020-6029
get_oil_gas_companies()            # SIC 1311, 1381-1389
get_retail_companies()             # SIC 5200-5999
get_insurance_companies()          # SIC 6311-6399
get_real_estate_companies()        # SIC 6500-6553
get_biotechnology_companies()      # SIC 2836

# ── Filter by state ──
get_companies_by_state("CA")       # California
get_companies_by_state("NY")       # New York

# ── Company SIC data ──
company = Company("NVDA")
company.sic                        # 3674
company.industry                   # "Semiconductors & Related Devices"
```

## What's Next

You've learned how to filter SEC companies by industry and SIC code. Here are related tutorials:

- [SEC Company Data with Python](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/sec-company-data-python.ipynb)
- [Compare Company Financials](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/compare-company-financials-python.ipynb)
- [Search and Filter SEC Filings](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/search-sec-filings-python.ipynb)
- [Extract Financial Statements from SEC Filings](https://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/financial-statements-sec-python.ipynb)

**Resources:**
- [EdgarTools Documentation](https://edgartools.readthedocs.io/)
- [GitHub Repository](https://github.com/dgunning/edgartools)
- [PyPI Package](https://pypi.org/project/edgartools/)

---

## Support EdgarTools

If you found this tutorial helpful, here are a few ways to support the project:

- **Star the repo** -- [github.com/dgunning/edgartools](https://github.com/dgunning/edgartools) -- it helps others discover edgartools
- **Visit edgartools.io** -- [edgartools.io](https://www.edgartools.io/) -- for more tutorials, articles, and updates
- **Report issues** -- found a bug or have a feature idea? [Open an issue](https://github.com/dgunning/edgartools/issues)
- **Share this notebook** -- know someone who works with SEC data? Send them the Colab link

*edgartools is free, open-source, and community-driven. No API key or paid subscription required.*