# Filtering filings by industry

This notebook demonstrates how to filter SEC filings by industry e.g. **Financial Services**. Then we will select a filing and view the text of the filing as well as tables attached to it.

This is a reimplementation of this popular Kaggle notebook [Scraping and analysing SEC filings](https://www.kaggle.com/code/klcode/scraping-and-analysing-sec-filings-part-1)

This notebook shows how to page through SEC filings.

**[Open this notebook in Google Colab](http://colab.research.google.com/github/dgunning/edgartools/blob/main/notebooks/Filtering-by-industry.ipynb)**

## Getting started

In [None]:
!pip install edgartools

## Setting up the library

To get started with edgartools import the library and then set your identity, which will be used whenever the library makes an API call to **SEC Edgar**.

In [11]:
from edgar import *

set_identity("user@edgartools.io")
use_local_storage()

## Getting annual reports for foreign companies

We will get **20-F** annual reports for foreign private issuers for 2024.

In [22]:
filings = get_filings(form="20-F", year=2024)

Now let's filter for companies that are in the banking industry. This is identified by the **SIC** value "**6029**". 

We loop though the filings, checking the sic for the company, and add to the list of ciks if there is a match.

In [41]:
from tqdm.auto import tqdm

bank_ciks = [f.cik for 
             f in tqdm(filings) 
             if Company(f.cik).sic == "6029"]
bank_filings = filings.filter(cik=bank_ciks)
bank_filings

  0%|          | 0/1148 [00:00<?, ?it/s]

[1;38;5;245m╭─[0m[1;38;5;245m─────────────────────────────────────────────────[0m[1;38;5;245m SEC Filings [0m[1;38;5;245m─────────────────────────────────────────────────[0m[1;38;5;245m─╮[0m
[1;38;5;245m│[0m                                                                                                                 [1;38;5;245m│[0m
[1;38;5;245m│[0m   [1m [0m[1mForm   [0m[1m [0m [1m [0m[1m       CIK[0m[1m [0m [1m [0m[1mTicker[0m[1m [0m [1m [0m[1mCompany                               [0m[1m [0m [1m [0m[1mFiling Date[0m[1m [0m [1m [0m[1mAccession Number   [0m[1m [0m  [1;38;5;245m│[0m
[1;38;5;245m│[0m  ─────────────────────────────────────────────────────────────────────────────────────────────────────────────  [1;38;5;245m│[0m
[1;38;5;245m│[0m    20-F     [2m [0m[2m    719245[0m[2m [0m [33m [0m[33mWEBNF [0m[33m [0m [1;32m [0m[1;32mWESTPAC BANKING CORP                  [0m[1;32m [0m  2024-11-05   [2m [0m[2m

## Selecting a single filing

You can select the first filing and get access to all the attachments

In [31]:
filing = bank_filings[0]
filing

╭─────────────────────────────────────[1m WESTPAC BANKING CORP [719245] 20-F 📄 [0m─────────────────────────────────────╮
│ ╭──────────────────────┬────────────╮                                                                           │
│ │[1;38;5;39m [0m[1;38;5;39m0001104659-24-114298[0m[1;38;5;39m [0m│ 2024-11-05 │                                                                           │
│ ╰──────────────────────┴────────────╯                                                                           │
│ ╭───────────────────────────────────────────────────────────────────────────────────────────╮                   │
│ │[1m [0m[1mLinks[0m[1m: 🏠 Homepage 📄 Primary Document 📜 Full Submission Text                           [0m[1m [0m│                   │
│ ├───────────────────────────────────────────────────────────────────────────────────────────┤                   │
│ │ 🏠 https://sec.gov/Archives/edgar/data/719245/0001104659-24-114298-index.html             │          

## Viewing the attachments
This filing has hundreds of attachments. Here we can see the complete list and also select and view individual attachments.

In [35]:
filing.attachments

[3m                                                      Attachments                                                  [0m
                                                                                                                   
 [1;2m [0m[1;2mSeq[0m[1;2m [0m [1;2m [0m[1;2mDocument                     [0m[1;2m [0m [1;2m [0m[1;2mDescription                                                 [0m[1;2m [0m [1;2m [0m[1;2mType        [0m
 ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────
  [1;38;5;39m1  [0m   [1;38;5;39mwebnf-20240930x20f.htm       [0m   [1;38;5;39mFORM 20-F                                                   [0m   📜 [1;38;5;39m20-F[0m     
 [1m [0m[1m2  [0m[1m [0m [1m [0m[1mwebnf-20240930xex11db.htm    [0m[1m [0m [1m [0m[1mEXHIBIT 11.(B)                                              [0m[1m [0m [1m [0m[1m📋 ✍️ EX-11.([0m
  3     webnf-20240930xex12.htm    