# Financial statements 

## Example Qorvo annual filing 2020

[Qorvo](https://www.qorvo.com/products) annual finnancial statement [2020 10K](https://ir.qorvo.com/sec-filings/sec-filing/10-k/0001604778-21-000032).

## SEC

* [XBRL/rfmd-20210403_htm.xml](https://www.sec.gov/Archives/edgar/data/1604778/000160477821000032/rfmd-20210403_htm.xml)
* [HTML/rfmd-20210403.htm)](https://www.sec.gov/Archives/edgar/data/1604778/000160477821000032/rfmd-20210403.htm):

Corresponds to the Cash and Cash equivalents in the Cash Flow statement.

<img src="../image/edgar_qorvo_2020_10K_CF.png" align="left" width=700 />

# Flow

<img src="../image/execution_flow.png" />

## Python code structure
<img src="../image/code.png" aligh="left" width=500/>


In [4]:
import os
import numpy as np
import pandas as pd

pd.set_option('max_colwidth', None)
pd.set_option('display.max_columns', None)
pd.set_option('precision', 2)

In [11]:
YEAR = 2022
QTR = 4

---

# Download Master index CSV files

```src/sec_edgar_download_xbrl_indices.sh``` script downloads the master index files into **DIR_DATA_CSV_INDEX** directory.

In [13]:
DIR_DATA_CSV_INDEX = "../data/csv/index"

## Example master index file for 2022 QTR 4

In [15]:
indices = pd.read_csv(
    f"{DIR_DATA_CSV_INDEX}/{YEAR}QTR{QTR}",
    sep="|",
    dtype={'Accession':str}
)
indices

Unnamed: 0,CIK,Company Name,Form Type,Date Filed,Filename
0,1000045,NICHOLAS FINANCIAL INC,10-Q,2022-11-14,edgar/data/1000045/0000950170-22-024756.txt
1,1000045,NICHOLAS FINANCIAL INC,8-K,2022-10-27,edgar/data/1000045/0000950170-22-020230.txt
2,1000045,NICHOLAS FINANCIAL INC,8-K,2022-11-03,edgar/data/1000045/0000950170-22-021926.txt
3,1000045,NICHOLAS FINANCIAL INC,8-K,2022-11-04,edgar/data/1000045/0000950170-22-022210.txt
4,1000209,MEDALLION FINANCIAL CORP,10-Q,2022-11-03,edgar/data/1000209/0000950170-22-021831.txt
...,...,...,...,...,...
18664,99780,TRINITY INDUSTRIES INC,10-Q,2022-10-25,edgar/data/99780/0000099780-22-000136.txt
18665,99780,TRINITY INDUSTRIES INC,8-K,2022-10-25,edgar/data/99780/0000099780-22-000134.txt
18666,99780,TRINITY INDUSTRIES INC,8-K,2022-11-03,edgar/data/99780/0000099780-22-000138.txt
18667,9984,BARNES GROUP INC,10-Q,2022-10-28,edgar/data/9984/0000009984-22-000167.txt


---
# Generate Listing CSV files

```src/sec_edgar_list_xbrl_xml.py``` iterates the master index files to:

1. Get all the files in a submission e.g. [QRVO 2021 10K](https://www.sec.gov/Archives/edgar/data/1604778/000160477821000032).
2. Identify which file is the XBRL XML file.
3. Generate the URL to the XBRL.

Save the files to **DIR_DATA_CSV_LIST** directory.

In [9]:
DIR_DATA_CSV_LIST = "../data/csv/listing"

In [10]:
listing = pd.read_csv(
    f"{DIR_DATA_CSV_LIST}/{YEAR}QTR{QTR}_LIST.gz",
    sep="|",
    dtype={'Accession':str}
)
listing

Unnamed: 0,CIK,Company Name,Form Type,Date Filed,Filename
0,1000045,NICHOLAS FINANCIAL INC,10-Q,2022-11-14,https://sec.gov/Archives/edgar/data/1000045/000095017022024756/nick-20220930_htm.xml
1,1000209,MEDALLION FINANCIAL CORP,10-Q,2022-11-03,https://sec.gov/Archives/edgar/data/1000209/000095017022021831/mfin-20220930_htm.xml
2,1000228,HENRY SCHEIN INC,10-Q,2022-11-01,https://sec.gov/Archives/edgar/data/1000228/000100022822000068/hsic-20220924_htm.xml
3,1000229,CORE LABORATORIES N V,10-Q,2022-10-28,https://sec.gov/Archives/edgar/data/1000229/000095017022020366/clb-20220930_htm.xml
4,1000298,IMPAC MORTGAGE HOLDINGS INC,10-Q,2022-11-14,https://sec.gov/Archives/edgar/data/1000298/000155837022017770/imh-20220930x10q_htm.xml
...,...,...,...,...,...
6221,99106,TRANS LUX Corp,10-Q,2022-11-10,https://sec.gov/Archives/edgar/data/99106/000151316222000128/tlx-20220930_htm.xml
6222,99250,"TRANSCONTINENTAL GAS PIPE LINE COMPANY, LLC",10-Q,2022-10-31,https://sec.gov/Archives/edgar/data/99250/000009925022000013/tgpl-20220930_htm.xml
6223,99302,TRANSCAT INC,10-Q,2022-11-02,https://sec.gov/Archives/edgar/data/99302/000143774922025534/trns20220924_10q_htm.xml
6224,99780,TRINITY INDUSTRIES INC,10-Q,2022-10-25,https://sec.gov/Archives/edgar/data/99780/000009978022000136/trn-20220930_htm.xml


---


# Download XBRL XML files

```src/sec_edgar_download_xbrl_xml.py``` interates the listing CSV files to:

1. downloads XBRL XML fils and save them to **DIR_DATA_XML_XBRL** directory.
2. generate CSV files to map each submission to the XML file downloaded and save them to **DIR_DATA_CSV_XBRL**.

In [12]:
DIR_DATA_CSV_XBRL = "../data/csv/xbrl"

In [13]:
xbrl = pd.read_csv(
    f"{DIR_DATA_CSV_XBRL}/{YEAR}QTR{QTR}_XBRL.gz",
    sep="|",
    dtype={'Accession':str}
)
xbrl

Unnamed: 0,CIK,Company Name,Form Type,Date Filed,Year,Quarter,Filename,Filepath
0,1000697,WATERS CORP /DE/,10-K,2010-02-26,2010,1,https://sec.gov/Archives/edgar/data/1000697/00...,1000697/000095012310017583/wat-20091231.xml.gz
1,1001039,WALT DISNEY CO/,10-Q,2010-02-09,2010,1,https://sec.gov/Archives/edgar/data/1001039/00...,1001039/000119312510025949/dis-20100102.xml.gz
2,1001082,DISH Network CORP,10-K,2010-03-01,2010,1,https://sec.gov/Archives/edgar/data/1001082/00...,1001082/000095012310018671/dish-20091231.xml.gz
3,1001838,SOUTHERN COPPER CORP/,10-K,2010-02-26,2010,1,https://sec.gov/Archives/edgar/data/1001838/00...,1001838/000110465910010334/scco-20091231.xml.gz
4,1002638,OPEN TEXT CORP,10-Q,2010-02-04,2010,1,https://sec.gov/Archives/edgar/data/1002638/00...,1002638/000119312510021715/otex-20091231.xml.gz
5,1002910,AMEREN CORP,10-K,2010-02-26,2010,1,https://sec.gov/Archives/edgar/data/1002910/00...,1002910/000119312510043155/aee-20091231.xml.gz
6,1004155,AGL RESOURCES INC,10-K,2010-02-04,2010,1,https://sec.gov/Archives/edgar/data/1004155/00...,1004155/000100415510000016/agl-20091231.xml.gz
7,1004440,CONSTELLATION ENERGY GROUP INC,10-K,2010-02-26,2010,1,https://sec.gov/Archives/edgar/data/1004440/00...,1004440/000104746910001515/ceg-20091231.xml.gz


## Directory structure as the result (example)

```
├── csv
│   ├── index                    <--- DIR_DATA_CSV_INDEX
│   │   ├── 2022QTR1
│   │   ├── 2022QTR2
│   │   ├── 2022QTR3
│   │   └── 2022QTR4
│   ├── listing
│   │   ├── 2022QTR1_LIST.gz     <--- DIR_DATA_CSV_LIST
│   │   ├── 2022QTR2_LIST.gz
│   │   ├── 2022QTR3_LIST.gz
│   │   └── 2022QTR4_LIST.gz
│   └── xbrl                     <--- DIR_DATA_CSV_XBRL
│       ├── 2022QTR1_XBRL.gz
│       ├── 2022QTR2_XBRL.gz
│       ├── 2022QTR3_XBRL.gz
│       └── 2022QTR4_XBRL.gz
└── xml
    └── xbrl                     <--- DIR_DATA_XML_XBRL
        ├── 40211
        │   ├── 000004021122000023
        │   │   └── gmt-20211231_htm.xml.gz
        │   ├── 000004021122000058
        │   │   └── gmt-20220331_htm.xml.gz
        │   ├── 000004021122000094
        │   │   └── gmt-20220630_htm.xml.gz
        │   └── 000004021122000124
        │       └── gmt-20220930_htm.xml.gz
```

---
# Parse the XBRL XML files

```src/sec_edgar_parse_xbrl_xml.py``` itereate the XBRL CSV files to:
1. Load the XML file for each submission.
2. Parse the XML file to extract all the financial report elements e.g. revenue.
3. Generate CSV for the finacial reports and save them to **DIR_DATA_CSV_GAAP**.

```src/xbrl_gaap_function.py``` implement the parsing logic to extract US GAAP (Generally Accepted Accounting Practice) financial report elements e.g. ```us-gaap:CashAndCashEquivalentsAtCarryingValue``` using the XML library [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/).

# GAAP

In [9]:
DIR_DATA_CSV_GAAP = "../data/csv/gaap"

In [10]:
YEAR = 2022
QTR = 1

gaap = pd.read_csv(
    f"{DIR_DATA_CSV_GAAP}/{YEAR}QTR{QTR}_GAAP.gz",
    sep="|",
    dtype={'Accession':str}
)
#df = df[df['CIK'] == 1000697]
#df.sort_values(['CIK', 'Year', 'Quarter'])
gaap

Unnamed: 0,CIK,Accession,Year,Quarter,Form Type,FS,Rep,Type,Name,Value,Unit,Decimals,Context
0,1820931,000119312522089986,2022,1,10-K,pl,operating_income,calc,us-gaap:operatingincomeloss,-5.86e+06,Unit_USD,0.0,P01_01_2021To12_31_2021
1,1163668,000114036122008216,2022,1,10-K,,,credit,us-gaap:interestanddividendincomeoperating,1.35e+08,U001,-3.0,c20210101to20211231
2,1564406,000156459022007694,2022,1,10-K,pl,revenue,credit,us-gaap:revenues,5.07e+08,U_iso4217USD,-5.0,C_0001564406_srtConsolidatedEntitiesAxis_oshOakStreetHealthIncAndAffiliatesMember_srtMajorCustomersAxis_oshHumanaMember_srtProductOrServiceAxis_oshCapitatedRevenueMember_20210101_20211231
3,1000045,000095017022000940,2022,1,10-Q,pl,revenue,credit,us-gaap:revenues,1.22e+07,U_USD,-3.0,C_92ed3f29-4d0b-420f-864a-0b1b379d6280
4,1689796,000155837022001602,2022,1,10-K,pl,revenue,credit,us-gaap:revenues,6.34e+08,Unit_Standard_USD_3IE4NddlME6yWiaIekhaEw,-3.0,Duration_1_1_2021_To_12_31_2021_wABg_ldjuUq11zhQc-9MDQ
...,...,...,...,...,...,...,...,...,...,...,...,...,...
184151,796343,000079634322000099,2022,1,10-Q,,,debit,us-gaap:commonstockvalue,0.00e+00,usd,-6.0,id9d28f571e6d4fa9ad5b62bb5c0a1bf6_I20220304
184152,796343,000079634322000099,2022,1,10-Q,,,debit,us-gaap:retainedearningsaccumulateddeficit,2.50e+10,usd,-6.0,id9d28f571e6d4fa9ad5b62bb5c0a1bf6_I20220304
184153,796343,000079634322000099,2022,1,10-Q,,,debit,us-gaap:accumulatedothercomprehensiveincomelossnetoftax,-1.77e+08,usd,-6.0,id9d28f571e6d4fa9ad5b62bb5c0a1bf6_I20220304
184154,796343,000079634322000099,2022,1,10-Q,bs,stockholders_equity,calc,us-gaap:stockholdersequity,1.38e+10,usd,-6.0,id9d28f571e6d4fa9ad5b62bb5c0a1bf6_I20220304
