## Exploration Notebooks - TOC in action

The purpose of this notebook is to demonstrate the logic for indentifying the Table of Contents section for both 10-K/10-Q and S-1 filings. 

#### Table of Contents

1. [TOC action for 10-K/10-Q filings](#10-K-10-Q)
2. [TOC action for S-1 filings](#S-1)

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
from prepline_sec_filings.fetch import get_filing
from prepline_sec_filings.sec_document import SECDocument

### 10-K/10-Q Filing <a id="10-K-10-Q"></a>

This section pulls in the Palantir 10-Q filing from the SEC site, which is available [here](https://www.sec.gov/Archives/edgar/data/1321655/000119312520292177/d31861d10q.htm). The goal is to identify the [table of contents](https://www.sec.gov/Archives/edgar/data/1321655/000119312520292177/d31861d10q.htm#toc) section.

In [None]:
text = get_filing("1321655",
                  "000119312520292177", 
                  "Unstructured Technologies",
                  "support@unstructured.io")

In [None]:
sec_document = SECDocument.from_string(text)

In [None]:
elements = sec_document.elements

In [None]:
toc = sec_document.get_table_of_contents()

From the cells below, we can see that the `get_table_of_contents` method section identified the table of contents section in the document. However, there is still extra junk at the end.

In [None]:
for element in toc.elements:
    print(element.text)

PART I. FINANCIAL INFORMATION
Item 1
Financial Statements (unaudited)
Condensed Consolidated Balance Sheets
Condensed Consolidated Statements of Operations
Condensed Consolidated Statements of Comprehensive Loss
Condensed Consolidated Statements of Cash Flows
Notes to Unaudited Condensed Consolidated Financial
Statements
Item 2
Management’s Discussion and Analysis of Financial Condition and Results
 of Operations
Item 3
Quantitative and Qualitative Disclosures About Market Risk
Item 4
Controls and Procedures
PART II. OTHER INFORMATION
Item 1
Legal Proceedings
Item 1A
Risk Factors
Item 2
Unregistered Sales of Equity Securities
Item 3
Defaults Upon Senior Securities
Item 4
Mine Safety Disclosures
Item 5
Other Information
Item 6
Exhibits
Table of Contents
SPECIAL NOTE REGARDING FORWARD-LOOKING STATEMENTS


### S-1 Filing <a id="S-1"></a>

This section pulls in the Tesla S-1 filing from the SEC site, which is available [here](https://www.sec.gov/Archives/edgar/data/1318605/000119312511149963/ds1.htm). The goal is to identify the [table of contents](https://www.sec.gov/Archives/edgar/data/1318605/000119312511149963/ds1.htm#toc) section.

In [None]:
text = get_filing("1318605",
                 "000119312511149963", 
                 "Unstructured Technologies", 
                 "support@unstructured.io")

In [None]:
sec_document = SECDocument.from_string(text)

In [None]:
elements = sec_document.elements

In [None]:
toc = sec_document.get_table_of_contents()

From the cells below, we can see that the `get_table_of_contents` method section identified the table of contents section in the document. However, there is still extra junk at the end.

In [None]:
for element in toc.elements:
    print(element.text)

Prospectus Summary
The Offering
Summary Consolidated Financial Data
Risk Factors
Special Note Regarding Forward Looking Statements
Market, Industry and Other Data
Use of Proceeds
Price Range of Common Stock
Dividend Policy
Capitalization
Dilution
Selected Consolidated Financial Data
Management’s Discussion and Analysis of Financial Condition and Results of
Operations
Business
Management
Executive Compensation
Certain Relationships and Related Party Transactions
Principal Stockholders
Description of Capital Stock
Shares Eligible for Future Sale
Material United States Tax Considerations for Non-United States Holders
Underwriting
Concurrent Private Placement
Legal Matters
Experts
Where You Can Find Additional Information
Index to Consolidated Financial Statements
