# Before/After Comparison: Raw HTML vs Clean Markdown

This notebook demonstrates the transformation `sec2md` performs on SEC filings.

In [1]:
from edgar import Company, set_identity
import sec2md
from IPython.display import HTML, Markdown, display

set_identity("Lucas Astorian <lucas@intellifin.ai>")

In [2]:
# Get Apple's latest 10-K
company = Company('AAPL')
filing = company.get_filings(form="10-K").latest()

print(f"Filing: {filing.form} - {filing.filing_date}")

Filing: 10-K - 2024-11-01


## BEFORE: Raw SEC HTML

This is what the cover page looks like in raw HTML - XBRL tags, inline styles, and complex formatting:

In [3]:
# Get raw HTML
raw_html = filing.html()

# Show first 2000 characters of raw HTML
print("Raw HTML (first 2000 chars):")
print("=" * 80)
print(raw_html[:2000])
print("=" * 80)

Raw HTML (first 2000 chars):
<?xml version='1.0' encoding='ASCII'?>
<!--XBRL Document Created with the Workiva Platform-->
<!--Copyright 2024 Workiva-->
<!--r:6516014a-223b-4792-964c-105c0fc62715,g:fb24cc6b-9929-486d-8f15-d4cad8060a59,d:7bfbfbe54b9647b1b4ba4ff4e0aba09d-->
<html xmlns:link="http://www.xbrl.org/2003/linkbase" xmlns:iso4217="http://www.xbrl.org/2003/iso4217" xmlns:country="http://xbrl.sec.gov/country/2024" xmlns="http://www.w3.org/1999/xhtml" xmlns:ixt-sec="http://www.sec.gov/inlineXBRL/transformation/2015-08-31" xmlns:dei="http://xbrl.sec.gov/dei/2024" xmlns:xbrli="http://www.xbrl.org/2003/instance" xmlns:ixt="http://www.xbrl.org/inlineXBRL/transformation/2020-02-12" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:aapl="http://www.apple.com/20240928" xmlns:ecd="http://xbrl.sec.gov/ecd/2024" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xbrldi="http://xbrl.org/2006/xbrldi" xmlns:ix="http://www.xbrl.org/2013/inlineXBRL" xmlns:srt="http://fasb.org/srt/2024" x

### Rendered raw HTML (how it looks in browser):

In [18]:
# Show how the raw HTML renders (messy)
HTML(raw_html[:127000])

0,1,2,3,4,5,6,7,8
,,,,,,,,
California,California,California,,,,94-2404110,94-2404110,94-2404110
(State or other jurisdictionof incorporation or organization),(State or other jurisdictionof incorporation or organization),(State or other jurisdictionof incorporation or organization),,,,(I.R.S. Employer Identification No.),(I.R.S. Employer Identification No.),(I.R.S. Employer Identification No.)
,,,,,,,,
One Apple Park Way,One Apple Park Way,One Apple Park Way,,,,,,
"Cupertino, California","Cupertino, California","Cupertino, California",,,,95014,95014,95014
(Address of principal executive offices),(Address of principal executive offices),(Address of principal executive offices),,,,(Zip Code),(Zip Code),(Zip Code)

0,1,2,3,4,5,6,7,8
,,,,,,,,
Title of each class,Title of each class,Title of each class,Trading symbol(s),Trading symbol(s),Trading symbol(s),Name of each exchange on which registered,Name of each exchange on which registered,Name of each exchange on which registered
"Common Stock, $0.00001 par value per share","Common Stock, $0.00001 par value per share","Common Stock, $0.00001 par value per share",AAPL,AAPL,AAPL,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC
0.000% Notes due 2025,0.000% Notes due 2025,0.000% Notes due 2025,—,—,—,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC
0.875% Notes due 2025,0.875% Notes due 2025,0.875% Notes due 2025,—,—,—,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC
1.625% Notes due 2026,1.625% Notes due 2026,1.625% Notes due 2026,—,—,—,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC
2.000% Notes due 2027,2.000% Notes due 2027,2.000% Notes due 2027,—,—,—,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC
1.375% Notes due 2029,1.375% Notes due 2029,1.375% Notes due 2029,—,—,—,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC
3.050% Notes due 2029,3.050% Notes due 2029,3.050% Notes due 2029,—,—,—,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC
0.500% Notes due 2031,0.500% Notes due 2031,0.500% Notes due 2031,—,—,—,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC,The Nasdaq Stock Market LLC

0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
,,,,,,,,,,,,,,,,,,,,
Large accelerated filer,Large accelerated filer,Large accelerated filer,,,,☒,☒,☒,,,,Accelerated filer,Accelerated filer,Accelerated filer,,,,☐,☐,☐
Non-accelerated filer,Non-accelerated filer,Non-accelerated filer,,,,☐,☐,☐,,,,Smaller reporting company,Smaller reporting company,Smaller reporting company,,,,☐,☐,☐
,,,,,,,,,,,,Emerging growth company,Emerging growth company,Emerging growth company,,,,☐,☐,☐


---

## AFTER: Clean Markdown

Now let's see the same content after `sec2md` transforms it:

In [13]:
# Convert to Markdown pages
pages = sec2md.convert_to_markdown(raw_html, return_pages=True)

print(f"Total pages: {len(pages)}")
print(f"\nFirst page (raw Markdown):")
print("=" * 80)
print(pages[0].content[:1000])
print("=" * 80)

Total pages: 59

First page (raw Markdown):
**UNITED STATES**

**SECURITIES AND EXCHANGE COMMISSION**

**Washington, D.C. 20549**

**FORM 10-K**

(Mark One)

☒ **ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934**

For the fiscal year ended September 28 , 2024

or

☐ **TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934**

For the transition period from to .

Commission File Number: **001-36743**

**Apple Inc.**

(Exact name of Registrant as specified in its charter)

| California | 94-2404110 |
| --- | --- |
| (State or other jurisdiction of incorporation or organization) | (I.R.S. Employer Identification No.) |
| One Apple Park Way |  |
| Cupertino , California | 95014 |
| (Address of principal executive offices) | (Zip Code) |

**( 408 ) 996-1010**

(Registrant’s telephone number, including area code)

Securities registered pursuant to Section 12(b) of the Act:

| Title of each class | Trading symbol(s) | Name o

### Rendered Markdown (clean and structured):

In [14]:
# Display the clean Markdown
Markdown(pages[0].content)

**UNITED STATES**

**SECURITIES AND EXCHANGE COMMISSION**

**Washington, D.C. 20549**

**FORM 10-K**

(Mark One)

☒ **ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934**

For the fiscal year ended September 28 , 2024

or

☐ **TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934**

For the transition period from to .

Commission File Number: **001-36743**

**Apple Inc.**

(Exact name of Registrant as specified in its charter)

| California | 94-2404110 |
| --- | --- |
| (State or other jurisdiction of incorporation or organization) | (I.R.S. Employer Identification No.) |
| One Apple Park Way |  |
| Cupertino , California | 95014 |
| (Address of principal executive offices) | (Zip Code) |

**( 408 ) 996-1010**

(Registrant’s telephone number, including area code)

Securities registered pursuant to Section 12(b) of the Act:

| Title of each class | Trading symbol(s) | Name of each exchange on which registered |
| --- | --- | --- |
| Common Stock, $0.00001 par value per share | AAPL | The Nasdaq Stock Market LLC |
| 0.000% Notes due 2025 | — | The Nasdaq Stock Market LLC |
| 0.875% Notes due 2025 | — | The Nasdaq Stock Market LLC |
| 1.625% Notes due 2026 | — | The Nasdaq Stock Market LLC |
| 2.000% Notes due 2027 | — | The Nasdaq Stock Market LLC |
| 1.375% Notes due 2029 | — | The Nasdaq Stock Market LLC |
| 3.050% Notes due 2029 | — | The Nasdaq Stock Market LLC |
| 0.500% Notes due 2031 | — | The Nasdaq Stock Market LLC |
| 3.600% Notes due 2042 | — | The Nasdaq Stock Market LLC |

Securities registered pursuant to Section 12(g) of the Act: None

Indicate by check mark if the Registrant is a well-known seasoned issuer, as defined in Rule 405 of the Securities Act.

Yes ☒ No ☐

Indicate by check mark if the Registrant is not required to file reports pursuant to Section 13 or Section 15(d) of the Act.

Yes ☐ No ☒

Indicate by check mark whether the Registrant (1) has filed all reports required to be filed by Section 13 or 15(d) of the Securities Exchange Act of 1934 during the preceding 12 months (or for such shorter period that the Registrant was required to file such reports), and (2) has been subject to such filing requirements for the past 90 days.

Yes ☒ No ☐

Indicate by check mark whether the Registrant has submitted electronically every Interactive Data File required to be submitted pursuant to Rule 405 of Regulation S-T (§232.405 of this chapter) during the preceding 12 months (or for such shorter period that the Registrant was required to submit such files).

Yes ☒ No ☐

Indicate by check mark whether the Registrant is a large accelerated filer, an accelerated filer, a non-accelerated filer, a smaller reporting company, or an emerging growth company. See the definitions of “large accelerated filer,” “accelerated filer,” “smaller reporting company,” and “emerging growth company” in Rule 12b-2 of the Exchange Act.

| Large accelerated filer | ☒ | Accelerated filer | ☐ |
| --- | --- | --- | --- |
| Non-accelerated filer | ☐ | Smaller reporting company | ☐ |
|  |  | Emerging growth company | ☐ |

If an emerging growth company, indicate by check mark if the Registrant has elected not to use the extended transition period for complying with any new or revised financial accounting standards provided pursuant to Section 13(a) of the Exchange Act. ☐

Indicate by check mark whether the Registrant has filed a report on and attestation to its management’s assessment of the effectiveness of its internal control over financial reporting under Section 404(b) of the Sarbanes-Oxley Act (15 U.S.C. 7262(b)) by the registered public accounting firm that prepared or issued its audit report. ☒

If securities are registered pursuant to Section 12(b) of the Act, indicate by check mark whether the financial statements of the registrant included in the filing reflect the correction of an error to previously issued financial statements. ☐

Indicate by check mark whether any of those error corrections are restatements that required a recovery analysis of incentive-based compensation received by any of the registrant’s executive officers during the relevant recovery period pursuant to §240.10D-1(b). ☐

Indicate by check mark whether the Registrant is a shell company (as defined in Rule 12b-2 of the Act).

Yes ☐ No ☒

The aggregate market value of the voting and non-voting stock held by non-affiliates of the Registrant, as of March 29, 2024, the last business day of the Registrant’s most recently completed second fiscal quarter, was approximately $ 2,628,553,000,000 . Solely for purposes of this disclosure, shares of common stock held by executive officers and directors of the Registrant as of such date have been excluded because such persons may be deemed to be affiliates. This determination of executive officers and directors as affiliates is not necessarily a conclusive determination for any other purposes.

15,115,823,000 shares of common stock were issued and outstanding as of October 18, 2024.

**DOCUMENTS INCORPORATED BY REFERENCE**

Portions of the Registrant’s definitive proxy statement relating to its 2025 annual meeting of shareholders are incorporated by reference into Part III of this Annual Report on Form 10-K where indicated. The Registrant’s definitive proxy statement will be filed with the U.S. Securities and Exchange Commission within 120 days after the end of the fiscal year to which this report relates.

---

## Summary

**Before:** Complex HTML with XBRL tags, inline styles, and nested tables

**After:** Clean, structured Markdown ready for:
- ✅ LLM processing
- ✅ Embeddings and retrieval
- ✅ Section extraction
- ✅ Semantic search

**Screenshot the cells above for your README comparison image!**