## Section Breakdown (`pdfplumber`)

In [1]:
import re
from collections import defaultdict

def parse_markdown(md_text):
    header_regex = re.compile(r'^(#{1,6})\s*(.*)')
    
    sections = defaultdict(str)
    
    current_section = None
    
    for line in md_text.split('\n'):
        header_match = header_regex.match(line)
        if header_match:
            level = len(header_match.group(1))
            header = header_match.group(2).strip()
            
            current_section = header
        elif current_section:
            sections[current_section] += line + '\n'
    
    return sections

with open('../outputs/digital-thermometer-pdfplumber.md') as file:
    markdown_text = file.read()

sections = parse_markdown(markdown_text)

In [2]:
sections.keys()

dict_keys(['General Description', 'Applications', 'Benefits and Features', 'Pin Configurations Absolute Maximum Ratings DC Electrical Characteristics AC Electrical Characteristics–NV Memory AC Electrical Characteristics Pin Description Overview', 'Operation—Measuring Temperature', 'Table 1. Temperature/Data Relationship', 'Operation—Alarm Signaling', 'Powering the DS18B20', '64-BIT Lasered ROM code', 'Memory', 'Configuration Register', 'CRC Generation', 'Table 2. Thermometer Resolution Configura', 'ion', '1-Wire Bus System', 'Hardware Configuration', 'Transaction Sequence', 'Initialization', 'ROM Commands', 'Search Rom [F0h]', 'Read Rom [33h]', 'Match Rom [55H]', 'Skip Rom [CCh]', 'Alarm Search [ECh]', 'DS18B20 Function Commands', 'Convert T [44h]', 'Write Scratchpad [4Eh]', 'Read Scratchpad [BEh]', 'Copy Scratchpad [48h]', 'Recall E [B8h]', 'Table 3. DS18B20 Function Command Set', 'Read Power Supply [B4h]', '1-Wire Signaling', 'Initialization Procedure—Reset And Presence Pulses', 'Rea

In [3]:
sections['General Description']

'\n The DS18B20 digital thermometer provides 9-bit to 12-bit Celsius temperature measurements and has an alarm function with nonvolatile user-programmable upper and lower trigger points. The DS18B20 communicates over a 1-Wire bus that by definition requires only one data line (and ground) for communication with a central microprocessor. In addition, the DS18B20 can derive power directly from the data line (“parasite power”), eliminating the need for an external power supply. Each DS18B20 has a unique 64-bit serial code, which allows multiple DS18B20s to function on the same 1-Wire bus. Thus, it is simple to use one microprocessor to control many DS18B20s distributed over a large area. Applications that can benefit from this feature include HVAC environmental controls, temperature monitoring systems inside buildings, equipment, or machinery, and process monitoring and control systems. \n\n'

## Page Breakdown (`pymupdf4llm`)

In [4]:
with open('../outputs/digital-thermometer-pdfplumber-overwrite-tables.md') as file:
    markdown_text = file.read()

In [5]:
pages = {}

for page_number, page_text in enumerate(markdown_text.split('-----'), start=1):
    pages[page_number] = page_text

In [6]:
pages[2]

'\n\n# DS18B20 Programmable Resolution 1-Wire Digital Thermometer\n\n**Absolute Maximum Ratings**\n\nVoltage Range on Any Pin Relative to Ground.....-0.5V to +6.0V Storage Temperature Range............................. -55°C to +125°C\nOperating Temperature Range.......................... -55°C to +125°C Solder Temperature................................Refer to the IPC/JEDEC\nJ-STD-020 Specification.\n\n_These are stress ratings only and functional operation of the device at these or any other conditions above those indicated in the operation sections of this specification is not implied. Exposure_\n_to absolute maximum rating conditions for extended periods of time may affect reliability._\n\n**DC Electrical Characteristics**\n\n(-55°C to +125°C; VDD = 3.0V to 5.5V)\n\n**Note 1:** All voltages are referenced to ground.\n**Note 2:** The Pullup Supply Voltage specification assumes that the pullup device is ideal, and therefore the high level of the\npullup is equal to VPU. In order to 

In [7]:
def page_range_iterator(md_text: str, pages_per_iteration: int = 2) -> iter:
    text_splits = markdown_text.split('-----')
    for i in range(0, len(text_splits), pages_per_iteration):
        page_batch = " ".join(text_splits[i:i + pages_per_iteration])
        if page_batch:
            yield page_batch

for page_batch in page_range_iterator(md_text=markdown_text, pages_per_iteration=2):
    print(page_batch)
    break

_[Click here](https://www.maximintegrated.com/en/storefront/storefront.html)_ _for production status of specific part numbers._

# DS18B20 Programmable Resolution 1-Wire Digital Thermometer


**General Description**
The DS18B20 digital thermometer provides 9-bit to
12-bit Celsius temperature measurements and has an
alarm function with nonvolatile user-programmable upper
and lower trigger points. The DS18B20 communicates
over a 1-Wire bus that by definition requires only one
data line (and ground) for communication with a central
microprocessor. In addition, the DS18B20 can derive
power directly from the data line (“parasite power”),
eliminating the need for an external power supply.

Each DS18B20 has a unique 64-bit serial code, which
allows multiple DS18B20s to function on the same 1-Wire
bus. Thus, it is simple to use one microprocessor to
control many DS18B20s distributed over a large area.
Applications that can benefit from this feature include
HVAC environmental controls, temperat