-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Include import statements in extract code examples #1105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Summary
This PR enhances the documentation by adding missing import statements to all code examples in the extract documentation. The changes add `import { z } from 'zod';` to TypeScript examples and `from pydantic import BaseModel` (plus `HttpUrl` where needed) to Python examples. This improvement makes the code examples complete and runnable out of the box, which is particularly beneficial for developers who are new to these validation libraries.The extract functionality in Stagehand relies on schema validation libraries - zod for TypeScript and pydantic for Python - to define the structure of data being extracted from web pages. Previously, the documentation showed usage of these libraries without the corresponding import statements, which would cause import errors for users copying the examples. This change aligns with documentation best practices by providing self-contained, executable code snippets that don't assume prior knowledge of the required dependencies.
Changed Files
Filename | Score | Overview |
---|---|---|
docs/basics/extract.mdx | 5/5 | Added import statements to TypeScript and Python code examples to make them complete and runnable |
Confidence score: 5/5
- This PR is safe to merge with minimal risk
- Score reflects simple additive changes that improve documentation without modifying any functionality
- No files require special attention
Sequence Diagram
sequenceDiagram
participant User
participant StagehandPage as "Stagehand Page"
participant ExtractHandler as "Extract Handler"
participant DOM as "DOM/Browser"
participant LLM as "LLM Client"
User->>StagehandPage: "page.extract(instruction, schema)"
StagehandPage->>ExtractHandler: "Create extract handler instance"
ExtractHandler->>ExtractHandler: "Initialize with stagehand, logger, page"
alt Text Extraction Path
ExtractHandler->>DOM: "Wait for DOM to settle"
DOM-->>ExtractHandler: "DOM ready"
ExtractHandler->>DOM: "Store original DOM state"
ExtractHandler->>DOM: "Process DOM to create selector mapping"
DOM-->>ExtractHandler: "Selector mappings"
ExtractHandler->>DOM: "Collect text annotations with bounding boxes"
DOM-->>ExtractHandler: "Text annotations array"
ExtractHandler->>ExtractHandler: "Deduplicate annotations"
ExtractHandler->>DOM: "Restore original DOM state"
ExtractHandler->>LLM: "Send formatted text for extraction"
LLM-->>ExtractHandler: "Structured data response"
else DOM Extraction Path
ExtractHandler->>DOM: "Wait for DOM to settle"
DOM-->>ExtractHandler: "DOM ready"
ExtractHandler->>DOM: "Retrieve accessibility tree"
DOM-->>ExtractHandler: "Accessibility tree data"
ExtractHandler->>ExtractHandler: "Transform schema"
ExtractHandler->>LLM: "Send DOM data for extraction"
LLM-->>ExtractHandler: "Structured data response"
end
ExtractHandler->>ExtractHandler: "Validate response against schema"
ExtractHandler->>ExtractHandler: "Update metrics and log response"
ExtractHandler-->>StagehandPage: "Return extracted data"
StagehandPage-->>User: "Return structured data object"
1 file reviewed, 1 comment
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
PR to make clearer the dependencies for
extract
(for those who haven't used zod or pydantic before)