-
Notifications
You must be signed in to change notification settings - Fork 2
feat: adds web and llm types and arguments #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive support for web scraping with LLM processing capabilities. It introduces new type definitions, arguments handling, and validation for both web scraping operations and LLM processing of the scraped content.
- Adds
WebArgumentsandLLMProcessorArgumentswith validation and marshalling support - Updates web job capabilities to require API keys instead of being always available
- Includes comprehensive test coverage for the new argument types
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| types/web.go | Defines web scraping request/result types and query enums |
| types/llm.go | Defines LLM processor request/result types for content processing |
| types/jobs.go | Updates web job capabilities and removes always-available web capabilities |
| args/web.go | Implements WebArguments with validation, defaults, and conversion methods |
| args/llm.go | Implements LLMProcessorArguments with validation, defaults, and conversion methods |
| args/web_test.go | Comprehensive test suite for WebArguments functionality |
| args/llm_test.go | Comprehensive test suite for LLMProcessorArguments functionality |
| args/unmarshaller.go | Updates interface definitions and web argument unmarshalling |
| args/unmarshaller_test.go | Updates tests to reflect new WebArguments type |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
mcamou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to restate the same comment I added to https://github.com/masa-finance/tee-indexer/pull/399: we're currently using web search in the tee-indexer E2E tests, since it doesn't require any tokens or API keys. Should we keep it around for internal use, or do you have any ideas how to get around that?
What
Adds types and arguments for web and llm actor requests. Supports unmarshalling and plugs into existing patterns for other job types. Removes web as a basic capability - it now requires an apify api key alongside an llm provider key.
Why
We want to support web scraping with an LLM summary of keywords and topics for indexing. This PR sets up the types and arguments to support both of those actors.