Skip to content

Conversation

grantdfoster
Copy link
Collaborator

@grantdfoster grantdfoster commented Oct 13, 2025

High-Level Summary of tee-worker PR Changes

Major Architectural Changes

  1. TEE Types Migration - The most significant change is the complete migration and consolidation of the tee-types repository into tee-worker. This involved:

    • Moving all type definitions from the separate tee-types package into tee-worker/api/types/
    • Updating all import statements throughout the codebase to use local types instead of external tee-types
    • Consolidating job definitions, argument types, and result structures
  2. Job System Refactoring - Complete overhaul of the job handling system:

    • Introduced a new JobType system with proper validation
    • Added comprehensive job argument unmarshalling with type-specific validation
    • Implemented capability-based job routing and validation
    • Added support for multiple job types: LinkedIn, Reddit, TikTok, Twitter, Web scraping, and LLM operations

New Features Added

  1. LinkedIn Integration - Full LinkedIn scraping capabilities:

    • Profile fetching with comprehensive data structures
    • Industry, experience, and seniority type definitions
    • LinkedIn-specific job arguments and validation
  2. Enhanced Scraping Support:

    • Reddit: Post and comment URL validation, search capabilities
    • TikTok: Search by query and trending functionality
    • Twitter: Enhanced search with profile results
    • Web: General web scraping with depth control
    • LLM: Language model integration capabilities
  3. Utility Functions - Added new utility packages:

    • Math utilities (Min, Max functions)
    • Set operations for capability management
    • Enhanced argument validation and unmarshalling

Code Quality Improvements

  1. Testing Infrastructure:

    • Comprehensive test suites for all new job types
    • Unit tests for utility functions
    • Integration tests for API endpoints
    • Updated GitHub Actions workflows
  2. Configuration and Development:

    • Added collaboration rules for development workflow
    • Updated IDE configuration files
    • Enhanced dependency management with Dependabot
    • Improved CI/CD pipeline with better test coverage

API and Client Changes

  1. Client Library Updates:

    • Updated HTTP client to use local types instead of external tee-types
    • Improved error handling and response processing
    • Enhanced job signature generation and validation
  2. Internal API Refactoring:

    • Updated route handlers to use new job type system
    • Improved capability detection and reporting
    • Enhanced job processing pipeline

Key Technical Details

  • 88 files changed with 5,168 additions and 923 deletions
  • Complete removal of external tee-types dependency
  • Introduction of type-safe job argument handling
  • Enhanced validation and error reporting throughout the system
  • Improved modularity with better separation of concerns

This PR represents a major consolidation effort that brings all TEE-related types and functionality under one repository while significantly expanding the scraping capabilities and improving the overall architecture of the system.

alvin-reyes and others added 30 commits May 13, 2025 14:33
feat: tiktok scraper common types
- Add LinkedInSearchArguments for search queries
- Add LinkedInProfileResult for profile data
- Support network filters and pagination
…#2)

- Rename LinkedInSearchArguments to LinkedInArguments for broader scope
- Add PublicIdentifier field to support individual profile fetching
- Maintain backward compatibility with deprecated type alias
- Field uses omitempty tag to preserve existing search functionality
- Enables dual-purpose struct for both search and profile operations
- Add comprehensive overview of package capabilities
- Document LinkedIn search and profile fetching usage examples
- Update structure section with detailed file descriptions
- Add backward compatibility information
- Include installation instructions with v1.0.0 tag
- Add release history section
fix: use jobtype instead of string
fix: remove searchbyfullarchive subtype from cred based capabilities
feat: refactored jobs to jobtype as key
grantdfoster and others added 15 commits September 11, 2025 18:27
feat: adds web and llm types and arguments
* fix: add max pages to llm args

* chore: fix test

* fix: rename to items

* chore: fix test

* fix: rename vars

* chore: relevant llm args to uint instead of int

* chore: temperature as a float

* chore: fix test

* chore: fix llm test

* fix: llm test

* fix: temperature test
* chore: add job arguments and job struct to types to prep for acceptance tests not needing ego imports

* chore: add count guard on twitter args

* chore: adds guards

* chore: adds var for max results

* fix: error message

* fix: allow unmarshalling on job types

* chore: adds unmarshal to job args
* fix: default model

* feat: support many models

* fix: be more idiomatic

* chore: add dynamic key

* chore: use util set

* chore: fix llm test

* chore: add key test
* feat: linkedin

* fix: validation tests

* chore: fix unmarshalling tests

* chore: adds basic linkedin type tests

* chore: cleanup type tests for linkedin

* chore: cleanup errors

* Update args/linkedin/profile/profile.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/industries/industries.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/industries/industries.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/industries/industries.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/industries/industries.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/industries/industries.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/industries/industries.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/industries/industries.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/industries/industries.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update types/linkedin/profile/profile.go

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* chore: fix spacing

* chore: omit empty

* chore: fix missing omit

* fix: profile marshalling

* chore: update umarshalling for mode

* chore: update profile

* chore: fixes test

* fix: max

* fix: omitempty

* chore: cleanup

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Added args/, types/, and pkg/util/ directories from tee-types
- Preserved complete git history from tee-types
- Resolved merge conflicts in go.mod, .gitignore, README.md, and go.sum
- Removed external tee-types dependency from go.mod
- Successfully merged tee-types into tee-worker with preserved git history
- Moved args/ to api/args/ and types/ to api/types/
- Updated all import paths from external tee-types to local paths
- Removed external tee-types dependency from go.mod
- Removed tee-worker specific files (job.go, encrypted.go, key.go) from types package
- All merged packages (api/types, api/args, pkg/util) build successfully
- Ready for CI integration and final testing
- Merged job.go and jobs.go into a single jobs.go file
- Created api/tee package to isolate tee-specific functionality
- Moved GenerateJobSignature, SealJobResult, DecryptJob to api/tee package
- Moved EncryptedRequest and related types to api/tee package
- Updated routes.go to use the new api/tee package
- Types and args packages now build without ego dependency
- Clients can import types/args without needing ego library
- Updated pkg/client/http.go to import api/tee for EncryptedRequest
- Fixed teetypes.Job reference to use types.Job
- Updated internal/api/routes.go to use teejob.EncryptedRequest
- Resolved compilation errors from the consolidation
- Fixed duplicate imports (teetypes/teeargs aliases) across all internal files
- Updated all teetypes.* and teeargs.* references to use direct types.* and args.*
- Added missing imports to internal/apify/actors.go and internal/config/config.go
- Fixed reddit package path references
- All internal packages now build without import errors
@grantdfoster grantdfoster requested a review from Copilot October 13, 2025 20:00
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR migrates all type definitions from the external tee-types package into the local tee-worker/api/types/ directory, consolidating the codebase and eliminating the external dependency. The migration includes comprehensive job argument validation, capability-based job routing, and enhanced support for multiple scraping platforms.

Key changes:

  • Complete removal of external tee-types dependency and migration of all types to local api/ directory
  • Implementation of type-safe job argument unmarshalling with comprehensive validation
  • Addition of utility functions for mathematical operations and set operations

Reviewed Changes

Copilot reviewed 83 out of 89 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/util/set.go Implements generic Set data structure with union, intersection, and difference operations
pkg/util/math.go Adds Min and Max utility functions for ordered types
api/types/*.go Consolidates all type definitions from external tee-types package
api/args/*.go Implements type-safe job argument structures with validation
internal/jobs/*.go Updates all job handlers to use local types instead of external tee-types
pkg/client/http.go Updates HTTP client to use local types
Files not reviewed (4)
  • .idea/.gitignore: Language not supported
  • .idea/modules.xml: Language not supported
  • .idea/tee-types.iml: Language not supported
  • .idea/vcs.xml: Language not supported

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

grantdfoster and others added 2 commits October 13, 2025 22:01
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Collaborator

@rapidfix rapidfix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes!

@grantdfoster
Copy link
Collaborator Author

I think there is one small thing to look at here - how the tee indexer imports the tee client. not 100% sure it needs the ego import, investigating...

@grantdfoster
Copy link
Collaborator Author

It was just this piece importing tee / ego, so duplicated in client code

type EncryptedRequest struct {
	EncryptedResult  string `json:"encrypted_result"`
	EncryptedRequest string `json:"encrypted_request"`
}

@rapidfix
Copy link
Collaborator

It was just this piece importing tee / ego, so duplicated in client code

type EncryptedRequest struct {
	EncryptedResult  string `json:"encrypted_result"`
	EncryptedRequest string `json:"encrypted_request"`
}

can't we move it from the worker in a place that doesn't have ego imports?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants