Skip to content

Conversation

@mcamou
Copy link
Contributor

@mcamou mcamou commented Aug 21, 2025

What

Add jobs for scraping Reddit

Testing

Tested with curl on the devbox, using an Apify API key

@mcamou mcamou marked this pull request as ready for review August 21, 2025 17:43
@mudler mudler requested a review from Copilot August 22, 2025 07:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements Reddit scraping capabilities using the Apify API, adding a new Reddit job type to the worker's capabilities.

Key changes:

  • Added Reddit scraper job type with support for URL scraping, post search, user search, and community search
  • Refactored Apify client interface to enable reuse across different scrapers
  • Updated type definitions and capabilities to support Reddit functionality

Reviewed Changes

Copilot reviewed 21 out of 23 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pkg/client/apify_client.go Added Apify interface and refactored types to use uint instead of int for pagination
internal/jobserver/worker.go Added error logging improvements for job execution
internal/jobserver/jobserver_test.go Added WorkerID field to test job structures
internal/jobserver/jobserver.go Registered new Reddit scraper job type
internal/jobs/webscraper.go Changed log levels from Info to Debug/Warn for reduced verbosity
internal/jobs/twitterapify/client.go Updated to use new Apify interface and uint types
internal/jobs/twitter.go Cleaned up unused function and updated import statements
internal/jobs/redditapify/client_test.go Comprehensive tests for Reddit Apify client functionality
internal/jobs/redditapify/client.go New Reddit Apify client implementation
internal/jobs/reddit_test.go Unit tests for Reddit scraper job execution
internal/jobs/reddit.go New Reddit scraper implementation
internal/capabilities/detector.go Added Reddit capabilities detection when Apify key is available
internal/api/routes.go Improved error message clarity
go.mod Added temporary replace directive for tee-types dependency
api/types/reddit/reddit_test.go Tests for Reddit response type marshalling/unmarshalling
api/types/reddit/reddit_suite_test.go Test suite setup for Reddit types
api/types/reddit/reddit.go Reddit data type definitions and JSON marshalling logic
api/types/job.go Added Reddit configuration support and Job string method
api/types/encrypted.go Enhanced error messages with proper error wrapping
README.md Added documentation for Reddit job types and capabilities
Makefile Added test target for Reddit functionality

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Collaborator

@rapidfix rapidfix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looking good!

@rapidfix
Copy link
Collaborator

Depends on gopher-lab/tee-types#15

@mcamou mcamou merged commit e375a45 into main Aug 22, 2025
5 checks passed
@mcamou mcamou deleted the reddit-scraper branch August 22, 2025 10:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants