Apify (API Key) - update scrape-single-url #18210

michelle0927 · 2025-08-28T17:43:08Z

The current version of the Scrape Single URL action isn't returning results. This PR updates the component to use the "Crawl a Single URL" example from Apify's documentation.

Summary by CodeRabbit

New Features
- Single-URL scraping now fetches and returns the page’s HTML directly for easier consumption.
Refactor
- Simplified configuration by removing the crawler type option; a single straightforward fetch is used.
- Output format changed from a complex response to raw HTML content.
Documentation
- Updated action description to clarify HTML output and added a reference to documentation.
Chores
- Bumped component version and added a new dependency to support the updated scraping approach.

coderabbitai · 2025-08-28T17:43:15Z

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Replaced Apify actor-based scraping with a direct HTTP fetch via got-scraping in the scrape-single-url action. Updated action metadata (description, version, props) to reflect HTML output and simplified inputs. Incremented the package version and added got-scraping as a dependency.

Changes

Cohort / File(s)	Summary
Scrape action refactor `components/apify/actions/scrape-single-url/scrape-single-url.mjs`	Switched from `this.apify.runActor` to `gotScraping({ url })`; return value changed to HTML body. Updated description (now references HTML and docs) and version (0.0.4 → 0.1.0). Removed `crawlerType` prop; simplified `url` prop (dropped explicit optional flag).
Package management updates `components/apify/package.json`	Bumped package version 0.2.2 → 0.3.0. Added dependency `got-scraping@^4.1.2`. Existing dependency versions unchanged.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as Caller
  participant A as scrape-single-url Action
  participant G as got-scraping
  participant W as Target Website

  Note over A: New flow: direct fetch via got-scraping
  U->>A: invoke({ url })
  A->>G: gotScraping({ url })
  G->>W: HTTP GET url
  W-->>G: HTML response
  G-->>A: { body: "<html>..." }
  A-->>U: HTML body

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitched my nose at the changing breeze,
Swapped actors for scraping with elegant ease.
One hop to the page, HTML in paw,
A simpler trail, no crawler to draw.
Version bumps made, dependencies tight—
Carrot-commit: crisp, clean, light. 🥕✨

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch issue-18167-2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

vercel · 2025-08-28T17:43:23Z

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments

Project	Deployment	Preview	Comments	Updated (UTC)
pipedream-docs	Ignored			Aug 28, 2025 9:09pm
pipedream-docs-redirect-do-not-edit	Ignored			Aug 28, 2025 9:09pm

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

components/apify/package.json (1)
17-18: Dependency addition is correct; confirm runtime compatibility

got-scraping requires Node ≥16. Ensure the Pipedream runtime supports Node ≥16, or optionally add an engines constraint:
 {
   "name": "@pipedream/apify",
   "version": "0.3.0",
   "description": "Pipedream Apify Components",
   "main": "apify.app.mjs",
+  "engines": {
+    "node": ">=16"
+  },
   "keywords": [
Run to verify got-scraping’s engine requirement:
npm view got-scraping@4.1.2 engines
# returns { node: '>=16' }
components/apify/actions/scrape-single-url/scrape-single-url.mjs (1)
7-8: Clarify description to reflect implementation (no Actor/proxy by default)

Recommend making it explicit that this uses got-scraping directly and does not invoke an Apify Actor or proxy unless configured.
-  description: "Executes a scraper on a specific website and returns its content as HTML. This action is perfect for extracting content from a single page. [See the documentation](https://docs.apify.com/sdk/js/docs/examples/crawl-single-url)",
+  description: "Fetches a single URL using got-scraping and returns the page HTML. Does not invoke an Apify Actor or use Apify Proxy by default. [See the documentation](https://docs.apify.com/sdk/js/docs/examples/crawl-single-url)",

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 577bd0f and 84938b5.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (2)

components/apify/actions/scrape-single-url/scrape-single-url.mjs (1 hunks)
components/apify/package.json (2 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Verify TypeScript components
GitHub Check: Publish TypeScript components
GitHub Check: pnpm publish
GitHub Check: Lint Code Base

🔇 Additional comments (2)

components/apify/package.json (1)

3-3: Semver bump looks right given behavioral change

Returning plain HTML instead of an actor response is a breaking behavioral change at the action level; bumping the package to 0.3.0 is appropriate.

components/apify/actions/scrape-single-url/scrape-single-url.mjs (1)

2-2: Using got-scraping is appropriate for simple single-URL fetches

Import looks good and matches the dependency added in package.json.

components/apify/actions/scrape-single-url/scrape-single-url.mjs

vunguyenhung · 2025-08-29T03:35:35Z

Hi everyone, all test cases are passed! Ready for release!

Test report
https://vunguyenhung.notion.site/Apify-API-Key-update-scrape-single-url-25dbf548bb5e818891cbea4a8e2babd0

michelle0927 added 2 commits August 28, 2025 13:36

update scrape-single-url

680a815

pnpm-lock.yaml

84938b5

michelle0927 self-assigned this Aug 28, 2025

michelle0927 added this to Component (Source and Action) Backlog Aug 28, 2025

michelle0927 moved this to Doing in Component (Source and Action) Backlog Aug 28, 2025

coderabbitai bot reviewed Aug 28, 2025

View reviewed changes

components/apify/actions/scrape-single-url/scrape-single-url.mjs Show resolved Hide resolved

michelle0927 moved this from Doing to Ready for PR Review in Component (Source and Action) Backlog Aug 28, 2025

pipedream-component-development requested a review from lcaresia August 28, 2025 17:53

lcaresia approved these changes Aug 28, 2025

View reviewed changes

Merge branch 'master' into issue-18167-2

016e2ad

lcaresia moved this from Ready for PR Review to Ready for QA in Component (Source and Action) Backlog Aug 28, 2025

vunguyenhung moved this from Ready for QA to In QA in Component (Source and Action) Backlog Aug 29, 2025

vunguyenhung moved this from In QA to Ready for Release in Component (Source and Action) Backlog Aug 29, 2025

michelle0927 merged commit 59b21cc into master Aug 29, 2025
10 checks passed

michelle0927 deleted the issue-18167-2 branch August 29, 2025 14:30

github-project-automation bot moved this from Ready for Release to Done in Component (Source and Action) Backlog Aug 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Apify (API Key) - update scrape-single-url #18210

Apify (API Key) - update scrape-single-url #18210

Uh oh!

michelle0927 commented Aug 28, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Aug 28, 2025 •

edited

Loading

Review skipped

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Status, Documentation and Community

Uh oh!

vercel bot commented Aug 28, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

vunguyenhung commented Aug 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Apify (API Key) - update scrape-single-url #18210

Apify (API Key) - update scrape-single-url #18210

Uh oh!

Conversation

michelle0927 commented Aug 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

vercel bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vunguyenhung commented Aug 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

michelle0927 commented Aug 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 28, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

vercel bot commented Aug 28, 2025 •

edited

Loading