Skip to content

Web scraping endpoints that use puppeteer and axios, which accept arguments via a POST request.

Notifications You must be signed in to change notification settings

harrisoncramer/serverless-scraper-api

Repository files navigation

🚀 Serverless Webscraping Functions

Endpoints that do webscraping for you! Just send over the options provided for each endpoint and the Lambda function will scrape the page, and then return the results of your query.

The functions are written in Typescript, and deployed with Serverless.

Installation

yarn install

Development

To spin up the development server locally, run yarn dev

This is an alias for the serverless offline command which uses the serverless-offline plugin to spin up a development server with our lambdas at various endpoints.

The functions can then be hit with Postman, Curl, or another service. For example, send a POST request to the /dev/getLinks endpoint with the following JSON data: { "url": "https://www.yourwebsite.com", "puppeteer": false, "limit": 7 }

This will return up to seven of the links on the page that you feed into the API.

Deployment

yarn deploy:prod

This command set the NODE_ENV to production, loading the correct environment variables, and then deploys the function to the cloud (configuration inside the serverless.ts file) on the prod stage.

You can also deploy to the dev stage with the yarn deploy:dev command, which is an alias for the development stage with serverless.

About

Web scraping endpoints that use puppeteer and axios, which accept arguments via a POST request.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages