Skip to content

chrismaresca/aws-youtube-transcript-handler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Serverless Python + AWS Lambda HTTP Post To Fetch YouTube Transcript

An AWS Lambda function to fetch a YouTube transcript.

Configure Serverless

  1. Install the serverless-python-requirements plugin if you don't have it already.
pnpm install serverless-python-requirements
  1. Install the serverless-plugin-resource-tagging plugin if you don't have it already.
pnpm install serverless-plugin-resource-tagging
  1. Install the serverless-functions-base-path plugin if you don't have it already.
pnpm install serverless-functions-base-path
  1. Set the following environment variables (Everything except the OpenAI API key are already set in the .env.sample file. To use this file, rename it to .env with the following command
mv .env.sample .env

For a brief explanation of each environment variable, see below:

  • SERVICE_NAME: The name of the service (default: youtube-transcript-service-v1)

  • LOG_LEVEL: Logging level (default: DEBUG).

  • USE_PROXY: Use a proxy to fetch the transcript (default: false).

  • PROXY_USERNAME: The username for the proxy (default: none).

  • PROXY_PASSWORD: The password for the proxy (default: none).

  1. Update serverless.yml to include the serverless-python-requirements plugin.

  2. Update serverless.yml to include the serverless-plugin-resource-tagging plugin.

  3. Update serverless.yml to include the serverless-functions-base-path plugin.

  4. Update the 'service' of the service in serverless.yml to your desired service name.

  5. Logging is configured in src/utils/logger.py to output to stdout. This should work on AWS CloudWatch.

Deploy Serverless

serverless deploy

About

YouTube Transcript Scraper w/ Proxy Setup

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages