Skip to content
#

aws-glue

Here are 197 public repositories matching this topic...

Leveraging AWS Cloud Services, an ETL pipeline transforms YouTube video statistics data. Data is downloaded from Kaggle, uploaded to an S3 bucket, and cataloged using AWS Glue for querying with Athena. AWS Lambda and Glue converts to Parquet format and stores it in a cleansed S3 bucket. AWS QuickSight then visualizes the materialised data.

  • Updated May 30, 2024
  • Python
aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

  • Updated May 28, 2024
  • Python

This AWS-based data pipeline manages data from storage in S3 data lakes, through transformation with AWS Glue and Lambda, to refined storage in separate S3 repositories. Using Athena for SQL querying and QuickSight for interactive dashboards, this solution optimizes data processing and visualization, facilitating informed decision-making and insigh

  • Updated May 27, 2024
  • Python

Improve this page

Add a description, image, and links to the aws-glue topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-glue topic, visit your repo's landing page and select "manage topics."

Learn more