Skip to content
#

aws-glue

Here are 129 public repositories matching this topic...

aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

  • Updated Feb 18, 2025
  • Python

The Smart City Data Streaming Pipeline processes real-time data from IoT devices using Apache Kafka for ingestion and Apache Spark for processing. Data is stored in AWS S3 and analyzed with Glue, Athena, and Redshift. It enhances traffic management, predictive analytics, and urban planning, making cities smarter and more efficient.

  • Updated Feb 9, 2025
  • Python

This project delivers an end-to-end data pipeline solution designed to employ a comprehensive ETL process to move BIM data from Autodesk Revit model of Tencent Global Headquarters into cloud storage for processing and and analytics. The pipeline leverages tools and services such as Apache Airflow, Amazon S3, AWS Glue, and Amazon Redshift.

  • Updated Jan 27, 2025
  • Python

This project creates a scalable data pipeline to analyze YouTube data from Kaggle using AWS services: S3, Glue, Lambda, Athena, and QuickSight. It processes raw JSON and CSV files into cleansed, partitioned datasets, integrates them with ETL workflows, and catalogs data for querying. Final insights are visualized in QuickSight dashboards.

  • Updated Jan 25, 2025
  • Python

Improve this page

Add a description, image, and links to the aws-glue topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-glue topic, visit your repo's landing page and select "manage topics."

Learn more