Companion repository for the "Streamlining AWS Glue CI/CD — A Comprehensive Blueprint" blog post
-
Updated
Jun 1, 2024 - HCL
Companion repository for the "Streamlining AWS Glue CI/CD — A Comprehensive Blueprint" blog post
Cloud-based AI / ML workflow and data application development framework
A modern data marketplace that makes collaboration among diverse users (like business, analysts and engineers) easier, increasing efficiency and agility in data projects on AWS.
Process DynamoDB change streams via. AWS Glue w Iceberg to keep a copy of a collection in S3 upto date
Apache Hudi examples designed to be run on AWS Glue via. Glue Jobs
Leveraging AWS Cloud Services, an ETL pipeline transforms YouTube video statistics data. Data is downloaded from Kaggle, uploaded to an S3 bucket, and cataloged using AWS Glue for querying with Athena. AWS Lambda and Glue converts to Parquet format and stores it in a cleansed S3 bucket. AWS QuickSight then visualizes the materialised data.
Hackolade plugin for AWS Glue Data Catalog
AWS Comprehend is an event-driven, serverless data processing pipeline that leverages AWS services to perform natural language processing and analysis on user-submitted text files.
pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).
This AWS-based data pipeline manages data from storage in S3 data lakes, through transformation with AWS Glue and Lambda, to refined storage in separate S3 repositories. Using Athena for SQL querying and QuickSight for interactive dashboards, this solution optimizes data processing and visualization, facilitating informed decision-making and insigh
Demo code to illustrate the execution of PyTest unit test cases for AWS Glue jobs in AWS CodePipeline using AWS CodeBuild projects
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMS
This project aims to securely manage, streamline, and perform analysis on the structured and semi-structured YouTube videos data based on the video categories and the trending metrics.
End to End Data Engineering Projects
Sample code to collect Apache Iceberg metrics for table monitoring
Proyecto donde automatizamos el proceso de recolección , exploración, optimización y visualización de datos, como así también el entrenamiento de modelos de Machine Learning utilizando Amazon Web Services (AWS)
Add a description, image, and links to the aws-glue topic page so that developers can more easily learn about it.
To associate your repository with the aws-glue topic, visit your repo's landing page and select "manage topics."