Skip to content

Infrastructure: The projects herein simplify the repeated use of a variety of frameworks, and cloud services & platforms.

Notifications You must be signed in to change notification settings

miscellane/infrastructure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 

Repository files navigation


A simple and illustrative Amazon Data Pipeline example. It runs a containerized software package, i.e., it runs an instance of a container image. The image is pulled from Docker Hub. A version of the pipeline's definition code outlines the function of each `ShellCommandActivity` node; each runs one of the augmentation pipeline scripts.



Infrastructure


In brief, the aim of the projects herein is to simplify the repeated use of a variety of (a) frameworks, (b) cloud services, and (c) cloud platforms. And, the notes & examples are for reference purposes; the notes will be updated continuously.


Cloud Services & Platforms

Focused on templates and small software packages that ease or aid the use of cloud services & platforms


Frameworks

Quite a variety of frameworks are used for data science, statistics, data engineering, and machine learning engineering, e.g., Apache Spark, Apache Hive, etc.

Each framework project below is akin to a suite of programs that can be used for any data project that uses the framework; saving time and reducing/eliminating some repetitive steps.

For example, the hive project simplifies the process of projecting a structure onto a repository of data files via Apache Hive; it more or less parameterises steps that are repeated per data project.

The notes of each project are within each project's section.

About

Infrastructure: The projects herein simplify the repeated use of a variety of frameworks, and cloud services & platforms.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages