@wholebuzz/mapreduce-example

Example project for @wholebuzz/mapreduce

This project illustrates running a custom Mapper and Reducer in various scheduling scenarios:

airflow: Various Apache Airflow operators and examples.
docker: Examples of running the tasks in local Docker containers.
helm: Example running on Kubernetes with Helm.
gce: Example starting directly on Google Compute Engine.
local: Examples of running the tasks locally.

For example tasks, we'll:

Count the words in this README.md
Count the words in the title property of records in the test dataset
Sort the test dataset by date, guid and id
Join two separate datasets, each sorted (and sharded) by guid

The supplied test dataset consists of a collection of 10,000 headlines of the form:

{
  "id": 28,
  "date": "2021-10-17T08:52:05.000Z",
  "guid": "https://metro.co.uk/?p=15435904",
  "link": "https://metro.co.uk/2021/10/17/mum-of-six-creates-hallway-tribute-to-her-kids-for-less-than-80-15435904/",
  "feed": "https://metro.co.uk/feed/",
  "props": {
    "title": "Mum-of-six creates sweet hallway tribute to her kids – for less than £80",
    "summary": "'It proves you don’t need to spend lots of money to transform your home.'",
    "imageUrl": "https://wholenews-images.storage.googleapis.com/261d2f38dfd584f5e83130fe504934fb.png"
  },
  "tags": {
    "topic": "PARENTS",
    "locale": "en-GB",
    "topics": [
      "Interiors",
      "DIY"
    ],
    "category": "national",
    "classifiedCategory": "national"
  }
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
airflow		airflow
docker		docker
gce		gce
helm		helm
local		local
src		src
.babelrc		.babelrc
.gitignore		.gitignore
.prettierrc.js		.prettierrc.js
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json
tslint.json		tslint.json
webpack.config.js		webpack.config.js
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

@wholebuzz/mapreduce-example

Example project for @wholebuzz/mapreduce

About

Releases

Packages

Languages

wholebuzz/mapreduce-example

Folders and files

Latest commit

History

Repository files navigation

@wholebuzz/mapreduce-example

Example project for @wholebuzz/mapreduce

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages