Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
src
 
 
 
 
 
 
 
 

A short project to demonstrate how Scala's types can make type signatures more useful when working with Beam pipelines.

Our story looks a little like this:

As a ...
I want to know how many times each user accesses each URL and the status code they received
So that ...

Our log lines look like this:

1.2.3.4,bob,2017-01-01T00:00:00.001Z,/,200

Use your IDE's support to show the type signature at different points in the pipeline.

Without type aliases:

sc.textFile(args("input"))
  // SCollection[String]
  .map(AccessLog.parseLine)
  // SCollection[Entry]
  .map(x => (x.userId, x.path, x.statusCode))
  // SCollection[(String, String, Int)]
  .countByValue
  // SCollection[((String, String, Int), Long)]

With type aliases:

sc.textFile(args("input"))
  // SCollection[String]
  .map(AccessLog.parseLine)
  // SCollection[Entry]
  .map(x => (x.userId, x.path, x.statusCode))
  // SCollection[(UserId, Path, StatusCode)]
  .countByValue
  // SCollection[((UserId, Path, StatusCode), Long)]

About

Example of using Scala types with Apache Beam pipelines

Resources

Releases

No releases published

Packages

No packages published

Languages