Skip to content

brabster/beam-scala-types-example

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A short project to demonstrate how Scala's types can make type signatures more useful when working with Beam pipelines.

Our story looks a little like this:

As a ...
I want to know how many times each user accesses each URL and the status code they received
So that ...

Our log lines look like this:

1.2.3.4,bob,2017-01-01T00:00:00.001Z,/,200

Use your IDE's support to show the type signature at different points in the pipeline.

Without type aliases:

sc.textFile(args("input"))
  // SCollection[String]
  .map(AccessLog.parseLine)
  // SCollection[Entry]
  .map(x => (x.userId, x.path, x.statusCode))
  // SCollection[(String, String, Int)]
  .countByValue
  // SCollection[((String, String, Int), Long)]

With type aliases:

sc.textFile(args("input"))
  // SCollection[String]
  .map(AccessLog.parseLine)
  // SCollection[Entry]
  .map(x => (x.userId, x.path, x.statusCode))
  // SCollection[(UserId, Path, StatusCode)]
  .countByValue
  // SCollection[((UserId, Path, StatusCode), Long)]

About

Example of using Scala types with Apache Beam pipelines

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages