Skip to content

denisdefreyne/d-stream

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
lib
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

D★Stream

Gem version Build status Code Climate Code Coverage

D★Stream is a set of extensions for writing stream-processing code in Ruby.

CAUTION: D★Stream is work in progress, and pre-alpha quality.

Examples

Example 1: straightforward

The following example takes a sequence of events for a given ticket, and calculates the history for that ticket, using slowly changing dimensions:

events =
  Enumerator.new do |y|
    y << { id: 40562348, at: Time.now - 400, status: 'new' }
    y << { id: 40564682, at: Time.now - 300, assignee_id: 2 }
    y << { id: 40565795, at: Time.now - 250, priority: 'high' }
    y << { id: 40569932, at: Time.now - 100, status: 'solved' }
  end.lazy

S = DStream

history_builder =
  S.compose(
    # calculate new state
    S.scan({}, &:merge),

    # add `version`
    S.zip(1..),
    S.map { |(e, i)| e.merge(version: i) },

    # remove `id`
    S.map { |e| e.except(:id) } },

    # add `valid_to` and `valid_from`, and remove `at`
    S.with_next,
    S.map { |(a, b)| a.merge(valid_to: b ? b.fetch(:at) : nil) },
    S.map { |e| e.merge(valid_from: e.fetch(:at)) },
    S.map { |e| e.except(:at) } },

    # add `row_is_current`
    S.with_next,
    S.map { |(a, b)| a.merge(row_is_current: b.nil?) },
  )

history = history_builder.call(events)
history.each { |e| p e }

The output is as follows:

{
  :status=>"new",
  :valid_from=>2017-05-05 20:18:14 +0200,
  :valid_to=>2017-05-05 20:19:54 +0200,
  :version=>1,
  :row_is_current=>false
}
{
  :status=>"new",
  :assignee_id=>2,
  :valid_from=>2017-05-05 20:19:54 +0200,
  :valid_to=>2017-05-05 20:20:44 +0200,
  :version=>2,
  :row_is_current=>false
}
{
  :status=>"new",
  :assignee_id=>2,
  :priority=>"high",
  :valid_from=>2017-05-05 20:20:44 +0200,
  :valid_to=>2017-05-05 20:23:14 +0200,
  :version=>3,
  :row_is_current=>false
}
{
  :status=>"solved",
  :assignee_id=>2,
  :priority=>"high",
  :valid_from=>2017-05-05 20:23:14 +0200,
  :valid_to=>nil,
  :version=>4,
  :row_is_current=>true
}

Example 2: better factored

This example is functionally identical to the one above, but uses S.compose in order to make the final process, history_builder, easier to understand.

events =
  Enumerator.new do |y|
    y << { id: 40562348, at: Time.now - 400, status: 'new' }
    y << { id: 40564682, at: Time.now - 300, assignee_id: 2 }
    y << { id: 40565795, at: Time.now - 250, priority: 'high' }
    y << { id: 40569932, at: Time.now - 100, status: 'solved' }
  end.lazy

S = DStream

merge =
  S.scan({}, &:merge),

add_version =
  S.compose(
    S.zip(1..),
    S.map { |(e,i)| e.merge(version: i) },
  )

remove_id =
  S.map { |e| e.except(:id) } }

add_valid_dates =
  S.compose(
    S.with_next,
    S.map { |(a,b)| a.merge(valid_to: b ? b.fetch(:at) : nil) },
    S.map { |e| e.merge(valid_from: e.fetch(:at)) },
    S.map { |e| e.except(:at) } },
  )

add_row_is_current =
  S.compose(
    S.with_next,
    S.map { |(a,b)| a.merge(row_is_current: b.nil?) },
  )

history_builder =
  S.compose(
    merge,
    add_version,
    remove_id,
    add_valid_dates,
    add_row_is_current,
  )

history = history_builder.call(events)
history.each { |h| p h }

API

The following functions create individual processors:

  • map(&block) (similar to Enumerable#map)

    S.map(&:odd?).call(1..5).to_a
    # => [true, false, true, false, true]
  • select(&block) (similar to Enumerable#select)

    S.select(&:odd?).call(1..5).to_a
    # => [1, 3, 5]
  • reduce(&block) (similar to Enumerable#reduce)

    S.reduce(&:+).call(1..5)
    # => 15
  • take(n) (similar to Enumerable#take)

    S.take(3).call(1..10).to_a
    # => [1, 2, 3]
  • zip(other) (similar to Enumerable#zip):

    S.zip((10..13)).call(1..3).to_a
    # => [[1, 10], [2, 11], [3, 12]]
  • buffer(size) yields each stream element, but keeps an internal buffer of not-yet-yielded stream elements. This is useful when reading from a slow and bursty data source, such as a paginated HTTP API.

  • with_next yields an array containing the stream element and the next stream element, or nil when the end of the stream is reached:

    S.with_next.call(1..5).to_a
    # => [[1, 2], [2, 3], [3, 4], [4, 5], [5, nil]]
  • scan(init, &block) is similar to reduce, but rather than returning a single aggregated value, returns all intermediate aggregated values:

    S.scan(0, &:+).call(1..5).to_a
    # => [1, 3, 6, 10, 15]
  • flatten2 yields the stream element if it is not an array, otherwise yields the stream element array’s contents:

    S.compose(S.with_next, S.flatten2).call(1..5).to_a
    # => [1, 2, 2, 3, 3, 4, 4, 5, 5, nil]

To call a processor to a stream, use #call:

S = DStream

stream = ['hi', 'hello']

S.map(&:upcase).call(stream).to_a
# => ["HI", "HELLO"]

To combine one or more processors, use .compose:

S = DStream

stream = ['hi', 'hello']

processor = S.compose(
  S.map(&:upcase),
  S.map(&:reverse),
)

processor.call(stream).to_a
# => ["IH", "OLLEH"]

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages