Skip to content

Naming Stages #48

@isaacsanders

Description

@isaacsanders

Hey all,

On my team, we have been using Flow and GenStage for the past 9 months or more. We use it to process streams of data from a number of different sources, and to process a lot of data throughout the day. Ideally, we don't shut down. Currently the system has OOM crashes and slows down because of backed-up message queues.

We consistently have an issue where a GenStage process will slow its processing, will begin to accumulate memory. The GenStages that we have written ourselves are all named, but the GenStage generated by our use of Flow are nameless. We have to guess where the problems exist and I think that is a problem.

I would like to be able to provide a name prefix, and to have each "stage" of the Flow (map, flat_map, filter, etc) to append another prefix, then have each GenStage within the "stage" append a further suffix, but just enough to not collide, like _#{i}, like the partitions in Registry.

An example:

input_stage
|> Flow.from_stage(window: initial_window)
|> Flow.flat_map(&processing_function/1, name: __MODULE__.Flow.ProcessingFunction)
|> Flow.partition(window: post_processing_window)
|> Flow.filter(&filtering_function/1, name: __MODULE__.Flow.FilteringFunction)
|> Flow.into_stages([output_stage], name: __MODULE__.Flow)

This will generate GenStage processes like:

__MODULE__.Flow.ProcessingFunction.FlatMap._0
__MODULE__.Flow.ProcessingFunction.FlatMap._1
__MODULE__.Flow.ProcessingFunction.FlatMap._2
__MODULE__.Flow.ProcessingFunction.FlatMap._3
__MODULE__.Flow.FilteringFunction.Filter._0
__MODULE__.Flow.FilteringFunction.Filter._1
__MODULE__.Flow.FilteringFunction.Filter._2
__MODULE__.Flow.FilteringFunction.Filter._3

A feature, like this, that allows us to reliably determine the line(s) of code from where a process originated, would take a lot of the guesswork out of debugging and performance tuning for my team, and make it so that we can use tools like :observer and WombatOAM more effectively.

If this is of interest to the maintainers, I would be happy to implement this feature, with your supervision & advice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions