Skip to content

numa container in a "blackhole" accumulator devours memory #356

@tmenjo

Description

@tmenjo

numa container in a "blackhole" accumulator devours memory

Describe the bug

(numaflow-python: v0.13.0)

I made a blackhole accumulator based on the example streamsorter. It takes two input streams like the original streamsorter but emits no output stream.

I found that my blackhole accumulator devoured memory. Grafana told me that the memory consumption of the numa container in the accumulator vertex did not reach a ceiling but continued to increase for 2 hours at least.

The behavior cannot be seen on the original streamsorter. I built it by myself, run the example pipeline with it, and found that the memory consumption reached a ceiling.

I want to know whether this is an actual issue or my fault when using accumulator. Is there any constraint such as "every input datum to an accumulator should be output" or "the number of output datum from an accumulator should be equal to that of input datum"? I think this matters when we make a "multiplexer" accumulator which omits one of the two input data, or a "cross-join" accumulator which put two different input data together into one output datum.

numaproj/numaflow#3262 may be a related issue.

To Reproduce

  1. Build the image (original streamsorter or modified blackhole) and push it to a registry.
  2. Deploy the example pipeline with the built image.
  3. Start sending 2MB data chunks to the two HTTP sources repeatedly.
  4. Watch the memory consumption of the accumulator vertex.

I have my branch blackhole-v0.13.0 for reproduction. Please (fix build configuration if needed and) build the original streamsorter image on tmenjo@cd06382, and blackhole on tmenjo@00cd516. The branch also contains a simple shell script for reproduction.

Expected behavior

The memory consumption of the accumulator vertex reaches a ceiling.

Screenshots

blackhole

The memory consumption of the numa container in the blackhole accumulator vertex continued to increase for 2 hours at least:

Image

streamsorter

The memory consumption reached a ceiling within 20 minutes.

Image

Environment

  • Kubernetes: v1.35.2
  • Numaflow: v1.7.1
  • Numaflow-Python: v0.13.0

Message from the maintainers:

Impacted by this bug? Give it a 👍. We often sort issues this way to know what to prioritize.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions