Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hubble observe fails with "requested data has been overwritten and is no longer available" #17036

Closed
michi-covalent opened this issue Aug 3, 2021 · 0 comments · Fixed by #17046
Labels
kind/bug This is a bug in the Cilium logic. sig/hubble Impacts hubble server or relay

Comments

@michi-covalent
Copy link
Contributor

cilium version: v1.10.3

i'm hitting this error fairly consistently on nodes with moderate traffic:

% kubectl exec -it -n kube-system ds/cilium -- hubble observe --all
requested data has been overwritten and is no longer available
command terminated with exit code 1

ref: there is a draft PR to re-implement the ring buffer: #14304

@michi-covalent michi-covalent added kind/bug This is a bug in the Cilium logic. sig/hubble Impacts hubble server or relay labels Aug 3, 2021
michi-covalent added a commit to michi-covalent/cilium that referenced this issue Aug 4, 2021
Currently GetFlows() fails with the following error when a position in
the ring buffer being read by Ring.read() has been overwritten:

    requested data has been overwritten and is no longer available

This turned out to be impractical as it makes it difficult to read all
the flows in the ring buffer (e.g.. hubble observe --all). GetFlows()
would fail if Hubble observes a single flow between the reader rewinding
to the oldest position and retrieving the entry.

This patch modifies Ring.read() so that GetFlows() returns LostEvent
instead of stopping with an error. The caller of GetFlows() can then
decide how to handle LostEvent.

Note that this makes the behavior of Ring.read() consistent with that
of Ring.readFrom() used in the follow mode. It generates LostEvent and
continues following instead of failing with ErrInvalidRead.

Fixes: cilium#17036

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
ti-mo pushed a commit that referenced this issue Aug 18, 2021
Currently GetFlows() fails with the following error when a position in
the ring buffer being read by Ring.read() has been overwritten:

    requested data has been overwritten and is no longer available

This turned out to be impractical as it makes it difficult to read all
the flows in the ring buffer (e.g.. hubble observe --all). GetFlows()
would fail if Hubble observes a single flow between the reader rewinding
to the oldest position and retrieving the entry.

This patch modifies Ring.read() so that GetFlows() returns LostEvent
instead of stopping with an error. The caller of GetFlows() can then
decide how to handle LostEvent.

Note that this makes the behavior of Ring.read() consistent with that
of Ring.readFrom() used in the follow mode. It generates LostEvent and
continues following instead of failing with ErrInvalidRead.

Fixes: #17036

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
ti-mo pushed a commit to ti-mo/cilium that referenced this issue Aug 18, 2021
[ upstream commit 98b5fb3 ]

Currently GetFlows() fails with the following error when a position in
the ring buffer being read by Ring.read() has been overwritten:

    requested data has been overwritten and is no longer available

This turned out to be impractical as it makes it difficult to read all
the flows in the ring buffer (e.g.. hubble observe --all). GetFlows()
would fail if Hubble observes a single flow between the reader rewinding
to the oldest position and retrieving the entry.

This patch modifies Ring.read() so that GetFlows() returns LostEvent
instead of stopping with an error. The caller of GetFlows() can then
decide how to handle LostEvent.

Note that this makes the behavior of Ring.read() consistent with that
of Ring.readFrom() used in the follow mode. It generates LostEvent and
continues following instead of failing with ErrInvalidRead.

Fixes: cilium#17036

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>
tklauser pushed a commit that referenced this issue Aug 19, 2021
[ upstream commit 98b5fb3 ]

Currently GetFlows() fails with the following error when a position in
the ring buffer being read by Ring.read() has been overwritten:

    requested data has been overwritten and is no longer available

This turned out to be impractical as it makes it difficult to read all
the flows in the ring buffer (e.g.. hubble observe --all). GetFlows()
would fail if Hubble observes a single flow between the reader rewinding
to the oldest position and retrieving the entry.

This patch modifies Ring.read() so that GetFlows() returns LostEvent
instead of stopping with an error. The caller of GetFlows() can then
decide how to handle LostEvent.

Note that this makes the behavior of Ring.read() consistent with that
of Ring.readFrom() used in the follow mode. It generates LostEvent and
continues following instead of failing with ErrInvalidRead.

Fixes: #17036

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>
krishgobinath pushed a commit to krishgobinath/cilium that referenced this issue Oct 20, 2021
Currently GetFlows() fails with the following error when a position in
the ring buffer being read by Ring.read() has been overwritten:

    requested data has been overwritten and is no longer available

This turned out to be impractical as it makes it difficult to read all
the flows in the ring buffer (e.g.. hubble observe --all). GetFlows()
would fail if Hubble observes a single flow between the reader rewinding
to the oldest position and retrieving the entry.

This patch modifies Ring.read() so that GetFlows() returns LostEvent
instead of stopping with an error. The caller of GetFlows() can then
decide how to handle LostEvent.

Note that this makes the behavior of Ring.read() consistent with that
of Ring.readFrom() used in the follow mode. It generates LostEvent and
continues following instead of failing with ErrInvalidRead.

Fixes: cilium#17036

Signed-off-by: Michi Mutsuzaki <michi@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. sig/hubble Impacts hubble server or relay
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant