Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prom2: "no token found" on scrape #2751

Closed
mwitkow opened this Issue May 22, 2017 · 14 comments

Comments

Projects
None yet
5 participants
@mwitkow
Copy link
Contributor

mwitkow commented May 22, 2017

What did you do?
Set up monitoring side-by side between Prom2 and Prom1.

What did you expect to see?
The same data.

What did you see instead? Under which circumstances?
Some scrape targets report no token found

2017-05-22T08:14:07.683853000Z time="2017-05-22T08:14:07Z" level=error msg="append failed" err="no token found" source="scrape.go:483" 

image

@mwitkow

This comment has been minimized.

Copy link
Contributor Author

mwitkow commented May 22, 2017

The bad output is in here:

controller.badtoken.tar.gz

I emailed @fabxc an output from another private job that we have that has a very similar problem.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 22, 2017

Interesting. That one doesn't have the trailing commas fixed in #2752

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 22, 2017

Okay, so I cannot even use that input in a .go file for testing. The reason is that your node_collector_evictions_number has a NULL byte in the zone label.

Given it's a valid unicode character, we probably have to support it but I think that's still not the intended behavior of your exporter.

@mwitkow

This comment has been minimized.

Copy link
Contributor Author

mwitkow commented May 22, 2017

@fabxc that one is actually a Kubernetes controller

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 22, 2017

An upstream one?

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 22, 2017

@beorn7 @brian-brazil opinions?

FWIW, we want to support unicode so people can use all the emojis they want (that works in pkg/textparse)... but NULL is fundamentally useless IMO and anything actually exposing it is probably not handling an internal edge case correctly, like here.

@mwitkow

This comment has been minimized.

Copy link
Contributor Author

mwitkow commented May 22, 2017

@fabxc yup, unmodified 1.5.x series kubernetes controller.

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented May 22, 2017

Argh... how can this stuff keep happening. Is the label value assembled somewhere internally in k8s though or passed through via your setup configuration somehow?

@mwitkow

This comment has been minimized.

Copy link
Contributor Author

mwitkow commented May 22, 2017

So the problem is visible when you open the controller.badtoken in VIM:

# HELP node_collector_evictions_number Number of Node evictions that happened since current instance of NodeController started.
# TYPE node_collector_evictions_number counter
node_collector_evictions_number{zone="europe-west1:^@:europe-west1-c"} 4
# HELP node_collector_unhealty_nodes_in_zone Gauge measuring number of not Ready Nodes per zones.
# TYPE node_collector_unhealty_nodes_in_zone gauge
node_collector_unhealty_nodes_in_zone{zone="europe-west1:^@:europe-west1-c"} 0
# HELP node_collector_zone_health Gauge measuring percentage of healty nodes per zone.
# TYPE node_collector_zone_health gauge
node_collector_zone_health{zone="europe-west1:^@:europe-west1-c"} 100
# HELP node_collector_zone_size Gauge measuring number of registered Nodes per zones.
# TYPE node_collector_zone_size gauge
node_collector_zone_size{zone="europe-west1:^@:europe-west1-c"} 19
# HELP node_controller_rate_limiter_use A metric measuring the saturation of the rate limiter for node_controller

I'll double check how we have that configured in our Kubernetes config and what version it is
.

@joshpmcghee

This comment has been minimized.

Copy link

joshpmcghee commented May 22, 2017

We're running 1.5.3.

@beorn7

This comment has been minimized.

Copy link
Member

beorn7 commented May 22, 2017

I think we should keep supporting all valid UTF-8. While the NUL character is visually not as appealing as an emoji, it is common enough to slip in now and then. If we restricted ourselves to "all UTF-8 except NUL", we would require every metrics producer to keep this special case in mind and handle it.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented May 22, 2017

I've seen null in string responses from SNMP devices. I agree with @beorn7

@fabxc

This comment has been minimized.

Copy link
Member

fabxc commented Jun 12, 2017

Fixed

@fabxc fabxc closed this Jun 12, 2017

@lock

This comment has been minimized.

Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.