Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upMemory issue(?) with remote_write on ARMv7 with #2666
Comments
This comment has been minimized.
This comment has been minimized.
|
Just FYI, if I comment out the
No crash as far as I can tell. |
This comment has been minimized.
This comment has been minimized.
|
It doesn't look like a I suspected an alignment issue like in elastic/beats#3273, but the field we atomically increment is already the first in its struct, so it should be aligned properly: https://github.com/prometheus/prometheus/blob/master/storage/remote/ewma.go#L24 From https://golang.org/pkg/sync/atomic/: "On both ARM and x86-32, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a global variable or in an allocated struct or slice can be relied upon to be 64-bit aligned." So now I don't know if this is a Go bug or what... I don't have an ARM laying around. I wonder if you could try compiling the following Go program for your rpi and see if it crashes? |
This comment has been minimized.
This comment has been minimized.
|
It doesn't seem like it crashes from what I can tell:
|
This comment has been minimized.
This comment has been minimized.
|
@vpetersson Thanks! Strange. Did you also use Go 1.8.1 for this, which was the Go version from which your broken Prometheus was built? Maybe I should get a Raspberry Pi :) @discordianfish Can you reproduce this on yours, perhaps? |
This comment has been minimized.
This comment has been minimized.
|
@juliusv No, this was not 1.8.1. I simply installed the
|
This comment has been minimized.
This comment has been minimized.
|
Ah. Since I'm suspecting a Go bug, it would be super helpful if someone could try it out with Go 1.8.1. |
This comment has been minimized.
This comment has been minimized.
|
@juliusv Pulled 1.8.1 from upstream and ran it again. Same result:
|
This comment has been minimized.
This comment has been minimized.
|
Thanks! Hm ok. Anyone want to donate an rpi to me? :) Or rather, looks like someone who has one has to dig deeper into this... |
This comment has been minimized.
This comment has been minimized.
|
I can send you one. I've followed you on Twitter. Follow me back and DM me your address. |
This comment has been minimized.
This comment has been minimized.
|
You can find the address in
https://github.com/prometheus/prometheus/blob/master/MAINTAINERS.md
…On Wed, May 3, 2017 at 11:50 AM Viktor Petersson ***@***.***> wrote:
I can send you one. I've followed you on Twitter. Follow me back and DM me
your address.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2666 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAANaJigiDQAqmxasv2wMM5-GuasWMk0ks5r2E3KgaJpZM4NL5_Q>
.
|
This comment has been minimized.
This comment has been minimized.
|
@grobie It's kinda hard to FedEx a parcel to an email address. ;) |
This comment has been minimized.
This comment has been minimized.
|
Wooops, sorry for the noise. The message before was collapsed in my client
;-)
…On Wed, May 3, 2017 at 12:07 PM Viktor Petersson ***@***.***> wrote:
@grobie <https://github.com/grobie> It's kinda hard to FedEx a parcel to
an email address. ;)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2666 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAANaERXEI46nGZTXVKlAb9UJ3gnjIcEks5r2FHIgaJpZM4NL5_Q>
.
|
This comment has been minimized.
This comment has been minimized.
|
@vpetersson What a scheme to get followers! Hehe. But this is awesome, thanks! Done. |
This comment has been minimized.
This comment has been minimized.
|
@juliusv Probably the most expensive scheme ever invented. :) |
This comment has been minimized.
This comment has been minimized.
|
@juliusv it could be that the embedded Go seems to offer very few guarantees re: alignment - https://golang.org/ref/spec#Size_and_alignment_guarantees @SuperQ was running the remote-write code on a Pi at KubeCon, but that may have been a Pi3 (64bit) or with an earlier version that didn't include the dynamic sharding code. I have a Pi3 here so can give this a quick go. |
This comment has been minimized.
This comment has been minimized.
|
@tomwilkie Yeah I've had it running in the past too. Please note that this was running ResinOS (which is basically running Docker on the Pi). Not sure if that changes anything. |
This comment has been minimized.
This comment has been minimized.
|
I was running on a pine64 . I have a pi3 and pi1 I can test on. |
This comment has been minimized.
This comment has been minimized.
|
Oh man, even the unit test reproduce the issue:
Bring on the ARM CI... |
This comment has been minimized.
This comment has been minimized.
|
Okay, its embedded struct alignment:
|
This comment has been minimized.
This comment has been minimized.
|
And it seems like multiple embedded structs are tightly-packed:
TBH I don't see a want of guaranteeing alignment that isn't super fragile! |
This comment has been minimized.
This comment has been minimized.
|
Wat? Ah finally context :) I just now got the notification.. If you still have something to test, let me know. Otherwise there is also scaleway. |
This comment has been minimized.
This comment has been minimized.
|
thanks @discordianfish; I've got a pi at home and tested the above code. TLDR; atomicly-accessed int64s must be 64-bit aligned. Can't see a way to guarantee this in go that isn't super fragile. Some more reading: |
This comment has been minimized.
This comment has been minimized.
|
@vpetersson mind testing out the 266-alignment-on-arm branch? Its work for me. |
This comment has been minimized.
This comment has been minimized.
|
@tomwilkie I guess you mean this branch in your fork: https://github.com/tomwilkie/prometheus/tree/2666-alignment-on-arm Ok, so it's an alignment problem after all. Interesting that the statement about first words in structs being aligned from the We also use atomic accesses in other places, such as for counter increments in the Prometheus client library. So it's probably pure luck that those places aren't crashing on ARM right now. We did have the same problem with x86 32-bit earlier, but solved that by having the atomically accessed variable as the first member in the struct (which doesn't seem to help this time). |
This comment has been minimized.
This comment has been minimized.
You need to be careful about how to parse the phrase, as what it means is "The first word ... in an allocated struct ... can be relied upon to be 64-bit aligned." Embedded structs are not allocated structs. |
This comment has been minimized.
This comment has been minimized.
|
@tomwilkie Man, that's a gotcha. Thanks. How about just making them pointers to independently allocated structs then (with a big fat comment that it should stay this way)? Saves us all the extra padding fields in tomwilkie@f1806c3. AFAIU there is no guarantee about those padding fields creating alignment either? |
This comment has been minimized.
This comment has been minimized.
|
Sure, happy with that. |
juliusv
closed this
in
#2675
May 3, 2017
This comment has been minimized.
This comment has been minimized.
|
I can confirm that the problem is still present in 1.6.2:
|
juliusv
reopened this
May 14, 2017
This comment has been minimized.
This comment has been minimized.
|
@tomwilkie Could you check the latest release version (1.6.2) on your rpi 3 again? I wonder if there is a difference between your rpi and @vpetersson's, but at least they both seem to be an rpi 3. I doubt that the exact OS should make a difference, as it's an architectural memory access issue that should be independent of the OS? |
This comment has been minimized.
This comment has been minimized.
|
I think the issue is that the fix[1] didn't make it onto the 1.6 branch[2]. Since its been merged already, should I make a cherry pick for 1.6.3? [1] 3141a6b |
This comment has been minimized.
This comment has been minimized.
|
I think there were plans to release 1.7 soon, are we sure we'll have a 1.6.3 release? We need to get better on marking changes to be included in the next patch release in general. |
This comment has been minimized.
This comment has been minimized.
|
This fix should be out now. |
brian-brazil
closed this
Jul 14, 2017
This comment has been minimized.
This comment has been minimized.
|
I can confirm that this is now working. Thanks, guys! |
marcan
referenced this issue
Sep 4, 2017
Closed
SIGSEGV on startup (armv7, prometheus-2.0.0-beta.2) #3136
This comment has been minimized.
This comment has been minimized.
lock
bot
commented
Mar 23, 2019
|
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
vpetersson commentedApr 28, 2017
(Somewhat of a follow-up on #2663)
What did you do?
I'm trying to send logs remotely using
remote_writeon ARMv7.What did you expect to see?
I expected Prometheus to send a payload to the URL specified in
remote_write.What did you see instead? Under which circumstances?
All. It works fine if I remove the
remote_writestanza.Environment
A Raspberry Pi 3 Model B running ResinOS.
In case the memory allocation is a concern, here is a dump of that:
Linux 4.4.48 armv7l