-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
export-distro sometimes encounters 0MQ message body error #1512
Comments
FWIW that error message is coming from the following location in the @rsdmike @lenny-intel for visibility |
Based on the logs above we are getting the So wondering if the previous message envelope got dropped. |
@TomFlem , can you check your core data logs for this error? |
@lenny-intel I got the same issue Deployment Environment: EdgeX Version: Case 1core-data log
export-distro log
Case 2core-data log
export-distro log
|
No we don't see this error. |
Thanks @weichou1229. This is great information which confirms this issue is not just on low power devices. I will have one of our devs who uses a MAC attempt to recreate. |
@rsdmike was able to recreate the issue on his MAC Book. I was not able to recreate it on my Linux box which ran for 2hrs. |
I have now reproduced this issue on my Linux box running https://github.com/edgexfoundry/developer-scripts/blob/master/compose-files/docker-compose-edinburgh-no-secty-1.0.0.yml I left it running over night. |
I built latest images for core-data and export-distro from master (with Trevor's race condition fix) and still got the issue. This time it occurred fairly quickly, with approx 15mins. Running again with logging set to TRACE for both to see if we see anything from core data when the error occurs. |
Fix was included in v1.0.1 images |
馃悶 Bug Report
Affected Services
The issue is located in: the communication between export-distro and core-data via 0MQ
Is this a regression?
Yes, it doesn't happen in Delhi release
Description and Minimal Reproduction
Running EdgeX v1.0.0 with multiple Device Services on lower power kit (Dell3000, UPSquared), and create multiple export registrations, and wait for 5-15 mins until the error log printed in export-distro. All export-distro goroutines are terminated when this issue occurs. In the error log, 'events events' bit looks wrong.
馃敟 Exception or Error
level=ERROR ts=2019-07-01T14:35:12.820403389Z app=edgex-export-distro source=registrations.go:350 msg="exit msg: found more than 2 incoming messages (1 is no topic, 2 is topic and message), but found: 3"
When we print with a bit more info we see:
level=ERROR ts=2019-07-08T18:40:04.596019465Z app=edgex-export-distro source=registrations.go:350 msg="exit msg: found more than 2 incoming messages (1 is no topic, 2 is topic and message), but found: 3, message content: [events events {"Checksum":"","CorrelationID":"e8c3960b-afb3-42b5-910f-4f20d6e36267","Payload":"eyJjb3JyZWxhdGlvbi1pZCI6ImU4YzM5NjBiLWFmYjMtNDJiNS05MTBmLTRmMjBkNmUzNjI2NyIsImlkIjoiMWIwYzlhNzItMjRhNC00YTFiLTgyYTgtOTgxZDUxNDVjMTg5IiwiZGV2aWNlIjoiUmFuZG9tLUludGVnZXItRGV2aWNlIiwib3JpZ2luIjoxNTYyNjExMjA0NTYwNDgwODUxLCJyZWFkaW5ncyI6W3siaWQiOiI5OThlN2MxNy0xMTYyLTRlMmYtYjBmYy0wNjJmMzdkYTJmM2MiLCJvcmlnaW4iOjE1NjI2MTEyMDQ1NDI1OTA4MDcsImRldmljZSI6IlJhbmRvbS1JbnRlZ2VyLURldmljZSIsIm5hbWUiOiJSYW5kb21WYWx1ZV9JbnQxNiIsInZhbHVlIjoiLTQ5NTkifV19","ContentType":"application/json"}]"
level=INFO ts=2019-07-08T18:40:04.596015754Z app=edgex-export-distro source=registrations.go:250 msg="Terminating registration goroutine"
level=INFO ts=2019-07-08T18:40:04.597181997Z app=edgex-export-distro source=registrations.go:250 msg="Terminating registration goroutine"
level=INFO ts=2019-07-08T18:40:04.597900919Z app=edgex-export-distro source=registrations.go:250 msg="Terminating registration goroutine"
馃實 Your Environment
Deployment Environment:
lower power kit, such as Dell3000, UPSquared
EdgeX Version:
Edinburgh Release v1.0.0
Anything else relevant?
The log is printed by https://github.com/edgexfoundry/go-mod-messaging/blob/18e14ad62fb232c3fddceea415da29d0b56d5906/internal/pkg/zeromq/client.go#L172 , and the message content printing is added by us
The text was updated successfully, but these errors were encountered: