Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to read from client 192.168.10.153 with error 128! #172

Closed
pugmandan opened this issue Nov 23, 2023 · 18 comments
Closed

Failed to read from client 192.168.10.153 with error 128! #172

pugmandan opened this issue Nov 23, 2023 · 18 comments

Comments

@pugmandan
Copy link

12:45:01 | [E] | [stream_server:145] | Failed to write to client 192.168.10.153 with error 128!
12:45:01 | [W] | [stream_server:168] | Failed to read from client 192.168.10.153 with error 128!

Getting the above error in my tubeszb-cc2652p7-poe-2023 after it had been running for a number of hours. I confirmed that the client (zigbee2mqtt) was still operational (and showing no issues on its end or in its logs). I also operated a Zigbee device (turned light on and off) and it worked without delay.

The tubeszb logs was creating 10's of new log lines per second. I suspect this has been the cause of the stick crashing as it just logs endlessly until I suspect it runs out of memory.

Is there any information as to what error 128 means please? I have tried google with no success. Furthermore both the stick and zigbee2mqtt were both accessible so there should not have been any errors.

@pugmandan
Copy link
Author

Has happened again following a restart of the stick. Screenshot attached.

image

@pugmandan
Copy link
Author

There was an infrequent message which was hard to capture (due to the scrolling nature of the log but I have captured it:

14:54:14 | [E] | [stream_server:145] | Failed to write to client 192.168.10.153 with error 128!
14:54:14 | [W] | [stream_server:168] | Failed to read from client 192.168.10.153 with error 128!
14:54:14 | [E] | [stream_server:104] | Incoming bytes available, but outgoing buffer is full: stream will be corrupted!

@tube0013
Copy link
Owner

128 seems to be the buffer size. I just pushed up the p7 esphome config,

can you try compiling a version with the buffer size under the stream component at a higher number - you can uncomment:


Line 109

If you add this config to ESPHome you should be able to push the fw to the device over the network, otherwise you will need to download and manually flash the Legacy binary with esphomeflasher over seria/usb (with no PoE connected)

I've had 2 testers of the p7 using it for several months with no reported issues, and this is the first I'm hearing of this one, so appreciate paitence in sorting it out.

the P7 is using a ESPHome binary built with the esp-idf framework for lower overhead and I've seen faster performance when resetting nvram for example - it takes about 50% less time. It currently does not support Web-OTA fw installs. I also moved back to the current Oxan Stream Server - whare you can read more about the buffer size config: https://github.com/oxan/esphome-stream-server/tree/master#advanced

Thanks

@pugmandan
Copy link
Author

Thanks and no apology needed at all - I want the best and I know this is it. If anything a bit of bug fixing along the way makes the end result all the more rewarding.

I've reflashed as suggested with Line 109 uncommented. I'll leave this issue open for a day as for certain, if it hasn't crashed in that period, I'd consider the issue to be closed.

@pugmandan
Copy link
Author

@tube0013 - I'm sorry to say the same error has occurred - despite reflashing EspHome with thebuffer_size of 2048.

The error message has not changed:

18:20:38 | [E] | [stream_server:145] | Failed to write to client 192.168.10.153 with error 128!
18:20:38 | [W] | [stream_server:168] | Failed to read from client 192.168.10.153 with error 128!

I've attached the ESPHome code (directly copied from my ESPHome instance) to evidence I am not fat fingering anything

config.yaml.txt

@tube0013
Copy link
Owner

Is the ip in the error the z2m host or the coordinator ip?

Thanks

@pugmandan
Copy link
Author

Is the ip in the error the z2m host or the coordinator ip?

Thanks

The ip is the z2m host - the coordinator ip is 192.168.10.156

Interestingly, no signs of anything going wrong in the z2m host logs and when the errors are logging with the coordinator, it is possible to continue operating the zigbee devices via z2m

@pugmandan
Copy link
Author

pugmandan commented Nov 24, 2023

Some additional error code lines which I have not been able to see before:

image

10:58:53 | [E] | [stream_server:104] | Incoming bytes available, but outgoing buffer is full: stream will be corrupted!
10:58:53 | [W] | [stream_server:109] | Dropped 19 pending bytes for client 192.168.10.153

This is the relevant part from the Z2M logs - no other errors or warnings are in the Z2M logs:
error 2023-11-24 10:55:49: Adapter disconnected, stopping
info 2023-11-24 10:55:49: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload '{"state":"offline"}'
info 2023-11-24 10:55:49: Disconnecting from MQTT server
info 2023-11-24 10:55:49: Stopping zigbee-herdsman...
error 2023-11-24 10:55:49: Failed to stop Zigbee2MQTT

@pugmandan
Copy link
Author

pugmandan commented Nov 24, 2023

@tube0013 - something I noticed when reviewing the ESPHome yaml is that you are currently referencing (lines 18-19):

external_components:
  - source: github://oxan/esphome-stream-server

I noticed on an issue thread in that git that you remarked it is very unreliable with ESPHome > 2021.9

Is it possible to adjust your code to use the fork https://github.com/tube0013/esphome-stream-server-v2?

I have attempted but am getting compile errors so far

@tube0013
Copy link
Owner

Yeah, so what happened was the Oxan component was originally used, then it became a bit unreliable with a esphome release after 2021.9. Oxan went quiet with no updates for like a year maybe longer. I hired a developer to help me fork it and get it reliable again. A few months ago Oxan came back with a big update. So I've been tracking that as I honestly don't want to be maintaining the fork if I don't have too. If you want to try my fork I'll send a yaml in a bit, tomorrow at the latest - or you could look at the cc2652p2 2023 Poe yaml for how to configure it yourself.

@pugmandan
Copy link
Author

If you could look at it tomorrow I'd appreciate it. I've attempted to swap in the relevant parts of the code but am getting a compile error due to the dashboard_import: element not being happy.

Thanks again for your support with this

@tube0013
Copy link
Owner

tubeszb-cc2652p7-poe-2023_esp-idf.zip
tubeszb-cc2652p7-poe-2023_arduino.zip

attached are 2 versions using the esphome-streamer from my repo, one is built on the esp-idf framework and the other arduino. Please let me know if these work with out any errors like you were seeing before and are more stable. Included in each zip is the .yaml and a binary compiled on 2023.11.3

@pugmandan
Copy link
Author

Great, thank you @tube0013 - I'll loaded the arduino version first.

Something I have noticed from looking at the data being collected in Home Assistant is that the error seems to be happening every 2 hours like clockwork. I've used the 'TubesZB Serial Connected' sensor as the source of data and have an automation that is in place that when 'TubesZB Serial Connected' becomes unavailable reboot it.

TubesZB Serial Connected was connected - 14:15:46 - 2 hours ago
TubesZB Serial Connected was disconnected - 14:15:46 - 2 hours ago
TubesZB Serial Connected became unavailable - 14:15:38 - 2 hours ago
TubesZB Serial Connected was connected - 12:15:43 - 4 hours ago
TubesZB Serial Connected became unavailable - 12:15:33 - 4 hours ago
TubesZB Serial Connected was connected - 10:15:37 - 6 hours ago
TubesZB Serial Connected became unavailable - 10:15:28 - 6 hours ago
TubesZB Serial Connected was connected - 08:15:34 - 8 hours ago
TubesZB Serial Connected became unavailable - 08:15:23 - 8 hours ago
TubesZB Serial Connected was connected - 06:15:28 - 10 hours ago
TubesZB Serial Connected became unavailable - 06:15:18 - 10 hours ago
TubesZB Serial Connected was connected - 04:15:24 - 12 hours ago
TubesZB Serial Connected became unavailable - 04:15:13 - 12 hours ago
TubesZB Serial Connected was connected - 02:15:18 - 14 hours ago
TubesZB Serial Connected was disconnected - 02:15:14 - 14 hours ago
TubesZB Serial Connected became unavailable - 02:15:09 - 14 hours ago
TubesZB Serial Connected was connected - 00:15:14 - 16 hours ago
TubesZB Serial Connected became unavailable - 00:15:04 - 16 hours ago

I'll feedback on the two bits of code you've sent over tomorrow as I'll need to run each one for at least 2 hours to see if the same behaviour is happening.

@tube0013
Copy link
Owner

What kind of network router are you using? the 2 hours like clockwork, seems to be a dhcp lease type issue.

I had an email support issue similar to this and it was solved by uploading a binary with s static IP:

I figured out my issue - DHCP. Totally odd z2m was losing the connection every 2 hrs which was the DHCP lease time on my IOT vlan. Looking at the logs in pfSense there was a series of DHCP request DHCP ack messages for the static address I had set repeating 20+ times and in that time z2m lost the connection before the address finally settled down.

I set a fixed address in the config in ESP Home and pushed a new build and it's been rock solid since. No idea why it might have been behaving like that but figure I'd let you know in case it shows up again.

@pugmandan
Copy link
Author

@tube0013 I previously had configured a static IP address from within pfsense but had not adjusted the esphome config to reference this.

I can confirm making this change resolved the issue. My apologies for wasting your time on this.

I've had 16 hours uninterrupted connection now so feel very confident the issue has been resolved.

@tube0013
Copy link
Owner

No worries! Glad it's sorted. I think I'm going to figure to add a note for pfsense users to set a static ip in the ESPHome fw.

@OBoudreaux
Copy link

I've got the same issue but I'm running a Unifi Edgerouter X. Coordinator and host both have static IP's. Seeing the same warning and error. Is setting the static IP and flashing this still the preferred fix?

@tube0013
Copy link
Owner

tube0013 commented Jan 9, 2024

@OBoudreaux if you have already flashed a firmware with static IP you should be good. if still experiencing issues please open another issue. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants