-
-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenDTU messing up configuration during runtime #83
Comments
Could you please doublecheck the heap usage on the system overview if this error occours again? |
The Heap usage look a little bit high, but not too much. I am currently seeing something between 108 and 110kb. You mentioned in the other issue that this only occours for one of your two ESP's. Have you tried to reflash this one again (After a complete flash erase)? Do you see anything special in the serial consoole before this issue happens? |
Today, before I flashed a new version, the uptime was > 2 days without any issues. So you maybe have some kind of different config. Are you using MQTT with TLS? |
no, just plain mqtt with HA discovery |
it seems to be appearing only with mqtt enabled so i dug a bit in that code and found #86 might be related |
finally the issue also appeared on a dtu without mqtt enabled, this was after 3d17h uptime. The dtus with mqtt enabled still seem to fail faster. |
Found this in "WebApi_ws_live.cpp" DynamicJsonDocument root(40960);
JsonVariant var = root;
generateJsonResponse(var);
size_t len = measureJson(root);
AsyncWebSocketMessageBuffer* buffer = _ws.makeBuffer(len); // creates a buffer (len + 1) for you.
if (buffer) {
serializeJson(root, (char*)buffer->get(), len + 1);
_ws.textAll(buffer);
} It seems to me, that the buffer is created of the size of the Json document. But in |
OK, ignore my comment. I had to learn, that |
@HacksBugsAndRockAndRoll are you running two ESPs in parallel and do you at least use distinct DTU_RADIO_IDs for them ? |
Yes to both of these questions. |
Can you confirm this issue also occurs if you are running only one ESP ? |
There should be no misinterpretation of the data when using two different DTU id's because one DTU wouldn't see the packages of the other one. Unless.... How different are the DTU id's? The RF packet only contains the lowest 4 bytes of the ID. If this bytes are identical there might be an issue. (But the chance is very small because there are 3 different CRC checksums which have to match) |
I observe the same issue. Restarting the ESP makes it go away. It appears randomly after 1-3 days of uptime. When it happens, the DTU is accessible via the web interface, but stops polling the inverters. |
@petrm which inverter(s) are you using? Are you using mqtt? If yes, what configuration? Are you using a DTU-Lite or Pro in parallel? What is your current installed Git Hash (Info --> System)? |
@petrm and @HacksBugsAndRockAndRoll do you have a chance to log the output of the serial console for a longer time? it would be interessting what happens just before this issue. (If there is something special in the serial console) I still try to reproduce this issue. Are you doing anything special? (e.g. poll the web api using curl etc? Or something else which I may not have in mind currently?) I am rebooting my ESP regulary because of development work but I also reach uptimes of 5-10 days without problems. |
Since there is no way to get any log remotely, I can only attach the serial console when I am back in about three weeks.
|
This is absolutly correct. There is a config structure which stores the serial number etc. but when showing the type or exporting mqtt stuff the internal data structures of the hoymiles library are used. There is an vector which stores the inverters inside the hoymiles lib: OpenDTU/lib/Hoymiles/src/Hoymiles.h Line 30 in 9a44324
Anywhere in the code a part of this structure gets partly overwritten. (And therefor any future functionality is totally random). But this does not happen for all users. I would suspect some issue with the packet parser but without knowing the exact received packets its a little bit hard to analyze. (And due to the memory corruption it might be also wrong in an web output) |
Since I do not have a computer in a location where the DTU is in range to the inverter I'll need to set something up probably with a raspberry - this might take some time. To further explain this, I have my fork running on these two DTUs the only change I made is a regular detection for corrupted DTU serials in the configuration - in case a corruption is found a restart is triggered ( https://github.com/HacksBugsAndRockAndRoll/OpenDTU/blob/local/configfix-workaroud/src/ConfigFix.cpp - I know this is not the solution to the problem, but it is what allows me to use openDTU for my "productive" setup as long as the bug exists ). The blue line shows the device with MQTT enabled which seems to increase the chance of corruption - also this device shows the corruptions during the night time. |
If the corruption also occours during the night time it's maybe not related to the response of the inverter. But then it should be sufficient if you place one of your ESP's out of range of the inverter but connected to a computer to get the serial output. |
Can you maybe download your config file (Settings --> Config Management), open the .bin file using a hex editor, overwrite your WiFi password with X (do not change the length of the file, just overwrite the characters) and provide this in some way? Then I can import your config with all applied settings and see if this issue occurs. |
Sure I'll have a look. So far I can report, that my setup in my room (no inverter connectivity but active mqtt) corrupted only once since last weekend. Unfortunateyl I did not get any meaningful logs since the rebooting did not wait for the serial to flush. Since I fixed this (Serial.flush() then reboot) 4 days ago no corruption happened on this device - I can however move the whole setup into inverter range now since I found a raspberry to attach to it. |
Here is my config |
Short update. I now have 20 days uptime with d7fe495 and so far stable, no suspicious log messages or errors. |
@tbnobody havent looked into OpenDTU for the |
Currently I do not have a whole lot of time for this project. I can say, that I am running https://github.com/tbnobody/OpenDTU/commits/59b87c5 which is a slightly adjusted (self reset on corrupted config) version of 9a44324 and I still see the self resets triggered. I'll need to rebase my stuff and update some time. |
@HacksBugsAndRockAndRoll as we are unable to reproduce this issue on other devices, |
Would close this issue as it's really old and there where a lot of code iterations. Please open a new one if the problem occours again. |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new discussion or issue for related concerns. |
During normal runtime on version https://github.com/tbnobody/OpenDTU/commits/0cc6ce3 the inverter configuration seems to break.
curl gives this info /serial changed, but it is the correct one)
mqqt publishes like this
OpenDTU no longer pulls data from the inverter in this state.
The text was updated successfully, but these errors were encountered: