New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modbus patch, to allow communication with "slow" equipment using tcp #32557
Conversation
Hi @janiversen, It seems you haven't yet signed a CLA. Please do so here. Once you do that we will be able to review and accept this pull request. Thanks! |
Hey there @adamchengtkc, mind taking a look at this pull request as its been labeled with a integration ( |
0c397b8
to
fbfc845
Compare
It would be nice if someone could take a look at this small patch, so it can (hopefully) get merged, and used. |
I think the devs have said that time.sleep is never allowed, but I could be wrong. I'm interested in your usecase, since I also had some trouble getting modbus to read out the Huawei sun2000, and I seem to have different results. If I understand your code correctly, the connect (and delay) is only used once during HA startup, after which there are no problems anymore? Waiting after connect seems to work, but I still need to repeat that frequently (every ~10 seconds) or the problem reoccurs. |
I agree that using time.sleep in the running code is something that should be avoided and are normally not really needed since the worker thread will call every second. However in setup once you return (unless I am mistaken, which very well might be the case) setup is done, so because of that a wait is needed in the function. If you poll the sun2000 too fast it will fail. Do you use the scan interval as pr my example? It runs for me andI might get a read error 2-3 times a day but not more. Can you please provide the modbus part of your config then I can try it on my installation and see what can be done |
I don't use the build-in modbus component but I build my own: https://github.com/Emilv2/huawei_solar I quickly tried your solution using pymodbus manually on an interactive python, and your solution seems to work. Close the connection, connect the client, and then wait two seconds to read the registers, but I need to repeat this every time I start to read registers. Polling it very fast doesn't to fail for me, it's only the first read in a sequence that fails. I would say that if I could get async pymodbus to work that would be perfect, since |
Why do you need to connect for every poll cycle? that creates a bit of stress in the sun2000 and seems to be a lot of overhead. How does the first one fail? no response at all, only a tcp ack or a error message? I have a non home assistant python3 script that works without problems. The Huawei developers have confirmed that a delay after connect is needed. I am actually testing a new integration with async, but async.sleep() does not solve the delay after connect since it is in setup(). I read multiple registers in one read and then have a short async.sleep(). So far I have not had problems with the core of home assistant. I made the setup async as well. |
Just had a fast look at your code, I miss a thread lock is that on purpose? I will try your code this weekend, it looks a lot like my code, with 2 big differences. Based on the configuration I group the registers and read then in blocks and I keep a local value to further limit the number of actual reads. |
Because otherwise it fails just like if I never did the sleep. Without the sleep or after a while it always fails in the same way like this:
I'm fairly new to asyncio, where would I need a thread lock? Which local variables do you keep that I don't? The variables that do not change such as name and serial numbers I poll only once and the ones that change I poll. I thought about reading the registers in blocks, but that seems to make it so much more complicated. |
the thread lock is for both sync and async, it should be around each pymodbus call. Sun2000 do not support multiple connections, and if you open/close-open/close without a larger delay you will rum into problems. asyncio is a bit more complicated, just as reading multiple registers with on read call, but it pays to do it. I keep local variables for all registers, because from a automation pow, only current production and the import/value matters. I actually build a third variable “current comsumption” = production - power meter (import/export). |
I tried on three different devices, and all keep the same problem:
My code does not get stuck on any async part, it gets stuck on the creation of the modbus client. I can try it, but I don't see how a lock can help there? |
the lock is not for the connect() but for the read() however since you reconnect every time, you are very likely to run into problems with multiple connects simultaneously, unless you lock the connect(). As I wrote the lock is not to solve your async problem, since a lock is also needed for the sync calls. Your async problems are different, and I will try to compare my async code to see why yours might fail. I am not at my computer, but I remember I did something about adding keepalive to the socket. |
I don't think adding a lock would make sense in my sync code? All the updates are done sequentially in the same function. While the current production is the most useful for automation, the other values can be of use too for monitoring your setup so I poll them too. |
Shouldn't |
If you are using the lock the update process will have to wait for the sleep to be finished and the lock to be released. According to this there is a difference between the two. |
you are right my comment was wrong (and I deleted it), there is a difference between async and asyncio :-( I am testing async.sleep() right now. |
Changed time.sleep to asyncio.sleep(), and at the same time learned something :-) it seems time.sleep() blocks the process in some versions of python and only the thread in other versions, at least that how I understand the documentation. @Emilv2 thanks for the suggestion. @bdraco I hope this solves your request for change, I made it as an extra commit to make the review easier, if you want me to squash it or make other changes please let me know. |
the assert error in sensor.py seems to be in a previous version, at least I cannot find the assert. found and corrected the problem in switch.py (update --> async_update). @vzahradnik Lets see what other errors you can come up with :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take a look at my comments. Thanks!
@janiversen I tested and reviewed your latest code. Everything works for me now. I found a few cosmetic issues. Otherwise, I think this code is in a good shape, and from my point of view I will hapilly see this PR merged. For a reference, I'm putting here log warning, which needs to be fixed in the /srv/git/hassio/home-assistant/venv/lib/python3.8/site-packages/pymodbus/client/asynchronous/asyncio/__init__.py:250: DeprecationWarning: "@coroutine" decorator is deprecated since Python 3.8, use "async def" instead
def start(self, host, port=502): |
I will make your suggested changes later. I have seen the warnings, but it is really a warning about it will not work in 3.10 so I do not think the pymodbus people will see it as sn urgent issue. Thanks for your time so far, once all is cleaned up (tomorrow) I will ask to get merged. |
Adjust code based on review comments.
@bdraco finally this PR is robust and ready to merge (pending if you have new comments). @vzahradnik have helped a lot with review (as pr. suggestion from the code owner @adamchengtkc) and testing. Now we have an integration a lot more robust and totally async. Looking forward to get this merged, so both @vzahradnik and myself can continue on follow-up PRs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @janiversen Glad to see this is getting closer!
Would you please ensure all outstanding comments are resolved and mark them as resolved?
I did a quick run though. I'll do a more through pass this weekend when I'm not focused on my day job.
Sorry if I made confusion by doing "git rebase", but it is natural to me, to update the commit and thus not first make something wrong and then correct it in a later commit. This way of working stops for me, when a PR is set for final review (as this one is now). If "git rebase" causes too much confusion, please tell me, and I will refrain from using it. |
rebase should be fine. I can't see where you are logging when the device goes unavailable, and logging where it is recovered. I'm looking at 4b1c64f as the sha when I check this out. Can you point me to a specific file / line? |
We are not doing that loging, because it is handled inside the Pymodbus library. It will automatically try to reconnect with a doubling delay for each failed attempt, so there are no spefic log for that. Pymodbus will return an exception when it gives up, and that is handled in init.py line 255, which also handles spurious read errors. Pymodbus will return None, if there are no connection, which will set unavailable at entity level. |
Pymodbus in asyncio mode do not use twister but still throws a warning if twister is not installed, this warning goes into homeassistant.log and can thus cause confusion among users. However installing twister just to avoid the warning is not the best solution, therefore removing dependency on twister.
@bdraco do you have more review comments ? |
I’m still planning on running though this today now that we hit the weekend and I can give it more attention |
remove commented out code.
@janiversen LGTM. I don't see anything else besides the comments out code. Once that is fixed, I can approve. Someone else will likely do another pass on this at some point before or after merge given the size. |
@bdraco thanks a lot for your review. I am working on the followup PR, so I hope is not going to take ages :-) |
Somebody should remove the flag "small-pr" 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just some small comments for a future PR.
self._client = modbus_client | ||
self._lock = threading.Lock() | ||
self._name = name | ||
_LOGGER.debug("Preparing setup: %s", client_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want side effects, eg logging, in init method. Please move it outside, or remove it.
async with self._lock: | ||
kwargs = {"unit": unit} if unit else {} | ||
result = await func(address, count, **kwargs) | ||
if isinstance(result, (ModbusException, ExceptionResponse)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if the client func raised errors instead of returning them. I guess this is the library responsibility.
@@ -56,7 +54,7 @@ | |||
) | |||
|
|||
|
|||
def setup_platform(hass, config, add_entities, discovery_info=None): | |||
async def async_setup_platform(hass, config, add_entities, discovery_info=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename add_entities
to async_add_entities
.
Proposed change
Some modbus-tcp servers (like e.g. Huawei sun2000) need time between the connect() and the first send(), in order to prepare themself.
A typical error pattern is that connect() works perfectly, but when requesting data there are no response (apart from the TCP/IP ACK message). Once in this mode, the server cannot recover and later messages are also not responded to.
The modbus configuration (only for modbus-tcp) have been extended with a new optional parameter "delay" with default being 0. When adding e.g. delay: 2, time.sleep(2) will be called directly after the connect() returns. If delay is not present in the configuration it has a value of 0, and time.sleep() are not called.
Another PR have been filed to update the documentation.
There should not be any breaking changes, basically it is "just" a change from
sync to async. However the library version is bumped from 1.5.2 to 2.3.0, and
that might solve/cause problems.
This started as a 2-3 line change, but now something like 30% of the code is touched,
due to the request of making it async.
Type of change
Example entry for
configuration.yaml
:Additional information
The PR is done as 2 commits.
First commit adds a fixed time.sleep(2), which is the solution (apart from the 2)
Second commit adds the additional items needed to make time.sleep(_delay) configurable
https://github.com/home-assistant/home-assistant.io/pull/12305
Checklist
black --fast homeassistant tests
)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest
.requirements_all.txt
.Updated by running
python3 -m script.gen_requirements_all
..coveragerc
.The integration reached or maintains the following Integration Quality Scale: