Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Profalux shutters] Periodic movement #3102

Closed
vanackej opened this issue Jul 31, 2020 · 68 comments · Fixed by #3416
Closed

[Profalux shutters] Periodic movement #3102

vanackej opened this issue Jul 31, 2020 · 68 comments · Fixed by #3416

Comments

@vanackej
Copy link

Describe the bug

Every few minutes (30 to 60, I don't know exactly), all my Profalux shutters are doing a quick up/down movement (less than 1 second).
I have 4 shutters, and they are all doing the same movement at the same time.
If I turn the gateway down, the movement does not occur anymore, so the movement is related to the gateway.

Expected behavior

No shutters movement when no action performed.

Environment

  • Host system: Intel core i9 server
  • Running method: Proxmox VM, Debian OS
  • Firmware version: 26580700
  • deCONZ version: 2.05.79
  • Device: ConBee II
  • Do you use an USB extension cable: no

deCONZ Logs

Provided here : #2739 (comment)

Additional context

To be sure, I also completely shut down my domotic solution (Jeedom) and my Wifi network, and the problem still occurs with the gateway alone.

@vanackej
Copy link
Author

Hello @Smanar,
Back from holidays, I tried to disable the rest api part as you advised in this comment.
The periodic movement does not happen when the rest API is disabled, so I'm pretty confident now that the issue comes from it.
In the attached logs, it occurred again with a single shutter in the zigbee network, around 15:04:28 (maybe during the 2 seconds before/after).
deconz2.log

@Smanar
Copy link
Collaborator

Smanar commented Aug 14, 2020

I realy see nothing in your log about the profalux device, how many are paired during the previous log ?

0x20918A00000AFBA1
0x20918A00000AFBA0
0x20918A00000AFBA2

But It seem you have some upnp traffic, you haven't some device like alexa or google home that can use the API ?

Some other thing you can try.

  • wich one parameter are you using for debuging (in the command line) ?
  • on deconz, on the node title, on the first circle you can send again some zigbee request.
  • try to use the API, but remove the device from it. Or include the device using deconz and not phoscon, like that the device will be in deconz but not in the api (and the api can be active)

@vanackej
Copy link
Author

I deleted the 3 other shutters in the webapp, so for this log there is only 1 shutter left.

By the way, what happens when deleting the lights using the webapp ? Because my shutters have been also disassociated from their physical remotes when I removed them, so I think I have to perform a full reset (which is really a pain to do on those devices)

Here are the nodes in the deconz app. I don't understand what the other nodes are.
nodes

I'm pretty sure the issue is related to some internal timer/scheduled task in the gateway, as it occurs exactly every 30 minutes after the deconz app/gateway startup. If I restart the gateway at 12:00, it will happen at 12:30, 13:00, 13:30, ...

My command line is /usr/bin/deCONZ --http-port=80 --dbg-info=2 --dbg-zdp=1 --dbg-zcl=1 --dbg-aps=1 --dbg-http=1 > deconz.log &. I don't know about all the possible dbg options,I just found it in an other issue comments.

@Smanar
Copy link
Collaborator

Smanar commented Aug 15, 2020

Ha, I haven't thinked at remote ! But as they aren't in the API, I don't think they are guilties.

Your command line is perfect ^^, not possible having more informations, but I hope you have disabled it ATM ? (or you don't use SD card).

I m checking what can be trigger every 30 mn in deconz. But about your unpn devices, you haven't something special ?

@Smanar
Copy link
Collaborator

Smanar commented Aug 16, 2020

15:04:31:686 Idle timer triggered
15:04:01:686 Idle timer triggered

But on the code it happen more than every 30 mn, ca you check if you have it ONLY every 30 mn ?

have you tried to change the time ?

@vanackej
Copy link
Author

No, really nothing specific regarding upnp devices. And the date/time of the problem does not match the upnp logs time.
I will set up an iptables rule in order to block all incoming upnp traffic, in order to be really sure it's not related.
And yes, I can confirm it happens only every 30 minutes, I reproduced several times, with multiple deconz restarts.
What do you mean by changing the time ? the time of what ?

@vanackej
Copy link
Author

I can confirm the problem is still happening after blocking all upnp traffic with iptables

@Smanar
Copy link
Collaborator

Smanar commented Aug 17, 2020

So you have this log "Idle timer triggered" only every 30 mn ? and I think in same time than the VR is moving ?

If no, you can forget what i wrote below ^^

The fonction void DeRestPlugin::idleTimerFired() is trigger every 30s, but the part with this log use a counter than can be set a 60 in de_web_plugin.cpp

        if (d->otauLastBusyTimeDelta() < OTA_LOW_PRIORITY_TIME)
        {
            d->idleLimit = 60;
        }

So if you still have the code ready to be compiled, try to set this value at 30, to see if bug appear every 15 mn.

@vanackej
Copy link
Author

I have the log every 30 seconds, not every 30 minutes, so I don't really think it is related.

BTW, there is a new dev version that have been release 2.05.80, maybe I can try to update (I assume all the Profalux-related changes have been included in this version)

@Smanar
Copy link
Collaborator

Smanar commented Aug 18, 2020

Yep, the change have been validated.
But I still don't find an even that happen only every 30 mn ....

@Smanar
Copy link
Collaborator

Smanar commented Aug 18, 2020

@vanackej do you know how to use the API ?
To check the group JSON ? with an url like http://IP:PORT/api/KEY/groups

@vanackej
Copy link
Author

Yes, I know of course
I didn't create any group, so the response is empty.
Should I ?

@Smanar
Copy link
Collaborator

Smanar commented Aug 19, 2020

Nope, was just to be sure the problem was not from an old scene.

@Smanar
Copy link
Collaborator

Smanar commented Aug 20, 2020

Ok So have asked others devs, they have said me to check the same fonction DeRestPlugin::idleTimerFired()

If you are agree, i can make a special version with 3/4 more debug lines because ATM I realy don't have idea for this bug.

@vanackej
Copy link
Author

I'm ok to test a new custom build, let me know about it.

One new strange thing: After letting the gateway up for a few days, the problem disappeared, no more movement every 30 minutes. And shutters control was still OK using the API. But as soon as I restart the gateway, the problem is back. I don't know how long it takes for type problem to disappear after gateway startup, I will try to be careful.. It's a really strange one :-(

@Smanar
Copy link
Collaborator

Smanar commented Aug 21, 2020

So I have disabled pooling have added some more debug line > Smanar@3596678
If you can try it

git clone --branch profalux_test https://github.com/Smanar/deconz-rest-plugin.git

For debug line, it s to find the only one that appear every 30mn and not before.

@vanackej
Copy link
Author

I'll try on Sunday when I come back home, I'll let you know, thanks

@vanackej
Copy link
Author

vanackej commented Sep 1, 2020

Some updates after a few days of intensive testing, with and without your modified version :

  • The issue still happen with your version
  • I see "debug 1" a lot, but never seen "debug 2" log
  • If I change the server time manually, I'm able to reproduce the issue. If I launch the deconz process at 5, and set the clock to 5:29, the shutter make the movement the minute after.
  • It stops happening when the day change (i.e. after midnight the day the deconz process was started). Also able to reproduce by changing the server clock

@Smanar
Copy link
Collaborator

Smanar commented Sep 1, 2020

Ha, nice, usefull.

So if you start the inclusion at 23h, 1 h after no more issues ?

Debug 1 is every 30s ? 30 min ? or other ?

It mean if it s from deconz, there is a bugged timer that stop after midnight .... Strange, all timer can work during months

I have just find a 30 mn timer, but no rapport with you

QTimer::singleShot(30 * 60 * 1000, this, SLOT(timeManagerTimerFired()));

@vanackej
Copy link
Author

vanackej commented Sep 1, 2020

Yes, if I start the deconz process at 23h, it will occur at 23:30 then no more.

Here are my latest "debug 1" logs. The interval is not always exactly the same, but it looks like it runs every 30 seconds most of the time.

01:43:27:830 Profalux debug 1
01:43:57:830 Profalux debug 1
01:44:27:829 Profalux debug 1
01:44:29:829 Profalux debug 1
01:44:30:829 Profalux debug 1
01:45:00:830 Profalux debug 1
01:45:30:830 Profalux debug 1
01:46:00:829 Profalux debug 1

@vanackej
Copy link
Author

vanackej commented Sep 1, 2020

I think I found a least where the problem is coming from (but still don't get why for now)
Each time the problem occurs, I see the log send permit join, duration: 0
The condition around permit join (permitJoin.cpp:245) is if (!permitJoinLastSendTime.isValid() || (diff > PERMIT_JOIN_SEND_INTERVAL)), the constant here being 1800 * 1000 ms=> 30 minutes!

@vanackej
Copy link
Author

vanackej commented Sep 2, 2020

I can confirm that the problem disappear when I make the permitJoin.cpp:245 contion always false.
I'm not familiar with Zigbee protocol, so what I say here is maybe incorrect. But from my understanding, when the shutter receive the permit join message, it confirms he is already member of the network by doing this quick movement.
It still does not explain why it stops happening after midnight.
Do you know why the REST gateway send a permit join every 30 minutes ? Is it a "standard" Zigbee behavior ? Everything seems to be ok for me with it being disabled

@Smanar
Copy link
Collaborator

Smanar commented Sep 2, 2020

Hu ?? I need explanation for the calcul

1800 * 1000 ms=> 30 minutes!

^^

BTW perhaps that can be usefull for you e10b530#diff-7eef30a0e3da84572a02ee75c64e5264

And yes shutter make this move everytime you set the permit join, but in my mind it was everytime with value not 0 ....

I m still searching explanation.

@vanackej
Copy link
Author

vanackej commented Sep 2, 2020

30 minutes = 30 x 60 seconds = 1800 secondes, then multiply by 1000 to convert into milliseconds ^^
So yes, that was the condition that triggered the permit join after 30 minutes. I even made custom builds with differents values (shorter ones), and it happenned as expected according to the new value.
I tried to build and run master branch (which includes the commit you mentionned), but the problem is the same.
Permit Join with 0 value is still emitted every 30 minutes, and each time the shutter acknowledge by moving up and down.

@Smanar
Copy link
Collaborator

Smanar commented Sep 2, 2020

Lol, ha yes I m making reverse calcul, tired ....., but soon on holiday ^^.

I m asking to manup ATM, because I don't see the utility of this feature. So IDK yet if it s a bug or not.

@Smanar
Copy link
Collaborator

Smanar commented Sep 4, 2020

Ok so some news

The periodic sending of permit join (=0) is on purpose and meant to ensure that the network is closed.
The duration should be the value as given by the API, it's decremented automatically. There were fixes a couple of days back, so hopefully the current alpha should not show problems.

You can try the modification 3 posts above, but I m not sure it change something, your situation is special, and still no information for the reason it stop at midnight.

@vanackej
Copy link
Author

vanackej commented Sep 6, 2020

As I said in my previous message, I already tried the fix above, as I made some tests with master branch, that includes the fix we are talking about. Not better.

About why it stops at midnight, I looked at the code and I think it's pretty obvious: Time comparisons are all made using QTime objects, that only represents time elapsed since midnight, not including the date part. So, when permitJoinLastSendTime is 23:50 and currentTime is 00:00, the diff is a negative value (-10 minutes in my exemple), and will never be more than PERMIT_JOIN_SEND_INTERVAL.
permitJoinLastSendTime should be kept in a QDateTime object.I made the fix locally, and it's now ok. I can make a merge request with the fix.
The "funny" thing is that now, with the "fixed" version, my bug does not disappear after midnight anymore 😭

@Smanar
Copy link
Collaborator

Smanar commented Sep 6, 2020

Lol yes, It for that I prefer wait for an anwser from manup.

I can too put a hack just for your device, but on permit join procedure directly, I don't realy like that.

@vanackej
Copy link
Author

vanackej commented Sep 6, 2020

Is there a channel on the discord server I could be directly involved in the discussion about the issue ?
What is surprising to me, is that according to git history, this bug after midnight is at least present for the last 7 years, and no one noticed. So I think this periodic sending of permit join = 0 is maybe not that usefull...

@Smanar
Copy link
Collaborator

Smanar commented Sep 20, 2020

@vanackej Hello, you have tested the patch ?

@vanackej
Copy link
Author

Hi,
I just tested. The code does not compile, as a semicolon is missing in de_web_plugin_private.h
I fixed that, but the parameter does not seem to be updated. I added a log statement in permitJoin.cpp and gwDisablePJSecurisation value is always 0/false. I'm not familiar with config mecanism (API => database storage and live code value update), but something is wrong here

@vanackej
Copy link
Author

When I fetch back the config using the API, the DisablePJSecurisation is always false. It does not seem to be updated successfully, event if I got a success response during config PUT :
[ { "success": { "/config/DisablePJSecurisation": true } } ]

@Smanar
Copy link
Collaborator

Smanar commented Sep 21, 2020

Ok my bad, it s the problem with making code using only copy/paste.

both error are corrected.

@vanackej
Copy link
Author

vanackej commented Sep 21, 2020 via email

@Smanar
Copy link
Collaborator

Smanar commented Oct 2, 2020

@vanackej you still need the option ? ^^

@vanackej
Copy link
Author

vanackej commented Oct 3, 2020

Yes, I do !
I have issues, the gateway does not seem to start when I replace the original rest plugin lib with your patched version.
Is it compatible with latest stable version (2.05.81) ?

@Smanar
Copy link
Collaborator

Smanar commented Oct 3, 2020

I don't remember, I can update the code with the last one ( > 83), but I don't think using an old version change something for you (no recent hardware ?)

But I m checking the code again, and I don't see why the gateway can be blocked ....

On jeedom, you can change the log level in a configuration part, can you take a look if you see something bad ?

@vanackej
Copy link
Author

vanackej commented Oct 3, 2020 via email

@Smanar
Copy link
Collaborator

Smanar commented Oct 3, 2020

Done, I have updated the code with the last deconz version > 83.

You can miss a file, depend off the last beta version you have used

All available button maps have been outsourced to an external JSON file named button_maps.json, which must reside in the same directory as the deCONZ database zll.db. The purpose is to speed up integration of new switches as users can now create it theirselves (given that some appropriate guidance is provided).

@vanackej
Copy link
Author

vanackej commented Oct 4, 2020

Hi,
I only have v 81 available, even installing the deconz-dev package. Where am I supposed to get v 83 ?
I tried to compile and run the plugin with v81, it does not start

@Smanar
Copy link
Collaborator

Smanar commented Oct 5, 2020

You are on the stable branch, you need to switch to the beta one for other versions if you are using packages.
else you have all versions here http://deconz.dresden-elektronik.de/raspbian/beta/

But I don't think lib V 83 need a deconz V83 too.

On my side I m on deconz version 82, it have started.
tmp2

Edit:

Ha wait a little, it seem there is a problem yes

@vanackej
Copy link
Author

vanackej commented Oct 5, 2020

On my side, the process starts, everything seems ok at first, but then unable to log in using the webapp.
If I restore the original libde_rest_plugin file, it works again.

@Smanar
Copy link
Collaborator

Smanar commented Oct 5, 2020

I don't have same problem, but yes, I have too some problem with the last master, need to search why.

@Smanar
Copy link
Collaborator

Smanar commented Oct 7, 2020

Ok so I have some problem specific to my hardware, other can compile code without problem.
On my side node title are missing, and websocket not working, but can use the webapp, and BTW make a boobkmark after the logging in deconz else you will have a new entry created every time you log in.

So I m using the version 83 beta, with the compiled lib else you need to copy the file button_maps.json in /usr/share/deCONZ/devices

@vanackej
Copy link
Author

Hi,
I've been struggling with my local setup, but it seems ok now. I just disabled periodic permit join in my local setup, I will let you know if it's ok

@Smanar
Copy link
Collaborator

Smanar commented Oct 11, 2020

Hello,
Ok tell me if the setting is working, the date limit for PR is realy close.

@vanackej
Copy link
Author

Hi,

When DisablePJSecurisation is set, the behavior is ok.
But the parameter is not persisted across service restart. As soon as I restart the gateway, DisablePJSecurisation is back to false.

@Smanar
Copy link
Collaborator

Smanar commented Oct 12, 2020

Ok, I m still not understanding ...

I need to recompile on my side to be sure, but nothing special on code.

You can see something in log ? the database is parsed at plugin start.

@vanackej
Copy link
Author

vanackej commented Oct 13, 2020 via email

@Smanar
Copy link
Collaborator

Smanar commented Oct 13, 2020

Arf good catch, the code memorise it with the lowercase and the code work with the upper ...
It s working with this modification ?

@vanackej
Copy link
Author

vanackej commented Oct 13, 2020 via email

@vanackej
Copy link
Author

@Smanar Yeah it's ok. After fixing the typo, the config is properly stored and reloaded after restart.
I think it's ready for a PR

@acacia42
Copy link

acacia42 commented Feb 3, 2021

Bonjour,j’ai aussi le même problème avec mes volets profalux .Je sais que vous avez réglé le problème .Mais je ne sais pas comment intégrer cette solution dans mon système .Merci

@vanackej
Copy link
Author

vanackej commented Feb 3, 2021 via email

@acacia42
Copy link

acacia42 commented Feb 3, 2021

Merci beaucoup,je vais essayer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants