Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HaClient] Asyncua ha client #367

Merged
merged 5 commits into from
Jan 1, 2021

Conversation

ZuZuD
Copy link
Contributor

@ZuZuD ZuZuD commented Dec 19, 2020

See discussion #314.

This adds a Client interface named HaClient which supports two or more servers for high availability purposes.
You'll find some examples and tests to guide you through this new feature.

Under the hood, it starts a keepalive task to monitor the status and service level for each server.

It also includes an HaManager task which promotes a primary and reconnect unhealthy clients. Unhealthy clients are detected based on the socket status and keepalive data collected.

The requests received by users as well as the HaManager decisions produce an ideal state called the "ideal_map". Regularly, the reconciliator task kicks in, lock this configuration, and applies the "ideal_map" configuration.
It then stores the actual subscriptions status into the "real_map" using the lower level components of the library.

Current limits:

  • We only support HA WARM mode and datachange notifications (no event/status_change).
  • We support multiple subscriptions, but a node should only be subscribed once.

Component details:

  • HaClient:

    • Configuration based on dataclasses: HaConfig (required) and HaConfigSecurity (optional)
    • Responsible to start up sides tasks: keepalive, hamanager, reconciliator
    • Mutates the ideal_map via VirtualSubscription
    • Generic hooks (i.e: to monitor subscription performance)
  • KeepAlive (task):

    • Regularly hits the server to check its service_level / status.
  • HaManager (task):

    • Promote primary / secondaries
    • Reconnect disconnected/unhealthy client based on the keepalive
      feedback.
  • Reconciliator (task):

    • Applies the ideal configuration
    • Perform checks on the call responses
    • Mutates the real_map
  • VirtualSubscription:

    • Key component of the ideal and real maps.
    • Exposes an interface similar to the subscription but only
      store the settings.

It exposes an interface similar to Client. You'll find some examples
and tests to guide you through this new feature.

Under the hood, it starts a keepalive task to monitor the status and
service level for each server.

It also includes an HaManager task which promotes a primary and
reconnect unhealthy clients. Unhealthy clients are detected based
on the socket status and keepalive data collected.

The requests received by users as well as the HaManager decisions
produce an ideal state called the "ideal_map". Regularly, the reconciliator task
kicks in, lock this configuration, and applies the "ideal_map" configuration.
It then stores the actual subscriptions status into the "real_map" using the
lower level components of the library.

Current limits:

- We only support HA WARM mode and datachange notifications (no event/status_change).
_ We support multiple subscriptions, but a node should only be subscribed once.

Component details:

- HaClient:
    - Configuration based on dataclasses: HaConfig (required) and HaConfigSecurity (optional)
    - Responsible to start up sides tasks: keepalive, hamanager, reconciliator
    - Mutates the ideal_map via VirtualSubscription
    - Generic hooks (i.e: to monitor subscription performance)

- KeepAlive (task):
    - Regularly hits the server to check its service_level / status.

- HaManager (task):
    - Promote primary / secondaries
    - Reconnect disconnected/unhealthy client based on the keepalive
    feedback.

- Reconciliator (task):
    - Applies the ideal configuration
    - Perform checks on the call responses
    - Mutates the real_map

- VirtualSubscription:
    - Key component of the ideal and real maps.
    - Exposes an interface similar to the subscription but only
    store the settings.
@oroulet
Copy link
Member

oroulet commented Dec 20, 2020

Thanks. Obviously you have done a lot of work and the code looks very clean.
But this is so much code and it looks so complicated that I am unsure what to do with it. I am afraid it will quickly becomes a huge code amount that is in repository but is very little used, gets broken and nobody can fix it...
We can start by waiting a little and see if people use that kind of features

@ZuZuD ZuZuD force-pushed the ha_client_asyncua_latest branch 3 times, most recently from 63abfa7 to dacdd1d Compare December 23, 2020 12:15
CancelledError and TimeoutError have moved from concurrent.futures to
asyncio. Note that CancelledError now inherits from BaseException and
not Exception anymore. See https://bugs.python.org/issue32528 for the
details.
Finally, fix pytest_yield_fixture deprecation
@ZuZuD
Copy link
Contributor Author

ZuZuD commented Dec 24, 2020

Thanks. Obviously you have done a lot of work and the code looks very clean.
But this is so much code and it looks so complicated that I am unsure what to do with it. I am afraid it will quickly becomes a huge code amount that is in repository but is very little used, gets broken and nobody can fix it...
We can start by waiting a little and see if people use that kind of features

Thanks for your honest feedback @oroulet, and I totally understand your point. To give you more context, I currently use this HaClient in production with more than 20 OPC-UA "clusters". This is just to highlight that this is more than a PoC, and that I'll likely stay an active contributor should you merge this.

I may also know other contributors that are interested or already testing this feature. I will check with them if they can share their experience here to build up more support.

@oroulet oroulet closed this Dec 24, 2020
@oroulet oroulet reopened this Dec 24, 2020
@oroulet
Copy link
Member

oroulet commented Dec 24, 2020

I closed it by mistake ;-)

class KeepAlive:
"""
Ping the server status regularly to check its health
"""
Copy link
Member

@oroulet oroulet Dec 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isnt't that KeepAlive clas a (better) duplicate of the KeepAlive class in client.py? Could we merge them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a KeepAlive thread in python-opcua which was replaced by an async task. Since its role is to renew the secure_channel, its frequency depends on the session_timeout and secure_channel_timeout. However, users of the HaClient could want to check the server status more frequently. Keeping those tasks apart feels like the right SoC.

await asyncio.gather(*tasks_add, return_exceptions=True)
await asyncio.gather(*tasks_del, return_exceptions=True)

async def update_nodes(self, real_map, ideal_map, targets: Set[str]):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

several methods in that class are too long, they will be hard to maintain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored in the following commit.

asyncua/client/ha/utils.py Outdated Show resolved Hide resolved
@oroulet oroulet merged commit f874f7d into FreeOpcUa:master Jan 1, 2021
AndreasHeine pushed a commit that referenced this pull request Feb 5, 2021
* [HaClient] Asyncua client wrapper

It exposes an interface similar to Client. You'll find some examples
and tests to guide you through this new feature.

Under the hood, it starts a keepalive task to monitor the status and
service level for each server.

It also includes an HaManager task which promotes a primary and
reconnect unhealthy clients. Unhealthy clients are detected based
on the socket status and keepalive data collected.

The requests received by users as well as the HaManager decisions
produce an ideal state called the "ideal_map". Regularly, the reconciliator task
kicks in, lock this configuration, and applies the "ideal_map" configuration.
It then stores the actual subscriptions status into the "real_map" using the
lower level components of the library.

Current limits:

- We only support HA WARM mode and datachange notifications (no event/status_change).
_ We support multiple subscriptions, but a node should only be subscribed once.

Component details:

- HaClient:
    - Configuration based on dataclasses: HaConfig (required) and HaConfigSecurity (optional)
    - Responsible to start up sides tasks: keepalive, hamanager, reconciliator
    - Mutates the ideal_map via VirtualSubscription
    - Generic hooks (i.e: to monitor subscription performance)

- KeepAlive (task):
    - Regularly hits the server to check its service_level / status.

- HaManager (task):
    - Promote primary / secondaries
    - Reconnect disconnected/unhealthy client based on the keepalive
    feedback.

- Reconciliator (task):
    - Applies the ideal configuration
    - Perform checks on the call responses
    - Mutates the real_map

- VirtualSubscription:
    - Key component of the ideal and real maps.
    - Exposes an interface similar to the subscription but only
    store the settings.

* [HaClient] fix test for py3.8/py3.9

CancelledError and TimeoutError have moved from concurrent.futures to
asyncio. Note that CancelledError now inherits from BaseException and
not Exception anymore. See https://bugs.python.org/issue32528 for the
details.
Finally, fix pytest_yield_fixture deprecation

* [HaClient] address the comments (utils + refacto reconciliator)

Co-authored-by: oroulet <oroulet@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants