Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improve][DataProxy] Add DataProxy node load information #6364

Closed
2 tasks done
gosonzhang opened this issue Nov 2, 2022 · 2 comments
Closed
2 tasks done

[Improve][DataProxy] Add DataProxy node load information #6364

gosonzhang opened this issue Nov 2, 2022 · 2 comments
Assignees
Labels
stage/stale Issues or PRs that had no activity for a long time type/improve
Milestone

Comments

@gosonzhang
Copy link
Contributor

Description

When the system is running, the SDK will pull the list of DataProxy node addresses of the cluster from the Manager, and select several DataProxy from the list for data reporting processing. With the operation of the system, DataProxy's load will be changed due to changes in the number of connected SDK connections and the amount of messages sent by the business. It is possible that the load of DataProxy will become worse and cannot provide external services, or the load becomes lower which no SDK connecting, and it's service capability cannot be effectively exerted.

In order to solve this problem, I want to add the load information to the DataProxy node, and expose this load information to the Manager and SDK:

  1. DataProxy regularly counts its own load, and reports the load information to the Manager through heartbeat;
  2. The Manager updates stores the load information reported by each DataProxy, and provides it to the SDK when the SDK requests the DataProxy information;
  3. Based on the DataProxy IP list and load information provided by the Manager, the SDK randomly selects the remaining DataProxy nodes after excluding the Top N DataProxy nodes with excessive load, and performs data transmission services; the SDK periodically obtains the DataProxy ip and load information from the Manager;
  4. After the SDK establishes a connection with DataProxy, it decides whether to switch DataProxy for data reporting through the load information returned by DataProxy in the heartbeat response;
  5. The SDK operates according to 3, when selecting a new DataProxy to establish a connection to report data, avoid the DataProxy node of Top N with excessive load.

The whole idea is similar to the following picture:
heartbeat

Let's see if it is OK, if there is no problem, I will try to complete the processing of this part.

InLong Component

InLong Manager, InLong DataProxy, InLong SDK

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

@github-actions
Copy link

github-actions bot commented Jan 2, 2023

This issue is stale because it has been open for 60 days with no activity.

@github-actions github-actions bot added the stage/stale Issues or PRs that had no activity for a long time label Jan 2, 2023
@gosonzhang
Copy link
Contributor Author

The issue is finished and closed

@gosonzhang gosonzhang added this to the 1.9.0 milestone Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stage/stale Issues or PRs that had no activity for a long time type/improve
Projects
None yet
Development

No branches or pull requests

1 participant