Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pre-configuration based assignment #11578

Conversation

jasperjiaguo
Copy link
Contributor

@jasperjiaguo jasperjiaguo commented Sep 12, 2023

https://docs.google.com/document/d/1xxPkGPxyY21gAkFi9gtFDeSzEXjPjp-IQW70kHynsL8/edit#heading=h.nit8xkm24pdw

Assign instances based on a pre-configured instance assignment map. The assignment will always respect the mirrored servers in the pre-configured map. E.g.

  uplift from 15 instances in 3 replicas to 21 instance in 3 replicas
  21 instances in 4 pools
  Pre-configured partitioning:
          RG1    RG2    RG3
    Host  0      1      2
    Host  3      4      5
    Host  6      7      8
    Host  9      10     11
    Host  12     13     14
    Host  15     16     17
    Host  18     19     20
  
  Existing configured partitioning:
          RG1    RG2    RG3
    Host  0      6      2
    Host  12     7      14
    Host  1      4      5
    Host  3      13     8
    Host  9      10     11
  
  Final assignment for this table:
          RG1    RG2    RG3
    Host  0      1      2
    Host  12     13     14
    Host  3      4      5
    Host  6      7      8
    Host  9      10     11
    Host  15     16     17
    Host  18     19     20

In the above assignment, the mirrored servers (row) in the pre-configured partitioning will always be respected when converging the existing assignment to the final assignment, meanwhile, we preserve the existing assignment as much as we can in this process.

Tested using randomized testing scheme with 10000000 iterations. No errors.

@codecov-commenter
Copy link

codecov-commenter commented Sep 12, 2023

Codecov Report

Merging #11578 (8479b70) into master (e826a23) will decrease coverage by 1.75%.
Report is 57 commits behind head on master.
The diff coverage is 67.13%.

@@             Coverage Diff              @@
##             master   #11578      +/-   ##
============================================
- Coverage     63.20%   61.45%   -1.75%     
- Complexity     1141     1147       +6     
============================================
  Files          2343     2376      +33     
  Lines        126343   128744    +2401     
  Branches      19439    19899     +460     
============================================
- Hits          79854    79122     -732     
- Misses        40821    43898    +3077     
- Partials       5668     5724      +56     
Flag Coverage Δ
custom-integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration <0.01% <0.00%> (-0.01%) ⬇️
integration1 <0.01% <0.00%> (-0.01%) ⬇️
integration2 0.00% <0.00%> (ø)
java-11 61.41% <67.13%> (-1.76%) ⬇️
java-17 ?
java-20 ?
java-21 61.30% <67.13%> (?)
skip-bytebuffers-false 61.44% <67.13%> (?)
skip-bytebuffers-true 61.28% <67.13%> (?)
temurin 61.45% <67.13%> (-1.75%) ⬇️
unittests 61.45% <67.13%> (-1.75%) ⬇️
unittests1 46.62% <11.76%> (-20.77%) ⬇️
unittests2 27.68% <66.43%> (+13.28%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
.../assignment/instance/InstanceAssignmentDriver.java 100.00% <100.00%> (ø)
.../main/java/org/apache/pinot/core/auth/Actions.java 0.00% <ø> (ø)
...fig/table/assignment/InstanceAssignmentConfig.java 90.00% <100.00%> (+0.52%) ⬆️
...ent/instance/InstancePartitionSelectorFactory.java 42.85% <50.00%> (+2.85%) ⬆️
...mmon/assignment/InstanceAssignmentConfigUtils.java 0.00% <0.00%> (ø)
...he/pinot/segment/local/utils/TableConfigUtils.java 77.67% <50.00%> (+5.65%) ⬆️
...ntroller/helix/core/PinotHelixResourceManager.java 63.56% <38.46%> (+0.59%) ⬆️
...ance/MirrorServerSetInstancePartitionSelector.java 92.94% <92.94%> (ø)
...ources/PinotInstanceAssignmentRestletResource.java 65.84% <22.72%> (-3.64%) ⬇️
...ntroller/helix/core/rebalance/TableRebalancer.java 66.33% <25.00%> (-2.20%) ⬇️
... and 1 more

... and 289 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@jasperjiaguo jasperjiaguo force-pushed the PreConfiguredInstancePartitionSelector branch 3 times, most recently from b52d864 to 4d822e8 Compare September 14, 2023 21:41
@jasperjiaguo jasperjiaguo marked this pull request as ready for review September 14, 2023 21:42
@jasperjiaguo jasperjiaguo force-pushed the PreConfiguredInstancePartitionSelector branch from 4d822e8 to 745e6b0 Compare September 14, 2023 21:55
@Jackie-Jiang
Copy link
Contributor

What is the difference between this and #8989?

add tenant partition map rest resource

fix logic

refactor and fix some logic

pre-configuration based assignment
@jasperjiaguo jasperjiaguo force-pushed the PreConfiguredInstancePartitionSelector branch from 745e6b0 to 60877d1 Compare October 19, 2023 23:10
@Jackie-Jiang Jackie-Jiang added feature Configuration Config changes (addition/deletion/change in behavior) labels Oct 24, 2023
@Jackie-Jiang
Copy link
Contributor

Do you have a design doc for this? Trying to understand what we want to achieve in this PR

Copy link
Contributor

@vvivekiyer vvivekiyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flushing out partial comments. I'll review the rest of the files meanwhile.

@siddharthteotia
Copy link
Contributor

Do you have a design doc for this? Trying to understand what we want to achieve in this PR

Let's meet to share our context on where we are coming from and the overall goal here. We will setup a meeting

@vvivekiyer
Copy link
Contributor

@jasperjiaguo Capturing our offline discussion here:

The configuration for enabling this feature works as follows:

  1. PropertyStore will have a tenant-level instance assignment mapping. The name of the ZNode is the tenant name (with instancePartitionType i.e offline,consuming,realtime)
  2. TableConfig.instanceAssignmentConfig will contain "partitionSelector" that denotes whether we have to use MIRROR_SERVER_SET_PARTITION_SELECTOR or not.
  3. We additionally overload instancePartitionsMap in tableConfig with detail about the Znode where the tenant-level instance assignment mapping (1) is stored.

There seems to be a few things we can simplify here:
(1) Remove the additional level of redirection i.e populating instancePartitionsMap can be avoided:

  • Each table belongs to a specific server tenant
  • Each tenant has a dedicated Zk node in PropertyStore that stores the instance assignment mapping.
  • So, given a table, we should automatically be able to figure out the corresponding the tenant-level instance assignment mapping from Zk.

(or)

(2) If we need this extra detail, we should probably avoid overloading the already existing tableConfig.instancePartitionsMap with this mapping detail (even though they are mutually exclusive) and have a new field for it

@jasperjiaguo is going to look at the possibility for this. If the change is non-trivial, it's better left as a TODO.

@jasperjiaguo jasperjiaguo force-pushed the PreConfiguredInstancePartitionSelector branch from 208b079 to 367af71 Compare October 26, 2023 22:41
Copy link
Contributor

@somandal somandal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the detailed tests!

assignmentConfig, instanceConfigs, existingInstancePartitions, null);
}

public InstancePartitions assignInstances(InstancePartitionsType instancePartitionsType,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can the existing assignInstances(InstancePartitionsType instancePartitionsType, List<InstanceConfig> instanceConfigs, @Nullable InstancePartitions existingInstancePartitions) call this function since the remaining setup is the same? just to avoid too much code duplication

@siddharthteotia siddharthteotia merged commit 2e37f2c into apache:master Oct 30, 2023
19 checks passed
@walterddr
Copy link
Contributor

this PR broken ColocatedJoinQuickStart can we take a look CC @jasperjiaguo @siddharthteotia

Copy link
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI error:

2023/10/31 12:27:38.097 ERROR [TableConfigsRestletResource] [grizzly-http-server-10] null
java.lang.NullPointerException: null
	at org.apache.pinot.common.assignment.InstanceAssignmentConfigUtils.isMirrorServerSetAssignment(InstanceAssignmentConfigUtils.java:132) ~[classes/:?]
	at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.assignInstances(PinotHelixResourceManager.java:1757) ~[classes/:?]
	at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.addTable(PinotHelixResourceManager.java:1595) ~[classes/:?]
	at org.apache.pinot.controller.api.resources.TableConfigsRestletResource.addConfig(TableConfigsRestletResource.java:227) ~[classes/:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
	at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52) ~[jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:134) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:177) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:219) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:81) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:256) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248) [jersey-common-2.39.jar:?]
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244) [jersey-common-2.39.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:292) [jersey-common-2.39.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:274) [jersey-common-2.39.jar:?]
	at org.glassfish.jersey.internal.Errors.process(Errors.java:244) [jersey-common-2.39.jar:?]
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265) [jersey-common-2.39.jar:?]
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:235) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:684) [jersey-server-2.39.jar:?]
	at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:356) [jersey-container-grizzly2-http-2.39.jar:?]
	at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200) [grizzly-http-server-2.4.4.jar:2.4.4]
	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569) [grizzly-framework-2.4.4.jar:2.4.4]
	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549) [grizzly-framework-2.4.4.jar:2.4.4]
	at java.lang.Thread.run(Thread.java:829) [?:?]
2023/10/31 12:27:38.105 INFO [AddTableCommand] [main] {"code":500,"error":null}
java.lang.RuntimeException: Unable to create offline table - userGroups from schema file [/var/folders/bq/f8s34x8163v0609ry22yqkf40000gn/T/1698780410918/userGroups/userGroups_schema.json] and table conf file [/var/folders/bq/f8s34x8163v0609ry22yqkf40000gn/T/1698780410918/userGroups/userGroups_offline_table_config.json].
	at org.apache.pinot.tools.BootstrapTableTool.bootstrapOfflineTable(BootstrapTableTool.java:191)
	at org.apache.pinot.tools.BootstrapTableTool.execute(BootstrapTableTool.java:104)
	at org.apache.pinot.tools.admin.command.QuickstartRunner.bootstrapTable(QuickstartRunner.java:232)
	at org.apache.pinot.tools.Quickstart.execute(Quickstart.java:85)
	at org.apache.pinot.tools.admin.command.QuickStartCommand.execute(QuickStartCommand.java:178)

@walterddr
Copy link
Contributor

should be fixed by #11915

@jasperjiaguo
Copy link
Contributor Author

should be fixed by #11915

Thanks for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Configuration Config changes (addition/deletion/change in behavior) feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants