Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds partitioning library #5169

Merged

Conversation

davidporter-id-au
Copy link
Contributor

  • Adds an interface for handling workflow isolation group redirection and failover
  • Adds an interface for handling drain states for both global and domain-based drains
  • Adds relevant internal types
  • Adds a simple interface for handling global drains as stored in database-based dynamic config
  • Adds a default implementation for the open-source implementation and implementation example

- Adds an interface for handling workflow isolation group redirection and failover
- Adds an interface for handling drain states for both global and domain-based drains
- Adds relevant internal types
- Adds a simple interface for handling global drains as stored in database-based dynamic config
- Adds a default implementation for the open-source implementation and implementation example
@davidporter-id-au davidporter-id-au marked this pull request as ready for review March 20, 2023 22:27

// it's drained, fall back to picking a deterministic but random group
availableList := []string{}
for k, v := range available {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

available seems to only contain healthy isolation groups.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is the intention, yes? I'm not quite sure I understand your comment.?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it only contains healthy groups, we don't need the if condition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, sorry, missed that, yep

// it's drained, fall back to picking a deterministic but random group
availableList := []string{}
for k, v := range available {
if v.State == types.IsolationGroupStateDrained {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest changing the condition to be v.State != types.IsolationGroupStateHealthy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a deny-based implementation. we're only skipping zones which have been drained. Let me sync to make sure we're on the same page here.

// drain.
func pickIsolationGroup(wfPartition defaultWorkflowPartitionConfig, available types.IsolationGroupConfiguration) string {
wfIG, isAvailable := available[wfPartition.WorkflowStartIsolationGroup]
if isAvailable && wfIG.State != types.IsolationGroupStateDrained {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest changing the condition to be wfIG.State == types.IsolationGroupStateHealthy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a deny-based implementation. Isolation groups are explicitly assumed to be healthy unless specified as drained, so I can't rely on their being configured here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the state is not a boolean, we might introduce other state that is not healthy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's possible, but I'm maybe not understanding your point. Actually explicitly healthy groups will be not included in this list at all under this implementation, so picking healthy explicitly is not an option.

return nil, fmt.Errorf("could not resolve domain in isolationGroup handler: %w", err)
}
domainState := domainData.GetInfo().IsolationGroupConfig
globalCfg, err := z.globalIsolationGroupDrains.FetchConfig(ctx, persistence.GlobalIsolationGroupConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need some cache for configStore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but if it's ok let me follow it up. I'd like to do that in a subsequent PR


// Config is the base configuration for the partitioning library
type Config struct {
ZonalPartitioningEnabledGlobally dynamicconfig.BoolPropertyFnWithDomainIDFilter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both ZonalPartitioningEnabledGlobally and ZonalPartitioningEnabledForDomain?
And we might need a better name, like EnableTasklistIsolation or something.

Copy link
Contributor Author

@davidporter-id-au davidporter-id-au Mar 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that name is out of date, let me fix, sorry. I would normally, for pure flipr, just use the one switch, but we probably can't push arbitrary values with overrides there without a bigger refactor

// Config is the base configuration
type Config struct {
ZonalPartitioningEnabledGlobally dynamicconfig.BoolPropertyFnWithDomainIDFilter
ZonalPartitioningEnabledForDomain dynamicconfig.BoolPropertyFnWithDomainFilter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both dynamic config?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we represent rollout on both a global and domain level? I'm totally open to the idea that maybe we don't care about this level of granularity, but I thought for rollout it's sufficiently high-risk that enabling it on a per-domain-basis might make sense.

@coveralls
Copy link

coveralls commented Mar 22, 2023

Pull Request Test Coverage Report for Build 0187126b-39a2-4627-b57a-414e7d443938

  • 45 of 165 (27.27%) changed or added relevant lines in 2 files are covered.
  • 97 unchanged lines in 20 files lost coverage.
  • Overall coverage decreased (-0.02%) to 57.041%

Changes Missing Coverage Covered Lines Changed/Added Lines %
common/partition/default-partitioner.go 17 43 39.53%
common/isolationgroup/default-isolation-group-state.go 28 122 22.95%
Files with Coverage Reduction New Missed Lines %
service/history/queue/transfer_queue_processor_base.go 1 77.62%
common/persistence/nosql/nosqlplugin/cassandra/tasks.go 2 75.57%
common/task/fifoTaskScheduler.go 2 84.54%
common/task/parallelTaskProcessor.go 2 92.75%
common/task/weightedRoundRobinTaskScheduler.go 2 89.64%
service/matching/matchingEngine.go 2 73.33%
common/log/tag/tags.go 3 50.2%
common/persistence/nosql/nosqlplugin/cassandra/workflow.go 3 59.55%
common/persistence/nosql/nosqlTaskStore.go 3 64.17%
common/persistence/statsComputer.go 3 94.64%
Totals Coverage Status
Change from base Build 0186f2fa-8ee8-4c79-b4f4-91fca6856837: -0.02%
Covered Lines: 85510
Relevant Lines: 149910

💛 - Coveralls


// Config is the base configuration for the partitioning library
type Config struct {
IsolationGroupEnabledGlobally dynamicconfig.BoolPropertyFnWithDomainIDFilter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not removed

@davidporter-id-au davidporter-id-au merged commit 940bdb4 into feature/zonal-partitioning Mar 24, 2023
@davidporter-id-au davidporter-id-au deleted the feature/zonal-partitioning-lib-ii branch March 24, 2023 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants