Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch riak k8s riak config to pass all riak replicas to the DLService #74

Merged
merged 3 commits into from
Jul 22, 2020

Conversation

iakkus
Copy link
Member

@iakkus iakkus commented Jul 22, 2020

… (instead of just the service name)

This patch passes all riak replica addresses to the DataLayerService, so that it can form its own cluster view with those replicas.

Currently, the riak service URL is passed to the DataLayerService, such that its cluster view consists of only a single riak node. As a result, it falls back to eventual consistency instead of going for strong consistency for KV pairs.

@iakkus iakkus requested a review from manuelstein July 22, 2020 07:56
@iakkus iakkus changed the title patch riak k8s riak config to pass all riak replicas to the DLService… patch riak k8s riak config to pass all riak replicas to the DLService Jul 22, 2020
Copy link
Collaborator

@manuelstein manuelstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, to address individual pods of a service, one also needs to make sure it's a headless service, i.e. the Service rk-<release> needs to have clusterIP: None, otherwise a pod name like rk-<release>-0.rk-<release>.<namespace>.svc won't resolve.

The problem with creating this connect string like we do for DataLayerService is that it won't be updated when it is scaled, e.g. when the Riak cluster scales up or down. Same for DataLayerService's ALL_DATALAYER_BIND (which isn't even a BIND address but a connect string).

FYI when you can't test this on a k8s cluster, you can at least generate the helm template using

helm template \<release\> deploy/helm/microfunctions

and see if the generated string is correct.

While the DLService's connection string to find all other datalayers might soon vanish, this is not a solution we should apply here but make sure a DLService

  • knows as many nodes as it needs (e.g. 5 for a replica count of 3, there's an old article that suggests 5 nodes original link has gone but a copy of the article is on the Internet
  • keeps updating its view of the Riak Cluster in case it is scaling up or down to balance the load accordingly; it seems you found out that it never really updates, maybe we need to change the use of the RiakClient here?

And I think it should be a deployment option whether or not to use this "strong consistency" model which requires a bare minimum cluster size of 3. Then we can have the DLService wait for the minimum of 3 nodes before offering its service.

{{- $port := index .Values "riak" "ClientPortProtobuf" | toString -}}
rk-{{$rlsname}}.{{$namespace}}.svc:{{$port}}
{{- range $num, $e := until (.Values.riak.replicas|int) -}}
{{- printf "rk-%s-%d.%s.svc:%d" $.Release.Name $num $.Release.Namespace ($.Values.riak.ClientPortProtobuf|int) -}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is missing the service name, e.g.
rk-<release>-0.rk-<release>.<namespace>.svc

@iakkus
Copy link
Member Author

iakkus commented Jul 22, 2020

Thanks!

I agree that the scaling problems will still persist. This PR is only supposed to ensure that the DLService starts aware of all the riak replicas so that it can choose the consistency model.

I think we should handle the scaling up/down riak/DLService as a separate issue.

Copy link
Collaborator

@manuelstein manuelstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, wait, the service name is rk-<release>, not just <release>

@iakkus
Copy link
Member Author

iakkus commented Jul 22, 2020

Ok, I didn't see that part; I'll add it.

That said, for the datalayer, the service name is always "datalayer". Shouldn't we also make it depend on the <release>?

Replace the following:

{{- printf "dl-%s-%d.datalayer.%s.svc:%d" $.Release.Name $num $.Release.Namespace ($.Values.datalayer.port|int) -}}

with the following:

{{- printf "dl-%s-%d.dl-%s.%s.svc:%d" $.Release.Name $num $.Release.Name $.Release.Namespace ($.Values.datalayer.port|int) -}}

@manuelstein manuelstein self-requested a review July 22, 2020 12:55
@paarijaat
Copy link
Collaborator

The riak changes look fine.

Regarding the "datalayer" service name change to include release name. This will require changes to atleast 3 to 4 files. I would suggest we handle that in a separate issue.

@iakkus
Copy link
Member Author

iakkus commented Jul 22, 2020

Great, thanks!
I'll merge this now.

@iakkus iakkus merged commit 4752e3f into knix-microfunctions:develop Jul 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants