Skip to content
This repository has been archived by the owner on Dec 16, 2021. It is now read-only.

Default k8s services domains are hardcoded in trigger, can`t specify it in config #22

Open
mgolovatiy-atconsulting-ru opened this issue Feb 20, 2020 · 14 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@mgolovatiy-atconsulting-ru

Is this a BUG REPORT or FEATURE REQUEST?:
bug

What happened:
After kafka message put into queue, kafka-trigger-controller pod shows error in logs

time="2020-02-20T16:23:07Z" level=error msg="Failed to send message to function: Post http://nodejs-kafka-consumer-sample.<test_namespace_here>.svc.cluster.local:8080: dial tcp: lookup nodejs-kafka-consumer-sample.<test_namespace_here>.svc.cluster.local on 10.233.0.3:53: no such host"

What you expected to happen:
Trigger send message to function pod nodejs-kafka-consumer-sample, can see logged event in its logs.

May be this domain should be got from configMap (kubeless-config?).

How to reproduce it (as minimally and precisely as possible):

  • K8s server, one namespace with all demanded deployments, statefulSets, etc. Have configured kafka+zoo+kafkaHq.
  • Deploy function, trigger:
kubeless function deploy nodejs-kafka-consumer-sample --runtime nodejs12 \
	--dependencies kafkaConsumer_dependency.json \
	--handler kafkaConsumer.consumeTagTopic 	 \
        --from-file kafkaConsumer.js
kubeless trigger kafka create nodejs-kafka-consumer-sample --function-selector created-by=kubeless,function=nodejs-kafka-consumer-sample --trigger-topic nodejs-kafka-consumer-sample-in
  • using kafkaHq add message (topic created by this new function)
  • check trigger pod logs

Anything else we need to know?:
kubelet run with --cluster-domain=k8s.test option
Sources have hardcoded literal with domain:
event_sender.go: 57

 fmt.Sprintf("http://%s.%s.svc.cluster.local:%s", funcName, namespace, funcPort)

Seems to be bug.

Environment:

  • Kubernetes version: 1.11.3
  • Kubeless version: 1.0.5-dirty
  • Cloud provider or physical cluster: physical
  • Kafka statefull set + Zoo statefull set installed/configured before, publishing messages via kafka UI (kafkaHQ)
  • Kafka, Zoo, Triggers, Functions, Kubeless, kubeless-config at one namespace on the test k8s server.
  • HTTP trigger works well (js script in function).
@andresmgot
Copy link
Contributor

Hi @mgolovatiy-atconsulting-ru,

Thanks for reporting the issue. You mention that the cluster domain has been changed in your case. What would be the correct URL for your function? (Rather than http://%s.%s.svc.cluster.local:%s)

@mgolovatiy-atconsulting-ru
Copy link
Author

mgolovatiy-atconsulting-ru commented Feb 21, 2020

svc.cluster.local -> k8s.test that specified as cluster-domain param in kube.
So correct solution should be
fmt.Sprintf("http://%s.%s.%s:%s", funcName, namespace, kubeDomain, funcPort)
where kubeDomain should be configured in configMap or another place.

Or, may be, should query domain via k8s api, don`t know if it is possible. If i know go, i could make PR as soon as noticed this.

@andresmgot
Copy link
Contributor

In theory the .svc is not part of the domain so http://%s.%s.svc.%s:%s", funcName, namespace, kubeDomain, funcPort) should work for you as well.

I see that parameter available in the kubelet configmap present in the kube-system namespace but it seems a bit difficult to retrieve. Probably it's better to configure it just as a flag for the kafka-trigger-controller deployment.

Marking this up for grabs since I don't have the bandwidth to work on this right now. Happy to help to anyone to want to give this a try.

@mgolovatiy-atconsulting-ru
Copy link
Author

mgolovatiy-atconsulting-ru commented Feb 25, 2020

@andresmgot I suggest a solution without introducing an enviroment variable:

  1. kubeless-> add getClusterDomain function (in utils.k8sutil) - shall it be usefull?
  2. kafka-trigger -> add the same function in utils.k8sutil (is it another package?)
func getClusterDomain() string {
	apiSvc := "kubernetes.default.svc"

	clusterDomain := "cluster.local"

	cname, err := net.LookupCNAME(apiSvc)
	if err != nil {
		return clusterDomain
	}

	clusterDomain = strings.TrimPrefix(cname, apiSvc)
	clusterDomain = strings.TrimSuffix(clusterDomain, ".")

	return clusterDomain
}

Seems to be pretty simple.

  1. kafka-trigger -> fix event_sender.GetHTTPReq :
   req, err := http.NewRequest(method, fmt.Sprintf("http://%s.%s.svc.%s:%s", funcName, namespace, getClusterDomain(), funcPort), strings.NewReader(body))

Don't have unix, so cant test well, even can't run makefile.

Need CR due to my zero go-expirence

@andresmgot
Copy link
Contributor

Comments in the PR. I think it's easier to leave this code in kafka trigger for the moment (no need to move it kubeless core).

@mgolovatiy-atconsulting-ru
Copy link
Author

Couldn`t find in which repo new image was put, wish to use it with my test enviroment. Can i use some snapshot or only must wait for stable version?

@andresmgot
Copy link
Contributor

I have a working environment so I have built the image for you, you can use it as andresmgot/kafka-trigger-controller:mgolovatiy-atconsulting-ru and test it

@mgolovatiy-atconsulting-ru
Copy link
Author

Thank u

@mgolovatiy-atconsulting-ru
Copy link
Author

My solution doesn't work, have the same error. Don't understand why net.LookupCNAME("kubernetes.default.svc") returns wrong domain.

This question seems to be raised long before: vmware-archive/kubeless#832
I think it is a good idea to specify domain on creating function or trigger. May be it should be reopened?

@andresmgot
Copy link
Contributor

Sorry to hear that.

I think it is a good idea to specify domain on creating function or trigger. May be it should be reopened?

Well, it's not related to a specific function or trigger but to a cluster. I think this should be a flag for the trigger controller just as it was started in vmware-archive/kubeless#857.

I think you are close with your implementation, you just need to use a command flag instead of trying to auto-resolve the domain.

@andresmgot
Copy link
Contributor

closed by mistake

@andresmgot andresmgot reopened this Mar 2, 2020
@mgolovatiy-atconsulting-ru
Copy link
Author

mgolovatiy-atconsulting-ru commented Mar 2, 2020

May be PR vmware-archive/kubeless#857 is good enough?
I only don't like empty param check (if clusterDomain == "" ), should not it be checked to null too?

@mgolovatiy-atconsulting-ru
Copy link
Author

I wish to test kubeless image with PR #857 changes in my enviroment

@andresmgot
Copy link
Contributor

#857 is not syntactically correct (it doesn't compile) so it's not usable as it is

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants