Fix issue with long data source names#14620
Conversation
| .toLowerCase(Locale.ENGLISH), 63); | ||
| } | ||
|
|
||
| public static String convertTaskIdToJobName(String taskId) |
There was a problem hiding this comment.
whats the limit on length of job name?
There was a problem hiding this comment.
https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/ (its 63 characters)
There was a problem hiding this comment.
Do we still need to do this conversion if taskId is less than 63 chars? I mean whether it's valuable to keep taskId same as JobName if it's possible.
There was a problem hiding this comment.
we do need to since there are some characters in task ids that are not valid in k8s labels/job names
| .toLowerCase(Locale.ENGLISH), 63); | ||
| } | ||
|
|
||
| public static String convertTaskIdToJobName(String taskId) |
There was a problem hiding this comment.
Do we still need to do this conversion if taskId is less than 63 chars? I mean whether it's valuable to keep taskId same as JobName if it's possible.
| } | ||
| catch (Exception e) { | ||
| throw new KubernetesResourceNotFoundException("K8s pod with label: job-name=" + k8sTaskId + " not found"); | ||
| throw new KubernetesResourceNotFoundException("K8s pod with label: job-name=" + jobName + " not found"); |
There was a problem hiding this comment.
If we pass jobName instead of K8sTaskId, we lose the originalTaskId in the log, not sure if it's useful for troubleshooting.
There was a problem hiding this comment.
hmm i think it might be, although technically the k8sTaskId being logged right now is not really the real one b/c of the bug i mentioned above. maybe i can find a way to get the actual task id in there
| public static String convertTaskIdToJobName(String taskId) | ||
| { | ||
| return taskId == null ? "" : StringUtils.left(RegExUtils.replaceAll(taskId, K8S_TASK_ID_PATTERN, "") | ||
| .toLowerCase(Locale.ENGLISH), 30) + "-" + Hashing.murmur3_128().hashString(taskId, StandardCharsets.UTF_8); |
There was a problem hiding this comment.
Hashing.murmur3_128().hashString(taskId, StandardCharsets.UTF_8)
How many characters will this produce? I couldn't easily figure out if there is a limit on the length of the string returned from this function
There was a problem hiding this comment.
128 bits -> 32 characters
| K8sTaskId taskId = new K8sTaskId(job.getMetadata().getName()); | ||
| log.info("Successfully submitted job: %s ... waiting for job to launch", taskId); | ||
| String jobName = job.getMetadata().getName(); | ||
| log.info("Successfully submitted job: %s ... waiting for job to launch", jobName); |
There was a problem hiding this comment.
It would be good to report the taskId in this log line, to make it easy to map to the task_id that is being started.
* Fix issue with long data source names * Use the regular library * fix overlord utils test
* Fix issue with long data source names * Use the regular library * fix overlord utils test
Fixes a issue where having data source names (or more generally task ids) that are too long cause the K8sTaskRunner to throw errors.
Description
The K8sTaskRunner truncates task ids in order to get a valid K8s job name (less than 64 characters) and doing this can cause loss of uniqueness. I fixed this by appending a 128 bit hash onto the end of the task id which i think should be sufficient to ensure uniqueness. (Change is in KubernetesOverlordUtils).
I also had to fix a related issue that was covered up by the fact that our truncation function can be applied repeatedly wihtout any affect.
The issue is the line in KubernetesPeonClient
K8sTaskId taskId = new K8sTaskId(job.getMetadata().getName());.
This takes the job name from the k8s api (which has already been truncated by our truncation function since it is a job name), and then passes it into the K8sTaskId constructor as the originalTaskId. The K8sTaskId calls the truncation function a second time to generate the k8sTaskId. This is okay when our truncation function just grabs the first 63 characters of the task id since repeatedly applying this won't do anything, but with my latest changes this is no longer the case.
I updated this logic to just use the job name directly from the kubernetes api (that's all it needs anyways) and I had to change a lot of function signatures to do this. I also explicitly changed the k8sTaskId field to k8sJobName since that's the only thing that field is used for.
This does make it slightly more confusing to find out which task a job corresponds to from a k8s client like k8s, but the PodTemplateTaskAdapter also applies the whole unmodified task id as a annotation.
Release note
-->
Key changed/added classes in this PR
KubernetesOverlordUtilsKubernetesPeonClientK8sTaskIdThis PR has: