-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature k8s pods log extraction #445
Conversation
Include the conodition to each step
Otherwise subprocess.run does not have the argument `capture_output`
see https://dev.azure.com/data61/Anonlink/_build/results?buildId=614&view=results for a failed build publishing the pods logs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be so very helpful!
json.dump(info, f, indent=2, sort_keys=True) | ||
f.write('\n\n') | ||
with open(_directory / 'pod-{}_container-{}.log'.format(info['short_pod_name'], info['name']), 'at') as f: | ||
get_logs_container(info, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixing json and logging output is a bit strange
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the information is relevant.
From
{
"full_pod_name": "esdf764b0f-memstore-server-0",
"image": "redis:4.0.11-alpine",
"name": "redis",
"restart_count": 0,
"short_pod_name": "er-0"
}
[logs]
would
"full_pod_name": "esdf764b0f-memstore-server-0",
"image": "redis:4.0.11-alpine",
"name": "redis",
"restart_count": 0,
"short_pod_name": "er-0"
----------------------------------------------------------------
[logs]
be better?
The most interesting from it it the image and the restart count. The other information are more relevant if the deployment was staying up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure the info is great. It was more of an observation, no need to change it.
.azurePipeline/getPodsLogs.py
Outdated
name_container = container.get('name') | ||
restart_count = container.get('restartCount') | ||
image_name = container.get('image') | ||
info = {'full_pod_name': pod_name, 'short_pod_name': pod_name[len('benchmark-es-data61-xyz-'):], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
benchmark-es-data61-xyz-
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That explains why the short pod name was quite small... 👍
inputs: | ||
artifactName: PodLogs | ||
targetPath: ${{ parameters.logsFolderPath }} | ||
condition: failed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
- task: UsePythonVersion@0 | ||
inputs: | ||
versionSpec: '3.7' | ||
condition: failed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we use Azure's variables feature here so we can decide in the ci/cd system to keep the logs without having to inject a test failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see the point of using variables for this. In my mind, there three options:
- we should publish the logs for every build (changing the condition from
failed()
toalways()
- we should never publish the logs (which I would assume is not preferred, otherwise we wouldn't have opened the issue Azure Devops CI isn't saving container logs #418 )
- the logs are useful in some cases.
I went with the third option as I do not think that they are useful when a build succeeds (but I'm not feeling to strongly about it).
However, adding a variable true
or false
is in my opinion a bad idea, as I do not believe a developer should have to modify the CI script on a per branch basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compromise: the logs are always provided.
echo "Copy the pods' logs from the release '${{ parameters.releaseName }}' to '${{ parameters.logsFolderPath }}'." | ||
python .azurePipeline/getPodsLogs.py ${{ parameters.logsFolderPath }} ${{ parameters.releaseName }} | ||
displayName: 'Copy the logs from the service pods.' | ||
condition: failed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to repeat the condition in this template, can we lift it up to the calling location in azure-pipelines.yml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The step template
does not accept a condition
field, which is why I needed to bring them back here.
|
||
|
||
def get_logs_container(container_info, file, previous=False): | ||
cmd = "kubectl --namespace {} logs {} {}".format(NAMESPACE, container_info['full_pod_name'], container_info['name']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this command gets logs with ascii color codes... Is this what you want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use ansioclors to remove colors from logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, of course if we are outputting to a console the colors are really good to keep. All good by me either way
It will close #418 |
…a61-xyz-` to create short name.
…ption of container.
json.dump(info, f, indent=2, sort_keys=True) | ||
f.write('\n\n') | ||
with open(_directory / 'pod-{}_container-{}.log'.format(info['short_pod_name'], info['name']), 'at') as f: | ||
get_logs_container(info, f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure the info is great. It was more of an observation, no need to change it.
|
||
|
||
def get_logs_container(container_info, file, previous=False): | ||
cmd = "kubectl --namespace {} logs {} {}".format(NAMESPACE, container_info['full_pod_name'], container_info['name']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, of course if we are outputting to a console the colors are really good to keep. All good by me either way
Added explanation for the reason of the JobAttempt parameter.
Add an extra step in our azure pipeline to publish the logs of the entity-service pods when tests are failing.
First, use a python script to download the logs (easier with python than bash, and I already had the script locally), then, create an azure template for steps running the python script and publishing the logs in azure artifacts, finally integrate this template in the main pipeline.
Note: I included a commit which will need to be reversed: I introduce a test made to fail (directly raising an exception).