Feature k8s pods log extraction #445

gusmith · 2019-10-01T06:32:47Z

Add an extra step in our azure pipeline to publish the logs of the entity-service pods when tests are failing.
First, use a python script to download the logs (easier with python than bash, and I already had the script locally), then, create an azure template for steps running the python script and publishing the logs in azure artifacts, finally integrate this template in the main pipeline.

Note: I included a commit which will need to be reversed: I introduce a test made to fail (directly raising an exception).

…em in Azure.

Include the conodition to each step

Otherwise subprocess.run does not have the argument `capture_output`

gusmith · 2019-10-01T07:26:16Z

see https://dev.azure.com/data61/Anonlink/_build/results?buildId=614&view=results for a failed build publishing the pods logs.

This reverts commit 33230b8.

hardbyte

This will be so very helpful!

hardbyte · 2019-10-01T11:21:27Z

.azurePipeline/getPodsLogs.py

+            json.dump(info, f, indent=2, sort_keys=True)
+            f.write('\n\n')
+        with open(_directory / 'pod-{}_container-{}.log'.format(info['short_pod_name'], info['name']), 'at') as f:
+            get_logs_container(info, f)


Mixing json and logging output is a bit strange

But the information is relevant.
From

{ "full_pod_name": "esdf764b0f-memstore-server-0", "image": "redis:4.0.11-alpine", "name": "redis", "restart_count": 0, "short_pod_name": "er-0" } [logs]

would

"full_pod_name": "esdf764b0f-memstore-server-0", "image": "redis:4.0.11-alpine", "name": "redis", "restart_count": 0, "short_pod_name": "er-0" ---------------------------------------------------------------- [logs]

be better?
The most interesting from it it the image and the restart count. The other information are more relevant if the deployment was staying up.

Sure the info is great. It was more of an observation, no need to change it.

hardbyte · 2019-10-01T11:21:49Z

.azurePipeline/getPodsLogs.py

+        name_container = container.get('name')
+        restart_count = container.get('restartCount')
+        image_name = container.get('image')
+        info = {'full_pod_name': pod_name, 'short_pod_name': pod_name[len('benchmark-es-data61-xyz-'):],


benchmark-es-data61-xyz-?

That explains why the short pod name was quite small... 👍

.azurePipeline/getPodsLogs.py

hardbyte · 2019-10-01T11:24:44Z

.azurePipeline/templatePublishLogsFromPods.yml

+  inputs:
+    artifactName: PodLogs
+    targetPath: ${{ parameters.logsFolderPath }}
+  condition: failed()


nit: newline

hardbyte · 2019-10-01T11:26:07Z

.azurePipeline/templatePublishLogsFromPods.yml

+- task: UsePythonVersion@0
+  inputs:
+    versionSpec: '3.7'
+  condition: failed()


I suggest we use Azure's variables feature here so we can decide in the ci/cd system to keep the logs without having to inject a test failure.

https://docs.microsoft.com/en-us/azure/devops/pipelines/process/variables?view=azure-devops&tabs=yaml%2Cbatch

I do not see the point of using variables for this. In my mind, there three options:

we should publish the logs for every build (changing the condition from failed() to always()

we should never publish the logs (which I would assume is not preferred, otherwise we wouldn't have opened the issue Azure Devops CI isn't saving container logs #418 )

the logs are useful in some cases.

I went with the third option as I do not think that they are useful when a build succeeds (but I'm not feeling to strongly about it).
However, adding a variable true or false is in my opinion a bad idea, as I do not believe a developer should have to modify the CI script on a per branch basis.

Compromise: the logs are always provided.

hardbyte · 2019-10-01T11:27:04Z

.azurePipeline/templatePublishLogsFromPods.yml

+    echo "Copy the pods' logs from the release '${{ parameters.releaseName }}' to '${{ parameters.logsFolderPath }}'."
+    python .azurePipeline/getPodsLogs.py ${{ parameters.logsFolderPath }} ${{ parameters.releaseName }}
+  displayName: 'Copy the logs from the service pods.'
+  condition: failed()


Do we have to repeat the condition in this template, can we lift it up to the calling location in azure-pipelines.yml?

The step template does not accept a condition field, which is why I needed to bring them back here.

hardbyte · 2019-10-01T11:28:29Z

.azurePipeline/getPodsLogs.py

+
+
+def get_logs_container(container_info, file, previous=False):
+    cmd = "kubectl --namespace {} logs {} {}".format(NAMESPACE, container_info['full_pod_name'], container_info['name'])


Note this command gets logs with ascii color codes... Is this what you want?

Use ansioclors to remove colors from logs.

Okay, of course if we are outputting to a console the colors are really good to keep. All good by me either way

.azurePipeline/templatePublishLogsFromPods.yml

gusmith · 2019-10-01T23:09:23Z

It will close #418

…a61-xyz-` to create short name.

…ption of container.

hardbyte · 2019-10-04T06:35:27Z

.azurePipeline/getPodsLogs.py

+            json.dump(info, f, indent=2, sort_keys=True)
+            f.write('\n\n')
+        with open(_directory / 'pod-{}_container-{}.log'.format(info['short_pod_name'], info['name']), 'at') as f:
+            get_logs_container(info, f)


Sure the info is great. It was more of an observation, no need to change it.

hardbyte · 2019-10-04T06:37:55Z

.azurePipeline/getPodsLogs.py

+
+
+def get_logs_container(container_info, file, previous=False):
+    cmd = "kubectl --namespace {} logs {} {}".format(NAMESPACE, container_info['full_pod_name'], container_info['name'])


Okay, of course if we are outputting to a console the colors are really good to keep. All good by me either way

.azurePipeline/templatePublishLogsFromPods.yml

Added explanation for the reason of the JobAttempt parameter.

Guillaume Smith added 5 commits October 1, 2019 16:12

Add a python script to dowload all the pods' logs from a release.

89f258f

Azure template to get the logs from pods of a release, and publish th…

a8d4dff

…em in Azure.

And now use the template in the main azure pipeline.

af8a294

And update some documentation.

8e537a3

Commit to be reversed: ensure that we have a filaing test.

33230b8

gusmith self-assigned this Oct 1, 2019

Guillaume Smith added 4 commits October 1, 2019 16:35

Cannot add a condition to a template.

d70d1b8

Include the conodition to each step

Typo

2f03c7c

Change the name of the pods logs artifact.

c6773a3

Requires Python 3.7.

df764b0

Otherwise subprocess.run does not have the argument `capture_output`

Guillaume Smith and others added 2 commits October 1, 2019 17:26

Revert "Commit to be reversed: ensure that we have a filaing test."

6bd8dc8

This reverts commit 33230b8.

Merge branch 'develop' into feature-k8s-pods-log-extraction

6afdebc

gusmith requested a review from hardbyte October 1, 2019 07:27

gusmith marked this pull request as ready for review October 1, 2019 07:28

hardbyte approved these changes Oct 1, 2019

View reviewed changes

Guillaume Smith added 5 commits October 2, 2019 13:22

Use ansicolors to remove colored characters from logs.

f0c9ba0

Use the release name variable instead of hard-coded `benchmark-es-dat…

bf4c22d

…a61-xyz-` to create short name.

Stop using json formats in the logs file to provide high level descri…

3f2f753

…ption of container.

nit: added some new lines.

4c057e3

Always provide the logs artifact.

52d6fc5

gusmith requested a review from hardbyte October 2, 2019 03:40

Merge branch 'develop' into feature-k8s-pods-log-extraction

037613b

hardbyte approved these changes Oct 4, 2019

View reviewed changes

Guillaume Smith added 4 commits October 8, 2019 16:35

Try renaming published artifacts to be able to re-run step.

0d7bc46

Test with avriables.

90b4caf

Did not work before. new try.

1327c34

Add some documentation in the azure step.

df85447

Added explanation for the reason of the JobAttempt parameter.

gusmith merged commit 3460e0b into develop Oct 8, 2019

gusmith deleted the feature-k8s-pods-log-extraction branch October 8, 2019 23:06

gusmith mentioned this pull request Oct 9, 2019

Azure Devops CI isn't saving container logs #418

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature k8s pods log extraction #445

Feature k8s pods log extraction #445

gusmith commented Oct 1, 2019

gusmith commented Oct 1, 2019

hardbyte left a comment

hardbyte Oct 1, 2019

gusmith Oct 1, 2019

hardbyte Oct 4, 2019

hardbyte Oct 1, 2019

gusmith Oct 1, 2019

hardbyte Oct 1, 2019

gusmith Oct 2, 2019

hardbyte Oct 1, 2019

gusmith Oct 1, 2019 •

edited

Loading

gusmith Oct 2, 2019

hardbyte Oct 1, 2019

gusmith Oct 1, 2019

hardbyte Oct 1, 2019

gusmith Oct 2, 2019

hardbyte Oct 4, 2019

gusmith commented Oct 1, 2019

hardbyte Oct 4, 2019

hardbyte Oct 4, 2019



		def get_logs_container(container_info, file, previous=False):
		cmd = "kubectl --namespace {} logs {} {}".format(NAMESPACE, container_info['full_pod_name'], container_info['name'])

Feature k8s pods log extraction #445

Feature k8s pods log extraction #445

Conversation

gusmith commented Oct 1, 2019

gusmith commented Oct 1, 2019

hardbyte left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gusmith Oct 1, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gusmith commented Oct 1, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gusmith Oct 1, 2019 •

edited

Loading