Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-4152. Archive container logs for kubernetes check #1355

Merged
merged 2 commits into from Aug 27, 2020

Conversation

adoroszlai
Copy link
Contributor

@adoroszlai adoroszlai commented Aug 26, 2020

What changes were proposed in this pull request?

In the last few days kubernetes check has been failing very frequently. This change let's it save logs from all containers to make it easier to find any problems. (docker-compose-based acceptance check also saves container logs.)

https://issues.apache.org/jira/browse/HDDS-4152

How was this patch tested?

$ unzip -t kubernetes.zip
Archive:  kubernetes.zip
    testing: getting-started.xml      OK
    testing: getting-started/         OK
    testing: log.html                 OK
    testing: minikube.xml             OK
    testing: minikube/                OK
    testing: ozone-dev.xml            OK
    testing: ozone-dev/               OK
    testing: ozone.xml                OK
    testing: ozone/                   OK
    testing: report.html              OK
    testing: summary.html             OK
    testing: getting-started/pod-datanode-0.log   OK
    testing: getting-started/pod-datanode-1.log   OK
    testing: getting-started/pod-datanode-2.log   OK
    testing: getting-started/pod-om-0.log   OK
    testing: getting-started/pod-s3g-0.log   OK
    testing: getting-started/pod-scm-0.log   OK
    testing: minikube/pod-datanode-0.log   OK
    testing: minikube/pod-datanode-1.log   OK
    testing: minikube/pod-datanode-2.log   OK
    testing: minikube/pod-om-0.log    OK
    testing: minikube/pod-s3g-0.log   OK
    testing: minikube/pod-scm-0.log   OK
    testing: ozone-dev/pod-datanode-0.log   OK
    testing: ozone-dev/pod-datanode-1.log   OK
    testing: ozone-dev/pod-datanode-2.log   OK
    testing: ozone-dev/pod-jaeger-0.log   OK
    testing: ozone-dev/pod-om-0.log   OK
    testing: ozone-dev/pod-prometheus-584b7948d9-8fcb6.log   OK
    testing: ozone-dev/pod-s3g-0.log   OK
    testing: ozone-dev/pod-scm-0.log   OK
    testing: ozone/pod-datanode-0.log   OK
    testing: ozone/pod-datanode-1.log   OK
    testing: ozone/pod-datanode-2.log   OK
    testing: ozone/pod-om-0.log       OK
    testing: ozone/pod-s3g-0.log      OK
    testing: ozone/pod-scm-0.log      OK

https://github.com/adoroszlai/hadoop-ozone/runs/1031423764

@adoroszlai adoroszlai self-assigned this Aug 26, 2020
@adoroszlai adoroszlai requested a review from elek August 26, 2020 18:29
@@ -77,6 +77,13 @@ start_k8s_env() {
wait_for_startup
}

get_logs() {
mkdir -p logs
for pod in $(kubectl get pods -o custom-columns=NAME:.metadata.name | tail -n +2); do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame that we have no better solution. We can do kubectl logs -l app=ozone but it will fail when we add setup with more apps (like ozone + spark).

FTR: locally I use https://github.com/wercker/stern and it works well (but this loop is easier.)

Copy link
Member

@elek elek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+2 (big +1 for the idea and +1 for the patch)

Thanks for this patch. I got the same idea, but was lazy to do it, and today I realized that you already implemented and tested it: 👍

@elek elek merged commit 5fab834 into apache:master Aug 27, 2020
@adoroszlai adoroszlai deleted the HDDS-4152 branch August 27, 2020 09:09
@adoroszlai
Copy link
Contributor Author

Thanks @elek for reviewing and committing it.

rakeshadr pushed a commit to rakeshadr/hadoop-ozone that referenced this pull request Sep 3, 2020
errose28 added a commit to errose28/ozone that referenced this pull request Sep 11, 2020
* master: (26 commits)
  HDDS-4167. Acceptance test logs missing if fails during cluster startup (apache#1366)
  HDDS-4121. Implement OmMetadataMangerImpl#getExpiredOpenKeys. (apache#1351)
  HDDS-3867. Extend the chunkinfo tool to display information from all nodes in the pipeline. (apache#1154)
  HDDS-4077. Incomplete OzoneFileSystem statistics (apache#1329)
  HDDS-3903. OzoneRpcClient support batch rename keys. (apache#1150)
  HDDS-4151. Skip the inputstream while offset larger than zero in s3g (apache#1354)
  HDDS-4147. Add OFS to FileSystem META-INF (apache#1352)
  HDDS-4137. Turn on the verbose mode of safe mode check on testlib (apache#1343)
  HDDS-4146. Show the ScmId and ClusterId in the scm web ui. (apache#1350)
  HDDS-4145. Bump version to 1.1.0-SNAPSHOT on master (apache#1349)
  HDDS-4109. Tests in TestOzoneFileSystem should use the existing MiniOzoneCluster (apache#1316)
  HDDS-4149. Implement OzoneFileStatus#toString (apache#1356)
  HDDS-4153. Increase default timeout in kubernetes tests (apache#1357)
  HDDS-2411. add a datanode chunk validator fo datanode chunk generator (apache#1312)
  HDDS-4140. Auto-close /pending pull requests after 21 days of inactivity (apache#1344)
  HDDS-4152. Archive container logs for kubernetes check (apache#1355)
  HDDS-4056. Convert OzoneAdmin to pluggable model (apache#1285)
  HDDS-3972. Add option to limit number of items displaying through ldb tool. (apache#1206)
  HDDS-4068. Client should not retry same OM on network connection failure (apache#1324)
  HDDS-4062. Non rack aware pipelines should not be created if multiple racks are alive. (apache#1291)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants