Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins: "aws logs tail ${log_group} --follow" stdout is stuck #7234

Closed
orlovmyk opened this issue Aug 30, 2022 · 10 comments
Closed

Jenkins: "aws logs tail ${log_group} --follow" stdout is stuck #7234

orlovmyk opened this issue Aug 30, 2022 · 10 comments
Assignees
Labels
guidance Question that needs advice or information. third-party This issue is related to third-party libraries or applications.

Comments

@orlovmyk
Copy link

orlovmyk commented Aug 30, 2022

Describe the bug

I have state machine, which is monitored in loop in Jenkins. Meanwhile I want to watch logs in Cloudwatch, using:

aws logs tail ${log_group} --follow 

But it's just doesn't have any output. Without follow it works fine, looks like it's buffering issue, because I haven't found sys.stdout.flush() invoked in code normally, it's used only while facing errors.

But I can see output using

aws logs tail ${log_group} --follow --debug

my guess it's beacuse using --debug stdout is redirected to stderr, and last one is not buffered

Also using other tools I also can tail logs without any problems:

What have I tried:

  • Setting PYTHONUNBUFFERED=1 - no effect
  • stdbuf -o 0
  • stdbuf -o L
  • since=0s and many other cli flags

Expected Behavior

I can see logs in my Jenkins job

Current Behavior

I can see logs in my Jenkins job only with --debug flag

Reproduction Steps

Create Jenkins job, add stage with tail --follow command

sh script:"""#!/bin/bash
python3 -m pip install awscliv2
~/.local/bin/awsv2 --install

# monitor log group
~/.local/bin/awsv2 logs tail ${env.log_group} \
            --follow

Possible Solution

Add sys.stdout.flush() to code as an cli option, or respect PYTHONUNBUFFERED env variable

Additional Information/Context

It's really weird, I killed 3 days to debug already 🤯
I haven't tried aws logs tail --follow with other CI/CD tools, but I guess it might have the same problem

CLI version used

aws-cli/2.7.27

Environment details (OS name and version, etc.)

Python/3.9.11 Linux/5.10.102-99.473.amzn2.x86_64 exe/x86_64.amzn.2 prompt/off

@orlovmyk orlovmyk added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 30, 2022
@aBurmeseDev aBurmeseDev self-assigned this Aug 31, 2022
@aBurmeseDev aBurmeseDev added investigating This issue is being investigated and/or work is in progress to resolve the issue. and removed needs-triage This issue or PR still needs to be triaged. labels Aug 31, 2022
@aBurmeseDev
Copy link
Member

Hi @orlovw, thanks for reaching out and sorry to hear that you're having trouble.

It looks like you're installing CLI v2 via pip which we do not officially support, unfortunately. Here's our official documentation on how to install/updates AWS CLI: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html.

Also I was able to successfully use aws logs tail group_name --follow command and get the expected return output. Here's docs on the command: https://awscli.amazonaws.com/v2/documentation/api/latest/reference/logs/tail.html

Please let me know if you have any other questions!

@aBurmeseDev aBurmeseDev added third-party This issue is related to third-party libraries or applications. closing-soon This issue will automatically close in 4 days unless further comments are made. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Sep 1, 2022
@orlovmyk
Copy link
Author

orlovmyk commented Sep 1, 2022

@aBurmeseDev I will try to install it not using pip tomorrow and test again, also have you tried to run it inside Jenkins? It's the most important part here, because Jenkins works really weird with some commands stdout.

@orlovmyk
Copy link
Author

orlovmyk commented Sep 2, 2022

@aBurmeseDev I did following steps on my laptop to 100% about my testing, I'm using MacOS uname -a:
Darwin Nikitas-MacBook-Pro.local 21.6.0 Darwin Kernel Version 21.6.0: Sat Jun 18 17:07:25 PDT 2022; root:xnu-8020.140.41~1/RELEASE_X86_64 x86_64.

  1. removed brew installed awscli: brew uninstall awscli
  2. installing by official guide:
curl "https://awscli.amazonaws.com/AWSCLIV2.pkg" -o "AWSCLIV2.pkg"
sudo installer -pkg AWSCLIV2.pkg -target /

Checking installation

➜  ~ aws --version                                                                                                                                                                                        
aws-cli/2.7.28 Python/3.9.11 Darwin/21.6.0 exe/x86_64 prompt/off
  1. Tests in console

3.1 Running

aws logs tail /aws/lambda/my-log-group --follow --since 0s

3.2 Running my Lambda
And can see the logs, after my Lambda function invocation

➜  ~ aws logs tail /aws/lambda/my-log-group--follow --since 0s
2022-09-02T12:51:31.251000+00:00 2022/09/02/[$LATEST]419072086af14c69bf935a036f99ffc6 START RequestId: 289579e5-5c69-4b0c-a3af-bf4980ac837d Version: $LATEST
2022-09-02T12:51:31.252000+00:00 2022/09/02/[$LATEST]419072086af14c69bf935a036f99ffc6 {"level":"INFO","location":"init:71","message":"Init deployment","timestamp":"2022-09-02 12:51:31,252+0000","service":"canary-manager","status":"init","execution":"mockms-0_0_0_14-1662123076807","xray_trace_id":"1-6311fc52-507a5d6f37a02de2752060f2"}
2022-09-02T12:51:34.140000+00:00 2022/09/02/[$LATEST]419072086af14c69bf935a036f99ffc6 END RequestId: 289579e5-5c69-4b0c-a3af-bf4980ac837d
2022-09-02T12:51:34.140000+00:00 2022/09/02/[$LATEST]419072086af14c69bf935a036f99ffc6 REPORT RequestId: 289579e5-5c69-4b0c-a3af-bf4980ac837d	Duration: 2889.06 ms	Billed Duration: 2890 ms	Memory Size: 128 MB	Max Memory Used: 78 MB	Init Duration: 705.15 ms
2022-09-02T12:51:34.476000+00:00 2022/09/02/[$LATEST]419072086af14c69bf935a036f99ffc6 START RequestId: 66ac0643-145a-4fb2-b75c-a309fa50577f Version: $LATEST
2022-09-02T12:51:35.394000+00:00 2022/09/02/[$LATEST]419072086af14c69bf935a036f99ffc6 {"level":"INFO","location":"switch_weight:65","message":"Additional Lambda function version '26' switched to 10%","timestamp":"2022-09-02 12:51:35,393+0000","service":"canary-manager","status":"switch_weights","execution":"mockms-0_0_0_14-1662123076807","component":"mockms","region":"us-west-2","stage":{"duration":10,"weight":10},"xray_trace_id":"1-6311fc56-2c50c2f267911b48562e7b45"}
  1. Jenkins part

4.1 I have installed Jenkins locally:

brew install jenkins-lts
brew services start jenkins-lts

Got to 127.0.0.1:8080 and configured it as default as possible
4.2 Created new pipeline job:

pipeline {
  agent any
  environment {
    log_group = "/aws/lambda/my-log-group"
  }
  stages {
      stage("Monitor log group"){
          steps {
            script {
                sh script:"""#!/bin/sh
aws logs tail ${env.log_group} --region us-west-2 \
    --since 0s
    --follow
                """
            }
          }
        }
    }
}

4.3 Started it and after it started lambda, waatching for new logs, but

Started by user [admin](http://127.0.0.1:8080/user/admin)
Replayed [#10](http://127.0.0.1:8080/job/test-awscli2/10/)
[Pipeline] Start of Pipeline
[Pipeline] node
Running on [Jenkins](http://127.0.0.1:8080/computer/(built-in)/) in /Users/orlow/.jenkins/workspace/test-awscli2
[Pipeline] {
[Pipeline] withEnv
[Pipeline] {
[Pipeline] stage
[Pipeline] { (Monitor log group)
[Pipeline] script
[Pipeline] {
[Pipeline] sh
Sending interrupt signal to process
Aborted by [admin](http://127.0.0.1:8080/user/admin)
/Users/orlow/.jenkins/workspace/test-awscli2@tmp/durable-900fca58/script.sh: line 2:  2959 Terminated: 15          /usr/local/bin/aws logs tail /aws/lambda/my-log-group --region us-west-2 --follow --since 0s
script returned exit code 143
[Pipeline] }
[Pipeline] // script
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Finished: ABORTED

Logs are not shown
4.4 Using the same job, but including --debug will show logs, and also show that awscli knows about new lines
4.5 Using the same job, but tweaking --since, will only show logs from period from --since to command start

@aBurmeseDev
Copy link
Member

Thank you for sharing this detailed step-by-step workflow @orlovw. Based on what you're describing this sounds like a problem that may be unique to Jenkins, as it returned the logs until Jenkins get installed. Have you tried reaching out to Jenkins support regarding the stdout behavior? We don't have any documentation on this topic and I couldn't find any related issues. You might also consider contacting AWS Support for further guidance on this.

@orlovmyk
Copy link
Author

orlovmyk commented Sep 2, 2022

@aBurmeseDev I will reach out Jenkins support, but can you confirm from your side that you have the same behaviour, I don't want to be alone with my problem and want to know that you were able to reproduce it/understand it as me

Thanks :')

@aBurmeseDev
Copy link
Member

As far as reproducing goes with AWS CLI commands --follow and --since, I can confirm that I have the same expected behavior. Here's more on --since command for your reference:

--since (string) From what time to begin displaying logs. By default, logs will be displayed starting from ten minutes in the past. The value provided can be an ISO 8601 timestamp or a relative time. For relative times, provide a number and a single unit. Supported units include:
For example, a value of 5m would indicate to display logs starting five minutes in the past. Note that multiple units are not supported (i.e. 5h30m)

This issue will auto-close in couple days but if you have further issue with AWS CLI, please feel free to open a new issue. Thank you!

@aBurmeseDev aBurmeseDev added closing-soon This issue will automatically close in 4 days unless further comments are made. guidance Question that needs advice or information. and removed closing-soon This issue will automatically close in 4 days unless further comments are made. bug This issue is a bug. labels Sep 2, 2022
@orlovmyk
Copy link
Author

orlovmyk commented Sep 3, 2022

@aBurmeseDev Update on issue, I've tried to play around with it a little bit more. The interesting thing is if your logs hit limit of default Python buffer size of 8192 bytes then output is shown as needed, but in worng order (not too big problem). I think it's not Jenkins issue, buffer size should be configurable (at least for tail command) and it will resolve problem.

if you want I can close that issue and propose new improvment to make stdout bufer configurable. 🤔

P.S. Found similar issues for other commands as well:
#68
#3977

P.S.S working on POC to prove that issue in buffer

@github-actions github-actions bot removed the closing-soon This issue will automatically close in 4 days unless further comments are made. label Sep 3, 2022
@orlovmyk
Copy link
Author

orlovmyk commented Sep 3, 2022

Finally!! found workaround, it took ~1w to found, I'm quite tired right now tbh 💀

TL;DR;
MacOS

script -q aws logs tail ${env.log_group} --region us-west-2 \
    --format json \
    --color off \
    --follow \
    --cli-binary-format base64 \
    --since 0s

Linux

script -q /dev/null logs tail -c "${env.log_group} --region us-west-2 \
    --format json \
    --color off \
    --follow \
    --since 0s"

More detailed explanation:

We know that issue appears only on Jenkins, but why?

As you might know Jenkins runs sh directive as simple scipt invoke, but its stdout/stderr is redirected to file, example:

echo 123 &2>1 > jenkins.log

To prove it let's take a look at processes we have during Jenkins sh run:

ps aux | grep jenkins-log.txt

sh -c ({ while [ -d '/Users/orlow/.jenkins/workspace/test-awscli2@2@tmp/durable-f5d785b3' -a \! -f '/Users/orlow/.jenkins/workspace/test-awscli2@2@tmp/durable-f5d785b3/jenkins-result.txt' ]; 
do touch '/Users/orlow/.jenkins/workspace/test-awscli2@2@tmp/durable-f5d785b3/jenkins-log.txt'; 
sleep 3; 
done } 
& jsc=durable-c022c672bb45fe6720a516124f98640ac44193c69e990403288414c2d10052ea; 
JENKINS_SERVER_COOKIE=$jsc  
'/Users/orlow/.jenkins/workspace/test-awscli2@2@tmp/durable-f5d785b3/script.sh' > '/Users/orlow/.jenkins/workspace/test-awscli2@2@tmp/durable-f5d785b3/jenkins-log.txt' 2>&1; 
echo $? > '/Users/orlow/.jenkins/workspace/test-awscli2@2@tmp/durable-f5d785b3/jenkins-result.txt.tmp'; 
mv '/Users/orlow/.jenkins/workspace/test-awscli2@2@tmp/durable-f5d785b3/jenkins-result.txt.tmp' '/Users/orlow/.jenkins/workspace/test-awscli2@2@tmp/durable-f5d785b3/jenkins-result.txt'; wait) >&- 2>&- &

Looking into code:

def is_a_tty():

def is_a_tty():
    try:
        return os.isatty(sys.stdout.fileno())
    except Exception as e:
        return False

It's used only to ensure should we colorise output or not, but that's not the only thing TTY changes. Also if we are running inside TTY means that should not cache neither stdout neither stdin. Why? Because caching stdin or stdout may result in infinite waiting for input parameters for stdin and delayed output for stdout.

We understood that awscli thinks we are running in non-TTY (non-human readable console), so caching is OK thing to do. It increases performance a lot. But it's not we want to achive in Jenkins - we need to trick awscli, so it will think we are running into TTY. The first issue on StackOverflow: https://stackoverflow.com/questions/1401002/how-to-trick-an-application-into-thinking-its-stdout-is-a-terminal-not-a-pipe brings us to response, now stdout in real-time + as a bonus it's now colorized! cool! 😎 👨‍🎨
image

P.S I don't think it's the best workaround in the world, even more I think AWS cli must have option like --tty, or environment-variable to tweak buffer behaviour.

@orlovmyk orlovmyk closed this as completed Sep 3, 2022
@github-actions
Copy link

github-actions bot commented Sep 3, 2022

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.

@sidpremkumar
Copy link

sidpremkumar commented Mar 24, 2023

Hi I believe this issue needs to be re-opened, we have the same problem when trying to pipe the output of aws logs tail to a formatter. The buffer is causing us massive headaches.

PYTHONUNBUFFERED does not seem to be respected as well :/

@aBurmeseDev any advice here?

Specifically we try:
PYTHONUNBUFFERED=1 aws logs tail {group} --follow --format short --log-stream-name-prefix {prefix} --color off --since 0s >>file.txt 2>&1

And file.txt does not get updated in realtime 😭

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information. third-party This issue is related to third-party libraries or applications.
Projects
None yet
Development

No branches or pull requests

3 participants