Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log lines for an active step are inaccessible #886

Closed
bogosj opened this issue Dec 30, 2020 · 74 comments
Closed

Log lines for an active step are inaccessible #886

bogosj opened this issue Dec 30, 2020 · 74 comments
Labels
Actions Feature Feature requires both runner, pipelines service and launch changes bug Something isn't working enhancement New feature or request

Comments

@bogosj
Copy link

bogosj commented Dec 30, 2020

Describe the bug
When opening the "run" page for an active job you can see:

  • Gray checkmarks and expandable sections for completed steps and all of their log lines.
  • Yellow loading animation on any active steps.
  • Circles for the future steps.

The log lines for the active steps will then start streaming in, but the prior lines for that step are not there, and there appears to be no way to see them until the step completes.

To Reproduce
Steps to reproduce the behavior:

  1. Run an action
  2. Visit the run page after a few seconds and try to find the log lines for the currently running step that occurred before you loaded the page.

Expected behavior
All of the log lines that were emit before I loaded the page should be visible, then new ones stream in afterwards.

@bogosj bogosj added the bug Something isn't working label Dec 30, 2020
@bogosj
Copy link
Author

bogosj commented Dec 30, 2020

It looks like this is a KI, but I can't find a canonical bug tracking it:

mxschmitt/action-tmate#1 (comment)

@konradpabjan
Copy link
Contributor

👋 Actions engineer here

So I wouldn't classify this as a bug, but it's just a missing feature.

When a step is running, you can do "inspect element", and see that there is a active websocket that is streaming in new output from the runner. When a step is completed though, a GET request is made to our service that downloads all the logs for step in one go. If a step is in-progress and a user refreshes a page or first loads a page while it is in-progress, the logs that were already streamed to the websocket are effectively gone until the step is completed and we can fetch everything using an internal GET api.

The solution for us is to somehow cache existing logs that have already been streamed and if a user refreshes a step that is in-progress, fetch the cached logs and display that + append new logs that are coming in through the websocket.

We'll eventually add this feature (this would be really really nice to have), but at the moment we just haven't been able to pick this up.

@vadi2
Copy link

vadi2 commented Jan 19, 2021

This would be great to have because some steps can take a while, or in fact due to error get stuck for a good half hour - and you're sitting there wondering if you should cancel it or not. It comes up often as a problem during development of your workflow!

@TingluoHuang
Copy link
Member

I believe this is on our backlog.

@graeme-verticalscope
Copy link

yeah this would be nice to have 😃

@TingluoHuang TingluoHuang added Actions Feature Feature requires both runner, pipelines service and launch changes enhancement New feature or request labels Apr 23, 2021
@ethomson
Copy link
Contributor

Hi! This seems like a possibly useful addition. However, suggestions for GitHub Actions functionality belong in the GitHub feedback forum, where the product managers and engineers will be able to review them. This repository is really only focused on the runner application, which is a very small component of GitHub Actions.

We do have this opened on our backlog, so I'm going to close this issue here. Thanks again!

@charlesritchea
Copy link

This is basic CI functionality, so it's a bug. I don't know how you could classify it as a nice to have feature. Sounds like you aren't in the trenches having to diagnose builds much

@FranDepascuali
Copy link

FranDepascuali commented Aug 22, 2022

I'm suffering this right now :p.
This is the only thing I'm seeing, it has been a long time and I have no way of diagnosing what's happening
image

Update: 1hr 42min, make your bets!

@schipperkp
Copy link

I agree with the thought process that this is not nice to have but a must-have. If you're working on debugging a prod deploy that has gone south, you may not be able to cancel the build, but rather you must let it complete the rollback steps. This means you lose valuable time and information by needing to wait...

@schipperkp
Copy link

@ethomson can we get a link to the backlog Item in this thread?

@umm0n
Copy link

umm0n commented Aug 31, 2022

As others have commented, I don't see this as a nice-to-have but rather, an elemental feature. A link to the backlog feature would be nice!

@etaormm
Copy link

etaormm commented Sep 19, 2022

@TingluoHuang @konradpabjan any update for this issue/bug?

@julienbonastre
Copy link

Not a fan of bumping for the sake of it, but yes, +1, this is not a "nice to have" but a fundamental feature that should have been part of initial design requirements for a CI solution - if not realtime, at least near-realtime access to logging of job outputs is crucial

@bitdeft
Copy link

bitdeft commented Oct 6, 2022

I'm also going to heavily +1 this. It was mentioned that a discussion would need to be opened to express desires for this feature/bugfix to be resolved sooner than later if it's on the backlog.

I didn't see any when I searched, so I've created this discussion here:
community/community#35363

@dongho-jung
Copy link

dongho-jung commented Dec 7, 2022

?? why did this get closed? I think this is not a "nice to have" feature, but rather a "missing" feature. it's been almost two years and I can't still see the log in real time..

@dongho-jung
Copy link

this is not released yet, they seem making a progress for this issue behind the scene.

image

@paulogodinhoaq
Copy link

@0xF4D3C0D3 do you believe these are related to this issue? They seem to be about getting the actual Worker logs into our hands, rather than the step its executing. I may be misreading.

@charlesritchea
Copy link

@0xF4D3C0D3 do you believe these are related to this issue? They seem to be about getting the actual Worker logs into our hands, rather than the step its executing. I may be misreading.

I've been following this for awhile I've always understood it to be the step it's executing. Currently the logs often don't show up until after the step finishes. Can be a long time and you have no idea if your step is hung or not

@paulogodinhoaq
Copy link

paulogodinhoaq commented Dec 14, 2022

I've been following this for awhile I've always understood it to be the step it's executing. Currently the logs often don't show up until after the step finishes. Can be a long time and you have no idea if your step is hung or not

Same, this is currently defeating my transition from Jenkins to Actions.

By my comment I meant that these commits recently made by @AvaStancu may not be related to this issue here, but aiming to help people wanting to watch/debug the runner logs itself.

Unfortunately, the Issues related to those commits are in a private repo and we can't get context.

@etaormm
Copy link

etaormm commented Jan 11, 2023

I don't think this is being worked on right now...
I have opened the same issue a couple months ago - but no one is assigned to it yet.

#2131

@mcarans
Copy link

mcarans commented Jan 12, 2023

This is definitely what I would call a bug and it's rather annoying to not be able to get an idea of where a step has reached particularly when debugging very long-running steps.

@bryanmacfarlane
Copy link
Member

bryanmacfarlane commented Jan 27, 2023

Hey, we want to follow up here from GitHub Actions engineering.

First and foremost just wanted to let everyone know that we hear you. We're very aware that this is a big ask and one of the big pain points with actions.

Over recent months there's big architectural changes happening as part of larger efforts to set us up for many larger goals; this being one of them. The services and the key pinnings behind those changes are out in production and being phased in. We hope to realize this as part of those changes soon.

Even though the issue is closed (since it's really not primarily a runner issue) we wanted to follow up since this is high activity. Thanks for the (very clear) feedback here. :)

@axel-h
Copy link

axel-h commented Apr 16, 2023

About three months later and I'm still waiting for a practical solution. It's really a pain when you have PR-triggered test jobs that run for over an hour producing logs from time to time, but there is no way to have a look at what has been printed so far - unless you capture all log from the beginning.

@julienbonastre
Copy link

About three months later and I'm still waiting for a practical solution. It's really a pain when you have PRs triggered test jobs that runs for over an hour producing logs from time to time, but there is no way to have a look at what has been printed so far - unless you capture all log from the beginning.

Anyone have an idea of what the ACTIVE and OPEN issue is for tracking the acclaimed changes that would remediate this? So far we are all here watching and waiting on closed issues (this one and #891 ) - doesn't bode well for me trying to have optimism on issues that have no priority or have been deemed completed 😞

@rehanvdm
Copy link

rehanvdm commented Aug 8, 2023

Thanks @bryanmacfarlane and @yacaovsnc for the transparency, that is good news 🙂

@thomasphorton
Copy link

Hi! Are we getting close to onboarding people to this? Liberty would love to get in on this ASAP

@dnuzz
Copy link

dnuzz commented Sep 12, 2023

Same, this is making some of our deployment processes effectively a black box

@mjperrone
Copy link

The current ETA for beta testing this feature is end of August / early September timeframe.

@yacaovsnc, now that it is early September, are you able to provide an updated ETA?

@bryanmacfarlane
Copy link
Member

if you want to try it out, then email me. my email is my full name (alias) and that's at github dot com. I can hook you up. We finished scale testing and had a bug bash. We're fixing a few bugs 🤞 so hopefully a few days. I'm the messenger here and relaying back to the teams so 🙏 ❤️

@perryjrandall
Copy link

^ Amazing news! Us folks at aptos-labs would love to try it out if you can include us as well (sent an email too)

@Mause
Copy link

Mause commented Sep 22, 2023

Any chance of getting this enabled on the duckdb and duckdblabs orgs? We have several long running jobs that would really benefit from this

@andrewakim
Copy link

👋 Hi all - Actions PM here. I wanted to thank everyone for their interest in testing out the feature. We have a good amount of customers in the test group already so we won't be taking on anymore for now. I'll reach back out if we open it again.

@will-lumley
Copy link

will-lumley commented Sep 26, 2023

Very excited to see this feature.

@raphaccgil
Copy link

@andrewakim do we have any updates here?
We are using GA in our CI for DBT and in this case would important to have this information.
In my scenario, we are using a DBT with Slim CI approach, this means we are using just a few data. However, the models still take a long time and we do not know if the CI is progressing or if any model has failed.

@andrewakim
Copy link

👋 Hi all - Appreciate the patience. The team is still working on the feature and addressing a few issues we came across. I'll set a reminder for myself to post an update in 2-3 weeks and I'll hopefully have a bit more detail around timelines by then.

@andrewakim
Copy link

As promised, just wanted to share a quick update. We had to shift gears to focus on a few other priorities that came up in the last few weeks but we're planning on returning to this work later in the quarter. I'll keep sending out updates as things develop.

@humzam
Copy link

humzam commented Dec 5, 2023

👋 Hi, @andrewakim @bryanmacfarlane

What is the timeline looking like for this feature to be released?

@andrewakim
Copy link

👋 Hi everyone - Good news! We're planning on rolling this feature out slowly starting in a few weeks. We'll begin to ramp up the rollout in the new year. Thanks for your patience!

@ayushxx7
Copy link

ayushxx7 commented Dec 7, 2023

following...

@Dmitry1987
Copy link

Dmitry1987 commented Dec 12, 2023

asking for this feature is like asking Microsoft to give away % of their current revenue, literally.. the runner billing per minutes spent, duuh 😅

Look: so there's a GET request "in the end when stage has been completed", which means they record the logs into some storage anyway, in the process, while the amazing websocket is "only streaming to browser". So letting the users access that "full log from current storage - whatever was cached so far" is a piece of cake (we're all developers here, and we know it 💯 ). The reasons why it would last 2.5 years to "fix" or implement the feature could be bandwidth cost (more GET requests in total, while stage is not yet completed) which is miniscule since we can stream the naughty websocket all day and watch logs anyway, even if we missed the critical first hundred lines - so bandwidth is not an issue. Or it could be the cost we all pay for the github runners per minute billed, and we obviously will spend more minutes waiting for stages to complete 🧐. Kinda ridiculous but since there are so many other pros to github, most of us keeps playing along like we don't understand what really goes on with this bug over all these years 😆 . Enjoy that runners billing revenue, Microsoft! 🚀💰. Well we can't blame the engineering team, this is business 🤷‍♂️.

EDIT: * 3 years to fix 🙊 🙉

@brianjmurrell
Copy link

@Dmitry1987 Did you bother to read even just 2 comments back before you posted your diatribe?

TL;DR: The fix for this is being rolled out.

@Dmitry1987
Copy link

@Dmitry1987 Did you bother to read even just 2 comments back before you posted your diatribe?

TL;DR: The fix for this is being rolled out.

Do you bother to understand I don't care if the fix is out or not, I'm talking about the past 3 years that it's been dragged obviously? I honestly don't care since I'll stream the stdout of our scripts to another storage from which we'll simply display it in any quick and dirty node.js/vue.js dashboard (well, here's your feature right there 🤣 I don't need to wait for it 3 years, and it proves that it is an easy fix, and the real reason was a business decision to milk more runner minutes, 💯 ).
We talk about Microsoft business practices and not about tech - there's nothing techy to discuss here, it's a miniscule feature even at your scale, as I said if the final GET request does display the full log, it means it was written to a storage, so ignoring the community requests for exposing that endpoint and allowing us to fetch the full log, is ridiculous.

As I said, we're not Grandmas with flowers, we're all developers and we understand how things work. So far the handling of this "bug" was subpar and we do our conclusions. For example, Azure already "enjoys" the effect of community sentiment (when engineers tell the management to avoid using Azure) your business decisions should take this into account since it affects the long term customer base. I don't buy that it was a difficult or expensive feature to roll out during 3 years 🤷‍♂️

@ChristopherHX
Copy link
Contributor

Part of GitHub Roadmap github/roadmap#839, since a week

Upon visiting the logs of an actively running Actions jobs, customers will now see the previous 1,000 lines of logs and any new lines emitted thereafter.

I'm wondering how this rollout will hit my custom actions runner, tracking this ca. 1 year. Due to implied protocol swap from azure pipelines like dotnet backend to a new golang based backend

@brianjmurrell
Copy link

There has been an update related to this public beta release: https://github.com/github/releases/issues/3686#issuecomment-1866738791

Bzzzzt. That's a 404.

@spaltrowitz
Copy link

My apologies @brianjmurrell this post was in error. Please keep watching the roadmap for updates github/roadmap#839!

@ChristopherHX
Copy link
Contributor

ChristopherHX commented Jan 23, 2024

Interestingly, I noticed today that I have a beta label on the workflow UI with a link to https://github.com/orgs/community/discussions/89879.

A sample run with the beta label showing for me is https://github.com/ChristopherHX/github-act-runner/actions/runs/7618673328/job/20750292224
Screenshot_20240123-105935~2

And I could see old log lines in the UI of running steps on GitHub Hosted Runners even after pressing f5, which is good.

UPDATE - 23 Jan 2024 23:28 CEST
Feature has been turned off for me

.... - There was a separate incident that was causing this bug. We turned off the feature for now but will turn it back on once we can confirm the service is running normally.

https://github.com/orgs/community/discussions/89879#discussioncomment-8224894
UPDATE - 24 Jan 2024 18:56 CEST
Seems to be reenabled for me

@lure8225
Copy link

Hi, this still seems to be an issue. Correct?
If I have a runner and watch the steps live I will get the streaming connection and see the step output. If I open the workflow run after the step has started and before the step has finished I will se nothing as the output is not fetched via get yet but there is also no new output streamed

@charlesritchea
Copy link

charlesritchea commented Apr 25, 2024 via email

@lure8225
Copy link

Thanks for the instant reply. I'm using a selfhosted runner via ARC. Will this issue also be resolved for these when this gets out of beta?

@charlesritchea
Copy link

charlesritchea commented Apr 25, 2024 via email

@wrslatz
Copy link

wrslatz commented Apr 30, 2024

Looks like this has been released! 🎉
Blog, github/roadmap#839

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Actions Feature Feature requires both runner, pipelines service and launch changes bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests