Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

devops: "Run full-sync-to-tip-test" job logs are very long and often fail to load #7637

Closed
teor2345 opened this issue Sep 27, 2023 · 5 comments
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles A-diagnostics Area: Diagnosing issues or monitoring performance C-bug Category: This is a bug S-needs-triage Status: A bug report needs triage

Comments

@teor2345
Copy link
Contributor

teor2345 commented Sep 27, 2023

Describe the issue or request

When I try to look at the logs for a full sync job, they sometimes don't load in my browser, or they hang while scrolling.

Expected Behavior

Previously, the logs were split up into 6 hour jobs, so they weren't as large. They were slow to load, but they would usually load and scroll ok.

Current Behavior

Sometimes the logs fail to load at all. If they do, they are very hard to navigate, because only part of the logs load, and then when I scroll they pause again.

Possible Solution

We could store the full logs as a compressed GitHub artifact file, and only show the last 10,000 log lines in the Run job. Most failures would be visible in the last 10,000 lines, and if they weren't, we could download the full logs to our local machines.

https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts

It would also help to add a short paragraph to the CI developer doc that tells us how to:

  • download the full logs for a job
  • view them without annoying escape codes (unzip then less -r)

Additional Information/Context

Since billing is based on artifact size, we might want to compress the logs before we upload them. (Even though they are compressed again by GitHub, they are charged based on uncompressed size.)

But the logs aren't that large, and public repositories are free, so we don't need to do this yet.

Is this happening on PRs?

This job does not run on PRs

Is this happening on the main branch?

Yes

@teor2345 teor2345 added C-bug Category: This is a bug A-devops Area: Pipelines, CI/CD and Dockerfiles S-needs-triage Status: A bug report needs triage P-High 🔥 A-diagnostics Area: Diagnosing issues or monitoring performance labels Sep 27, 2023
@gustavovalverde
Copy link
Member

This might be something to consider as a short-term solution: https://docs.github.com/en/actions/monitoring-and-troubleshooting-workflows/using-workflow-run-logs#viewing-logs-with-github-cli

Long term, I've been investigating to export all logs and consolidate them in a single tool which would allow us to aggregate all logs from all runs and search on them.

@teor2345
Copy link
Contributor Author

teor2345 commented Oct 3, 2023

Another short-term solution would be to add a "Full Logs" job before the "Run" job, and wait for the test and show the full logs in that job.

Then in the "Run" job we could limit the output to the last 1000 lines, and get the exit status. (Or the first 1000 and the last 1000.)

@mpguerra
Copy link
Contributor

mpguerra commented Oct 9, 2023

@teor2345
Copy link
Contributor Author

@gustavovalverde just checking if you're working on this one?

It might not be a high priority any more if the full sync is working, because we got the performance bugs fixed.

@gustavovalverde
Copy link
Member

Since a few weeks ago, GitHub has made changes to their logging view, and it's not crashing anymore (not taking so much resources). It loads most of the logs, but we still need to download them if we need to search the whole logs.

The actual solution we need to implement is centralized logging.

image

@gustavovalverde gustavovalverde closed this as not planned Won't fix, can't repro, duplicate, stale Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-devops Area: Pipelines, CI/CD and Dockerfiles A-diagnostics Area: Diagnosing issues or monitoring performance C-bug Category: This is a bug S-needs-triage Status: A bug report needs triage
Projects
Archived in project
Development

No branches or pull requests

3 participants