Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failing CI #10008

Closed
andrewrk opened this issue Oct 23, 2021 · 9 comments
Closed

failing CI #10008

andrewrk opened this issue Oct 23, 2021 · 9 comments
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Oct 23, 2021

The CI has been especially unreliable lately. Apologies to everyone who has been opening pull requests and seen false negatives.

I'm in the process of acquiring our own hardware to run the CI stuff on, to increase reliability and make it easier to troubleshoot when a flaky test failure happens.

It's going to be annoying for another week or so and then hopefully we will have something sorted out.

Thanks for understanding, I know it's been really annoying lately for everyone.

@andrewrk andrewrk added this to the 0.9.0 milestone Oct 23, 2021
@andrewrk andrewrk pinned this issue Oct 23, 2021
@andrewrk
Copy link
Member Author

Update on the situation: ZSF now has a beefy Hetzner machine that we are paying 100 euros/month for, and @mikdusan has volunteered to work on migrating our CI infrastructure to it piecemeal. Thanks for being patient with us- taking more control over our CI pipeline is long, long overdue and this should bring long-term stability to the project.

This also opens up the opportunity to run performance-tracking benchmarks on every CI run as well.

This was referenced Oct 26, 2021
@andrewrk
Copy link
Member Author

andrewrk commented Nov 2, 2021

Update: CI is getting closer to ready. Meanwhile, I rebooted the gotta-go-fast repository and added some zig ast-check benchmarks. Have a look at the README to see the new description of how it works now. The next steps will be:

  • Backfilling some performance measurements
  • Adding html/javascript to ziglang.org/perf which download the data and display interactive graphs
  • Integrating with the new CI system so that performance measurements happen with every commit, and the data is synced to ziglang.org/perf with every commit

@andrewrk
Copy link
Member Author

andrewrk commented Nov 2, 2021

we have our first self-hosted CI run

@andrewrk
Copy link
Member Author

andrewrk commented Nov 9, 2021

x86_64 Linux has now been moved over to ci.ziglang.org. This should already improve stability of the CI and we are still planning on moving more over.

Meanwhile, https://ziglang.org/perf is live. Adjustments are still being made to integrate the page with the rest of the website as well as to the graphs.

@RossSmyth
Copy link
Contributor

Having a Zig Bors/Homu may be a good idea so that all PR's are tested as the head of master before merging.

@matu3ba
Copy link
Contributor

matu3ba commented Nov 19, 2021

@RossSmyth The Azure build jobs on Windows are (yet) too unstable to do that unless Windows is excluded, lol. For example 10172 10168 10161 10151 all failed due to Windows timing out after 6 hours.

@andrewrk
Copy link
Member Author

This particular issue is resolved. We have moved some things over to our self-hosted CI, reduced memory usage, and disabled flaky tests, and now CI is in an OK state.

@andrewrk andrewrk unpinned this issue Apr 16, 2022
@andrewrk
Copy link
Member Author

Reopening - I'm seeing the aarch64 drone CI failing due to timeouts sometimes.

@andrewrk andrewrk reopened this Apr 19, 2022
@andrewrk andrewrk pinned this issue Apr 19, 2022
@andrewrk andrewrk unpinned this issue Jun 3, 2022
@andrewrk
Copy link
Member Author

andrewrk commented Jun 3, 2022

Solved in b095aa6

@andrewrk andrewrk closed this as completed Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants