Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-3870: [Rust][Build] Speed up CI for Rust #2517

Merged
merged 1 commit into from
Oct 10, 2023

Conversation

sarutak
Copy link
Member

@sarutak sarutak commented Sep 24, 2023

AVRO-3870

What is the purpose of the change

This PR aims to make the CI for Rust faster.

In the current master, there are something wrong about actions/cache in test-lang-rust-ci.yml.

First, a directory target is tend to be cached but the path is wrong. The correct path is lang/rust/target, not ~/target.
https://github.com/apache/avro/actions/runs/6277247580/job/17048599038#step:22:2

Second, as of Rust 1.70.0, Cargo changes the way to download dependencies.
https://blog.rust-lang.org/2023/06/01/Rust-1.70.0.html#sparse-by-default-for-cratesio
So, it's better not to share the cache for ~/.cargo

Verifying this change

Done by CI.

Documentation

  • Does this pull request introduce a new feature? (no)

@github-actions github-actions bot added the build label Sep 24, 2023
@martin-g
Copy link
Member

First, a directory target is tend to be cached but the path is wrong. The correct path is lang/rust/target, n

See https://github.com/apache/avro/pull/2517/files#diff-7a8b4df8e37b74953d7bc5d6dce62876277b1951065cdb170a48ade910d5e551R37 .


- name: Cache Cargo for 1.65.0 (Restore Only)
if: matrix.rust == '1.65.0' && matrix.target != 'x86_64-unknown-linux-gnu'
uses: actions/cache/restore@v3
Copy link
Member

@martin-g martin-g Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks weird.
actions/cache@v3 does two things - first it restores the cache and later in the post-processing saves the cache.
Now you introduce the new actions/cache/restore@v3 which gives you more control when to restore but it should be used with actions/cache/save@v3 to have a better control when to save.
Using both actions/cache with any of the new more granular actions does not make sense to me.

Copy link
Member Author

@sarutak sarutak Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having only restore for some cache is intended.
For lang/rust/target, if a target is wasm32-unknown-unknown, cargo test doesn't run. So if wasm32-unknown-unknown finishes earlier than x86_64-unknown-linux-gnu, the dependencies built for tests will not be cached right?

For ~/.cargo, it might not be necessary to separate by conditions.
But, at least web-assembly job should be configured restore only.

See https://github.com/apache/avro/actions/runs/6277247580/job/17048597779#step:7:12
All the jobs download dependencies except for web-assembly job.
I guess this means ~/.cargo of web-assembly was cached because web-assembly job finished first but ~/.cargo of web-assembly doesn't contain all the necessary dependencies for other jobs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still very confused about these changes :-/
IMO we should add ${{ matrix.target }} to the cache key. This way x86_64 caches won't interfere with the wasm ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the handling of pre-1.70 and post-1.70 adds to the confusion.
Let's just disable the new sparse protocol until MSRV is 1.70.0 via https://doc.rust-lang.org/cargo/reference/config.html#registriescrates-ioprotocol (i.e. a global env variable for the whole workflow)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open a new PR with the proposed simplifications!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still very confused about these changes :-/
IMO we should add ${{ matrix.target }} to the cache key. This way x86_64 caches won't interfere with the wasm ones.

As I explained here, I didn't intentionally use ${{ matrix.target }}.

Also the handling of pre-1.70 and post-1.70 adds to the confusion.
Let's just disable the new sparse protocol until MSRV is 1.70.0 via https://doc.rust-lang.org/cargo/reference/config.html#registriescrates-ioprotocol (i.e. a global env variable for the whole workflow)

This idea seems awesome!

.github/workflows/test-lang-rust-ci.yml Show resolved Hide resolved
@sarutak
Copy link
Member Author

sarutak commented Sep 26, 2023

See https://github.com/apache/avro/pull/2517/files#diff-7a8b4df8e37b74953d7bc5d6dce62876277b1951065cdb170a48ade910d5e551R37 .

As I explained here, defaults.run.working-directory seems not to affect to path in actions/cache.

@sarutak
Copy link
Member Author

sarutak commented Oct 3, 2023

The latest change is a combination of solutions of mine and @martin-g .
In my experiment, CI takes 7m 24s with caches and takes up approximately 1.7 GB for Rust related caches.
https://github.com/sarutak/avro/actions/runs/6396061155
https://github.com/sarutak/avro/actions/caches

The degree of the performance improvement is not so large, but cost effective.

@martin-g
Copy link
Member

martin-g commented Oct 3, 2023

Do you see any issues with #2538 ?
It does not use matrix.rust and hash of Cargo.lock for the key, so the cache is reused even when dependencies are changed, but the drawback is that the cache grows slightly and some of the cached deps won't be used. We can easily delete the cache if it grows too big but I think this will happen after few years.

@sarutak
Copy link
Member Author

sarutak commented Oct 4, 2023

@martin-g
I've left comments in #2538

@martin-g martin-g merged commit 911b7af into apache:main Oct 10, 2023
12 checks passed
martin-g pushed a commit that referenced this pull request Oct 10, 2023
@martin-g
Copy link
Member

Thank you, @sarutak !

RanbirK pushed a commit to RanbirK/avro that referenced this pull request May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants