Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using --remote_download_toplevel, some remote inputs are staged (downloaded) for local execution, even when they already exist on disk #13912

Closed
brentleyjones opened this issue Aug 27, 2021 · 13 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@brentleyjones
Copy link
Contributor

brentleyjones commented Aug 27, 2021

Description of the problem / feature request:

When using --remote_download_toplevel, and only some of the outputs of another action are consumed as inputs to an action that is run locally, those inputs are staged (downloaded) each time the local action is executed, instead of reusing the already downloaded outputs, if not all of the outputs of the producing action have been downloaded.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

example_rules.bzl:

ProducerInfo = provider(
    fields = {
        "a": "a",
        "b": "b",
    },
)

def _producer_impl(ctx):
    a = ctx.actions.declare_file("{}/a.txt".format(ctx.attr.name))
    b = ctx.actions.declare_file("{}/b.txt".format(ctx.attr.name))

    ctx.actions.run_shell(
        command = """
head -c 100 </dev/urandom > $1
head -c 100 </dev/urandom > $2
""",
        arguments = [
            a.path,
            b.path,
        ],
        outputs = [a, b],
        mnemonic = "Producer",
    )

    return [
        ProducerInfo(
            a = depset([a]),
            b = depset([b]),
        ),
        OutputGroupInfo(
            a = depset([a]),
            b = depset([b]),
        ),
    ]

producer = rule(
    implementation = _producer_impl,
)

def _consumer_impl(ctx):
    b = ctx.attr.producer[ProducerInfo].b
    output = ctx.actions.declare_file("{}/out.txt".format(ctx.attr.name))

    args = ctx.actions.args()
    args.add_all(b)

    ctx.actions.run_shell(
        command = """
readonly output="$1"
shift
cat $@ > "$output"
""",
        arguments = [
            output.path,
            args,
        ],
        inputs = depset([ctx.info_file], transitive = [b]),
        mnemonic = "Consumer",
        outputs = [output],
        execution_requirements = {
            "no-cache": "1",
            "no-remote": "1",
        },
    )

    return [
        DefaultInfo(
            files = depset([output]),
        )
    ]

consumer = rule(
    implementation = _consumer_impl,
    attrs = {
        "producer": attr.label(providers = [[ProducerInfo]]),
    },
)

BUILD:

load("@//:example_rules.bzl", "producer", "consumer")

producer(
    name = "producer",
)

consumer(
    name = "consumer",
    producer = ":producer",
)
# Run it once to prime the cache
❯❯❯ bazel build --disk_cache=/tmp/cache --embed_label=$(date) --remote_download_toplevel -- //:consumer
INFO: Invocation ID: 63bbdc7b-b5f8-4895-96dd-3ff99f1b287e
INFO: Analyzed target //:consumer (4 packages loaded, 8 targets configured).
INFO: Found 1 target...
Target //:consumer up-to-date:
  bazel-bin/consumer/out.txt
INFO: Elapsed time: 0.183s, Critical Path: 0.05s
INFO: 3 processes: 1 internal, 2 darwin-sandbox.
INFO: Build completed successfully, 3 total actions

# Clean to force cache to be used
❯❯❯ bazel clean
❯❯❯ bazel build --disk_cache=/tmp/cache --embed_label=$(date) --remote_download_toplevel -- //:consumer
INFO: Invocation ID: 0c1ed250-40f2-4d55-ad76-5c59d8ebb7be
INFO: Analyzed target //:consumer (4 packages loaded, 8 targets configured).
INFO: Found 1 target...
Target //:consumer up-to-date:
  bazel-bin/consumer/out.txt
INFO: Elapsed time: 0.143s, Critical Path: 0.02s
INFO: 3 processes: 1 remote cache hit, 1 internal, 1 darwin-sandbox.
INFO: Build completed successfully, 3 total actions

# Subsequent runs re-download (see the "1 remote cache hit")
❯❯❯ bazel build --disk_cache=/tmp/cache --embed_label=$(date) --remote_download_toplevel -- //:consumer
INFO: Invocation ID: 43e4f249-3c96-4dc1-b725-4b161a5054c2
INFO: Analyzed target //:consumer (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:consumer up-to-date:
  bazel-bin/consumer/out.txt
INFO: Elapsed time: 0.094s, Critical Path: 0.02s
INFO: 3 processes: 1 remote cache hit, 1 internal, 1 darwin-sandbox.
INFO: Build completed successfully, 3 total actions

# If we download the other part of the output...
❯❯❯ bazel build --disk_cache=/tmp/cache --embed_label=$(date) --remote_download_toplevel --output_groups=+a -- //:producer
INFO: Invocation ID: d8ba3512-ff22-44da-b19a-95f04c7ba594
INFO: Analyzed target //:producer (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:producer up-to-date:
  bazel-bin/producer/a.txt
INFO: Elapsed time: 0.088s, Critical Path: 0.00s
INFO: 2 processes: 1 remote cache hit, 1 internal.
INFO: Build completed successfully, 2 total actions

# ... then we no longer get extra downloads
❯❯❯ bazel build --disk_cache=/tmp/cache --embed_label=$(date) --remote_download_toplevel -- //:consumer
INFO: Invocation ID: 7dee6046-a432-4264-a23b-ee76cccaec35
INFO: Analyzed target //:consumer (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:consumer up-to-date:
  bazel-bin/consumer/out.txt
INFO: Elapsed time: 0.105s, Critical Path: 0.03s
INFO: 2 processes: 1 internal, 1 darwin-sandbox.
INFO: Build completed successfully, 2 total actions

# If we delete that output, then we get extra downloads again
❯❯❯ rm -f bazel-bin/producer/a.txt
❯❯❯ bazel build --disk_cache=/tmp/cache --embed_label=$(date) --remote_download_toplevel -- //:consumer
INFO: Invocation ID: 190c6993-564c-48a8-8b83-98f4fa8e9e3e
INFO: Analyzed target //:consumer (0 packages loaded, 0 targets configured).
INFO: Found 1 target...
Target //:consumer up-to-date:
  bazel-bin/consumer/out.txt
INFO: Elapsed time: 0.147s, Critical Path: 0.03s
INFO: 3 processes: 1 remote cache hit, 1 internal, 1 darwin-sandbox.
INFO: Build completed successfully, 3 total actions

What operating system are you running Bazel on?

macOS 11.5.2

What's the output of bazel info release?

release 5.0.0-pre.20210817.2

Any other information, logs, or outputs that you want to share?

#12855 (comment)

@brentleyjones
Copy link
Contributor Author

cc: @coeuvre

@oquenchil oquenchil added team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug untriaged labels Aug 31, 2021
@brentleyjones
Copy link
Contributor Author

@coeuvre This is still an issue, and would be nice to have fixed for 5.0 🙏.

@coeuvre coeuvre added P1 I'll work on this now. (Assignee required) and removed untriaged labels Nov 5, 2021
@coeuvre coeuvre self-assigned this Nov 5, 2021
@coeuvre coeuvre added this to the Bazel 5.0 Release Blockers milestone Nov 5, 2021
@coeuvre
Copy link
Member

coeuvre commented Nov 8, 2021

@brentleyjones Can you try with --experimental_action_cache_store_output_metadata?

@coeuvre coeuvre removed this from the Bazel 5.0 Release Blockers milestone Nov 10, 2021
@coeuvre coeuvre added P2 We'll consider working on this in future. (Assignee optional) and removed P1 I'll work on this now. (Assignee required) labels Nov 10, 2021
@coeuvre
Copy link
Member

coeuvre commented Nov 10, 2021

Since --experimental_action_cache_store_output_metadata can be used as a workaround, I am removing this from 5.0 release blocker and down the priority to P2.

@brentleyjones Let me know if it doesn't work for you.

@lummax
Copy link

lummax commented Feb 3, 2022

We had a similar effect with --remote_download_toplevel and a bazel shutdown / server restart. After that bazel redownloaded a lot from the remote-cache / disk-cache. In the end --experimental_action_cache_store_output_metadata helped a lot and it turns out this is exactly why it was added: #8248

$ bazel build :foo --experimental_remote_download_outputs=minimal
$ bazel shutdown
# This will be clean build, because the metadata from the previous build are
# not yet properly persisted to disk
$ bazel build :foo --experimental_remote_download_outputs=minimal

Excuse me for adding a lot of redundant words, but I hope this makes it more discoverable :)

Maybe a hit in the documentation for --remote_download_toplevel would be a good idea?

@brentleyjones
Copy link
Contributor Author

@coeuvre @meteorcloudy Can we get an issue tracking the path to non-experimental for --experimental_action_cache_store_output_metadata? I just re-discovered this bug, and forgot that one needed to set this flag. It would be nice to have it enabled by default if there aren't known issues with it.

brentleyjones added a commit to MobileNativeFoundation/rules_xcodeproj that referenced this issue Oct 7, 2022
Sets `--experimental_action_cache_store_output_metadata` to work around bazelbuild/bazel#13912.
brentleyjones added a commit to MobileNativeFoundation/rules_xcodeproj that referenced this issue Oct 7, 2022
Sets `--experimental_action_cache_store_output_metadata` to work around
bazelbuild/bazel#13912.
@meteorcloudy
Copy link
Member

@coeuvre Let me know if flipping --experimental_action_cache_store_output_metadata has any blockers, you can file an issue to track the flip of this flag just like other incompatible flags, the pipeline will pick it up with the correct labels set (incompatible-change and migration-ready).

@coeuvre
Copy link
Member

coeuvre commented Oct 7, 2022

Yes, i will create an issue for it. However, I would not like to remove the experimental_ prefix and enable it by default in 6.0 since there are still some improvements to be done which might also able to prevent build failures for most cases in the event of remote cache eviction without using action rewinding.

@coeuvre
Copy link
Member

coeuvre commented Nov 9, 2022

with #16720, this flag is always set when using --remote_download_{toplevel,minimal}. Closing.

@coeuvre coeuvre closed this as completed Nov 9, 2022
@coeuvre coeuvre reopened this Nov 9, 2022
@coeuvre
Copy link
Member

coeuvre commented Nov 9, 2022

Sorry, should have closed this after #16720 is merged.

@brentleyjones
Copy link
Contributor Author

Does your comment still stand about renaming? Could #16720 get into 6.0 at least?

@coeuvre
Copy link
Member

coeuvre commented Nov 9, 2022

Does your comment still stand about renaming?

With #16720, I don't think renaming matters anymore. Also, this flag only makes sense when using along with BwoB, so making it default globally doesn't sound good.

Could #16720 get into 6.0 at least?

I think so.

@brentleyjones
Copy link
Contributor Author

With #16720, I don't think renaming matters anymore.

I think renaming still matters.

Also, this flag only makes sense when using along with BwoB, so making it default globally doesn't sound good.

I agree 👍.

ShreeM01 added a commit that referenced this issue Nov 14, 2022
#16740)

…n of flag --remote_download_{toplevel,minimal}.

So users don't need to add it manually.

It is always desired to use this flag along with BwoB.

Closes #13912.

Closes #16720.

PiperOrigin-RevId: 487248820
Change-Id: I94f6486a0a61522dc9000de7e9a595ace65c97e2

Co-authored-by: Chi Wang <chiwang@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
5 participants