Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Message field can now be populated with useful messages #361

Merged
merged 1 commit into from
Nov 1, 2023

Conversation

allada
Copy link
Member

@allada allada commented Oct 31, 2023

We now allow users to configure how the message field is populated for ActionResult and added HistoricalExecuteResponse that is compatible with bb_browswer.

closes #333
closes #331


This change is Reviewable

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@aaronmondal
cc: @chrisstaite-menlo (feel free to review too if you'd like)

Reviewable status: 0 of 20 files reviewed, all discussions resolved (waiting on @aaronmondal)

@allada allada force-pushed the print-useful-data-in-message-field branch from 7616ff9 to 149ca6f Compare October 31, 2023 07:08
Copy link
Member

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 20 of 20 files at r1, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @allada)


cas/cas_main.rs line 501 at r1 (raw file):

                    let maybe_ac_store = if let Some(ac_store_ref) = &local_worker_cfg.upload_action_result.ac_store {
                        Some(store_manager.get_store(ac_store_ref).err_tip(|| {
                            format!("Failed to find store for ac_store in worker config : {}", ac_store_ref)

nit: ac_store_ref can be inlined:format!("Failed to find ... {ac_store_ref}")


cas/cas_main.rs line 512 at r1 (raw file):

                            store_manager.get_store(cas_store_ref).err_tip(|| {
                                format!(
                                    "Failed to find store for historical_results_store in worker config : {}",

nit: cas_store_ref can be inlined:format!("Failed to find ... {cas_store_ref}")


cas/worker/running_actions_manager.rs line 1181 at r1 (raw file):

        if treat_infra_error_as_failure && action_result.error.is_some() {
            did_fail = true;
        }

Nit: You could remove the mut here

Suggestion:

        let did_fail = if treat_infra_error_as_failure && action_result.error.is_some() {
            true
        } else {
            action_result.exit_code != 0
        };

cas/worker/running_actions_manager.rs line 1187 at r1 (raw file):

            // Never cache internal errors or timeouts.
            UploadCacheResultsStrategy::Everything => treat_infra_error_as_failure || action_result.error.is_none(),
            UploadCacheResultsStrategy::FailuresOnly => did_fail,

BTW Interesting. What is the use case for the FailuresOnly strategy?


cas/worker/running_actions_manager.rs line 1199 at r1 (raw file):

    ) -> Result<String, Error> {
        // TODO(allada) Currently only sha256 is supported, but soon will be dynamic.
        template_str.replace("digest_function", digest_function::Value::Sha256.as_str_name());

Nit Do we really want to use SHA256 as the default over Blake3?


cas/worker/running_actions_manager.rs line 1235 at r1 (raw file):

                .update_action_result(Request::new(update_action_request))
                .await
                .map(|_| ())

Is this safe to do? What are we mapping to () here?


cas/worker/running_actions_manager.rs line 1285 at r1 (raw file):

        let mut execute_response = to_execute_response(action_result.clone());

        let message_template = if action_result.exit_code == 0 && action_result.error.is_none() {

Could you add a comment explaining in what situation we could we get an exit code 0 and a not-none error message?


cas/worker/running_actions_manager.rs line 1325 at r1 (raw file):

            )
            .await
            .map(|_| ())

What are we mapping here?

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @aaronmondal)


cas/cas_main.rs line 501 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: ac_store_ref can be inlined:format!("Failed to find ... {ac_store_ref}")

Done.


cas/cas_main.rs line 512 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

nit: cas_store_ref can be inlined:format!("Failed to find ... {cas_store_ref}")

Done.


cas/worker/running_actions_manager.rs line 1181 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

Nit: You could remove the mut here

Yeah I thought about it, but it was easier to read this way.


cas/worker/running_actions_manager.rs line 1187 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

BTW Interesting. What is the use case for the FailuresOnly strategy?

Buildbarn uploads historical cache only on failure and uses action cache only on success when printing results.


cas/worker/running_actions_manager.rs line 1199 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

Nit Do we really want to use SHA256 as the default over Blake3?

For now we don't support blake3 as a CAS hash function. This needs to be compatible with CAS for now so we need to stick to sha256.


cas/worker/running_actions_manager.rs line 1235 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

Is this safe to do? What are we mapping to () here?

This only ignores the ActionResult, which we don't care about here. The errors are not caught under map(), so errors are ignored.


cas/worker/running_actions_manager.rs line 1285 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

Could you add a comment explaining in what situation we could we get an exit code 0 and a not-none error message?

Done.


cas/worker/running_actions_manager.rs line 1325 at r1 (raw file):

Previously, aaronmondal (Aaron Siddhartha Mondal) wrote…

What are we mapping here?

Yeah this map was needless.

@allada allada force-pushed the print-useful-data-in-message-field branch from 149ca6f to d0d91fb Compare October 31, 2023 16:22
Copy link
Member

@aaronmondal aaronmondal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @allada)

Breaking Change: This is a breaking change because the worker
config's action cache store is now inside the "upload_action_result"
property. Only the config file should need to be changed.

We now allow users to configure how the message field is populated
for ActionResult and added HistoricalExecuteResponse that is
compatible with bb_browswer.

closes #333
closes #331
@allada allada force-pushed the print-useful-data-in-message-field branch from d0d91fb to cccd483 Compare November 1, 2023 03:54
Copy link
Collaborator

@chrisstaite-menlo chrisstaite-menlo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 18 of 20 files at r1, 4 of 4 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @allada)


cas/grpc_service/execution_server.rs line 170 at r3 (raw file):

        let receiver_stream = Box::pin(WatchStream::new(receiver).map(|action_update| {
            log::info!("\x1b[0;31mexecute Resp Stream\x1b[0m: {:?}", action_update);
            Ok(Into::<Operation>::into(action_update.as_ref().clone()))

Wouldn't .into::<Operation>() read better?


cas/worker/local_worker.rs line 248 at r3 (raw file):

                                        // Save in the action cache before notifying the scheduler that we've completed.
                                        if let Some(digest_info) = action_digest.clone().and_then(|action_digest| action_digest.try_into().ok()) {
                                            if let Err(err) = running_actions_manager.cache_action_result(digest_info, &mut action_result).await {

Nice


cas/worker/tests/utils/mock_running_actions_manager.rs line 133 at r3 (raw file):

            .send(RunningActionManagerCalls::CacheActionResult(Box::new((
                action_digest,
                action_result.clone(),

Oh, not as nice as I first thought...

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @allada)


cas/worker/tests/utils/mock_running_actions_manager.rs line 133 at r3 (raw file):

Previously, chrisstaite-menlo (Chris Staite) wrote…

Oh, not as nice as I first thought...

This one is only for tests.

Copy link
Member Author

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @allada)


cas/grpc_service/execution_server.rs line 170 at r3 (raw file):

Previously, chrisstaite-menlo (Chris Staite) wrote…

Wouldn't .into::<Operation>() read better?

Yeah, I go back and forth with this.

@allada allada merged commit cf2f3e4 into main Nov 1, 2023
14 checks passed
@allada allada deleted the print-useful-data-in-message-field branch November 1, 2023 15:36
blakehatch pushed a commit to blakehatch/nativelink that referenced this pull request Nov 21, 2023
Breaking Change: This is a breaking change because the worker
config's action cache store is now inside the "upload_action_result"
property. Only the config file should need to be changed.

We now allow users to configure how the message field is populated
for ActionResult and added HistoricalExecuteResponse that is
compatible with bb_browswer.

closes TraceMachina#333
closes TraceMachina#331
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants