Skip to content

Conversation

@seriousben
Copy link
Contributor

Context

What

Testing

Contribution Checklist

  • If a Python package was changed, please run make fmt in the package directory.
  • If the server was changed, please run make fmt in server/.
  • Make sure all PR Checks are passing.

@seriousben seriousben force-pushed the seriousben/get_desired_state_impl branch 11 times, most recently from 3f7bbef to ab85ace Compare April 24, 2025 14:51
@diptanu diptanu marked this pull request as ready for review April 25, 2025 03:28
@diptanu diptanu force-pushed the seriousben/get_desired_state_impl branch from 6393273 to e79527b Compare April 25, 2025 03:41
monitoring_server_host=monitoring_server_host,
monitoring_server_port=monitoring_server_port,
enable_grpc_state_reconciler=enable_grpc_state_reconciler,
enable_grpc_state_reconciler=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I'll be removing the old code paths in Executor once we transition to grpc.

ignored_clock=new_state.clock,
)
continue # Duplicate or outdated message state sent by Server.
# TODO: The clock is only incremented when function executors have actionable changes and not on new allocations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to fix this in the future without much effort?

reducer=output.reducer,
)
output_files: List[Any] = []
if output is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this code doesn't make any sense :)

This improves time to detect FE failures.
As Executor is now pushing its state periodically to Server we can't rely
on sequences of events happening within a certain strict-ish duration
so we can't make assertions on these durations anymore.
Removing the tests that do such assertions for now until we have a better
approach to implement such tests.
eabatalov and others added 4 commits April 25, 2025 14:34
I changed the test to not run invocations sequentially but run them in parallel.
This is because when running sequentially there's effectively 1 task at a time
available to get distributed among 2 idle executors and it's not important to
route this task randomly to the executors. After making it parallel we have
1200 enqued tasks and we get really good uniform distribution in the test
and it's passing now.
The main idea is to not measure the extra 5 sec latency of Server learning
from Executor that Function Executor was created successfully.
Copy link
Collaborator

@diptanu diptanu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eabatalov @seriousben Merging this since tests are green. Unblocks me. Prayers Up for our production 😂

@diptanu diptanu merged commit 2021839 into main Apr 26, 2025
9 checks passed
@diptanu diptanu deleted the seriousben/get_desired_state_impl branch April 26, 2025 05:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants