feat: get desired state implementation #1362

seriousben · 2025-04-16T20:51:39Z

Context

What

Testing

Contribution Checklist

If a Python package was changed, please run make fmt in the package directory.
If the server was changed, please run make fmt in server/.
Make sure all PR Checks are passing.

…lstack

eabatalov · 2025-04-25T10:31:27Z

indexify/src/indexify/cli/cli.py

        monitoring_server_host=monitoring_server_host,
        monitoring_server_port=monitoring_server_port,
-        enable_grpc_state_reconciler=enable_grpc_state_reconciler,
+        enable_grpc_state_reconciler=True,


Looks good, I'll be removing the old code paths in Executor once we transition to grpc.

eabatalov · 2025-04-25T10:31:57Z

indexify/src/indexify/executor/grpc/state_reconciler.py

-                        ignored_clock=new_state.clock,
-                    )
-                    continue  # Duplicate or outdated message state sent by Server.
+            # TODO: The clock is only incremented when function executors have actionable changes and not on new allocations.


Is it possible to fix this in the future without much effort?

eabatalov · 2025-04-25T10:33:52Z

indexify/src/indexify/executor/task_reporter.py

+            reducer=output.reducer,
        )
        output_files: List[Any] = []
-        if output is None:


Yeah this code doesn't make any sense :)

This improves time to detect FE failures.

As Executor is now pushing its state periodically to Server we can't rely on sequences of events happening within a certain strict-ish duration so we can't make assertions on these durations anymore. Removing the tests that do such assertions for now until we have a better approach to implement such tests.

I changed the test to not run invocations sequentially but run them in parallel. This is because when running sequentially there's effectively 1 task at a time available to get distributed among 2 idle executors and it's not important to route this task randomly to the executors. After making it parallel we have 1200 enqued tasks and we get really good uniform distribution in the test and it's passing now.

The main idea is to not measure the extra 5 sec latency of Server learning from Executor that Function Executor was created successfully.

diptanu

@eabatalov @seriousben Merging this since tests are green. Unblocks me. Prayers Up for our production 😂

seriousben force-pushed the seriousben/get_desired_state_impl branch 11 times, most recently from 3f7bbef to ab85ace Compare April 24, 2025 14:51

diptanu marked this pull request as ready for review April 25, 2025 03:28

seriousben and others added 11 commits April 24, 2025 20:34

feat: get desired state implementation

9c3567b

support all fields and change how in-mem is updated

c12b331

test: add more reconciliation tests

2c64ea1

fix: executor stop using clock as idempotency token

e98d5db

fix: fix tests

e23989c

fix: fix rebase and make it possible to run server against minio/loca…

2b53def

…lstack

feat: default to always do grpc reconciliation on executor

fd5451b

fix behavior test

a6bc9ca

lint

833b672

fix tests params

928817f

update test runner

e79527b

diptanu force-pushed the seriousben/get_desired_state_impl branch from 6393273 to e79527b Compare April 25, 2025 03:41

eabatalov reviewed Apr 25, 2025

View reviewed changes

eabatalov added 3 commits April 25, 2025 11:58

Update tensorlake

d41afbf

Reduce periodic FE health checks period 10 sec -> 5 sec

964d323

This improves time to detect FE failures.

eabatalov and others added 4 commits April 25, 2025 14:34

Fix test_invoke_duration tests

9c4e704

The main idea is to not measure the extra 5 sec latency of Server learning from Executor that Function Executor was created successfully.

feat: support FE statup failures

3f06928

updated tensorlake commit

0515f76

diptanu approved these changes Apr 26, 2025

View reviewed changes

diptanu merged commit 2021839 into main Apr 26, 2025
9 checks passed

diptanu deleted the seriousben/get_desired_state_impl branch April 26, 2025 05:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: get desired state implementation #1362

feat: get desired state implementation #1362

Uh oh!

seriousben commented Apr 16, 2025

Uh oh!

eabatalov Apr 25, 2025

Uh oh!

eabatalov Apr 25, 2025

Uh oh!

eabatalov Apr 25, 2025

Uh oh!

diptanu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: get desired state implementation #1362

feat: get desired state implementation #1362

Uh oh!

Conversation

seriousben commented Apr 16, 2025

Context

What

Testing

Contribution Checklist

Uh oh!

eabatalov Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

eabatalov Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

eabatalov Apr 25, 2025

Choose a reason for hiding this comment

Uh oh!

diptanu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants