Scale in unregistered htex blocks #3232

benclifford · 2024-03-11T16:42:39Z

This PR will make the htex scale-in code consider blocks which have been launched through the provider mechanism but have not registered to the interchange. This is especially relevant at shutdown time, when previous blocks still waiting in the queue (and so, not registered) would be abandoned by parsl shutdown.

[first commit on this PR is a test that should break; when that happens, I'll push the rest; this is to get a demonstrated failure of test in CI]

Changed Behaviour

More blocks scaling in. Tidier shutdown.

Fixes

Fixes #2627

Type of change

Bug fix

…caled in do this by submitting an infinite sleep worker

…at least one manager registered how to test this? make an htex with init_blocks=min_blocks=1 block and a worker command that does something like sleep forever (it needs to not fail, because failed blocks wont' get scaled in). and then shutdown without any tasks. in the old behaviour that unregistered block will not be scaled in, i think, but in the new behaviour it should be.

benclifford · 2024-03-11T17:09:05Z

Since I wrote this code a few weeks ago, there's been a change to the high throughput executor status method to introduce a "MISSING" state, and I've also gone over the general status code with @yadudoc a bit.

Most importantly, I'd like to check that this code isn't actually calling the provider status calls (eg. invoking sbatch) unnecessarily: the current code is quite unclear as to when asking for status returns some in-memory cache of blocks (and block status) vs calling out to the LRM.

There might also be opportunities for aligning this code better with the code that handles MISSING status.

yadudoc · 2024-03-12T18:55:36Z

This draft PR fixes the issue described here: #3235.

benclifford · 2024-03-12T18:57:22Z

@yadudoc - I don't think it does? this PR doesn't fix any cache-based status update delays? I thought we tested a different (no-PR-number) patch?

khk-globus · 2024-03-13T01:46:39Z

parsl/executors/high_throughput/executor.py

+
+        managers = self.connected_managers()


Makes the earlier assignment redundant, yeah?

khk-globus · 2024-03-13T03:03:40Z

parsl/executors/high_throughput/executor.py

+        for (b_id, job_status) in provider_blocks.items():
+
+            # skip blocks that have already finished...
+            if job_status.state in TERMINAL_STATES:
+                continue
+
+            if b_id not in block_info:
+                block_info[b_id] = BlockInfo(tasks=0, idle=float('inf'))


provider_blocks is a dictionary, yeah? If so, then b_id is guaranteed to be unique through the loop. Coupled with the fact that block_info is empty when the loop starts, the if-conditional is a waste. Could merge it all into the instantiation:

block_info = { block_id: BlockInfo(tasks=0, idle=float('inf') for block_id, job_status in provider_blocks.items() if job_status.state not in TERMINAL_STATES }

Later on, block_info keys are set to zero on first use as well, so could make that structure a defaultdict:

block_info: Dict[str, BlockInfo = defaultdict(lambda: BlockInfo(0, float('inf'))) for b_id, job_status in provider_blocks.items(): # skip blocks that have already finished... if job_status.state in TERMINAL_STATES: continue # the defaultdict logic is now the authority on "empty BlockInfo" objects block_info[b_id]

And could remove the if-branch in the following for-loop as well.

parsl.provider.* status() calls can be expensive, in the sense that they do things like execute batch system commands which system administrators dislike being run automatically and periodically, and even in some situations ban that behavior. So, Parsl tries to not do those calls too often - providers can define a status_polling_interval (which returns an interval in seconds) and the strategy code will call (through a chain of calls) the provider status method at most every status_polling_interval. Prior to this PR, this was implemented in the job status poller code, which makes the following chain of calls at most every status polling interval poller/strategy iterator => poll_item.poll (runs "often" - eg every 5 seconds driven by job status poller?) => determine if the rest of this stack should run, or otherwise use data cached as poll item._status attribute => executor.status() => provider.status() => (in the case of cluster provider, provider._status and LRM command line callout) There are a couple of issues related to this call stack: * #3235 - slow update of jobs that failed to submit - when a job fails to submit, at present it is recorded in a separate table, the _simulated_status dictionary, in BlockProviderExecutor. Entries added to that are not exposed to the scaling code until executor.status() is called, which may be many strategy iterations later. In that time, the scaling strategy make make the wrong decisions, based on incomplete information. That is the core of issue #3235 * #2627 - htex scale in needs to be aware of job status. Scale in code is not always driven in by the poller/strategy code; for example it executes at shutdown driven by the DataFlowKernel. That scale in code then is not in a position to make use of the cached data inside poll_item and it is more natural to call executor.status() to determine status - see PR #3232. That results in calls to provider.status() without any status_polling_interval rate limiting/caching. This current PR moves that rate limit/cache deeper into the stack to the BlockProviderExecutor. The job status poller code now polls the executor on every iteration, relying on the executor to perform rate limiting and caching. That has the effect of exposing simulated statuses to the scaling strategy code on each iteration, so that the delays in issue #3235 will not be experienced. This change also means that PR #3232 can safely call executor.status() without exceeding the rate limit. This PR changes when provider status gets called, because it is now a cache that refreshes whenever status is called past the expiry time, rather than more explicitly driven in a loop. That might change scaling behaviour, but I think not in any significant way. This PR changes the decision of when an executor is to be polled for status: previously non-positive values of status_polling_interval were used as magic values to inhibit provider status polling. Now, any executor with a provider will be polled. This PR makes the call stack above provider.status a little more data driven and less effectful (the effect being load on the LRM), which might make refactoring this stack a bit easier. However, it will also make that a bit more computation heavy - I think not much because usually there are not many blocks in existence. This PR adds a test that repeatedly calling BlockProviderExecutor.status does not result in an excessive number of calls to the provider status method but still updates eventually.

this is to accomodate future population of blockinfo from multiple sources how is this change tested? it *shouldn't* change anything. but scale_in is not well tested...

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232

this is to accomodate future population of blockinfo from multiple sources how is this change tested? it *shouldn't* change anything. but scale_in is not well tested...

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232

this is to accomodate future population of blockinfo from multiple sources how is this change tested? it *shouldn't* change anything. but scale_in is not well tested...

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232

this is to accomodate future population of blockinfo from multiple sources how is this change tested? it *shouldn't* change anything. but scale_in is not well tested...

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

This is to accomodate upcoming expansion of the block_info dictionary to include data from an additional source (the list of blocks known to the scaling code) This change arises from a review of PR #3232, an earlier prototype of that expansion work. Behaviourally, this shouldn't change anything: it changes invalid block accesses from key errors into blocks that are infinitely idle and completely unloaded. However there are no indexes into block_info except in code that is adding block information, so this situation should not arise.

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

Prior to this PR, the scale_in code for the HighThroughputExecutor will not scale in any block that has not had at least one manager register with the interchange, because it retrieves the list of blocks from the interchange. This is documented in issue #3232 This PR makes the htex scale in code also pay attention to blocks in the status_facade list - which includes blocks that have been submitted, and blocks which have been reported by the provider mechanism.

benclifford added 2 commits March 11, 2024 16:06

test that a block which does not register with manager is correctly s…

2752dbd

…caled in do this by submitting an infinite sleep worker

khk-globus reviewed Mar 13, 2024

View reviewed changes

benclifford mentioned this pull request Mar 13, 2024

Move provider status caching closer to providers #3247

Closed

benclifford added a commit that referenced this pull request Mar 25, 2024

copy test from broken pr #3232 for scaling in non-running blocks

de46f30

benclifford added a commit that referenced this pull request Mar 25, 2024

at this point, scale in code can now observe poll item states and so …

0e0dc23

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232

benclifford added a commit that referenced this pull request Mar 26, 2024

copy test from broken pr #3232 for scaling in non-running blocks

ce9565c

benclifford added a commit that referenced this pull request Mar 26, 2024

at this point, scale in code can now observe poll item states and so …

c0d7e13

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232

benclifford added a commit that referenced this pull request Mar 26, 2024

copy test from broken pr #3232 for scaling in non-running blocks

2f78823

benclifford added a commit that referenced this pull request Mar 26, 2024

at this point, scale in code can now observe poll item states and so …

89cf2d7

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232

benclifford added a commit that referenced this pull request Mar 26, 2024

copy test from broken pr #3232 for scaling in non-running blocks

f117971

benclifford added a commit that referenced this pull request Mar 26, 2024

at this point, scale in code can now observe poll item states and so …

5d5ad80

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232

benclifford added a commit that referenced this pull request Mar 27, 2024

copy test from broken pr #3232 for scaling in non-running blocks

313df2c

benclifford added a commit that referenced this pull request Mar 27, 2024

at this point, scale in code can now observe poll item states and so …

3761ef4

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

benclifford mentioned this pull request Mar 27, 2024

Use a defaultdict for BlockInfo in htex scale-in #3300

Merged

benclifford added a commit that referenced this pull request Mar 28, 2024

copy test from broken pr #3232 for scaling in non-running blocks

e5499c1

benclifford added a commit that referenced this pull request Mar 28, 2024

at this point, scale in code can now observe poll item states and so …

d7ed2ea

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

benclifford added a commit that referenced this pull request Apr 7, 2024

copy test from broken pr #3232 for scaling in non-running blocks

5f31f96

benclifford added a commit that referenced this pull request Apr 7, 2024

at this point, scale in code can now observe poll item states and so …

92204c1

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

benclifford added a commit that referenced this pull request Apr 8, 2024

copy test from broken pr #3232 for scaling in non-running blocks

cf5719f

benclifford added a commit that referenced this pull request Apr 8, 2024

at this point, scale in code can now observe poll item states and so …

c4b7fac

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

benclifford added a commit that referenced this pull request Apr 9, 2024

copy test from broken pr #3232 for scaling in non-running blocks

ebec40d

benclifford added a commit that referenced this pull request Apr 9, 2024

at this point, scale in code can now observe poll item states and so …

62029f9

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

benclifford added a commit that referenced this pull request Apr 9, 2024

copy test from broken pr #3232 for scaling in non-running blocks

4d92648

benclifford added a commit that referenced this pull request Apr 9, 2024

at this point, scale in code can now observe poll item states and so …

e4802d6

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

benclifford added a commit that referenced this pull request Apr 11, 2024

copy test from broken pr #3232 for scaling in non-running blocks

5705a10

benclifford added a commit that referenced this pull request Apr 11, 2024

at this point, scale in code can now observe poll item states and so …

34a0f54

…htex can now make use of that information source... so I can try to make a better implementation of PR #3232 i think this fixes 3232

benclifford mentioned this pull request Apr 11, 2024

Make HTEX scale-down be aware of unstarted blocks #3353

Merged

benclifford closed this in #3353 Apr 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale in unregistered htex blocks #3232

Scale in unregistered htex blocks #3232

benclifford commented Mar 11, 2024 •

edited

benclifford commented Mar 11, 2024

yadudoc commented Mar 12, 2024

benclifford commented Mar 12, 2024

khk-globus Mar 13, 2024

khk-globus Mar 13, 2024

Scale in unregistered htex blocks #3232

Scale in unregistered htex blocks #3232

Conversation

benclifford commented Mar 11, 2024 • edited

Changed Behaviour

Fixes

Type of change

benclifford commented Mar 11, 2024

yadudoc commented Mar 12, 2024

benclifford commented Mar 12, 2024

khk-globus Mar 13, 2024

Choose a reason for hiding this comment

khk-globus Mar 13, 2024

Choose a reason for hiding this comment

benclifford commented Mar 11, 2024 •

edited