-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Cleanup {}
named function block
#207
Conversation
Would this also be implemented on the Cmdlet/PSCmdlet API? Or only on functions? I would also like to see |
This is already possible with Cmdlet / PSCmdlet -- you just need to implement
Ah yes, forgot to mention that; that's also done in the linked draft PR. I'll add it to this RFC. 😄 |
@SteveL-MSFT any chance I could get you guys to look this one over? 💖 |
@vexx32 the |
First, I want to compliment you on a well written RFC. It was a pleasure to read! I think that the use case is narrow-enough that it's acceptable to have a more verbose syntax for achieving the goal. Namely, this could be achieved with function Get-Foo {
param(
[string]$myparam
)
function Get-FooInternal {
process { # "$myparam" could be used here without additional marshaling }
}
start {
try {
Get-FooInternal
} finally {
# this is an equivalent of dispose block
}
}
} |
@vors could you expand on what you mean with your Yes, it can be achieved with try/finally, but (crucially!) not across named blocks. You can't initialize something in Try/finally cannot be used outside of named blocks, so this construct is necessary for any command that needs to work with external resources (database connections, for example) as well as utilize PowerShell's pipeline framework. The capability is simply not there at the moment, and this fills that gap. |
Nah, that's documented behavior. See Implement the IDisposable Interface (AC04). |
Added to the Alternate Proposals section below.
Clarify wording around invoking dispose{} and mention IDisposable with PS Classes
@SeeminglyScience oh nice! |
@SteveL-MSFT do you think we have sufficient comments for a committee review, or would you like me to give it another Twitter shout? 🙂 |
@SteveL-MSFT @joeyaiello friendly ping for review! I'd like to get this one in the works and hopefully get the code into ps7 if possible! It's already written, bar some minor rebases needed. |
@SteveL-MSFT @joeyaiello friendly ping for review when you guys are able, please & thank you! 😊 💖 |
@PowerShell/powershell-committee reviewed this today with quorum via @daxian-dbw, @JamesWTruher, @SteveL-MSFT, and @joeyaiello Overall, we really like the RFC, especially how it reflects the organic consensus that was reached in #6673. We have a few questions / open concerns, though: First and foremost, we're worried about the risk of taking significant engine changes in an area where we don't feel very confident about our test coverage, specifically where we're not currently calling The magnitude of those changes are exacerbated by the fact that myself and @daxian-dbw agree that we should simultaneously update C# cmdletst to call their We want to apologize for dragging our feet on such a high-interest RFC for so long, as it would've been really awesome to have in 7.0, but we feel it's best to wait until after GA before we merge it. In the meantime, we can also discuss some smaller design questions / concerns.
Does anyone have opinions on the above? |
Sad to hear it won't make it into 7.0, as I agree -- it would have been a pretty awesome feature to get in for a major version bump, but I'm glad you guys like it nonetheless! 😊 (Let me know if you guys do change your mind and want to shoot for 7.0; I have the code and pretty thorough tests written to the best of my ability already. Unless we need to make changes, but I can make that happen.) FWIW:
Not sure on the Ctrl+C/Ctrl+Break points. Timeout makes sense, but also is the kind of implementation detail that can potentially cause problems. That said, dispose with a timeout is still markedly better than no dispose at all, and if it's documented would probably be OK. |
Considering you've been vocal that you aren't planning a v7 release until dotnet 3.1 comes out as that's an LTS version then perhaps this could get this RFC & associated PR mentioned in via an experimental feature branch in the main PSRepo (ie not just an experimental feature) until you are happy that all tests are added, as from what I read this sounds like it would need to be a bit longer lived than most PR's are and perhaps needs a handful of PR's instead of just the one. This often makes sense for longer lived (read harder to build) features than a single PR would allow flexibility for, and whilst I know this isn't typically how development of PowerShell is done, it may make sense in this case, especially if it would help in accelerating this making it into the GA of v7.0
The above suggestion would in my mind also help with what you've called out here @joeyaiello |
@kilasuit I'd be totally with you if it weren't for the fact that we don't think this is going to be feasible to hide behind an experimental feature. |
@vexx32 we are actively discussing this today |
Include considerations & alternate proposals for committee and individual feedback.
@SteveL-MSFT friendly ping for follow up. 🙂 |
To do so, such commands need also implement `IDisposable` and the `Dispose()` method. | ||
This RFC would simply change when the dispose method is called. | ||
Currently, the method is called during pipeline disposal. | ||
This RFC would ensure it is called immediately after `EndProcessing()` completes during normal pipeline operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Strictly speaking, it doesn't need to be changed.
However, I think it makes more sense to call it after EndProcessing()
when possible, simply because that means that we:
- have a more well-defined point where we can expect Dispose() to be called,
- free up resources earlier, and
- spread out the overall impact of calling
Dispose()
so that some of it can happen during pipeline execution / towards the end of a pipeline, rather than all waiting till the very end of a pipeline and having to wait for every command in the pipeline to be disposed, one at a time.
On paper I don't think the differences will be massively different in the majority of cases. Selfishly, this makes it slightly easier to implement in CommandProcessorBase
, if I recall that section of code correctly. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking further at the code behaviour... I don't think this is strictly necessary. Although some work is currently done before disposing commands, it doesn't look like there's a significant wait between calling the endprocessing/DoComplete methods and the DisposeCommands(), so it's likely okay to leave it more or less where it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having been tinkering a bit more with this, it really comes down to a fairly small-ish difference... in most cases.
- If we have
dispose
execute after allEndProcessing()
/end{}
steps are completed, that is a bit more consistent with how other steps work (all cmdlets executebegin
before moving toprocess
for each, etc.) - If we have
dispose
execute after eachEndProcessing()
/end{}
step is completed, that lets us free up resources a bit quicker, and makes some types of pipelines easier to work with.- For example,
Get-Content $file | Some-Function | Set-Content $file
would be possible ifGet-Content
opens its file read handle inEndProcessing()
and then disposes of it inDispose()
;Set-Content
would freely be able to open its own file handle by the time itsEndProcessing()
step comes around. Currently, this requires parentheses to be used to separate the pipelines:(Get-Content $file) | Some-Function | Set-Content $file
.
- For example,
That's just an example, but I'm sure there are other potential applications that are a bit harder to reason about. I tend to think it's a better idea to dispose a bit sooner if we can. This would be a fairly small change to cut out, though, so whatever the committee decides, I'll be able to live with. 😁
- The slightly different parameter name is necessary as C# does not permit a property name to collide with a method name, and `IDisposable` mandates a method named `Dispose()`. | ||
- To improve consistency, the parameter can be explicitly aliased to `-Dispose`. | ||
- The parameter _must_ be assigned explicitly by name, and never be assumed based on number of provided scriptblock parameters as the command currently does with `begin`/`process`/`end`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the .foreach{} magic method also supports begin/process/end. Are you planning to extend it as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. If I don't, dispose {}
is basically a dead block there, I don't see a good way to prevent folks using it since it's literally just accepting a scriptblock.
In truth, wasn't really aware they supported named blocks till the call today. Open to suggestions on this one, really could go either way. Kind of like ForEach-Object
there. I've actually moved this section to Considerations as well. 🙂
- This includes new token types and appropriate entries in the `Tokenizer`'s keyword tables. | ||
- Additional members and constructors for `ScriptBlockAst` to enable it to be recognised by the parser. | ||
- Additional methods in `DlrScriptCommandProcessor` and `CompiledScriptBlock` to enable the block to be executed at the correct time. | ||
- Both types implement or inherit from a type that implements `IDisposable`, and as such the existing `Dispose()` methods can generally be utilised to run this block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the cmdlet actually going to be IDisposible? If so, what happens if one of these things is disposed on the finalizer thread. It will very likely error out either because there's no PowerShell engine available or the engine is in use. Maybe post it back to the event list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think script cmdlets already are IDisposable, so I'm just adding some code into that disposal. Script commands (non-CmdletBinding functions) are processed by DlrScriptCommandProcessor which inherits CommandProcessorBase (which is IDisposable).
Can you elaborate a bit on that last point? Not quite sure what you mean or how to do that, but definitely agree we should be very careful if it ever gets to that segment of code on the finalizer thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After digging through the code somewhat for some of the InvokeWithPipe
overloads, I think I have a surface-level grasp of what you mean. I'll add this into the RFC, I think I can implement this.
- Reworded a few things, moved a couple bits around to make it more sensible. - Moved the `ForEach-Object` bits to Considerations as it may not be desirable/necessary to include in the implementation. - Added finalizer thread behaviour to address exceptional circumstances.
@SteveL-MSFT I've gone through and made the changes we talked about, I think... that should cover just about everything. Added a few sections and moved the stuff about My memory's honestly failing me here, I don't quite recall if everyone was on board with FYI @bpayette as well, made some edits that are related to your comments. 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@PowerShell/powershell-committee reviewed this asking for some changes:
- Committee still agrees to use the name
Cleanup
instead ofDispose
asDispose()
has specific meaning for C# developers that doesn't map exactly to this scriptblock - Currently this RFC calls out ForEach-Object (and related .ForEach/.Where) supporting
Cleanup{}
, but expectation is that any scriptblock execution needs to be consistent and respect theCleanup{}
block. Whether a-Cleanup
parameter is needed forForEach-Object
can be added later based on user feedback
There were some questions/concerns about the implementation details, but that should be handled in the code PR and once resolved reflected back into this RFC.
Was this Cleanup{} name-change made during a public call that I missed? Was it recorded? I won't argue, but given that recently ternary and null-operators have been accepted that have "similar but not identical" meanings and syntax for C# programmers, that reasoning seems specious. |
@essentialexch no, it wasn't during a public call nor was it recorded. To be clear, it's wasn't an "obviously this vs that" type of conclusion. We debated |
@SteveL-MSFT I'll go through and change the naming here then, no worries. If it works as it needs to work and the name is reasonably descriptive I'm all for it. 🙂 I'll incorporate the rest into the RFC; to summarise quickly, though, I entirely agree -- it should work everywhere that named blocks are already supported. This includes:
@essentialexch if my memory is reliable over this timespan, the name |
dispose{}
named function blockCleanup {}
named function block
:-) As I wrote, I won't argue. It just seemed like a surprising unanticipated change here, at least to me. :-) I just want the functionality. It's the first "must have" functionality driving PS v7+ functionality to me. It provides finally for long-running robust scripts that don't/won't require regular restarts. |
- Name changed to `cleanup` - Clarification on more confusing parts.
|
||
#### Error States | ||
|
||
Terminating errors can be thrown from a `cleanup {}` block, which will skip the invocation of the rest of that `cleanup {}` block, without preventing other commands' `cleanup {}` blocks from being executed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question related to this: what happens with the following statements in a cleanup block:
throw "bad"
$PSCmdlet.ThrowTerminatingError(...)
$PSCmdlet.WriteError(...)
Which automatic variables will be available, and will any of them have fields that don't make sense or otherwise don't work in the cleanup block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non-terminating errors work normally. Terminating errors work OK, but only if the command hasn't already thrown an error out.
In terms of the current implementation that I have in the code PR itself, I've noticed that there's an issue with reporting terminating errors in the Cleanup{} block due to the design of the pipeline processor if (and only if) the command has already thrown a terminating error. Basically, the error is recorded in $error
and then the pipeline processor suppresses it, because the command is already stopping due to an earlier error. Same should happen with errors thrown from Dispose()
in compiled cmdlets, I think, even in the original code there.
I'd rather avoid refactoring that out of the pipeline processor itself, it seems like that would be super messy. If there's a way I can catch that coming out of the cleanup {}
block (or indeed the Dispose()
block from compiled commands) and make it a non-terminating error at that point (so the cleanup{}
is still cancelled by a terminating error, but the pipeline reframes the error as non-terminating if it's already received a terminating error from that command, and the command has already finished what it can do anyway).
Something like that, anyway. Hopefully that makes sense. 😁
All the usual automatic variables should be available, I think. I'll verify $PSCmdlet
is... the only ones I know are not are $_
/$PSItem
and $input
as I explicitly avoid passing those into the execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that all makes sense. At the very least the pipeline processor is so central that any issues that come about when making the necessary modifications should present themselves pretty quickly.
With the variables I'm particularly thinking of things like $PSCmdlet
and $ExecutionContext
that have deep properties hooking back into the engine -- I expect that at least some of them will behave quite differently in a cleanup block and we should do our best to establish how, or if they will have parts that shouldn't be relied upon (or should behave differently -- e.g. throw because it's not a valid time to use them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order for the cleanup block to run at all, we essentially have to tell the pipeline stopper to temporarily pretend all's fine and dandy until we're done with the cleanup block.
As a result, all those variables seem to behave themselves just fine, at least at the moment. But yeah, depending on decisions made during code review, that might change a bit, we'll see how it goes, definitely something to keep a sharp eye on. 🙂
To do so, such commands need also implement `IDisposable` and the `Dispose()` method. | ||
This RFC would simply change when the dispose method is called. | ||
Currently, the method is called during pipeline disposal, after **all** commands have completed their `EndProcessing()` steps. | ||
As part of the changes proposed in this RFC, we would ensure each command's `cleanup {}` or `Dispose()` routines are called immediately after _each_ cmdlet's `EndProcessing()` execution completes during normal pipeline operation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm nervous about changing the ordering here with regard to when Dispose
is called. What behavioral change do you think will happen because of this change, and if none, is there a way that you can validate that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There shouldn't be a lot of visible changes here, but there would be some. Essentially the change here would simple reorder the EndProcessing/Dispose steps in a given pipeline from EndA -> EndB -> EndC -> DisposeA -> DisposeB -> DisposeC
to instead being EndA -> DisposeA -> EndB -> DisposeB -> EndC -> DisposeC
To an individual command, there should not be a significant difference. It's still getting disposed after it completes EndProcessing, just with a bit less delay than before.
For a practical example, let's take the example of a cmdlet at the start of a pipeline that takes a lock on a file to read data in during BeginProcessing()
and doesn't release it until Dispose()
. Suppose also that it writes all of its output in EndProcessing()
. Then you want to pipe that to another command that wants to also take a write lock on that same file, but doesn't need it until EndProcessing()
.
Contrived example? Maybe!
Currently, this circumstance will fail completely, because the first command's lock is still alive by the time the second command wants to take a lock. If we move disposal/cleanup to just after each command's EndProcessing()
instead, then everything just works.
Hugely useful scenario? Maybe, maybe not. It's kind of an edge case where it will be super useful, but at least on paper I think it makes more sense to dispose resources and do cleanup when the command has done its job, rather than when the pipeline as a whole is completed.
If we suppose that all the commands in a given pipeline have some cleanup{}
or Dispose()
processing going on, doing it after each command will "feel" better overall to users, I think; it'll spread out any impact of running cleanup/disposal across the run of the pipeline, rather than effectively queueing it up until the end of the pipe.
Anyway, that's my reasoning.
As for do I think there'll be a behavioural change... no, in most cases there won't be. Can I validate that.... maybe? Not sure. I can't really think of a way apart from simply demonstrating that commands in general still work just fine, even when they have implement IDisposable
or a cleanup{}
block.
I'd venture to guess that if it were a problem, we'd see some odd behaviour fairly quickly from cmdlets that utilise the IDisposable pattern currently? Outside of those, no commands presently written could realistically be affected, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(sorry for the long text, didn't realize I'd gone on so long 😬 )
@JamesWTruher's latest comment aside (we think it's probably an implementation detail to be handled in PR), we're good to accept this RFC. Cleaning it up for merge now! |
I'm not looking forward to having this conversation about a thousand times:
|
For a working implementation of this RFC, please see PowerShell/PowerShell#9900.