-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed bug where extract method throws on unreachable code or local function parameter #25147
Conversation
@dotnet/roslyn-ide @jinujoseph can you take a look? |
retest windows_debug_unit64_prtest please |
retest windows_release_vs-integration_prtest please |
This seems bad. Why are you throwing in this case instead of NFWing? If we don't understand something, we should either fail gracefully, or we should do the best we can even if its' not correct. |
it logs NFW already. this unexpected condition is same as other code action/refactoring throwing invalid cast exception since it expected type but actual node was namespace. or throwing null exception because it expects type to always exist but not. not sure what you mean by handling it gracefully. none of roslyn code is written not to throw on unexpected condition. |
if we believe this, we should go this route like we are now moving away from fatal watson to non fatal watson, ... extract method existed for long time, these new unexpected condition happens since new language features are added such as local function, and that breaks existing invariant and it throws. nothing new for how roslyn has been written. but sure, we can change pattern on how roslyn is written from throw to handle unexpected conditions. but changing only here won't change anything. |
I guess my point is that existing code should be written with teh expectation that future change to the language may break its assumptions. So, for example, don't blindly cast to something. Check if that thing is actually the right kind. And fallback, or bail out (possibly emitting an NFW) if that happens. Pretty much all the rest of our code-actions have not had this issue as the language has moved forward. I'm asking that ExtractMethod be written in a similar fashion. |
What features do this? That's a bug in the feature. If you get a SyntaxNode you should check it for the forms you can work with. And if you don't have one of those forms you should bail out gracefully. That's been the way we've tried to write most of our features from the beginning.
I disagree. It means that as hte language continues to move forward this feature won't have problems with that. And that's a good thing. That's strictly better than being in a state where the language can move forward and this feature can break. |
"What features do this? That's a bug in the feature. If you get a SyntaxNode you should check it for the forms you can work with. And if you don't have one of those forms you should bail out gracefully. That's been the way we've tried to write most of our features from the beginning." no, we prefer hard cast rather than using "is" when we believe it should only get certain type of node/symbol. that sure is correct until we add new node type, symbol kind and etc. we do not write our code future proof. "I disagree. It means that as hte language continues to move forward this feature won't have problems with that. And that's a good thing. That's strictly better than being in a state where the language can move forward and this feature can break." well, that's whole pattern change. I can open a bug that states we should move to that pattern same as we opened a bug saying we should move to ValueTask, or move to NFW and etc. but that's not this PR is for. |
I disagree. While we certainly have other code which is makes similar mistakes, we write tons of LS code to be future proof and to not blindly cast. Can you point out hte line of code that is actually crashing in this case? I'd like to see what it looks like and how we might make it resilient both to the language changing, and to us not considering certain cases. |
|
||
return Contract.FailWithReturn<int>("Shouldn't reach here"); | ||
return leftLocation.SourceSpan.Start - rightLocation.SourceSpan.Start; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, we crash because it doesn't expect local function which added after extract method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if I write it future proof, I can't assume that common tree must exist here as well, since who knows what kind of new method symbol we would add in future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. i would expect this code to not have a Contract. I would expect this to either be resilient to a new sort of symbol being passed in. Or the code should be able to indicate that it doesn't know what is going on, and the refactoring should be suppressed.
Assuming any of our symbol kinds is "complete" is not something i think is acceptable for the LS code to do. It can NFW if we want to send us data to help fill in these gaps without user reports. But it should not crash just because it expected a certain set of methods, and then new ones were added.
The only way this sort of code is acceptable is if the author has put a system in place to crash immediately the moment a new enum member is added. i.e. that way when someone on the compilre team adds MethodKind.LocalFunction, then the LS refuses to even load because you have a static constructor that fails. That way the person adding the enum member knows to go fix up this code.
Right now you're basically saying: "i've only written code to handle a fixed set of cases, and i will crash in the future if any new cases happen. But there's no way for you to know this will happen and customers will be hit by this in the wild." That's not responsible. Either we need to be resilient to our enums/symbols expanding over time. or we must fail immediately so that hte person adding the new language feature knows they have to fix/test this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since who knows what kind of new method symbol we would add in future?
Correct. We need to be resilient here. For example: look at AbstractSymbolDescriptionBuilder.AddDescriptionPartAsync. It is needed to create a description for an arbitrary symbol. It hardcodes is knowledge about which symbols are supported, but it has a standard 'fall through' case for any symbols it doesn't expect. It doesn't crash saying "i didn't know how to create quick info for this new language feature type."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
like I said, fine opening issue for what you are saying, not going to make that change in this PR. and I don't believe what you are saying is completely true since if that is true, when new language feature is added from compiler, IDE should never crash which is not true at all.
"The only way this sort of code is acceptable is if the author has put a system in place to crash immediately" not sure what you meant immediately. you mean when there is no one invoking the feature, it need to somehow figure that out immediately? where is such code in Roslyn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really didn't want to. but you put the symbol display as example, so I dug in to see whether it is really future proof and there is no Contract.
it does have some such as this
http://source.roslyn.io/#Microsoft.CodeAnalysis.Features/LanguageServices/SymbolDisplayService/AbstractSymbolDisplayService.AbstractSymbolDescriptionBuilder.cs,303
if new group kind is added, it will throw. so, now will it immediately throw when group kind is added and vs run first time? no it won't until quick info is shown. will it throw immediately when quick info is shown? no it won't until quickinfo that has new group kind is needed. if new group kind is needed, will it immediately throw? no it won't until multiple line is needed in description.
..
this kind of code in Roslyn is common. and this actually helped us a lot when something changed. sure we can move to new pattern where we try to not use this kind of pattern as much like we are now moving away from fatal watson to NFW.
but this PR I dont believe is to address that issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by the way, this doesn't crash VS. it shows info bar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not a good case. SymbolDescriptionGroups is tightly coupled with AbstractSymbolDescriptionBuilder. You can of course have tightly coupled code have knowledge of itself. The issue is when you cross layers, or use subsystems that are by design expected to change over time. The compiler layer is one of those systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, don't want to argue more. I see this differently so. like I said I am fine to open an issue. I just am not going to change how extract method handle unexpected condition in this PR.
if this crash VS, I will do something not to crash VS. but it doesn't.
@@ -114,6 +114,8 @@ private static void BuildMatrix() | |||
s_matrix.Add(new Key(dataFlowIn: false, dataFlowOut: false, alwaysAssigned: true, variableDeclared: false, readInside: false, writtenInside: true, readOutside: false, writtenOutside: true), VariableStyle.SplitIn); | |||
s_matrix.Add(new Key(dataFlowIn: false, dataFlowOut: false, alwaysAssigned: true, variableDeclared: false, readInside: false, writtenInside: true, readOutside: true, writtenOutside: false), VariableStyle.SplitIn); | |||
s_matrix.Add(new Key(dataFlowIn: false, dataFlowOut: false, alwaysAssigned: true, variableDeclared: false, readInside: false, writtenInside: true, readOutside: true, writtenOutside: true), VariableStyle.SplitIn); | |||
s_matrix.Add(new Key(dataFlowIn: false, dataFlowOut: false, alwaysAssigned: true, variableDeclared: false, readInside: true, writtenInside: false, readOutside: false, writtenOutside: true), VariableStyle.InputOnly); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataflowin, dataflowout being false, and readinside true and variabledeclared false is not something expected for a symbol. if it is not declared inside and read, data must flow in. but not for unreachable code case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we have 8 variables, which give us 256 possibilities. among those, these are expected combination. we don't expect combination that not belong to ones listed here. if we get one, rather than silently ignore those case or create random code or show some dialog box to users. this one throw and show info bar to users.
I agree we should not crash VS. and we don't. if your point is rather than info bar we should show something else. not sure what that something else is.
again, I am fine I open an issue saying we need to write future proof code but I am not going to make that change in this PR. "we write tons of LS code to be future proof and to not blindly cast. " this is certainly not true. since I have many code review you saying, crash or throw if it can't happen now even though it can happen in future since tree shape changes or other thing. we have many code where at the end of switch statement, we throw saying this kind is not expected. where we can always add new kind. |
@@ -126,7 +126,7 @@ public int CompareTo(ParameterVariableSymbol other) | |||
return 0; | |||
} | |||
|
|||
var compare = CompareTo((IMethodSymbol)_parameterSymbol.ContainingSymbol, (IMethodSymbol)other._parameterSymbol.ContainingSymbol); | |||
var compare = CompareMethodParameters((IMethodSymbol)_parameterSymbol.ContainingSymbol, (IMethodSymbol)other._parameterSymbol.ContainingSymbol); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, I can't assume parameter symbol always belong to method symbol if I want future proof. we have this kind assumption all over the places.
|
||
// dataflow in and out can be false for symbols in unreachable code | ||
// boolean indicates | ||
// dataFlowIn: false, dataFlowOut: false, alwaysAssigned: true, variableDeclared: false, readInside: true, writtenInside: false, readOutside: true, writtenOutside: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agocke Can you confirm these are the expected values from data flow analysis here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -3760,6 +3760,99 @@ private static void NewMethod() | |||
await TestExtractMethodAsync(code, expected); | |||
} | |||
|
|||
// dataflow in and out can be false for symbols in unreachable code | |||
// boolean indicates | |||
// dataFlowIn: false, dataFlowOut: false, alwaysAssigned: true, variableDeclared: false, readInside: true, writtenInside: false, readOutside: false, writtenOutside: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agocke Can you confirm these are the expected values from data flow analysis here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agocke these 2 cases are not for local function but for unreachable code.
..
flow analysis not working correctly on certain language feature can result in 2 different outcome. one being infobar like this. or extract method generating broken code. not sure what known issue mentioned below on local function will result in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@agocke basically, it is reversed representation. data flow analysis API gives set of symbols on each cases, data flow in, out, read in/out, write in/out and etc.
I have variable for each case for a symbol that indicates the symbol exist in each set or not. if that variable is confusing you, think it as dataFlowAnalysisResult.DataFlowIn.Contains(symbol) and etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Note that there are still known bugs around region analysis and local functions, e.g. #14214 I do have a fix for these issues, but have not had time to implement it. |
@sharwell I'm having trouble understanding the notation used in these tests. DataFlowsOut and DataFlowsIn don't return booleans, they return sets of symbols. Is the boolean to indicate that there is a single symbol in each tests and it is either present or absent in the set for true and false, respectively? |
@agocke that decision is per symbol, so it gets data flow analysis result which contains state -> set of symbols, and revert it to symbol -> set of state. and use that to find out what to do with each symbol from a predefined matrix. ... for tests, there are 2 kinds of tests. 1 is just sanity test I have for all entries in the matrix. test could have multiple symbols in it, but one of symbol cover one of entry in the matrix. it basically cover that the symbol behave as described in the matrix. the other tests are feature test. it cover language features, extract method behavior around comments, whitespace. formatting, selection validation and etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 conditioned on @agocke agreeing that the data flow implementation isn't just broken for these cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where data flow results were listed, the values look like what I expect.
Specifically for data flows in, this is specified as:
A variable assigned outside is used inside if an analysis that leaves the
variable unassigned on entry to the region would cause the generation of "unassigned" errors
within the region.
All variables are definitely assigned in unreachable code, so if a variable is unassigned before entering a region, but the usages of that variable in the region are unreachable, I would expect no definite assignment errors to be produced, thus the variable does not flow into the region.
DataFlowsOut is the same, but for assignments. I would also expect no variables to flow out of the region in these examples.
|
||
// dataflow in and out can be false for symbols in unreachable code | ||
// boolean indicates | ||
// dataFlowIn: false, dataFlowOut: false, alwaysAssigned: true, variableDeclared: false, readInside: true, writtenInside: false, readOutside: true, writtenOutside: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -3760,6 +3760,99 @@ private static void NewMethod() | |||
await TestExtractMethodAsync(code, expected); | |||
} | |||
|
|||
// dataflow in and out can be false for symbols in unreachable code | |||
// boolean indicates | |||
// dataFlowIn: false, dataFlowOut: false, alwaysAssigned: true, variableDeclared: false, readInside: true, writtenInside: false, readOutside: false, writtenOutside: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@Pilchie for ask mode approval for 15.7.p2 |
thank you! @jinujoseph @Pilchie can I get approval for 15.7? |
Approved. |
Customer scenario
User invokes extract method on code snippet that includes unreachable code or local function parameter, and VS throws an exception that ends up either show up in info bar or message box.
Bugs this fixes
#23352
#23175
Workarounds, if any
No workaround
Risk
this address very specific condition so no risk or change on extract method behavior
Performance impact
N/A
Is this a regression from a previous update?
No
Root cause analysis
Extract method has a matrix that lists all possible combination it supports and throw exception if it encounters code that it doesn't know how to handle. we found a case we need to add support (unreachable code/local function parameter)
How was the bug found?
Customer, Dogfooding