-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SONARPY-1798 Try to resolve built-in types for names which have no symbol #1774
SONARPY-1798 Try to resolve built-in types for names which have no symbol #1774
Conversation
000ebe2
to
3c31b8b
Compare
3a97a6f
to
b74df45
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is a good example of what I'm trying to promote and why.
I believe that if this PR had been focused on fixing the following problem:
"If a name does not have any binding usage, let's search in the builtins module to allocate a type for it"
It could have been much smaller and focused, easier to review and to make progress (carving it out of the ArgumentNumberCheck
was a great change, IMO - it would have been even better to do it from the start).
I also have a question regarding names whose type is a builtin: what does it mean for them in terms of symbol? I believe for now, they would have a type but no symbol (because no assignment). Is this intentional? I don't have a strong opinion on the behavior at this point, but I think we should probably test/document this. It may have impacts on rules that aim to detect undefined symbols.
public boolean isOrExtends(String s) { | ||
// TODO: implement and maybe it should accept PythonType instead of string | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this as part of this PR? It's probably better to introduce it when we need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excluded from PR
@@ -157,8 +157,7 @@ void builtin_parent() { | |||
); | |||
ClassType classB = classTypes.get(1); | |||
assertThat(classB.superClasses()).hasSize(2); | |||
// FIXME: ensure builtin parent is resolved | |||
assertThat(classB.hasUnresolvedHierarchy()).isTrue(); | |||
assertThat(classB.hasUnresolvedHierarchy()).isFalse(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some assertions?
Now that the hierarchy is resolved, it would make sense to assert on parents/members to ensure information about BaseException
is correctly retrieved (e.g members, etc...).
It may also make sense to have an additional test with a builtin parent that actually has an unresolved type hierarchy to ensure the information is not lost then.
@@ -77,4 +79,13 @@ public List<Tree> computeChildren() { | |||
public Kind getKind() { | |||
return Kind.SUBSCRIPTION; | |||
} | |||
|
|||
@Override | |||
public PythonType typeV2() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make sense to have a test for this (e.g is ObjectTypeTest
).
I also think the current implementation is not correct. If I have:
my_list = ["hello"]
foo(my_list[0])
The type of the argument (= the subscription expression) is str
and not list
. I think it would therefore make sense for the type of a subscription expression to be evaluated as the type of the attribute
, if its own type is a known container type. I think this could deserve a dedicated ticket, with some examples of fixed FNs though.
Ultimately if I remove this method, everything is still green. So maybe it's best to leave this out of this PR for now as it's not really needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excluded from PR
.map(TypeInferenceV2::getUsagesType) | ||
.or(() -> projectLevelTypeTable.getModule().resolveMember(name.name())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a subtle yet impactful change from the previous implementation.
In the previous implementation, any variable that either has multiple assignments or a global declaration usage would not be assigned a type.
Now we fall back to builtins
instead, which means that in the following (admittedly convoluted) code:
global int
int = 42
int = "hello"
int
The final int
would be considered to be builtins.int
despite the multiple reassignments.
I think the fallback to builtins should only happen if a symbol has no binding usages.
I'm a bit unhappy about the focus of our discussion regarding the imperative loop vs the optional chain. I think the main focus should be on functional behavior rather than technical implementation and I believe this case was an example where we can improve.
Taking the previous implementation as a reference, I believe that a simple
if (types.size().isEmpty() {
// the underlying assumption is that each binding usage will have a type and add to the list
// if it's not the case, a simple boolean "hasNoBindingUsage" computed during the iteration would do the trick
setTypeToName(name, projectLevelTypeTable.getModule().resolveMember(name.name()));
}
at the end of the method would have achieved what you're trying to do, without requiring an additional iteration over usages
and without risking an undesired functional change.
I also think we are missing tests for this (like variations of the convoluted case I showed), which is hidden due to the optional chain.
Again, I have nothing against optional chains, but they do have limitations that need to be acknowledged and I don't think we should aim to refactor everything to these just for the sake of it.
@@ -148,8 +148,16 @@ private void addParameter(org.sonar.plugins.python.api.tree.Parameter parameter, | |||
Token starToken = parameter.starToken(); | |||
if (parameterName != null) { | |||
ParameterType parameterType = getParameterType(parameter); | |||
this.parameters.add(new ParameterV2(parameterName.name(), parameterType.pythonType(), parameter.defaultValue() != null, | |||
parameterState.keywordOnly, parameterState.positionalOnly, parameterType.isKeywordVariadic(), parameterType.isPositionalVariadic(), locationInFile(parameter, fileId))); | |||
var parameterV2 = new ParameterV2(parameterName.name(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the changes regarding FunctionType
building for this PR?
I feel we can focus on the builtins fallback which would then be clearly separated from this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
excluded
public void createBuiltinModule(ModuleType parent) { | ||
var name = "builtins"; | ||
createModuleFromSymbols(name, parent, TypeShed.builtinSymbols().values()); | ||
public ModuleType createBuiltinModule() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure yet that I see the value of removing the builtins
prefix.
I believe we could still have a builtins
dedicated module with a helper method that would help retrieve it so that we don't have to duplicate the "builtins"
literal everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is mainly needed to convert symbols (old ones) to types, the builtin types don't have builtins
in their FQN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! I think it was a mistake on our end in the original implementation, though. Since other tools (including the Python interpreter itself) will map builtin types to the builtins
module.
Ideally, I think I'd like to follow that convention as well, however I think for now the least disruptive approach is okay (i.e, if this PR still achieves its goals without removing builtins
, I'd keep the prefix - and if it breaks a lot of other stuff, we can consider adding it back later instead).
b74df45
to
b2a3106
Compare
af8dceb
to
efb7f8b
Compare
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have some questions before we proceed with merging this PR.
@Override | ||
public void visitName(Name name) { | ||
SymbolV2 symbolV2 = name.symbolV2(); | ||
if (symbolV2 == null) { | ||
projectLevelTypeTable.getModule().resolveMember(name.name()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this change.
Before, we were falling back to built-in symbols that did not have binding usages.
Now, we're doing it if the symbol is null
.
Does that mean that the previous implementation, as well as my suggestion, was wrong? I'd expect this to be explicitly addressed in the comment then. Also, why did the original implementation solve the FNs in nonCallableCalled.py
if it was incorrect?
I'm also not sure that we really want the symbol of used built-in names to be null
in the first place, which could deserve at least comment / discussion / ticket as well, depending one how we want to proceed with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in my implementation, there was no difference between no symbol and symbol doesn't have a binding usage, that's why it has been working.
About symbol for builtin types - agree but it is out of this ticket scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: add comment with ticket
@@ -75,8 +75,6 @@ public TypeInferenceV2(ProjectLevelTypeTable projectLevelTypeTable) { | |||
this.projectLevelTypeTable = projectLevelTypeTable; | |||
} | |||
|
|||
private static final String BUILTINS = "builtins"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You didn't answer my question about builtins
removal.
Given we don't map FQNs to PythonType
yet, do we really need to remove it? Does it break something?
I know it looks like I'm pushing to keep it, but I would really like to understand why this PR needed to remove it, given that we haven't made any decisions on how we represent FQNs yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cause when you resolve types e.g. for serialized symbols or project level symbols the FQN of type str is just str, it doesn't operate with builtins, and I want to have common behavior for it
efb7f8b
to
71dac54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
cb39e36
into
rnd/type-inference-engine-specification
No description provided.