Skip to content

SONARPY-944 Use precomputed Typeshed symbols for third-party libraries in the Python analyzer#1061

Merged
andrea-guarino-sonarsource merged 5 commits into
typeshed-reworkfrom
SONARPY-944
Jan 28, 2022
Merged

SONARPY-944 Use precomputed Typeshed symbols for third-party libraries in the Python analyzer#1061
andrea-guarino-sonarsource merged 5 commits into
typeshed-reworkfrom
SONARPY-944

Conversation

@andrea-guarino-sonarsource
Copy link
Copy Markdown
Contributor

No description provided.

@andrea-guarino-sonarsource andrea-guarino-sonarsource changed the title Sonarpy 944 SONARPY-944 Use precomputed Typeshed symbols for third-party libraries in the Python analyzer Jan 25, 2022
@andrea-guarino-sonarsource andrea-guarino-sonarsource force-pushed the SONARPY-944 branch 6 times, most recently from d421c7e to 89990eb Compare January 27, 2022 13:53
@andrea-guarino-sonarsource andrea-guarino-sonarsource marked this pull request as ready for review January 27, 2022 14:32
Copy link
Copy Markdown
Contributor

@guillaume-dequenne guillaume-dequenne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I left some minor comments. I think it would be nice merging this in a separate branch so that I can incorporate my changes from SONARPY-945 there as well, so that we can validate every regression in sonar-security is under control before merging everything.

path = os.path.join(CURRENT_PATH, relative_path)
for root, dirs, files in os.walk(path):
package_name = root.replace(path, "").replace("\\", ".").replace("/", ".").lstrip(".")
if not generate_python2_stdlib and "python2" in package_name:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it needed to generalize this bit? I think only the stdlib has a @python2 part in its path.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes Indeed I changed it to generalize. However, also third_parties has @python2 (for example six)

self.pretty_printed_name = f"Tuple[{','.join(item_names)}]"
if any(item.is_unknown for item in items):
self.kind = None
self.is_unknown = True
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not have a Tuple with Any items in that case? Or just another field that mark the entirety of items as unknown?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I hesitated a bit if creating a Tuple with Any. I took the simplest approach for the time being, if that's fine also for you, I would prefer to handle this in a separate PR. I can create a ticket for it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!



def save_module(ms: Union[ModuleSymbol, MergedModuleSymbol], is_debug=False, debug_dir="output"):
def save_module(ms: Union[ModuleSymbol, MergedModuleSymbol], is_debug=False, debug_dir="output", is_stdlib=True):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to do the same thing for custom stubs and used a similar boolean for that. I don't think it's a good approach once we have to distinguish between the 3 cases.
Another idea I just had was to use a field on the module level to indicate its type (custom / stdlib / third party) and define the save_dir through that field. Wdyt?
In any case this doesn't need to be changed here, I can also do it in my PR when generalizing the method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, makes sense to me. I will let you do it in your PR

output_dir_name = output_dir_name if python_version >= (3, 0) else f"{output_dir_name}@python2"
opt = get_options(python_version)
build_result = walk_typeshed_stdlib(opt)
_, build_result = walk_typeshed_stdlib(opt)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we always use build_result but not always source_paths, maybe we could change the order in which they are returned?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can do that

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 75f1353


private static Map<String, Symbol> getSymbolsFromProtobufModule(String moduleName) {
InputStream resource = TypeShed.class.getResourceAsStream(PROTOBUF + moduleName + ".protobuf");
private static Map<String, Symbol> getSymbolsFromProtobufModule(String moduleName, boolean isThirdParty) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of a boolean, what about passing the directory name directly (so that this generalizes more easily with custom stubs) ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I will let you do that in your PR :)

Comment on lines 74 to +75
private static final String THIRD_PARTY_3 = "typeshed/third_party/3/";
private static final String CUSTOM_THIRD_PARTY = "custom/";
private static final String PROTOBUF = "protobuf/";
private static final String PROTOBUF_THIRD_PARTY = "protobuf/stubs/";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we have stdlib under protobuf and third parties under protobuf/stubs. All of this is under types which currently holds Typeshed as well.
What about going for a protobuf_stdlib, protobuf_thid_parties and protobuf_custom all at the same level, for better clarity?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works for me! if that's fine I will let you also do that in your PR or I can do it in another separate PR

resp = make_response("hello")
resp.headers['Access-Control-Allow-Origin'] = '*' # Noncompliant
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
resp.headers['Access-Control-Allow-Origin'] = '*' # FN
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why this becomes a FN? Is this the one related to the other ticket you created?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because attributes were missing from typeshed classes. I think this change is not needed after I rebase on master :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm, after rebase the issue is raised again :)


def main():
save_merged_symbols()
save_merged_symbols(is_third_parties=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this change is not needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 90ef511

@andrea-guarino-sonarsource andrea-guarino-sonarsource changed the base branch from master to typeshed-rework January 28, 2022 10:19
@sonarsource-next
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@guillaume-dequenne guillaume-dequenne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@andrea-guarino-sonarsource andrea-guarino-sonarsource merged commit 5588096 into typeshed-rework Jan 28, 2022
@andrea-guarino-sonarsource andrea-guarino-sonarsource deleted the SONARPY-944 branch January 28, 2022 12:49
hashicorp-vault-sonar-prod Bot pushed a commit that referenced this pull request Apr 23, 2026
GitOrigin-RevId: c7c8ff06ab0e3f046cf936c4656b4185c1012161
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants