Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYNPY-1401] Prevent Repeated syn.get calls in _help_walk #1059

Merged
merged 1 commit into from
Jan 31, 2024

Conversation

BWMac
Copy link
Contributor

@BWMac BWMac commented Jan 31, 2024

Problem:

It was reported recently that running synapseutils.walk on large projects in Synapse with many folders and subfolders was taking a very long time due to syn.get being executed on every recursive _helpWalk call. Further, as pointed out by @linglp, this is unnecessary since we are also using syn.getChildren which provides the same information about entity types.

Solution:

Update _helpWalk to use the entities returned by syn.getChildren in the recursive function call. Specifically, repeated calls to syn.get are prevented by adding a new parameter that carries the starting entity into each recursive iteration. syn.get is called once per walk call to initiate the process from the parent folder or project.

Notes:

  • A brief exploration was done to determine if we could further optimize this function by using functools.cache, but this was not implemented due to the way that cache interacts with generators.
  • _helpWalk and its parameters were converted to using snake case
  • Unit tests for the walk functions were updated as needed.

@BWMac BWMac requested a review from a team as a code owner January 31, 2024 18:22
@pep8speaks
Copy link

Hello @BWMac! Thanks for opening this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 88:89: E501 line too long (102 > 88 characters)
Line 90:89: E501 line too long (91 > 88 characters)
Line 92:89: E501 line too long (111 > 88 characters)
Line 93:89: E501 line too long (117 > 88 characters)

@BWMac BWMac marked this pull request as draft January 31, 2024 18:24
@BWMac BWMac marked this pull request as ready for review January 31, 2024 18:26
Copy link

sonarcloud bot commented Jan 31, 2024

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

11 New issues
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

Copy link
Contributor

@BryanFauble BryanFauble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes lgtm!

Ran this sample script to verify everything in the project printed:

for directory_path, directory_names, file_name in synapseutils.walk(
    syn=syn, synId="syn49637038"
):
    for directory_name in directory_names:
        print(
            f"Directory ({directory_name[1]}): {directory_path[0]}/{directory_name[0]}"
        )

    for file in file_name:
        print(f"({file[1]}): {directory_path[0]}/{file[0]}")

And no errors were received.

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 Excellent - this is another great example of iterative improvement to the codebase.



# Helper function to hide the newpath parameter
def _helpWalk(
def _help_walk(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Not sure what I was thinking when I was writing this, but the _ is almost redundant with help. I can imagine this could be _walk, but... this is good. no action required. Just a thought in the future for naming functions.

@BWMac
Copy link
Contributor Author

BWMac commented Jan 31, 2024

@BryanFauble Thanks for testing, next time I will include a script/snippet to save reviewers the trouble.

@BWMac BWMac merged commit bfc9253 into develop Jan 31, 2024
37 of 38 checks passed
@BWMac BWMac deleted the bwmac/SYNPY-1411/recursive_help_walk_bug branch January 31, 2024 20:19
@thomasyu888 thomasyu888 changed the title [SYNPY-1411] Prevent Repeated syn.get calls in _help_walk [SYNPY-1401] Prevent Repeated syn.get calls in _help_walk Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants