Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INT-1234] Refactor to use client MountedRepo state #9954

Merged
merged 24 commits into from Apr 24, 2024
Merged

Conversation

emmajsadams
Copy link
Contributor

This PR refactors the backend to remove server MountedRepo state in favor of client state. It maintains existing behavior, updates all relevant tests, and adds new tests for PollMount.

@emmajsadams emmajsadams requested review from a team and smalyala and removed request for a team April 16, 2024 17:55
@@ -699,7 +640,7 @@ def _get_file_from_path(self, path: str) -> pfs.File:
else:
pach_path = str(Path(*parts[1:]))

file_uri = f"{commit_uri}:{pach_path}"
file_uri = f"{branch.as_uri()}:{pach_path}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove commit_uri a few lines above since it's no longer used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah actually I should not have changed this method to add branch. DatumManager should remain the same. I will just revert adding the param for this method.


return {
pfs: {
name: `${mountedRepo.repo.project}_${mountedRepo.repo.name}_${mountedRepo.mountedBranch.name}`,
Copy link
Contributor

@smalyala smalyala Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the last entry be mountedRepo.mountedBranch?.name

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check the mountedRepo param earlier in this method for null and return {} so I think this is safe as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for mountedRepo to be non-null and mountedBranch to be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smalyala nope. mountedRepo is non null when the user has selected at least a project/repo. Selecting that will automatically select a branch (usually master). It can only be changed to other projects, repos, branches with no option to manually unselect just a branch.

Copy link
Contributor

@smalyala smalyala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we pass the branch as part of the request data rather than as a query parameter, since the endpoint path isn't as secure (could be cached, logged, etc.) as the request payload? This is a (rare) concern if a user uses HTTPS requests and considers repo/branch name to be confidential.

Also, can we add some tests to cover cases where the branch argument in the FE request has bad data + missing data (no branch specified)?

@emmajsadams
Copy link
Contributor Author

Should we pass the branch as part of the request data rather than as a query parameter, since the endpoint path isn't as secure (could be cached, logged, etc.) as the request payload? This is a (rare) concern if a user uses HTTPS requests and considers repo/branch name to be confidential.

Probably a good thing to look into! Should be able to to modify the endpoint to do this.

Also, can we add some tests to cover cases where the branch argument in the FE request has bad data + missing data (no branch specified)?

Will do.

@emmajsadams
Copy link
Contributor Author

emmajsadams commented Apr 22, 2024

@smalyala Should we pass the branch as part of the request data rather than as a query parameter, since the endpoint path isn't as secure (could be cached, logged, etc.) as the request payload? This is a (rare) concern if a user uses HTTPS requests and considers repo/branch name to be confidential.

Hmm so on second thought, I wonder if we should reach out to customer eng on if this is a use case and tackle it another story? We probably want to reevaluate all the paramaters if them possibly being logged as part of a GET request is a concern specifically file name. If the connection between the client and server is HTTPS then the GET parameters are protected in transit so are we worried about the GET url being logged somewhere?

Also added the tests!

@smalyala
Copy link
Contributor

smalyala commented Apr 22, 2024

In general, it's good practice to avoid passing sensitive information with query parameters (i.e. could be logged on server side, stored in browser history, etc.) even if encrypting data with HTTPS. A few searches can provide more comprehensive info and examples than I can :)

Yep, I'm good with asking customer eng if this is a potential concern (it's probably not) and tackling it in a separate story if needed. I don't think we need to worry about file name because it's just the repo and branch name that are being passed as query parameters right?

@emmajsadams
Copy link
Contributor Author

@smalyala sounds good created https://pachyderm.atlassian.net/browse/INT-1257! Is the rest of the PR ready to merge?

@smalyala
Copy link
Contributor

smalyala commented Apr 23, 2024

Cool, thanks! Last comment here- #9954 (comment)

@emmajsadams emmajsadams merged commit a91173f into master Apr 24, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants