Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix search on private worksheets #4432

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

AndrewJGaut
Copy link
Contributor

@AndrewJGaut AndrewJGaut commented Mar 31, 2023

Reasons for making this change

We discovered an issue where bundles on private worksheets were not visible to cl search and would not appear in its output. Turns out there was a minor error in the search_bundles function when searching for private bundles with a non-root user (we used a JOIN instead of a LEFT OUTER JOIN in one place). This fixes that issue.

Changes

  • Fix issue for non-root user searching for private bundles.
  • Beef up tests. Add in test for searching for bundles on private worksheets. Beef up non-root user tests significantly. (Before, we had a separate section for non-root user tests for search which weren't as rigorous as the tests used for the root user. Now, we apply the exact same tests for the root user and a non-root user.)

Related issues

#4405

@AndrewJGaut AndrewJGaut marked this pull request as ready for review March 31, 2023 06:26
create_user(ctx, user_name, disk_quota='2000')
switch_user(user_name)
check_equals(uuid1, _run_command([cl, 'search', 'uuid=' + uuid1, '-u']))
def test_search_helper(ctx):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define a new helper function (so we can run the same tests multiple times)

Comment on lines +637 to +647
aliased_group_bundle_permission = aliased(cl_group_bundle_permission)
add_join(
cl_group_bundle_permission,
cl_bundle.c.uuid == cl_group_bundle_permission.c.object_uuid,
aliased_group_bundle_permission,
cl_bundle.c.uuid == aliased_group_bundle_permission.c.object_uuid,
left_outer_join=True,
)
aliased_cl_user_group = aliased(cl_user_group)
add_join(
aliased_cl_user_group,
aliased_cl_user_group.c.user_id == user_id,
left_outer_join=True,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use aliased here to account for the case when we've already done a join on the group_bundle_permission table or the user_group table. (If you do that twice without aliasing one of them, the query is invalid according to MySQL syntax rules.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does aliased automatically pick a unique name, or do you need to explicitly pick a name?

In a different PR, we should probably add validation when we do the joins to make sure that we're not joining onto the same table twice.

Comment on lines 1878 to 1885
"""Search bundles on private worksheets"""
wuuid = _run_command([cl, 'work', '-u'])
new_wuuid = _run_command([cl, 'new', random_name()])
_run_command([cl, 'wperm', new_wuuid, 'public', 'n']) # make worksheet private
_run_command([cl, 'work', new_wuuid]) # switch to worksheet
uuid = _run_command([cl, 'upload', test_path('a.txt')])
check_equals(uuid, _run_command([cl, 'search', uuid, '-u']))
_run_command([cl, 'work', wuuid])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the new test added for search on private worksheets

Comment on lines 1905 to 1916
test_search_helper(ctx)

# Test with non-root user.
if not os.getenv('CODALAB_PROTECTED_MODE'):
# This test does not work when protected_mode is True.
_, current_user_name = current_user()
user_name = 'non_root_user_' + random_name()
create_user(ctx, user_name, disk_quota='1000000')
switch_user(user_name)
new_wuuid = _run_command([cl, 'new', random_name()])
_run_command([cl, 'work', new_wuuid])
test_search_helper(ctx)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call helper function here to actually run the tests. Note we call the same helper for the root and non-root user, so the exact same tests are performed in both cases

Copy link
Contributor

@yifanmai yifanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great detective work!

Comment on lines +637 to +647
aliased_group_bundle_permission = aliased(cl_group_bundle_permission)
add_join(
cl_group_bundle_permission,
cl_bundle.c.uuid == cl_group_bundle_permission.c.object_uuid,
aliased_group_bundle_permission,
cl_bundle.c.uuid == aliased_group_bundle_permission.c.object_uuid,
left_outer_join=True,
)
aliased_cl_user_group = aliased(cl_user_group)
add_join(
aliased_cl_user_group,
aliased_cl_user_group.c.user_id == user_id,
left_outer_join=True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does aliased automatically pick a unique name, or do you need to explicitly pick a name?

In a different PR, we should probably add validation when we do the joins to make sure that we're not joining onto the same table twice.

size1 + size2, float(_run_command([cl, 'search', 'name=' + name, 'data_size=.sum']))
)

"""Search for floating bundles"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we use comment style # rather than docstrings for mid-method comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aliased does automatically pick a unique name, so there's no issue with table name collisions or joining on the same table twice That's actually what I meant when I said the query was invalid otherwise; before I added aliased, sometimes it would join on the same table twice and MySQL actually throws an error when that happens. So we should be OK (if I'm understanding your comment correctly).

For the # comments style, I updated that.

Copy link
Contributor

@yifanmai yifanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I understand, the failing case is when the bundle has no permissions, so the join is empty?

access_via_owner = cl_bundle.c.owner_id == user_id
access_via_group = and_(
or_( # Join constraint (group)
cl_group_bundle_permission.c.group_uuid
aliased_group_bundle_permission.c.group_uuid
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, is there a potential false positive case here (access_via_group is incorrectly true) if aliased_group_bundle_permission.c.group_uuid and self.public_group_uuid are both None?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this a bit more, this logic seems incorrect?

  • aliased_group_bundle_permission is all the group bundle permissions for the object
  • aliased_cl_user_group is all the group memberships for the user
  • aliased_cl_user_group.c.user_id == user_id doesn't do anything because this is already enforced by the join.
  • So it seems like if the bundle grants access to some group A, and the user X belongs to some group B, then user X incorrectly gets access to the bundle?
  • Seems like what we're missing here is we need to check that aliased_group_bundle_permission.c.group_id == aliased_cl_user_group.c.group_id?
  • Alternatively you can push this constraint into the join condition i.e. when you in onto aliased_group_bundle_permission, join using the condition that the bundle permission group is the user's groups or the public group only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're totally correct. Thanks for catching this! I updated it. I believe it should be correct now.

@AndrewJGaut AndrewJGaut requested a review from yifanmai May 17, 2023 06:40
Comment on lines +653 to +654
aliased_group_bundle_permission.c.group_uuid
== aliased_cl_user_group.c.group_uuid, # Private group
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you have to also check that aliased_group_bundle_permission.c.group_uuid is not NULL.

Otherwise this could be triggered if the user is not in any groups, and the bundles has no group permissions. Could you check if this is the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I added that check and some more tests to make sure that's not a problem anymore.

I think it looks good now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants