Skip to content

Conversation

@EstelleDa
Copy link
Member

@EstelleDa EstelleDa commented Dec 5, 2024

Only show unpublished superseding score set to the users who have permissions to view it. The current version to other users will only be the published one.

Fix #370

…missions to view it. The current version to other users will only be the published one.
Copy link
Collaborator

@bencap bencap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this Estelle. Is there a reason why all of the function definitions got an extra indent? It also might be nice if we add a test that ensures this new behavior to superseding score sets.

Comment on lines 107 to 116
if(
item
and item.superseding_score_set
and not owner_or_contributor
and (
urn_re.MAVEDB_OLD_TMP_URN_RE.fullmatch(item.superseding_score_set.urn)
or urn_re.MAVEDB_TMP_URN_RE.fullmatch(item.superseding_score_set.urn)
)
):
item.superseding_score_set = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if instead of checking like this we check the permissions of the superseding score set, something like:

Suggested change
if(
item
and item.superseding_score_set
and not owner_or_contributor
and (
urn_re.MAVEDB_OLD_TMP_URN_RE.fullmatch(item.superseding_score_set.urn)
or urn_re.MAVEDB_TMP_URN_RE.fullmatch(item.superseding_score_set.urn)
)
):
item.superseding_score_set = None
from mavedb.lib.permissions import Action, assert_permission, has_permission
if item.superseding_score_set:
superseding_score_set = db.scalars(select(ScoreSet).where(ScoreSet.urn = item.superseding_score_set).one()
if not has_permission(user, item, Action.READ):
item.superseding_score_set = None

This way, we don't have to maintain any extra permission logic and can guarantee the item within the superseding score set property is only returned if the user has access to it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified it. I check item.superseding_score_set directly cause it's an object.

@bencap bencap changed the base branch from release-2024.4.3 to release-2024.4.4 December 10, 2024 22:36
Copy link
Collaborator

@bencap bencap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you also be able to fix the extra indentations before merging? The logic looks good!

@EstelleDa
Copy link
Member Author

Would you also be able to fix the extra indentations before merging? The logic looks good!

@bencap Do you mean the one we discussed in Slack, #371?

…score sets are unpublished yet. Haven't fixed the search score set codes.
@bencap
Copy link
Collaborator

bencap commented Jan 7, 2025

Would you also be able to fix the extra indentations before merging? The logic looks good!

@bencap Do you mean the one we discussed in Slack, #371?

No it seems like there is just an extra level of indentation that got added to each of the function definitions in the src/mavedb/routers/score_sets.py file for whatever reason.

Comment on lines 172 to 201
superseding_score_sets = (
db.query(ScoreSet)
.filter(ScoreSet.experiment_id == experiment.id)
.filter(ScoreSet.superseding_score_set.has())
.all()
)

updated_score_set_result = []
for s in score_set_result:
current_version = s
while current_version:
if current_version.superseded_score_set:
if not has_permission(user_data, current_version, Action.READ).permitted:
next_version: Optional[ScoreSet] = next(
(sup for sup in superseding_score_sets if sup.urn == current_version.superseded_score_set.urn),
None
)
# handle poetry run mypy src/ error so that add next_version
if next_version:
current_version = next_version
else:
break
else:
break
else:
break
if current_version:
updated_score_set_result.append(current_version)
else:
updated_score_set_result.append(s)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thoughts on this are that it is something we may do often enough that we will want a separate utility function. What I would suggest is:

  • Create a new function find_superseded_score_set_tail. We'll be passing the function a ScoreSet database object, and then I think it also makes sense to pass it an optional user_data object and an action object. Basically, if we call the function with action=<some_action>, we will return the permitted tail of the superseded score set chain rather than the 'actual' tail. So if we have a chain looking like urn1 -> urn2 -> urn3 -> tmp1 -> tmp2 and the user has permission for all the urn objects but not the tmp objects, we would return the urn3 object rather than tmp2. If we call the function with action=None, we ignore the user permissions and just return the end of the chain (tmp2) in this case. Notice, we presume the permissions on the passed score set. This permission should be checked by the caller.
def find_superseded_score_set_tail(score_set: ScoreSet, action: Optional[Action] = None, user_data: Optional[UserData] = None):
    ...
  • I like the iterative approach you took above. We're basically just traversing a singly linked list, but with the added caveat that we have to look out for permissions and keep track of the previous score set so we can retain and return it. I think it'd look something like:
def find_superseded_score_set_tail(score_set: ScoreSet, action: Optional[Action] = None, user_data: Optional[UserData] = None):
    while score_set.superseding_score_set is not None:
        next_score_set_in_chain = score_set.superseding_score_set

        # If we were given a permission to check and the next score set in the chain does not have that permission,
        # pretend like we have reached the end of the chain. Otherwise, continue to the next score set.
        if action is not None and not has_permission(next_score_set_in_chain, action, user_data):
            return score_set

        score_set = next_score_set_in_chain

    return score_set
  • Now, we can call our new helper function on each of the score sets in the score set result and then guarantee the permissions of the base results:
...

score_set_result = (
    db.query(ScoreSet)
    .filter(ScoreSet.experiment_id == experiment.id)
    .filter(~ScoreSet.superseding_score_set.has())
    .all()
)

# First, guarantee the permissions of the base score set results. We should verify permissions can only get narrower for superseding score sets. If they can widen, that would lead to bugs with this method.
score_set_result = [
    score_set for score_set in score_set_result if has_permission(
         user_data, 
         score_set, 
         Action.READ
    ).permitted
]

# If a score set has no superseding score set, we have already guaranteed its permissions. This function guarantees the permission of the superseding score sets.
superseded_score_set_tails = [
    find_superseded_score_set_tail(
        score_set, 
        Action.READ, 
        user_data
    ) for score_set in score_set_result
]

...
  • superseded_score_set_tails now contains all the score sets which do not have a superseding score set and to which the user has permissions in addition to the deepest superseding score set to which the user has permissions for each of the score sets with a superseding score set.

Honestly, this is a super tricky little problem so I'm happy to talk more about it if any of that doesn't make sense. Also possible that I made a mistake somewhere in that, so lmk if you see one. Given the complexity, this might even deserve its own change request. But I like the way this solution will allow us to reuse the find_superseded_score_set_tail function and write some test cases for it, since its complexity means it isn't something we're going to want to maintain in multiple spots and this is probably going to be something we want to reuse in other locations where we need to serve the end of a superseding score set chain to a user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It cannot handle a case. If the superseding score set is temporary, it'll be filtered out on the first score_set_result. An example case is experiment urn:mavedb:00000049-a. Its score set will be empty. I agree to create a separating function to check the current version of score set as you suggested. It'll be easy to handle in the future.

@EstelleDa
Copy link
Member Author

#371

I see. I think the original reason is I clicked the tidy file button, but I converted them back before pushing the codes. Maybe I didn't notice this file.

@bencap bencap changed the base branch from release-2024.4.4 to release-2025.0.1 January 25, 2025 02:33
Comment on lines 267 to 304
print(score_sets)
# Remove superseded score set
if not score_sets:
score_sets = []

save_to_logging_context({"matching_resources": len(score_sets)})
logger.debug(msg=f"Score set search yielded {len(score_sets)} matching resources.", extra=logging_context())
print("if no score set")
final_score_sets: list[ScoreSet] = []
else:
if search.published:
filtered_score_sets_tail = [
find_publish_or_private_superseded_score_set_tail(
score_set,
Action.READ,
owner_or_contributor,
search.published
) for score_set in score_sets
]
else:
print("filtered_tail")
filtered_score_sets_tail = [
find_superseded_score_set_tail(
score_set,
Action.READ,
owner_or_contributor
) for score_set in score_sets
]
print(len(filtered_score_sets_tail))
# Remove None item.
filtered_score_sets = [score_set for score_set in filtered_score_sets_tail if score_set is not None]
print(len(filtered_score_sets))
if filtered_score_sets:
final_score_sets = sorted(set(filtered_score_sets), key=attrgetter("urn"))
for f in filtered_score_sets:
print(f.urn)
else:
final_score_sets = []

return score_sets # filter_visible_score_sets(score_sets)
save_to_logging_context({"matching_resources": len(final_score_sets)})
logger.debug(msg=f"Score set search yielded {len(final_score_sets)} matching resources.", extra=logging_context())
print(len(final_score_sets))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the print statements in here now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally forgot these debugging print functions. =.=



def search_score_sets(db: Session, owner_or_contributor: Optional[User], search: ScoreSetsSearch) -> list[ScoreSet]:
def search_score_sets(db: Session, owner_or_contributor: Optional["UserData"], search: ScoreSetsSearch) -> list[ScoreSet]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when we talked about this before what we had discussed was making the signature something like:

def search_score_sets(db: Session, owner_or_contributor: Optional["User"], search: ScoreSetsSearch, requesting_user: "UserData") -> list[ScoreSet]:
      ...

If we pass in the owner_or_contributor variable, we only fetch score sets which have that user as an owner or contributor. And then we use the requesting_user variable to check permissions. That way, we will be able to still search for other users' score sets but ensure that the user who requested them has permissions.

If we do it like that, I don't think we will need to make any changes to the existing search logic and don't need to add the search.me field.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the alternative is to not check permissions or fetch superseding score set tails in the search function itself, but instead call the function that fetches tails and checks permissions after we return the score sets that match the search.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your suggestions, I did some modifications. The search_score_set function is nearly the same as before except removing filtering superseded score set out, and the permissions are checked outside of the search function. Let me know what you think about the new one. I prefer this way, but maybe miss some points.

Comment on lines +369 to +376
if action is not None and not has_permission(user_data, score_set, action).permitted:
while score_set.superseded_score_set is not None:
next_score_set_in_chain = score_set.superseded_score_set
if has_permission(user_data, next_score_set_in_chain, action).permitted:
return next_score_set_in_chain
else:
score_set = next_score_set_in_chain
return None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this checking the case where a user doesn't have permissions on the original score set, and going back up the chain of score sets it supersedes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used to handle unpublished superseding score sets. For example, tmp1 supersedes urn1 so the chain is tmp1 -> urn1. The user doesn't have permission to view tmp1 so the while loop tries to find the next newest available version to the user. If the user has permission to view the next superseded score set, the function will return the score set. Otherwise, it'll return a none value.
The previous while loop doesn't handle this special case. It only looks at whether the score set has a superseding score set.

    while score_set.superseding_score_set is not None:
        next_score_set_in_chain = score_set.superseding_score_set
        if action is not None and not has_permission(user_data, next_score_set_in_chain, action).permitted:
            return score_set
        score_set = next_score_set_in_chain

Copy link
Collaborator

@bencap bencap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the back and forth on this Estelle, it'll be a good fix to get out!

@EstelleDa
Copy link
Member Author

(^o^)/ Thanks Ben!!

@EstelleDa EstelleDa merged commit c95060d into release-2025.0.1 Feb 19, 2025
5 checks passed
@bencap bencap mentioned this pull request Feb 25, 2025
@EstelleDa EstelleDa deleted the estelle/debugShowTmpSupersedingScoreSet branch February 27, 2025 10:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tweak behavior of superseding score sets for users without permissions Temporary URN displayed in the "Current version" field for some scoresets

3 participants