-
Notifications
You must be signed in to change notification settings - Fork 2
Fix current version bug #366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix current version bug #366
Conversation
…missions to view it. The current version to other users will only be the published one.
bencap
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this Estelle. Is there a reason why all of the function definitions got an extra indent? It also might be nice if we add a test that ensures this new behavior to superseding score sets.
src/mavedb/routers/score_sets.py
Outdated
| if( | ||
| item | ||
| and item.superseding_score_set | ||
| and not owner_or_contributor | ||
| and ( | ||
| urn_re.MAVEDB_OLD_TMP_URN_RE.fullmatch(item.superseding_score_set.urn) | ||
| or urn_re.MAVEDB_TMP_URN_RE.fullmatch(item.superseding_score_set.urn) | ||
| ) | ||
| ): | ||
| item.superseding_score_set = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if instead of checking like this we check the permissions of the superseding score set, something like:
| if( | |
| item | |
| and item.superseding_score_set | |
| and not owner_or_contributor | |
| and ( | |
| urn_re.MAVEDB_OLD_TMP_URN_RE.fullmatch(item.superseding_score_set.urn) | |
| or urn_re.MAVEDB_TMP_URN_RE.fullmatch(item.superseding_score_set.urn) | |
| ) | |
| ): | |
| item.superseding_score_set = None | |
| from mavedb.lib.permissions import Action, assert_permission, has_permission | |
| if item.superseding_score_set: | |
| superseding_score_set = db.scalars(select(ScoreSet).where(ScoreSet.urn = item.superseding_score_set).one() | |
| if not has_permission(user, item, Action.READ): | |
| item.superseding_score_set = None |
This way, we don't have to maintain any extra permission logic and can guarantee the item within the superseding score set property is only returned if the user has access to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified it. I check item.superseding_score_set directly cause it's an object.
bencap
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you also be able to fix the extra indentations before merging? The logic looks good!
…score sets are unpublished yet. Haven't fixed the search score set codes.
No it seems like there is just an extra level of indentation that got added to each of the function definitions in the src/mavedb/routers/score_sets.py file for whatever reason. |
src/mavedb/routers/experiments.py
Outdated
| superseding_score_sets = ( | ||
| db.query(ScoreSet) | ||
| .filter(ScoreSet.experiment_id == experiment.id) | ||
| .filter(ScoreSet.superseding_score_set.has()) | ||
| .all() | ||
| ) | ||
|
|
||
| updated_score_set_result = [] | ||
| for s in score_set_result: | ||
| current_version = s | ||
| while current_version: | ||
| if current_version.superseded_score_set: | ||
| if not has_permission(user_data, current_version, Action.READ).permitted: | ||
| next_version: Optional[ScoreSet] = next( | ||
| (sup for sup in superseding_score_sets if sup.urn == current_version.superseded_score_set.urn), | ||
| None | ||
| ) | ||
| # handle poetry run mypy src/ error so that add next_version | ||
| if next_version: | ||
| current_version = next_version | ||
| else: | ||
| break | ||
| else: | ||
| break | ||
| else: | ||
| break | ||
| if current_version: | ||
| updated_score_set_result.append(current_version) | ||
| else: | ||
| updated_score_set_result.append(s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts on this are that it is something we may do often enough that we will want a separate utility function. What I would suggest is:
- Create a new function
find_superseded_score_set_tail. We'll be passing the function a ScoreSet database object, and then I think it also makes sense to pass it an optionaluser_dataobject and anactionobject. Basically, if we call the function withaction=<some_action>, we will return thepermittedtail of the superseded score set chain rather than the 'actual' tail. So if we have a chain looking likeurn1 -> urn2 -> urn3 -> tmp1 -> tmp2and the user has permission for all the urn objects but not the tmp objects, we would return theurn3object rather thantmp2. If we call the function withaction=None, we ignore the user permissions and just return the end of the chain (tmp2) in this case. Notice, we presume the permissions on the passed score set. This permission should be checked by the caller.
def find_superseded_score_set_tail(score_set: ScoreSet, action: Optional[Action] = None, user_data: Optional[UserData] = None):
...
- I like the iterative approach you took above. We're basically just traversing a singly linked list, but with the added caveat that we have to look out for permissions and keep track of the previous score set so we can retain and return it. I think it'd look something like:
def find_superseded_score_set_tail(score_set: ScoreSet, action: Optional[Action] = None, user_data: Optional[UserData] = None):
while score_set.superseding_score_set is not None:
next_score_set_in_chain = score_set.superseding_score_set
# If we were given a permission to check and the next score set in the chain does not have that permission,
# pretend like we have reached the end of the chain. Otherwise, continue to the next score set.
if action is not None and not has_permission(next_score_set_in_chain, action, user_data):
return score_set
score_set = next_score_set_in_chain
return score_set
- Now, we can call our new helper function on each of the score sets in the score set result and then guarantee the permissions of the base results:
...
score_set_result = (
db.query(ScoreSet)
.filter(ScoreSet.experiment_id == experiment.id)
.filter(~ScoreSet.superseding_score_set.has())
.all()
)
# First, guarantee the permissions of the base score set results. We should verify permissions can only get narrower for superseding score sets. If they can widen, that would lead to bugs with this method.
score_set_result = [
score_set for score_set in score_set_result if has_permission(
user_data,
score_set,
Action.READ
).permitted
]
# If a score set has no superseding score set, we have already guaranteed its permissions. This function guarantees the permission of the superseding score sets.
superseded_score_set_tails = [
find_superseded_score_set_tail(
score_set,
Action.READ,
user_data
) for score_set in score_set_result
]
...
superseded_score_set_tailsnow contains all the score sets which do not have a superseding score set and to which the user has permissions in addition to the deepest superseding score set to which the user has permissions for each of the score sets with a superseding score set.
Honestly, this is a super tricky little problem so I'm happy to talk more about it if any of that doesn't make sense. Also possible that I made a mistake somewhere in that, so lmk if you see one. Given the complexity, this might even deserve its own change request. But I like the way this solution will allow us to reuse the find_superseded_score_set_tail function and write some test cases for it, since its complexity means it isn't something we're going to want to maintain in multiple spots and this is probably going to be something we want to reuse in other locations where we need to serve the end of a superseding score set chain to a user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It cannot handle a case. If the superseding score set is temporary, it'll be filtered out on the first score_set_result. An example case is experiment urn:mavedb:00000049-a. Its score set will be empty. I agree to create a separating function to check the current version of score set as you suggested. It'll be easy to handle in the future.
|
I see. I think the original reason is I clicked the tidy file button, but I converted them back before pushing the codes. Maybe I didn't notice this file. |
…dd some related tests.
src/mavedb/lib/score_sets.py
Outdated
| print(score_sets) | ||
| # Remove superseded score set | ||
| if not score_sets: | ||
| score_sets = [] | ||
|
|
||
| save_to_logging_context({"matching_resources": len(score_sets)}) | ||
| logger.debug(msg=f"Score set search yielded {len(score_sets)} matching resources.", extra=logging_context()) | ||
| print("if no score set") | ||
| final_score_sets: list[ScoreSet] = [] | ||
| else: | ||
| if search.published: | ||
| filtered_score_sets_tail = [ | ||
| find_publish_or_private_superseded_score_set_tail( | ||
| score_set, | ||
| Action.READ, | ||
| owner_or_contributor, | ||
| search.published | ||
| ) for score_set in score_sets | ||
| ] | ||
| else: | ||
| print("filtered_tail") | ||
| filtered_score_sets_tail = [ | ||
| find_superseded_score_set_tail( | ||
| score_set, | ||
| Action.READ, | ||
| owner_or_contributor | ||
| ) for score_set in score_sets | ||
| ] | ||
| print(len(filtered_score_sets_tail)) | ||
| # Remove None item. | ||
| filtered_score_sets = [score_set for score_set in filtered_score_sets_tail if score_set is not None] | ||
| print(len(filtered_score_sets)) | ||
| if filtered_score_sets: | ||
| final_score_sets = sorted(set(filtered_score_sets), key=attrgetter("urn")) | ||
| for f in filtered_score_sets: | ||
| print(f.urn) | ||
| else: | ||
| final_score_sets = [] | ||
|
|
||
| return score_sets # filter_visible_score_sets(score_sets) | ||
| save_to_logging_context({"matching_resources": len(final_score_sets)}) | ||
| logger.debug(msg=f"Score set search yielded {len(final_score_sets)} matching resources.", extra=logging_context()) | ||
| print(len(final_score_sets)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove the print statements in here now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally forgot these debugging print functions. =.=
src/mavedb/lib/score_sets.py
Outdated
|
|
||
|
|
||
| def search_score_sets(db: Session, owner_or_contributor: Optional[User], search: ScoreSetsSearch) -> list[ScoreSet]: | ||
| def search_score_sets(db: Session, owner_or_contributor: Optional["UserData"], search: ScoreSetsSearch) -> list[ScoreSet]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think when we talked about this before what we had discussed was making the signature something like:
def search_score_sets(db: Session, owner_or_contributor: Optional["User"], search: ScoreSetsSearch, requesting_user: "UserData") -> list[ScoreSet]:
...
If we pass in the owner_or_contributor variable, we only fetch score sets which have that user as an owner or contributor. And then we use the requesting_user variable to check permissions. That way, we will be able to still search for other users' score sets but ensure that the user who requested them has permissions.
If we do it like that, I don't think we will need to make any changes to the existing search logic and don't need to add the search.me field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the alternative is to not check permissions or fetch superseding score set tails in the search function itself, but instead call the function that fetches tails and checks permissions after we return the score sets that match the search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your suggestions, I did some modifications. The search_score_set function is nearly the same as before except removing filtering superseded score set out, and the permissions are checked outside of the search function. Let me know what you think about the new one. I prefer this way, but maybe miss some points.
| if action is not None and not has_permission(user_data, score_set, action).permitted: | ||
| while score_set.superseded_score_set is not None: | ||
| next_score_set_in_chain = score_set.superseded_score_set | ||
| if has_permission(user_data, next_score_set_in_chain, action).permitted: | ||
| return next_score_set_in_chain | ||
| else: | ||
| score_set = next_score_set_in_chain | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this checking the case where a user doesn't have permissions on the original score set, and going back up the chain of score sets it supersedes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used to handle unpublished superseding score sets. For example, tmp1 supersedes urn1 so the chain is tmp1 -> urn1. The user doesn't have permission to view tmp1 so the while loop tries to find the next newest available version to the user. If the user has permission to view the next superseded score set, the function will return the score set. Otherwise, it'll return a none value.
The previous while loop doesn't handle this special case. It only looks at whether the score set has a superseding score set.
while score_set.superseding_score_set is not None:
next_score_set_in_chain = score_set.superseding_score_set
if action is not None and not has_permission(user_data, next_score_set_in_chain, action).permitted:
return score_set
score_set = next_score_set_in_chain
…coreSet' into estelle/debugShowTmpSupersedingScoreSet
bencap
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the back and forth on this Estelle, it'll be a good fix to get out!
|
(^o^)/ Thanks Ben!! |
Only show unpublished superseding score set to the users who have permissions to view it. The current version to other users will only be the published one.
Fix #370