Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assignment service uses new ext_lti_assignment_id column #3119

Conversation

marcospri
Copy link
Member

@marcospri marcospri commented Sep 17, 2021

This code should behave exactly as without any changes until further PRs are merged.

More context in #3127

Testing

  • No behavior changes, launch and configure URL and canvas file assignments.

@marcospri marcospri marked this pull request as draft September 17, 2021 09:10
@marcospri marcospri force-pushed the ext_lti_assignment_id-column-service branch from 6029368 to 7b231ff Compare September 17, 2021 09:35
Comment on lines 66 to 79
if len(assignments) == 2:
# We stored a canvas file assignment before (storing its resource_link_id)
# we later configured it (storing its ext_lti_assignment_id) in a new row
# and now we are launching it, we want to merge those two assignments
old_assignment = (
assignments[0] if assignments[0].resource_link_id else assignments[1]
)
assignment = (
assignments[0]
if assignments[0].ext_lti_assignment_id
else assignments[1]
)
self._db.delete(old_assignment)
self._db.flush()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this because there are already canvas file assignments in the DB which have a resource_link_id.

After this, re-configured assignments will be created with ext_lti_assignment_id creating a duplicate.

)
)
else:
log.exception(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bddtests expect this behavior, ie not raising here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think we may have conflicting uses of AssignmentService.get(). In the LTI launch predicates the code seems to deliberately call AssignmentService.get() with just a resource_link_id that might be None and handles the fact that get() might return None by calling bool on it. The code does that here:

assignment_svc = request.find_service(name="assignment")
resource_link_id = request.params.get("resource_link_id")
tool_consumer_instance_guid = request.params.get("tool_consumer_instance_guid")
has_document_url = bool(
assignment_svc.get_document_url(
tool_consumer_instance_guid, resource_link_id
)
)

and again here:

previous_document_url = assignment_service.get_document_url(
tool_consumer_instance_guid, original_resource_link_id
)

In these cases I don't think you want to log an exception since this appears to be normal, expected behaviour. So an exception in the logs would just be confusing. And logging an exception will report the exception to Sentry as well.

In other places where get() is called the code seems to assume that it will not return None. This happens in the Canvas Files API views:

assignment = self.request.find_service(name="assignment").get(
application_instance.tool_consumer_instance_guid,
self.request.matchdict["resource_link_id"],
)
public_url = self.canvas.public_url_for_file(
assignment,
self.request.matchdict["file_id"],
self.request.matchdict["course_id"],
check_in_course=self.request.lti_user.is_instructor,
)

And it also happens in two places in the basic LTI launch views:

document_url = self.assignment_service.get_document_url(
tool_consumer_instance_guid, resource_link_id
)
return self.basic_lti_launch(document_url)

document_url = self.assignment_service.get_document_url(
tool_consumer_instance_guid, original_resource_link_id
)
self.assignment_service.set_document_url(
tool_consumer_instance_guid, document_url, resource_link_id=resource_link_id
)
return self.basic_lti_launch(document_url)

I think these three are assuming that the return value is never None?

So there are two conflicting use cases. So I think you might want to split this into two different methods:

class AssignmentService:
    def get(self, ...):
        # This will raise an exception if both resource_link_id and ext_lti_assignment_id are None.
        # This will also raise an exception if there's no matching assignment.
        # I don't *think* we expect these exceptions to ever actually get raised so we can just
        # allow the app to crash with a 500.

    def exists(self, ...):
        # This one always returns either True or False, never raises.
        # Alternatively this could return an assignment or None. But I think True or False might suffice.
        try:
            self.get(*args, **kwargs)
        except ValueError:
            return False
        else:
            return True

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two issues here:

  • Not passing any of the ids to .get(). This is not expected behavior I think, we should detect that early and issue a 400 or similar. The bddtest fail if I raise here but I think only because they are trying weird cases around the lti spec.

    Based on this I think we don't 4xx on those (missing resource_link_id) because we do some sort of lti redirect. https://github.com/hypothesis/lms/blob/onedrive-auth-page/lms/validation/_lti_launch_params.py#L52

  • Not checking for none on get. The views on basic_lti_launch assume there's a document and don't check for None because they come from the db_configured predicate that checks exactly that (

    # The ``db_configured=True`` view predicate ensures that this view
    )

    Then use on the predicate itself, here get_document_url is checking two things, that there's a row in the db and that it has a document. Refactoring it with exists() will mean doing a query to check if there's an assignment and another to get it and check the document. We could return the assignment on exists then but then we really wound't have an use for the get, will be just renaming it and between get -> Assignment and exist -> Assignment I prefer the first one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's doing a redirect instead of 404ing in order to be compliant with the LTI spec. It was necessary in order to get LTI certification.

I think the code that's calling get(resource_link_id=None) is doing so deliberately. It looks to me like it's doing resource_link_id = request.params.get("resource_link_id") instead of just request.params["resource_link_id"] deliberately so that resource_link_id will be None if it's not in the request.params.

Hmm. Ok, so I think it's actually get_document_url() rather than get() that has two different use-cases?

  1. The predicates are calling get_document_url(), possibly passing it resource_link_id=None, or just a resource_link_id that doesn't exist in the DB, and they expect it to return a document URL or None. In this case you don't want to raise if resource_link_id is None or if there's no matching assignment, you just want to return None.

  2. The views are calling get_document_url() and assuming (because the predicates must have already run) that it's going to return a document URL and not None. In this case you could raise if resource_link_id is None or if there's no matching assignment, because this should never happen.

So you might want separate get_document_url() and has_document_url() methods?

class AssignmentService:
    @lru_cache
    def get(self, ...):
        # Raise if both resource_link_id and ext_lti_assignment_id are None.
        # Raise if there's no matching assignment.

    def get_document_url(self, ...):
        # Calls get() so it will raise if both resource_link_id and ext_lti_assignment_id are None
        # or if there's no matching assignment.
        # Also raise if there is a matching assignment but it has no document_url.

    def has_document_url(self, *args, **kwargs):
        try:
            self.get_document_url(*args, **kwargs)
        except WhateverError:
            return False
        else:
            return True

Note that these won't result in multiple DB queries because both get_document_url() and has_document_url() work by calling get() which has @lru_cache on it.

The purpose of all this is to allow get() and get_document_url() to raise if both resource_link_id and ext_lti_assignment_id are None or if there's no matching assignment, because we never expect that to happen. But in the predicates, where we need it to not raise, those can call has_document_url() instead.

I don't think get() can do log.exception() as you have it currently because I think that will report exceptions to Sentry during expected behaviour, the edge cases that those BDD tests are covering.

Copy link
Collaborator

@seanh seanh Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of all this (in case it's not clear) is to allow the get() method to look something like this:

    @lru_cache
    def get(self, tool_consumer_instance_guid, resource_link_id=None, ext_lti_assignment_id=None):
        assert resource_link_id or ext_lti_assignment_id, "Can't get an assignment without a resource_link_id or ext_lti_assignment_id"

        if resource_link_id and ext_lti_assignment_id:
            return self._get_for_canvas_launch(resource_link_id, ext_lti_assignment_id, ...)

        if resource_link_id:
            return self._get_by_resource_link_id(resource_link_id, ...)

        return self._get_by_ext_lti_assignment(ext_lti_assignment_id, ...)

You might raise something else rather than an assert. But we can make it crash so that we'll get notified if anything is calling get() or get_document_url() without a resource_link_id or an ext_lti_assignment_id. At the same time the predicates can use the has_document_url() method to avoid any crash

)
if resource_link_id and not ext_lti_assignment_id:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional and the query are a bit verbose but trying to be clever here makes it a bit opaque.

@marcospri marcospri marked this pull request as ready for review September 20, 2021 12:47
@marcospri marcospri force-pushed the ext_lti_assignment_id-column-service branch from 0c687f0 to d56d5af Compare September 21, 2021 06:03
Comment on lines 87 to 46
def get(
self,
tool_consumer_instance_guid,
resource_link_id=None,
ext_lti_assignment_id=None,
):
"""Get an assignment using using resource_link_id, ext_lti_assignment_id or both."""

if resource_link_id and not ext_lti_assignment_id:
# Non canvas assignments
return (
self._db.query(Assignment)
.filter(
Assignment.tool_consumer_instance_guid
== tool_consumer_instance_guid,
Assignment.resource_link_id == resource_link_id,
)
.one_or_none()
)
.one_or_none()

return self._get_canvas(
tool_consumer_instance_guid, resource_link_id, ext_lti_assignment_id
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.get look less daunting now although all the code is still here (._get_canvas and _merge_canvas_assignments). I think is a good solution without a more deep refactoring.

Copy link
Collaborator

@seanh seanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind trying to split up the get() method some more?

I think your _get_canvas() is currently dealing with three separate cases (not resource_link_id and ext_lti_assignment_id , resource_link_id and ext_lti_assignment_id, and when both are None) and then the top-level get() deals directly with the remaining case (resource_link_id and not ext_lti_assignment_id). I think it might split things up more if get() began by directing each one of the three different cases to a different method.

I think we might want to consider making _merge_canvas_assignments() a public method as well. Partially just to further split up the code into smaller pieces. And partially because I'm not sure that a get() method should have a side-effects like merging things in the DB. I'm wondering under exactly what circumstances _merge_canvas_assignments() actually happens? But I think it might be better if the service method actually returns two assignments and the view detects that and calls merge_canvas_assignments().

@marcospri marcospri force-pushed the ext_lti_assignment_id-column-service branch from d56d5af to 054ad97 Compare September 21, 2021 12:08
@marcospri
Copy link
Member Author

Would you mind trying to split up the get() method some more?

I went a bit on that direction and also moved the private methods to the bottom of the file.

I think your _get_canvas() is currently dealing with three separate cases (not resource_link_id and ext_lti_assignment_id , resource_link_id and ext_lti_assignment_id, and when both are None) and then the top-level get() deals directly with the remaining case (resource_link_id and not ext_lti_assignment_id). I think it might split things up more if get() began by directing each one of the three different cases to a different method.

Yes, I split the public get based on what you are trying to do as a caller and expanded a bit more the comments on what you might find on the db on the canvas method.

I think we might want to consider making _merge_canvas_assignments() a public method as well. Partially just to further split up the code into smaller pieces. And partially because I'm not sure that a get() method should have a side-effects like merging things in the DB. I'm wondering under exactly what circumstances _merge_canvas_assignments() actually happens? But I think it might be better if the service method actually returns two assignments and the view detects that and calls merge_canvas_assignments().

I don't think having side effects in the get is good but I don't having it return more than one row is better. Get doesn't get called directly, only from set_document/get_document so it happens directly on the predicate (even if it appears to happen on the view).

I tried a new is_canvas predicate and duplicated the db_configured view to have a canvas specific one at that point we could use either a separate service or different methods on the same service but would also have to do the same thing on the db_configure predicate and I didn't like the end result.

For me the fact that get hides theses complexity trumps both the ugliness of the side effect query method and mixing canvas an non canvas logic in the service. I could see some clever service dispatch (eg the factory returning one instance of the service or another depending on from which LMS you are calling) could work without having to do a lot if if is_canvas all over the place.

@marcospri marcospri force-pushed the ext_lti_assignment_id-column-service branch 2 times, most recently from c86dfae to cf56608 Compare September 22, 2021 14:37
lms/services/assignment.py Show resolved Hide resolved
lms/services/assignment.py Show resolved Hide resolved
Comment on lines 181 to 184
query = self._db.query(Assignment).filter(
Assignment.tool_consumer_instance_guid == tool_consumer_instance_guid,
(
# Regular canvas launch
(
(Assignment.resource_link_id == resource_link_id)
& (Assignment.ext_lti_assignment_id == ext_lti_assignment_id)
)
# Configured file assignment that was stored in the DB
# before all canvas assignments were stored in the DB
| (
(Assignment.resource_link_id == resource_link_id)
& (Assignment.ext_lti_assignment_id.is_(None))
)
# First launch of a newly configured assignment
| (
(Assignment.resource_link_id.is_(None))
& (Assignment.ext_lti_assignment_id == ext_lti_assignment_id)
)
),
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the query can be simplified using in_()? I think it might be something like:

assignments = self._db.query(Assignment).filter(
    Assignment.tool_consumer_instance_guid == tool_consumer_instance_guid,
    Assignment.resource_link_id.in_([resource_link_id, None]),
    Assignment.ext_lti_assignment_id.in_([ext_lti_assignment_id, None])
).all()

I think we can omit the explanatory comments from the query and put them in the Python code that follows it instead, as the Python code is where you can see how each case is handled.

Copy link
Member Author

@marcospri marcospri Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The in behaves weirdly with null as anything that compares against it (select 1 where null = null;) so I think we need to use is null (sqlalcehmy converts x == null to x is null so that will work too) explicitly.

I did remove the comments and removed the intermediate query variable.

lms/services/assignment.py Outdated Show resolved Hide resolved
lms/services/assignment.py Outdated Show resolved Hide resolved
if not assignments:
raise NoResultFound()

if len(assignments) == 2:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I think assignments can only be empty, length 1, or length 2. It's not possible to have three matching assignments. For example you couldn't have this:

  1. A row with both the resource_link_id and the ext_lti_assignment_id
  2. A row with the matching resource_link_id and no ext_lti_assignment_id
  3. A row with the matching ext_lti_assignment_id but no resource_link_id

Because there are unique constraints on both resource_link_id and ext_lti_assignment_id, so 1 and 2 can't exist at the same time and 1 and 3 also can't exist at the same time. You either get none of them, 1 alone, 2 alone, 3 alone, or 2 and 3 together.

Comment on lines 119 to 126
old_assignment = (
assignments[0] if assignments[0].resource_link_id else assignments[1]
)
assignment = (
assignments[0] if assignments[0].ext_lti_assignment_id else assignments[1]
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you're implicitly assuming that one of the assignments has a resource_link_id and no ext_lti_assignment_id and the other has an ext_lti_assignment_id and no resource_link_id. I wonder if it might be nice to make that assumption explicit? Both because it might make the code more readable to explicitly state the assumption, and because it'll mean that we get notified if something's wrong.

This code also looks like it could assign the same object to old_assignment and assignment in cases where the above assumption has failed. It might be good to just make it so that it can't possibly assign the two to the same object. Again both for readability and safety.

Maybe something like:

if assignments[0].resource_link_id:
    old_assignment, new_assignment = assignments
else:
    new_assignment, old_assignment = assignments

assert not old_assignment.ext_lti_assignment_id
assert not new_assignment.resource_link_id

I think assertions might actually be fine and appropriate for this but could raise some other exception instead if you prefer.

Copy link
Member Author

@marcospri marcospri Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes good point, much more readable. Assertions are 👍

edit: I added an order by clause to the query to avoid this (and having to add a no cover or a test which forces the rows to come in a different order). Kept the assertions as a sanity check.

assignments[0] if assignments[0].ext_lti_assignment_id else assignments[1]
)
self._db.delete(old_assignment)
self._db.flush()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the flush really necessary here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, otherwise the old and new rows conflict on having the same resource_link_id before the delete takes effect

This could be solved in other ways I think (deferrable constraint? CTEs?) but it all looks more complicated than a flush.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird. I would think that if you're both deleting an object and updating another object in the same unit of work SQLAlchemy or Postgres would be smart enough to process the deletes first, but I guess not

assignment = (
assignments[0] if assignments[0].ext_lti_assignment_id else assignments[1]
)
self._db.delete(old_assignment)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so I guess "merging" the two assignments pretty much means deleting the old one. That might be fine. For example I guess it's fine to delete the old assignment's document_url as the new assignment has its own document_url.

But what about assignment.extra? I'm afraid we may need to copy that over to the new assignment. And I'm afraid that could be quite awkward if the new assignment could also potentially have a non-empty extra.

Copy link
Member Author

@marcospri marcospri Sep 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have to worry about extra. Existing canvas DB assignments don't have anything stored there and after all we only care about those in all this merge business.

That's was wrong, it might contain data for course copy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we could do for canvas assignments is store a list of all the document_urls a assignment had, might be useful? But I rather do that after all of this is done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the delete() and flush() to the bottom of the method? I don't know if it actually makes a difference, but it seems more intuitive to copy old_assignment's attributes to new_assignment and then delete old_assignment, rather than the other way round

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that would create a duplicate, added a comment on the flush.

)
self._db.delete(old_assignment)
self._db.flush()
return assignment
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want to do assignment.resource_link_id = old_assignment.resource_link_id?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That already happens on the calling function but added it here to have a more independent function

@marcospri marcospri force-pushed the ext_lti_assignment_id-column-service branch from cf56608 to e69ef4b Compare September 23, 2021 08:16
@marcospri marcospri force-pushed the ext_lti_assignment_id-column-service branch from 6aaaa1a to 3d2d9ee Compare September 23, 2021 09:43
Comment on lines 29 to 43
if resource_link_id and not ext_lti_assignment_id:
return self._get_by_resource_link_id(
tool_consumer_instance_guid, resource_link_id
)

Return the document URL for the assignment with the given
tool_consumer_instance_guid and resource_link_id, or None.
"""
assignment = self.get(tool_consumer_instance_guid, resource_link_id)
# Creating/editing a canvas assignment, only `ext_lti_assignment_id` is available
if not resource_link_id and ext_lti_assignment_id:
return self._get_by_ext_lti_assignment(
tool_consumer_instance_guid, ext_lti_assignment_id
)

return assignment.document_url if assignment else None
# We have both ext_lti_assignment_id and resource_link_id, canvas launch.
return self._get_for_canvas_launch(
tool_consumer_instance_guid, resource_link_id, ext_lti_assignment_id
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this method be clearer if we changed the order of the cases? if ... and not ... and if not ... and ... are relatively complex and hard to read, and I think we might be able to avoid them and use only simpler if's if we do them in a different order:

if not any([resource_link_id, ext_lti_assignment_id]):
    raise ValueError(...)

if resource_link_id and ext_lti_assignment_id:  # Or: if all([resource_link_id, ext_lti_assignment_id])
    return self._get_for_canvas_launch(...)

if not resource_link_id:
    return self._get_by_ext_lti_assignment_id(...)

return self._get_by_resource_link_id(...)

If execution gets past both the if not any([resource_link_id, ext_lti_assignment_id]) and the if resource_link_id and ext_lti_assignment_id then either one of resource_link_id or ext_lti_assignment_id must be None but not both, so only an if not resource_link_id is needed.

Comment on lines 149 to 159
def _get_by_ext_lti_assignment(
self, tool_consumer_instance_guid, ext_lti_assignment_id
):
return (
self._db.query(Assignment)
.filter_by(
tool_consumer_instance_guid=tool_consumer_instance_guid,
ext_lti_assignment_id=ext_lti_assignment_id,
)
.one()
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure about the naming here. As far as I know ext_lti_assignment_id's are Canvas-only so _get_by_ext_lti_assignment() is Canvas-specific but we have no "canvas" in the name. But then we do have "canvas" in the name of _get_for_canvas_launch() below.

Should we pick a side of the fence, in terms of whether or not we're using Canvas-specific naming whenever ext_lti_assignment_id is involved?

Is the solution to rename _get_by_ext_lti_assignment() to something with "canvas" in it? Or to rename _get_for_canvas_launch() to something without Canvas in it, like _get_by_resource_link_id_and_ext_lti_assignment_id() (which is very long but very explicit) or _get_by_both() or something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went for _get_for_canvas_assignment_config

else:
assignment = assignments[0]

if resource_link_id and not assignment.resource_link_id:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if resource_link_id and not assignment.resource_link_id:
if not assignment.resource_link_id:

_get_for_canvas_launch() is never called without a resource_link_id so we can simplify the if.

You could put assert resource_link_id and ext_lti_assignment_id at the top of _get_for_canvas_launch() if you wanted to make this more explicit.

)

# Clear the cache (@lru_cache) on self.get because we've changed the
# contents of the DB. (Python's @lru_cache doesn't have a way to remove
# just one key from the cache, you have to clear the entire cache.)
self.get.cache_clear()

return assignment

def _merge_canvas_assignments(self, assignments):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should move _merge_canvas_assignments() further down in the file, below the _get_for_canvas_launch() that calls it. It's a Canvas-specific edge case so we can hide it down below all the more general _get_*()'s

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we better add a docstring. Maybe something like this:

def _merge_canvas_assignments(self, old_assignment, new_assignment):
    """
    Merge two Canvas assignments into one and return the merged assignment.

    Merge old_assignment into new_assignment, delete old_assignment, and return the updated
    new_assignment.
    """

lms/services/assignment.py Outdated Show resolved Hide resolved
tests/unit/lms/services/assignment_test.py Outdated Show resolved Hide resolved
tests/unit/lms/services/assignment_test.py Outdated Show resolved Hide resolved
tests/unit/lms/services/assignment_test.py Outdated Show resolved Hide resolved
tests/unit/lms/services/assignment_test.py Outdated Show resolved Hide resolved
lms/services/assignment.py Outdated Show resolved Hide resolved
@marcospri marcospri force-pushed the ext_lti_assignment_id-column-service branch from 23b964b to 6722b8c Compare September 27, 2021 06:25
Co-authored-by: Sean Hammond <seanh@users.noreply.github.com>
@marcospri marcospri force-pushed the ext_lti_assignment_id-column-service branch from 6722b8c to d355a06 Compare September 27, 2021 07:44
Copy link
Collaborator

@seanh seanh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're going to have to change this so that the method that the predicates call (currently AssignmentService.exists()) can never have any side effects. The reason is this:

  1. Create a new assignment in Canvas. Canvas launches us with an ext_lti_assignment_id but no resource_key. We call _get_for_canvas_assignment_config() and it finds no matching assignment in the DB so we save a new assignment with the ext_lti_assignment_id and no resource_key

  2. Now launch the newly-created Canvas assignment for the first time. This time Canvas launches us with both a resource_link_id and an ext_lti_assignment_id. We call _get_for_canvas_launch(). It finds the assignment that we created in step 1 above and sets the existing assignment's resource_link_id.

    But look at the call stack when the assignment.resource_link_id = resource_link_id line gets executed: AssignmentService is being called by one of the view predicates. At this point in the request processing the request has not been authorized. The _permits() function in security.py hasn't been called yet. Could an unauthorized request cause the resource_link_id's of assignments to be set? Or cause two existing assignments to get merged?

In fact I think it turns out that unauthorized requests cannot modify the DB: when authorization fails we fail the request (with a 404 or something) and that causes the DB transaction to not get committed. But I don't think we want to rely on that as the one thing preventing us from making potentially unauthorized changes to the DB.

Nonetheless, I don't think we want to do this: view predicates get called by Pyramid before the view gets selected, before authorization happens, I think we're asking for trouble by letting them make changes to the DB.

new_assignment.
"""
old_extra = dict(old_assignment.extra)
# Legacy Canvas assignments in the DB were only ever saved with
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Legacy Canvas assignments in the DB were only ever saved with
# Legacy Canvas assignments in the DB were only ever saved with

@marcospri
Copy link
Member Author

Refactored again in another PR #3158

@marcospri marcospri closed this Sep 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants