Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile Desirable Use Cases Not Covered in Current Spec #497

Open
fugu13 opened this issue Sep 24, 2014 · 14 comments
Open

Compile Desirable Use Cases Not Covered in Current Spec #497

fugu13 opened this issue Sep 24, 2014 · 14 comments
Labels

Comments

@fugu13
Copy link
Contributor

fugu13 commented Sep 24, 2014

Ideally we'd get, for each use case, a General Description, a Reason for Current Impedance that is as detailed as possible (at a technical level, as whether or not it is possible is ultimately about technical capabilities), and, optionally a Possible First Ideas Toward Solution.

This is one aspect of brainstorming toward a possible future partially breaking spec version, whatever that gets called.

Then drop those into comments here, and as we triage and evolve we can turn those into standalone stories.

@canweriotnow
Copy link
Contributor

I can't find it, but there was someplace an example use case based upon Twitter activity. I'm currently stress-testing an LRS with a Twitter firehose archive, and it brought up a use case:

Here's an example (based on that example) encoding my latest tweet as a Statement:

{  "id": "some-random-uuid",
  "actor": {
  "account": {"name": "canweriotnow",
                    "homePage": "https://twitter.com"},
  "name": "djeisyn.luys."
  },
  "verb": {"id": "http://id.tincanapi.com/verb/tweeted",
               "display": {"en-US": "tweeted"},
  "activity": {
                    "id": "https://twitter.com/canweriotnow/status/543888863899713536"
                   }
}

The problem we saw was that the tweet could be reported in multiple statements (since the uuids can be generated by the server and not collide) while it's actually the same specific action being multiple-reported.

I'm not sure of the best fix for this. One possibility is using a generic tweet activity and not using the exact tweet, and embedding the exact tweet as an extension; this seems unsatisfactory as an LRS MUST NOT reject a Statement based on its extensions.

I think a better solution would be to add an optional boolean attribute to Activity such as "repeatable" and if it is set to false the LRS SHOULD or MUST reject the statement asserting the Activity.

Thoughts?

@garemoko
Copy link
Contributor

Thanks for raising this one Jason. I came across the issue of multiple statements about the same event this summer; in my case it was a JavaScript based e-learning course.

We never got to the root cause, but the issue we were seeing was that in a very small number of cases (0.1% or something tiny) the bookmarking data representing a complete course was getting saved to the LRS, but the "I Completed This" statement was going missing. This was a critical statement to the learner progressing on the LMS, but as the bookmarking already had the course complete, the learner was stuck in limbo - unable to ever generate the "completed" statement. Our solution was to re-issue the completed statement when the learner launched an attempt that was already complete, which meant that any learner revisiting a course that had successfully been completed would then have an additional "completed" statement for that particular course and registration.

I imagine other scenarios involving multiple statements for the same event are when multiple authorities assert the same thing, or when tools deliberately report on both sides of an event (Bob mentored Sue and Sue was mentored by Bob)

There's two approaches we considered for handling this, and a third which I thought of this morning in considering your use case. It's my opinion that the best approach will vary depending on the situation and therefore the solution to this is best handled by communities of practice in their recipes.

  1. The reporting tool expects the possibility of multiple statements and is designed to handle it. Perhaps it reports on only the first or most recent statement with a matching actor, verb, activity id and registration. This approach wouldn't be suitable, for example, in the case of the Tetris prototype where multiple attempts occur within a single launch.
  2. The activity provider queries the LRS for a matching statement before issuing a new statement. This adds the overhead of an extra GET call and some time delay before issuing the statement, but depending on the exact scenario that might not be a problem.
  3. My last option is more in line with what you suggested. Whilst I think we should avoid adding additional requirements to the LRS if we can, I can certainly see the value of allowing LRS to reject statements in this scenario, and actually I think this is already allowed. The specification leaves LRS permissions out of scope, so an LRS vendor could have a configurable option that allows the administrator to issue credentials that can send statements about an actor,verb,object,registration combination only when that combination does not already exist. I prefer to give the LRS administrator control in this scenario, rather than the activity definer because the activity provider already has power to avoid issuing duplicate statements by checking for duplicates before sending new ones as described in item 2 above.

The best approach will depend on the exact use case, which is why I say this should be handled by communities of practice. Even in option 3, if this functionality is not supported by the activity provider and the activity provider doesn't handle errors well, it could cause problems; the Activity Provider would need to expect and even rely on the fact that the LRS will reject the duplicates.

@andyjohnson
Copy link
Contributor

As a general rule, I think this largely falls back on the tool or AP in the
event of duplicate Statements. As Andrew says, making queries before
issuing Statements is a good practice and we do have the option of voiding
Statements as well. I think that adding querying to LRS rejections starts
a bad trend of dumping extra processes on the LRS.

Jason, to understand your use case better - is the system not allowing
retweets? I'd think you'd either coin the retweeted verb or use a
StatementRef from a new actor to the original tweet. If the same user
tweeted the same thing, wouldn't that be a different event. I'f it is an
ownership issue, I'd think that the AP could determine the owner of a
tweet. Again, I could just be missing the boat on what you are getting
at. Could you elaborate?

Thanks!

Andy

On Wed, Dec 17, 2014 at 6:20 AM, Andrew Downes notifications@github.com
wrote:

Thanks for raising this one Jason. I came across the issue of multiple
statements about the same event this summer; in my case it was a JavaScript
based e-learning course.

We never got to the root cause, but the issue we were seeing was that in a
very small number of cases (0.1% or something tiny) the bookmarking data
representing a complete course was getting saved to the LRS, but the "I
Completed This" statement was going missing. This was a critical statement
to the learner progressing on the LMS, but as the bookmarking already had
the course complete, the learner was stuck in limbo - unable to ever
generate the "completed" statement. Our solution was to re-issue the
completed statement when the learner launched an attempt that was already
complete, which meant that any learner revisiting a course that had
successfully been completed would then have an additional "completed"
statement for that particular course and registration.

I imagine other scenarios involving multiple statements for the same event
are when multiple authorities assert the same thing, or when tools
deliberately report on both sides of an event (Bob mentored Sue and Sue
was mentored by Bob
http://tincanapi.com/2014/11/19/who-did-it/?utm_source=tincanapi_com&utm_medium=github&utm_term=andrew&utm_content=blog&utm_campaign=who-did-it?pmc=em-1
)

There's two approaches we considered for handling this, and a third which
I thought of this morning in considering your use case. It's my opinion
that the best approach will vary depending on the situation and therefore
the solution to this is best handled by communities of practice in their
recipes.

The reporting tool expects the possibility of multiple statements and
is designed to handle it. Perhaps it reports on only the first or most
recent statement with a matching actor, verb, activity id and registration.
This approach wouldn't be suitable, for example, in the case of the Tetris
prototype
http://tincanapi.com/prototypes/?utm_source=tincanapi_com&utm_medium=github&utm_term=andrew&utm_content=page&utm_campaign=prototypes?pmc=em-1
where multiple attempts occur within a single launch.
2.

The activity provider queries the LRS for a matching statement before
issuing a new statement. This adds the overhead of an extra GET call and
some time delay before issuing the statement, but depending on the exact
scenario that might not be a problem.
3.

My last option is more in line with what you suggested. Whilst I think
we should avoid adding additional requirements to the LRS if we can, I can
certainly see the value of allowing LRS to reject statements in this
scenario, and actually I think this is already allowed. The specification
leaves LRS permissions out of scope, so an LRS vendor could have a
configurable option that allows the administrator to issue credentials that
can send statements about an actor,verb,object,registration combination
only when that combination does not already exist. I prefer to give the LRS
administrator control in this scenario, rather than the activity definer
because the activity provider already has power to avoid issuing duplicate
statements by checking for duplicates before sending new ones as described
in item 2 above.

The best approach will depend on the exact use case, which is why I say
this should be handled by communities of practice. Even in option 3, if
this functionality is not supported by the activity provider and the
activity provider doesn't handle errors well, it could cause problems; the
Activity Provider would need to expect and even rely on the fact that the
LRS will reject the duplicates.


Reply to this email directly or view it on GitHub
#497 (comment).

Andy Johnson
ADL Technical Team
608-318-0049

@bscSCORM
Copy link
Contributor

I also generally thing this falls on whatever is making the statements to avoid "duplicates" -- which I put in quotes because a big reason I'd be concerned about putting this in the LRS is the trouble with defining the term duplicate. (For the moment we have defined that as "two statements with the same ID"). That being the case, there is a way to align your concept of duplicates with that definition: Note that each statement has an ID which must be a UUID, and that the UUID RFC defines two name-based variants, see "4.3. Algorithm for Creating a Name-Based UUID"

Thus, provided you can define a deterministic process for distilling down what you consider to be "unique" about a tweet (and nothing else) into a single set of bytes, you could generate a name-based UUID for the statement about that tweet, and the LRS would now recognize your definition of "duplicate", because you would in fact have duplicate IDs for any statements you considered duplicate.

That's a hoop to jump through for sure, but I think it's a much smaller one than specifying the various useful options for what is considered duplicate (and agreeing on that list), and more portable than customizing your LRS.

@aaronesilvers
Copy link
Contributor

Are multiple objects allowed in a single statement? I ask because there are some use cases brought to my attention that come with working with sensors where one action may involve several objects, each with different possible results, even though the actual activity may be a singular activity.

@garemoko
Copy link
Contributor

@bscSCORM - I hadn't thought about name based UUIDs. In this case again this would be something for the CoP to document the process you describe in a recipe. Presumably there'd need to be a process for each action relating to a tweet, so a slightly different string for tweeting a tweet vs. retweeting, deleting or favouriting that tweet.

@aaronesilvers You can do stuff with sub-statements and contextActivities, but that may not be what you're looking for. It may be that multiple statements around different aspects of the experience/event are the best approach. have you got more details?

@bscSCORM
Copy link
Contributor

@garemoko It depends on the scope whether a CoP needs to be involved -- name based UUIDs include the concept of a 'namespace' with the name, so as long as that is applied properly anyone can take that approach on their own (of course the LRS won't detect duplicates for them outside of their namespace, but that's how it's supposed to work). So, if one wishes to avoid making duplicate statements about tweets where multiple organizations are making such statements, a CoP would be necessary.

@canweriotnow
Copy link
Contributor

I think this falls to a basic question of trust; one that xAPI (I think) deliberately attempts to sidestep; what is the system of record.

We don't have a concept of SoR, as statements are meant to be portable and immutable. But perhaps we should, or else we put ourselves in the following situation:

First, we leave it up to APs:

  • We need to identify untrustworthy APs
  • A way of rejecting statements from untrustworthy APs
  • We break the model

Or, we leave it up to LRS's

  • LRS's must identify untrustworthy AP's (or untrustworthy statements from them)
  • LRS's must identify imported stmnts from untrustworthy AP's coming from other LRS's
  • LRS's must identify other LRS's as being untrustworthy and decide when to reject all stmnts from them
  • We break the model

I'm not sure of the best solution to this, but I do believe it's a chain of trust issue we need to resolve.

@canweriotnow
Copy link
Contributor

@garemoko In regard to your use case, the test-fix I found was creating a hash around the non-repeatable activity id and making that a unique ID, but it breaks the current spec. So definitely something to consider for 2.0

@canweriotnow
Copy link
Contributor

@aaronesilvers This is also an excellent point; I think some hash of actor, verb, and activity is needed in the long run to make this possible. For the use case, MD5 is fast and the collision probability is low.

@garemoko
Copy link
Contributor

@canweriotnow thanks for the response. I don't see how an LRS rejecting statements from untrustworthy sources breaks the spec. Can you explain?

RE the UUID question, what you're describing seems to be similar what @bscSCORM mentioned above about name-based UUIDs. have I understood that right? Again, it doesn't seem to break the spec?

@canweriotnow
Copy link
Contributor

@garemoko maybe not fully "breaks" but are a lot of "MUST accept" clauses and nothing I can see about an LRS having the discretion to reject statements other than by revoking the auth credentials of the AP in question. I think some more thought needs to go into how APs identify themselves, and the chain of trust for authentication and authorization.

@bscSCORM
Copy link
Contributor

bscSCORM commented Jan 2, 2015

It's a bit buried, but note the error code section:https://github.com/adlnet/xAPI-Spec/blob/master/xAPI.md#errorcodes

"Unauthorized for the given credentials" is deliberately not defined further, and combined with "An LRS MUST return the error code most appropriate to the error condition from the list above.", I read the spec as giving the LRS a lot of freedom to reject requests for security reasons (including "we're not going to accept this particular statement based on these credentials").

Keep in mind the overall 'journaling' model of xApi -- note how statements are immutable but 'void' is included in the spec. Also note the authority on each statement. Even though I just mentioned it being possible, in general I think it's more flexible to note that statements are all claims and treat them that way. So, there is nothing wrong with the statement "student claims that they passed the test", but it should be taken with a grain of salt when reporting or otherwise trying to infer meaning from the statements. Of course this approach runs into trouble when folks try to use an LRS directly as a reporting system, but it's a much more flexible approach to wrap a better reporting system around an LRS than to try to make an LRS a good reporting system.

Also have a look at signatures: https://github.com/adlnet/xAPI-Spec/blob/master/xAPI.md#signature .

@garemoko
Copy link
Contributor

garemoko commented Jan 5, 2015

@canweriotnow I've raised making this clearer as a separate issue. Thanks for flagging this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants