Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPII-3193: The proper implementation of noUser keyin #653

Merged
merged 28 commits into from Sep 5, 2018

Conversation

Projects
None yet
4 participants
@cindyli
Copy link
Contributor

commented Aug 18, 2018

No description provided.

cindyli added some commits Jul 26, 2018

GPII-3193: Lifecycle manager now manages a set of dynamic components …
…that handle user logon requests. This implementation separates the logon handling processes out of kettle endpoint request handlers.
GPII-3193: Added setTimeout() for keying in with noUser when GPII has…
… no active key. This is to work around the issue that the /device kettle request sourced from noUser key-in somehow binds with the previous logout kettle request that has been destroyed and causes failure.
GPII-3193: Removed the unnecessary dynamic user logon request indexer…
… as well as moving UserLogonStateChange.js from flowManager to lifecycleManager.

@cindyli cindyli requested a review from amb26 Aug 18, 2018

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Aug 18, 2018

@gtirloni

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2018

ok to test

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Aug 20, 2018

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Aug 20, 2018

@@ -20,19 +20,18 @@ var fluid = require("infusion");
var gpii = fluid.registerNamespace("gpii");

fluid.defaults("gpii.flowManager.getGpiiKey.handler", {
gradeNames: ["kettle.request.http", "gpii.flowManager.sessionAware"],
gradeNames: ["kettle.request.http"],

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

I don't believe this handler is required any more and as part of our general spring cleaning of unnecessary HTTP endpoints this would be a good opportunity to remove it.

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

I know that it is used in several of our HTTP-based tests, but we will need to switch to do some white-box testing anyway (in order to test the proximity user A -> user B transition) so switching away from using this endpoint works well together with these other testing requirements

},
model: {
lastNonNoUserLogon: { // Holds information about the last non-noUser login/logout action

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

"currentUserLogon" is probably a clearer name. The fact that it only records logons of actual users is ok to leave just in documentation.

isProcessingQueue: false // if queue is currently being processed
actionQueue: [],
userLogonRequestQueue: [],
isProcessingActionQueue: false, // if action queue is currently being processed

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

See if we can do without these flags - every bit of shared mutable state carries an extra risk of races. Is it not enough to check whether the length of the queue is 0?

This comment has been minimized.

Copy link
@cindyli

cindyli Aug 23, 2018

Author Contributor

Tests show these flags are still required. Since the processing of a user logon request or a action is asynchronous, this flag is important to indicate the period from when a queue item has been removed from the queue to process to the end of the processing.

Here's an example: Determining whether "noUser" keyin should be added to the queue checks both the length of the queue is 0 and the isProcessingUserLogonRequestQueue flag. This is to cover the case when the last request on the queue, for example an actual user keyin, has been removed from the queue (the length of the queue is now 0), but is still in process (isProcessingUserLogonRequestQueue === true), "noUser" keyin would not be added to the queue.

This comment has been minimized.

Copy link
@amb26

amb26 Aug 23, 2018

Member

Couldn't we resolve this by only removing an item from the queue when it has finished processing, rather than when it starts? This would have the additional benefit of having an obvious public place to look in order to determine which item is currently being processed

// The grade to handle user login requests. Distributed from the GPII config files.
// In all-in-local config, the handler is "gpii.lifecycleManager.userLogonHandling.matchMakingStateChangeHandler".
// In untrusted config, the handler should be "gpii.lifecycleManager.untrusted.stateChangeHandler".
logonStateChangeHandler: "gpii.lifecycleManager.userLogonHandling.matchMakingStateChangeHandler",

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

Best to just define this in the relevant components themselves rather than make an extra place people need to look in order to find out what grades something has. And turn all the definitions of this into options distributions.

This comment has been minimized.

Copy link
@amb26

amb26 Aug 27, 2018

Member

This string still remains

var fluid = require("infusion");
var gpii = fluid.registerNamespace("gpii");

fluid.defaults("gpii.flowManager.baseHandler", {

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

Explain what this is for and what handlers it is the base for

gradeNames: ["kettle.request.http", "gpii.flowManager.baseHandler"],
debounceTimeMs: 1500,
invokers: {
handleRequest: {

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

Whilst this has been partially refactored, we still have the situation that it's impossible to trigger, e.g. a proximityLogon except via an HTTP request to localhost, which we need to get rid of. The entry point to this method should be refactored so it is just simply an invoker or event, and then this HTTP endpoint will forward to this.

};

// Define the grade to be used for key-in and key-out
gpii.lifecycleManager.userLogonRequest.handlerGrades = fluid.freezeRecursive({

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

No great reason to freeze this - one of the principal values of having it in a public structure is that someone might conceivably want to patch these

@@ -63,7 +63,7 @@ https://github.com/GPII/universal/blob/master/LICENSE.txt
var lifecycleManager = gpii.lifecycleManager();
jqUnit.expect(1);

var queuePromise = lifecycleManager.addToQueue(gpii.tests.lifecycleManager.queue.createCustomTask(1, fluid.identity, 1));
var queuePromise = lifecycleManager.addToActionQueue(gpii.tests.lifecycleManager.queue.createCustomTask(1, fluid.identity, 1));

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

We need to add corresponding tests now for the other queue, the queue of user logon actions. In particular, somewhere (either in these new tests, or in the existing UserLogonTests) we need to test the headline feature of this branch - that a proximity triggered request for user B when user A is logged on does not cause a temporary intermediate state of "noUser" logged on.

}

// handle getGpiiKey requests
if (logonState === "getGpiiKey") {

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

I don't think the extra complexity of this queue "action" type is needed - we always know synchronously who is logged on. This can just become a simple invoker on the component. Note that withSession is i) always synchronous, ii) will be eliminated in the next refactoring once we remove the possibility for multiple active sessions.

}
```

The active GPII key request item in the queue should have the following format:

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

I don't think this needs to be a queued action - the user might well be interested in knowing what the logon state is now rather than at some time in the future when the system has stopped doing things.

* * If GPII key is "reset", log out any already logged in user.
*/
gpii.flowManager.userLogin.handleRequest = function (gpiiKey, request, lifecycleManager) {
var loginPromise = lifecycleManager.addToUserLogonRequestQueue({

This comment has been minimized.

Copy link
@amb26

amb26 Aug 20, 2018

Member

These handlers are very nearly refactored compared to the proximity one, but they still need to be pushed the final step to where the login/logout action just consists of an invoker or event firing, to prepare for the day when we stop having HTTP endpoints at all. Also, note that the logout handler is identical except for the value of one string -> combine them.


// Checking lifecycleManager.model.logonChange.inProgress is to make sure during the period to finish
// this request handling, there isn't another user logon request being added to the queue and in process.
if (activeGpiiKeys && activeGpiiKeys.length === 0 && !lifecycleManager.model.logonChange.inProgress) {

This comment has been minimized.

Copy link
@amb26

amb26 Aug 24, 2018

Member

Is this last check necessary? It seems that checking the queue length would be a more reliable test (for the time being)

This comment has been minimized.

Copy link
@cindyli

cindyli Aug 24, 2018

Author Contributor

The last check is necessary due to the use of setTimeout() or fluid.invoke() that pulls out the function in it into an async process.

This check is to cover the case when the function within setTimeout runs simultaneously with the next logon request in the queue, in which case, the queue has become empty but the key in the next logon request hasn't been considered as active.

This comment has been minimized.

Copy link
@amb26

amb26 Aug 24, 2018

Member

But that isn't the case any more after the last refactoring, right? Since the queue always has at least one element in it so long as it is active at all?

This comment has been minimized.

Copy link
@cindyli

cindyli Aug 24, 2018

Author Contributor

This test still can capture the case after the last refactoring.

This example is, a keyin request adds to the queue right after "noUser" keys out. With "noUser" keyed out and before the keyin, the queue was empty and the function within setTimeout() was hit. When the keyin is added, the process of the keyin runs simultaneously with setTimeout() function, which will add "noUser" keyin back to the queue without the last check on lifecycleManager.model.logonChange.inProgress.

This comment has been minimized.

Copy link
@amb26

amb26 Aug 24, 2018

Member

It all seems quite risky. We should try to make sure that there is only one critical data structure being used at a time, and hopefully that structure is the queue. I don't feel completely confident that I understand the issue (at this point in the night) but I think that there is a fault in the implementation of performProximityTriggered, in that it explicitly waits for the conclusion of logoutPreviousUserPromise before it queues the login, whereas the whole point of the queued implementation was to ensure that we could register our login intention immediately so that the rest of the system can introspect on it. What I believe is that these two items should be added to the queue synchronously, and then hopefully races like this will go away. Please check over whether this seem sane to you ...

This comment has been minimized.

Copy link
@cindyli

cindyli Aug 24, 2018

Author Contributor

How can you be such a genius, @amb26??!! Great capture! Adding keyout then keyin synchronously does solve the race condition.

This pull request is ready for more reviews.

Thanks and cheers!

This comment has been minimized.

Copy link
@amb26

amb26 Aug 24, 2018

Member

Well, a real genius would have solved this problem by actually understanding it, whereas I merely reacted to what looked liked a collection of correlated smells : P But I'm glad that this smell led us to the problem since I'm sure I wouldn't have spotted the problem in performProximityTriggered by simply reading it during review.

This comment has been minimized.

Copy link
@cindyli

cindyli Aug 25, 2018

Author Contributor

Sigh... This pull request is not ready yet. It was silly of me to forget about removing the last check. The race condition still persists without this check. I will re-think it through when my brain is clearer.

This comment has been minimized.

Copy link
@cindyli

cindyli Aug 25, 2018

Author Contributor

@amb26, finally fixed the race condition. In the new commit, there are some adjustments to clean up the logic here and there, but the main trick is to fix the scope of the function in fluid.invokeLater. It's important to keep the queue change (queue.shift() call at line 171), and the actions resulted from the queue change (line 173-189), in the same process.

This pull request is ready for another review.

func: "gpii.tests.invokePromiseProducer",
args: ["{lifecycleManager}.performProximityTriggered", [gpii.tests.userLogonRequest.gpiiKey], "{that}"]
}, {
changeEvent: "{lifecycleManager}.applier.modelChanged",

This comment has been minimized.

Copy link
@amb26

amb26 Aug 24, 2018

Member

Sheer luck that this sequence works in the presence of https://issues.fluidproject.org/browse/FLUID-5502 ! It will be safer to instead register a single listener which assembles all successive changes into an array and then check the array just once with jqUnit.assertDeepEq

This comment has been minimized.

Copy link
@amb26

amb26 Aug 27, 2018

Member

Please add a note in this fixture file that there is an implementation risk due to FLUID-5502 but we believe that this is currently reliable because each of the model changes is triggered asynchronously

func: "gpii.tests.invokePromiseProducer",
args: ["{lifecycleManager}.performProximityTriggered", [gpii.tests.userLogonRequest.gpiiKey], "{that}"]
}, {
changeEvent: "{lifecycleManager}.applier.modelChanged",

This comment has been minimized.

Copy link
@amb26

amb26 Aug 24, 2018

Member

FIVE passive fixtures in a row! I really wonder how this works. Perhaps it is helped by each of the actions being asynchronous as a result of your timeout.

}
});

gpii.flowManager.getGpiiKey.handleRequest = function (that, events) {

This comment has been minimized.

Copy link
@amb26

amb26 Aug 24, 2018

Member

Phew! What a relief it is gone :)

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Aug 24, 2018

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Aug 25, 2018

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Aug 26, 2018

sequence: [{
func: "{loginRequest}.send"
// TODO: Wait till the noUser keyin at the GPII start completes. This is to work around an issue with

This comment has been minimized.

Copy link
@amb26

amb26 Aug 27, 2018

Member

We need to solve this by boiling the noUser logon event into a new onReady event for the whole server. This 100ms timeout will be unstable and unreliable.

This comment has been minimized.

Copy link
@amb26

amb26 Aug 27, 2018

Member

Why does this issue not affect all acceptance tests?

This comment has been minimized.

Copy link
@cindyli

cindyli Aug 29, 2018

Author Contributor

Acceptance tests need to start and load data into a pouchdb express server, which takes some time and completes much later than noUser keyin process. The test sequence runs when all servers to be ready at when noUser has already keyed in.

The production config test only needs to start the local flow manager to talk to a running GPII cloud. This process doesn't include data loading etc so the LFM server gets ready sooner than the noUser keyin completes.

I've improved the production config test to start the test sequence when a new aggregate event "onAllReady" is fired. "onAllReady" event listens to both "onServerReady" and "{flowManager}.events.noUserLoggedIn".

cindyli added some commits Aug 29, 2018

GPII-3193: Addressed review comments: improved the production config …
…test to start the test when the kettle server is ready and noUser keyin completes. Also included a couple of other minor improvements.
@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Aug 29, 2018

@cindyli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 29, 2018

@amb26, this pull request is ready for more review. Thanks.

@amb26

This comment has been minimized.

Copy link
Member

commented Aug 29, 2018

I am getting the following test failure:

17:08:17.385:  Opening Kettle Server on port 8084
17:08:17.426:  Kettle Server aip4undt-190409 is listening on port 8081
17:08:17.427:  Kettle Server aip4undt-190887 is listening on port 8084
17:08:17.435:  express pouchdb instance 'aip4undt-191278' initalizing...
17:08:17.436:  Creating directory 'C:\Users\Bosmon\AppData\Local\Temp\aip4undt-191278' for express pouchdb instance 'aip4undt-191278'...
17:08:17.443:  express baseDir: 'C:\Users\Bosmon\AppData\Local\Temp\aip4undt-191278'...
17:08:17.446:  Express server listening on port 8058
17:08:17.446:  Express started...
17:08:17.774:  Invoking handler gpii.deviceReporter.handlers.get for route /device with expectedGrade kettle.request.http
17:08:17.779:  Kettle server allocated request object with type gpii.deviceReporter.handlers.get
17:08:17.780:  Invoking handler gpii.deviceReporter.handlers.get for route /device with expectedGrade kettle.request.http
17:08:17.784:  Kettle server allocated request object with type gpii.deviceReporter.handlers.get
17:08:17.785:  Invoking handler gpii.deviceReporter.handlers.get for route /device with expectedGrade kettle.request.http
17:08:17.789:  Kettle server allocated request object with type gpii.deviceReporter.handlers.get
17:08:17.790:  Invoking handler gpii.deviceReporter.handlers.get for route /device with expectedGrade kettle.request.http
17:08:17.793:  Kettle server allocated request object with type gpii.deviceReporter.handlers.get
17:08:17.802:  Ignoring call to invoker parse of component {
    "typeName": "kettle.dataSource.encoding.JSON",
    "id": "aip4undt-189785",
    "lifecycleStatus": "destroyed",
    "destroy": { Function

    },
    "options": {
        "gradeNames": [
            "fluid.component",
            "kettle.dataSource.encoding.JSON"
        ],
        "invokers": {
            "parse": {
                "func": undefined,
                "funcName": "kettle.dataSource.parseJSON",
                "listener": undefined,
                "this": undefined,
                "method": undefined,
                "changePath": undefined,
                "value": undefined,
                "args": {
                    "value": "NO_VALUE"
                },
                "componentSource": "kettle.dataSource.encoding.JSON"
            },
            "render": {
                "func": undefined,
                "funcName": "kettle.dataSource.stringifyJSON",
                "listener": undefined,
                "this": undefined,
                "method": undefined,
       .... [output suppressed at 1024 chars - for more output, increase fluid.logObjectRenderChars] which has been destroyed
17:08:17.803:  getDeviceContext got deviceData undefined
17:08:17.803:  FATAL ERROR: Uncaught exception: Cannot read property 'isError' of undefined
TypeError: Cannot read property 'isError' of undefined
    at Array.promise.then.fluid.extend.message (e:\Source\gits\gpii\node_modules\universal\gpii\node_modules\lifecycleManager\src\UserLogonS
tateChange.js:47:1260)
    at Object.that.complete (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\FluidPromises.js:70:25)
    at Object.that.resolve (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\FluidPromises.js:52:22)
    at Function.fluid.promise.resumeSequence (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\FluidPr
omises.js:163:26)
    at Function.fluid.promise.progressSequence (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\Fluid
Promises.js:149:23)
    at Function.fluid.promise.resumeSequence (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\FluidPr
omises.js:174:31)
    at Function.fluid.promise.progressSequence (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\Fluid
Promises.js:149:23)
    at Array.<anonymous> (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\FluidPromises.js:169:35)
    at Object.that.complete (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\FluidPromises.js:70:25)
    at Object.that.resolve (e:\Source\gits\gpii\node_modules\universal\node_modules\infusion\src\framework\core\js\FluidPromises.js:52:22)
    at e:\Source\gits\gpii\node_modules\universal\node_modules\kettle\lib\dataSource-url.js:186:29
    at IncomingMessage.wrappedCallback (e:\Source\gits\gpii\node_modules\universal\node_modules\kettle\lib\KettleRequest.js:79:33)
    at emitNone (events.js:111:20)
    at IncomingMessage.emit (events.js:208:7)
    at endReadableNT (_stream_readable.js:1055:12)
    at _combinedTickCallback (internal/process/next_tick.js:138:11)
    at process._tickCallback (internal/process/next_tick.js:180:9)
17:08:17.804:  Current activity:
17:08:17.812:  {
    "module": "GPII",
    "event": "Error.Exception",
    "level": "FAIL",
    "data": {
        "error": {
            "stack": "TypeError: Cannot read property 'isError' of undefined\n    at Array.promise.then.fluid.extend.message (e:\\Source\\gi
ts\\gpii\\node_modules\\universal\\gpii\\node_modules\\lifecycleManager\\src\\UserLogonStateChange.js:47:1260)\n    at Object.that.complete
(e:\\Source\\gits\\gpii\\node_modules\\universal\\node_modules\\infusion\\src\\framework\\core\\js\\FluidPromises.js:70:25)\n    at Object.t
hat.resolve (e:\\Source\\gits\\gpii\\node_modules\\universal\\node_modules\\infusion\\src\\framework\\core\\js\\FluidPromises.js:52:22)\n
 at Function.fluid.promise.resumeSequence (e:\\Source\\gits\\gpii\\node_modules\\universal\\node_modules\\infusion\\src\\framework\\core\\js
\\FluidPromises.js:163:26)\n    at Function.fluid.promise.progressSequence (e:\\Source\\gits\\gpii\\node_modules\\universal\\node_modules\\i
nfusion\\src\\framework\\core\\js\\FluidPromises.js:149:23)\n    at Functi .... [output suppressed at 1024 chars - for more output, increase
 fluid.logObjectRenderChars]
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! gpii-universal@0.3.0 test:node: `node node_modules/nyc/bin/nyc.js node tests/all-tests.js`

The immediately previous test was

17:08:16.879:  jq: Test concluded - Module "StartupAPITests" Test name "gpii.startupAPI.tests.ConfigPath tests": 4/4 passed - PASS
@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Aug 30, 2018

@cindyli

This comment has been minimized.

Copy link
Contributor Author

commented Aug 30, 2018

@amb26, the test failure can be reproduced with a windows VM. The cause is because, the test before the test failure, StartupAPITests.js, starts a GPII server and stops it immediately to test GPII start/stop APIs. With the previous logic, noUser keys in at {flowManager}.events.onCreate at when the kettle server is not fully ready yet. Every gpii.start() call from StartupAPITests.js performed a noUser key-in that was then in process and sent a http request to http://127.0.0.1:8081/device` endpoint. When this server was immediately torn down, on Mac/linux, the request to /device endpoint is aborted. However on windows, this request somehow persists and gets processed by the next 8081 port server started UntrustedBrowserChannelTests.js. This is why the test log shows 4 /device requests are processed by UntrustedBrowserChannelTests.js that results in the test failure. 3 of them were actually from StartupAPITests.js.

The fix is to perform the noUser keyin until the kettle server is up and running. This will prevent noUser keyin from being performed too early and sending http request to a not-ready port.

}]
}],
events: {
timeoutComplete: null

This comment has been minimized.

Copy link
@amb26

amb26 Sep 2, 2018

Member

This event is unclearly named and the reader will not be able to determine what the timeout relates to

type: "kettle.test.request.http",
options: {
path: "/user/noUser/logout"
}
}
},
events: {
timeoutComplete: null

This comment has been minimized.

Copy link
@amb26

amb26 Sep 2, 2018

Member

I see that the name of this event was inherited from the original implementation. It's a good opportunity to rename them all

@gpii-bot

This comment has been minimized.

Copy link
Collaborator

commented Sep 4, 2018

@cindyli

This comment has been minimized.

Copy link
Contributor Author

commented Sep 4, 2018

@amb26, this pull request is ready for another look.

@amb26 amb26 merged commit 360f155 into GPII:master Sep 5, 2018

1 check passed

default Build finished.
Details
@cindyli

This comment has been minimized.

Copy link
Contributor Author

commented Oct 1, 2018

Merged at bddf0f6

@cindyli cindyli deleted the cindyli:GPII-3193 branch Oct 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.