-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simultaneous calls to the same action can cause early returns/continues with the wrong results #493
Comments
It might also be worth noting that, while this is generally an edge-case, deep extensions of the sort that #425 aims to enable might be more prone to triggering these sorts of issues if they are slow and highly asynchronous. |
Some notes about thoughts I’ve had since I first posted this issue (or, an expansion of stuff from #561): I think (1) or some flavor of it is essential. No matter what, it’s likely that some actions will have to do call multiple operations in the Electron process and those operations could collide. Even if everything in Nightmare’s core is extra careful, plugins could easy trip over things. Given how often people run into problems with loops (#533, #526, #522 to name just a few recent ones) and how complex explaining the problem and various solutions is, I think (3) above or a variation on it is also a good idea. If all actions go onto a single, long-lived queue, even after the queue starts executing, that will completely eliminate looping issues. As much as I like the idea of the power-user capability of being able to do multiple operations at once in the same window, I’m not sure there are a lot of concrete, real-world use cases for it. If there are good use-cases, I think they are sufficiently uncommon as to justify requiring a bit more work to enable them. Maybe an API like |
I agree. Having instances (whatever the final instance implementation is) be independent including the messaging bus is important.
I tentatively agree. The question that immediately jumps to my mind is handling something like: nightmare
.goto(url)
.someOtherAction()
.evenMoreAction()
.then(function(results){
return nightmare.goto(anotherUrl)
.yetAnotherAction()
// etc
}) I'm guessing you'd expect the first three actions to execute prior to the actions in This could also lead to unintended consequences with memory for sufficiently large sets, but I'm okay (at least for the moment) with assuming that if you're using Nightmare to do an arbitrarily large number of tasks, you already Know What You're Doing™.
I would be inclined to support something closer to async's nightmare
.goto(someUrl)
.parallel([
nightmare.action().anotherAction(),
nightmare.action().yetAnotherAction(),
nightmare.anotherActionStill().somethingElse()
])
.then(function(results){
//results is an array of results.
//could also support named hashes.
}); Thoughts? |
Since we are in agreement that (1) should happen and since it resolves the critical safety issue, some notes on that first: High-level goals:
Proposed API: // user process
this.child.send('command-name', arg1, arg2, ..., function(error, result) {
// handle results
});
// electron process
parent.on('command-name', function(arg1, arg2, ..., done) {
// do some stuff
done(error, result);
}); This uses callbacks in order to be consistent and composable with Nightmare’s action API. That way, you can just pass the callbacks along: actions.useragent = function(useragent, done) {
debug('.useragent() to ' + useragent);
this.child.send('useragent', done);
}; (If we wanted to change Nightmare’s API for actions to be promise-based, then I’d say this new API should be promises, too, but I don’t think anybody is pushing to do that). On handling progress data, // user process
this.child.send('command-name', arg1, arg2, ..., function(error, result) {
// handle results
})
.on('data', function() {
// do something on progress
});
// electron process
parent.on('command-name', function(arg1, arg2, ..., done) {
done.emit('data', 'step 1');
done.emit('data', 'step 2');
...
done(error, result);
}); Internally, every call to On the electron side, a handler for Will post on queueing stuff in a bit. |
Heeeeeeere we go...
Yes.
I’m struggling to follow with what you’re getting at here and I may be about to go way off the rails (sorry if so). Slightly reconfigured: nightmare
.action1()
.then(function then1 (results){
return nightmare.action2();
});
nightmare
.action3()
.then(function then2 (results) {...}); Are you asking whether Time
↓ action1
↓ then1
↓ └─ action2
↓ action3
↓ then2
// or
↓ action1
↓ ├──── (schedules) ────┐
↓ action3 then1
↓ └──── (schedules) ────┼──────┐
↓ action2 ─────────────────┘ then2 At the very least, if That is to say, I do not think this would be acceptable because it either violates the expectation of promises/then or puts you in a deadlock (can't continue to action3 until action2 is done but action2 can't be performed until after action3):
(Except the above is what would theoretically happen if nightmare
.action1()
.then(function then1 (results){
return nightmare.action2();
})
.then(function then2 (results) {
return nightmare.action4();
});
nightmare
.action3()
.then(function then3 (results) {...}); So. I think I would go with the second flow of events above (with the very complicated chart) because the implementation seems very straightforward. Except. Complicated flow. :( But then I want to imagine a magical land of fairies and unicorns where this works: nightmare
.action1()
.then(function then1 (results){
return nightmare.action2();
})
.action3()
.then(function then2 (results) {...}); Which begs the question: is that different than the above code snippet? Reading it naively, it would definitely imply behavior like the first timing diagram above. You would really expect ↓ action1
↓ └── (schedules) ──┐
↓ then1
↓ action2 ─────────────┘
↓ └── (schedules) ────────┐
↓ (implicit then)
↓ action3 ───────────────────┘
↓ └── (schedules) ──────────────┐
↓ then2 I think (if I’m thinking right, but I’m getting a headache now) that emulates the behavior of the first timing chart but under the implementation that matches the second (very complicated timing chart). Or, under the first technical approach, the two code snippets function the same, while they function differently under the second. How is that really different? Well... (deep breath) nightmare
.action1()
.then(function then1 (results){
return nightmare.action2(); // implicit then caused by return
})
.action3()
.then(function then2 (results) {...});
nightmare
.action4()
.then(function then3 () {...}); Would be:
And this is starting to get crazy enough that I’m almost beginning to regret coming down this path. Sigh. Or were you getting at what happens here? nightmare
.action1()
.then(function then1 (results){
return nightmare.action2();
});
nightmare
.action3()
.action4()
.then(function then2 (results) {...}); Being one of:
And I’m pretty sure I’ve gotten myself lost. Fresh eyes in the morning. Not going to get to explicit parallelism yet. |
process/execution safetyI agree with just about everything you said, including that this is a safety problem first and foremost. There are a few (many?) devils in the details, but I don't think they're worth touching until development is underway. I really like the idea of progress events, especially for debugging. Implementing something like that would have made sussing out several bugs much easier. It also adds a new layer of power for actions/plugins. I do have a question: how do you propose to have callbacks callable child->parent? I don't think you can call back like that, at least not without writing something reasonably fancy to manage the events for you. I feel like this is what you were getting at but never explicitly said it. I'm also feeling like maybe I've missed something important. queuesAnd this is where the brain-bending really gets into full swing. It might be useful to explain what I was driving at: Skipping ahead a bit, that buys having the whole of the Nightmare API hung off of the There are a couple of things with this approach that bother me. For one, I think the Nightmare instance doing the queueing would arguably need to be a full-on capital-P Promise implementation (whether that's done explicitly or with inheritiance). That would let it remain compatible with existing tools like the Second, and maybe more importantly, when does execution start if not with Moving on (well, backwards really), and borrowing your example here, and trying to clear up how this works in my brain: nightmare
.action1()
.then(function then1 (results){
return nightmare.action2();
});
nightmare
.action3()
.then(function then2 (results) {...}); The resulting queue would look like:
Which if I'm reading your diagrams properly, should match up with:
With respect to actions that return actions, I'd think that it wouldn't add an implicit then and instead alter the queue. Based on my above fun magical land commentary, and again borrowing your example: nightmare
.action1()
.then(function then1 (results){
return nightmare.action2(); // implicit then caused by return
})
.action3()
.then(function then2 (results) {...});
nightmare
.action4()
.then(function then3 () {...}); (Aside: you're right, this gets nightmarish [har] in a hurry.) I would expect this to look like:
... I think. I'm getting turned around trying to puzzle through this. Dearth of coffee. And ultimately, I'm still head-tilting at arranging Nightmare calls in this way. It would run deterministically (unlike the wild of now), but I still don't think it would Do What You Mean™. A naive reading of the above makes it look like you're doing two completely independent chains of actions on a single nightmare instance, and based on what we're describing here, that's just not the case. (Nor should it be.) With that said, now I'm starting to question what the ultimate goal is with the changes to queueing. I would advocate for 1) deterministic results and 2) easily usable/intuitive calls from Re parallelism, yeah, that gets into a bit of a different bailiwick, but my two previous rules should still apply. I can think of ways to accomplish this, but I would like to see the queueing problem solved first. If I'm thinking about that properly, parallelism might fall very well fall out for cheap. Like you said, the implementation doesn't need to have it day zero, but I wouldn't want to completely close the door on it. Oooookay. I've re-read your comments several times now, and my thinking parts are turning into mush. I know I've only scratched the surface here. I'll continue to let this marinate and comment again if I have more thoughts. |
This addresses one aspect of segment-boneyard#493. The basic idea here is that Nightmare’s IPC objects gain two methods, each to be used in opposite processes. - `ipc.respondTo(name, responder)` Registers a function that can respond to calls from another process. The `name` is handle by which other processes call it. The responder is a function that takes any number of arguments, where the last argument is a callback. This callback should be called when the responder’s work is complete. Any arguments will be passed back to the caller in the other process. In addition, the callback has a method named `progress` that can be called to emit `data` events in the other process. - `ipc.call(name, [arg, [arg, [...]]], [callback])` Calls a responder with the given `name` in another process. Any arguments between the `name` and `callback` are passed as arguments to the responder. The callback will be called when the responder’s work is finished. In addition, `ipc.call` returns an event emitter than can be used to handle `data` events indicating progress from the responder or a single `end` event, which will be called with the same arguments as and immediately prior to `callback`.
BOOM: #579. Now we’ve got a separate place to discuss that one. |
After reading everything you said i still dont understand why i can't run multiple instance of new Nightmare() on the same machine. |
@GautierT I think you might be a bit mixed up here—nothing on this issue is about problems running multiple instances of Nightmare at once (in fact doing so solves the issues here; this is about doing multiple sets of actions with one Nightmare instance). Your problem sounds a lot like #612. One thing you might want to try is using random partition names or memory partitions, like so: var nightmare = Nightmare({
// Make sure that, not only are we using a memory partition, but that
// it is probably a different one than any we've used before.
webPreferences: {partition: 'custom-partition' + Math.random()}
}); If that doesn’t help, you might want to make a separate issue, since your problem is definitely off-topic for this thread. |
@Mr0grog : Ok. Sorry for the miss understanding.. ! |
@GautierT No worries! I hope you get it figured out. |
Hey guys, thanks for all the feedback here, but I'm going to close this one. Having 2 queues working on the same page in electron is race-y in a number of ways. I'd like us to move towards what puppeteer does, where it creates a new page for each browser instance. Something like this: const nightmare = await Nightmare()
const browser1 = await nightmare.page()
const browser2 = await nightmare.page() Honestly when I first re-wrote nightmare to use electron – I messed this up badly. Sorry for the pain and time 😅 |
Originally found as a byproduct of #465, also noted over in the discussion on #491.
If two queues on the same Nightmare instance are operating simultaneously and both call the same action (e.g.
evaluate()
), it is possible for one call to pick up the results from the first call and finish early, continuing its cue with the wrong data.Here's an example from #465:
Or with typing:
Obviously there are more complicated questions with that one, but the salient bit here is that the queue typing "github nightmare" returned early. You can watch the window and see that it is still typing after the results are printed to the console.
You can do this with
goto()
, too, but that one's also somewhat complicated by the fact thatgoto()
actually doesn't know how to fail (which is a different issue altogether). But it would fail to fail (ha!) if navigation were pre-empted by a second, simultaneousgoto()
call.Obviously this is a little edge-case-y, but I suspect users will hit things like this more now that the "official" way to use Nightmare is via the
then()
API instead of with generators (where you have to go out of your way to trigger simultaneous actions).There are a few potential ways I can think of to fix this.
type()
. On the other hand, you might want most actions to be able to be simultaneous andtype()
might just be a little special.run()
call would queue afterrun
's callback (or possibly immediately before the callback, but after the actions it was running).The text was updated successfully, but these errors were encountered: