New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request's Architecture #1094
Comments
Closing this as it's not a bug, feel free to continue discussing here. |
In a couple of words, I would describe request's architecture as "cobbled together". It's gone through significant revisions over time by people with lots of different coding styles and goals, including me. |
basically, request is a giant duplex stream. most features are implemented on some part of that duplex and manipulate state. so it's sort of a duplex-state-machine-as-a-stream. |
So as some of you know I have been doing some refactoring work for Request. |
@seanstrom we should do this in text here on GitHub. I know it takes longer than a Skype call but this stuff needs to be documented publicly and discussed a bit more openly. |
At the top of my wish list would be some way to modularize some of the crazier features out of request itself. For (a strawman) example: Each feature could be a plugin that could listen to certain request events ( Just my 2 cents :) |
(reopening because I'd like this to get a little more exposure) |
@FredKSchott 👍 this is a great idea. Having more of the flow exposed through events would make it easier to write libraries like However, it would also provide users a more powerful gun to shoot themselves in the foot. Maybe some of these event names should be prefixed with |
Yeah we can namespace the events however we see fit. To do that I feel like I need read through Request enough to make educated choices when refactoring the code to that desired state. That's what I am starting on today. |
@seanstrom Awesome! I'd suggest finding the feature that's already the most modular (auth? form/formData? aws?) and starting with that as a prototype, to get a feel for how the core request will need to be updated. Whatever work you do, we should merge it to a separate branch than master until we're confident this is the direction we want to go. There's a good chance this work will require a breaking, 3.0 type change and we don't want to bother our users with that unnecessarily. @nylen agreed, although in my mind hooking into these would be a pretty advanced feature. It would be a huge win to allow advanced users access to these "hooks", so the tradeoff might be work it. Either way, we should aim to build some degree of safety/protection into these "feature plugins" regardless, for our own collaborators / sanity. |
PS: That suggestion was just one potential way to refactor this. It might be worth the time to think of others and compare :) |
Im curious to see if a good amount of this refactoring doesn't need to break current API. Then when we do move to 3.0 we can use most of the existing structure and flog any unwanted things. The plugin idea fits in with the way we're communicating already with events so I think we definitely experiment with that. :) |
Agreed, no need to break things unless we need to |
Okay so a few things I noticed while reading through Request today.
What do you guys think? |
+1
I'm against moving towards a more object oriented plugin model. Not that what we have now is great, it's certainly a huge problem, but I also don't want to go down the "plugin system" path which is what I think of when I hear "service objects." Here's a really good example: signing. We have support for a bunch of auth signing, some of which is implemented as giant methods. The actual signing usually lives in a separate module but there is a bunch of setup we have to do in order to prepare information to be sent to the signer. The main reason signing lives in One way to modularize this would be to create an "auth object" with a unique request API and just accept any of these objects in order to do auth. Another, and IMO far more robust, approach is to have a "sign" property which signing code sets to a function and the pre-header-send logic calls that function if available, ensuring it's a single call rather than an event that anyone can listen to. For the most part, request is a giant state machine. If we create additional objects with ever increasing state we'll start to get in to nasty areas where state in different objects effect each other and we're passing around a ton of extra references. Instead, I think the best approach is to flatten state in to a single namespace/object, which is a The other nice thing about this flat namespace/object is that you can easily manipulate a single prototype or instance to add functionality without worrying about inheritance. There's actually no reason that the prototype method for |
Okay I have feedback to what you just said, but I'll be breaking down my response in chunks. |
Okay, lets first cover that this state machine as a process. The very high level view of this process:
Note that at almost at any point it can be aborted. I believe that's the life of a Request instance. To me the Request instance is just a structure of state. They wouldn't have to have internal state, just return a value. Perhaps Service Objects was the wrong term to use. Why go this route? Note |
My first response - the one above this one - is just fully define what I meant by "Service Objects". The discussion about moving to a plugin base architecture is separate from "Service Objects". |
@mikeal that's a really good point. There are definitely some things (signing) that could never be moved out to rely solely on events / as plugins because they're just too ingrained in the request logic. But there are some things that act more as features on top of request, such as The reason I liked removing these as a first step was because it would remove those concerns from the request instance, and make the "core request logic" (like signing) smaller and less complex. Then, tackling projects like the auth object or signing functions to refactor The State Machine would be more manageable tasks. |
@mikeal re-reading this, and I think we both want the same end goal :) Although I'd argue that while monkey-patching the request object like that itself isn't strictly inheritance, it still comes with a similar set of problems/concerns (you need an understanding of the |
I think that when someone does As it stands, the hard work of stuff like I'm pretty sure that once we break off those functions and are forced to create events and other entry points for them to be implemented against we'll end up with a much clearer architecture and we'll be a lot more transparent about state which helps a lot with debugging. |
@seanstrom your high level state overview is a little lacking. I'll try my best to expose what you've missed.
In each of these steps there are all kinds of if statements for different features. Many of those features can actually impact each other, so ordering is actually very important. Many features have different behavior for the buffered callback case than they do for streaming. Many features need to get their state blown out on redirect, while others need to explicitly hold on to their state. Finally, you add proxying in to the mix, which changes the way you transport some of the data and requires you to hide/mask/remove some headers and add others. |
One more thing. If we do break these features out in to modules which are more or less the glue between request and other existing dependencies there's a few side effects:
|
@mikeal thank you for going more in depth with that break down. You explained the different phases of the request state machine. I want to know if we both understand each others opinions, and if you're opposed to it. In my mind, the information you've provided about request doesn't conflict with the architecture I mentioned above. So more feedback is appreciated. |
I think I am opposed to it, but it's hard to tell just because it's hard to understand all of these ideas without having a few examples implemented. Could you throw out some psuedo code where one of these service objects is used? As an example, here's what I was talking about previously with the signing: // request.js
var oauth = require('oauth')
function Request () {}
oauth.setup(Request)
Request.prototype.init = function () {
// lots of code
if (self.sign) self.sign()
// send headers
}
// oauth.js
exports.setup = function (Request) {
Request.prototype.oauth = function () {
// the actual signing api and prep
this.sign = function () { /* do the actual signing work */ }
}
} Note that in |
Yeah thats a cool way to decorate the request in order to have a signing method. As for my pseudo code. // File RequestInitialization.js
function RequestInitialization(request) {
// handles the complexities of initializing a request instance.
// it also delegates out to other functions for the work
// then it returns the request after initialization or just mutates the request implicitly
}
// File Request.js
var RequestInitialization = require('./RequestInitialization')
function Request(opts) {
// the actual constructor
// this handles forming all the initial state and containing of the state.
}
function RequestStateMachine(request) {
// handle the transitions of phases depending on a request instances state.
// This will coordinate the phases depending on the state, so at the beginning it will initialize
RequestInitialization(request)
}
// To wrap things up we just have a function that creates a new request
// and state machine each time.
function kickoff(options) {
request = new Request(options)
machine = new RequestStateMachine(request)
}
// We can also for API reasons point to RequestInitialization
Request.prototype.init = RequestInitialization The reason I'd like this is because we're would then treat the StateMachine, the Request state, and processes that take place on that state as separate things. Each with their own responsibility. Note that this pseudo is a high level picture, there's more that needs to be thought out than just this. |
I see what you're saying. The problem with this approach is that is doesn't actually allow us to move/keep the functionality specific to a feature separate from the rest of request. All you've really done is moved where we put a ton of monolithic logic about all the features to a different function, but that function is still just as inaccessible from another module as it was before. For instance, the current init() method is huge, none of what you are suggesting actually makes it smaller it just moves it. But, if we broke down what the init function does in to logical parts regarding their state and exposed hooks we could move all the feature specific logic in to other modules which accessed those hooks. At that point it doesn't matter if those hooks are implemented in the prototype method or in another function like you're suggesting, it should get much smaller and easier to manage. The hardest part of all of this will be figuring out and preserving the ordering some feature combinations rely on. I also don't see the point in having one object which is the state and one which is the API, it seems much simpler to flatten those in to one object. We actually want to provide better insight in to the state and so attaching it to the object which carries the API is a much better approach. |
Another quick note: it might seem cleaner in theory to put the logic in one place and the state in another but it's actually guaranteed to produce a lot more confusing code, no less. |
and then I did it again :( |
@nylen you're example from node core is good, it definitely gives a face to what I've been describing. |
There's a lot to respond to here, but I definitely see the draw of decorating Request itself. If we do go the "decoration" route, I'd really like to use a pattern closer to node core:
ie:
That forces some level of modularization, and would prevent different decorators from interfering with each other. The @mikeal was your point that this might not be possible / that some Request methods require modifications to the rest of the Request object? |
-1 To the idea of splitting a RequestStateMachine out of Request (for more-or-less all the reasons already stated above) |
@FredKSchott +1 except the oauth method needs to be a prototype method :) |
Ah duh! ninja-editted :) |
Heres an example of a module that wants to support another kind of auth. |
Suggestion If we have a module like var request = require('request-base')
var auth = require('request-auth')
request.use({auth: auth}) In this example we would tell Update: Thoughts? |
This is fine for the abstract case, as in someone doesn't want all of request so they explicitly pull in Also, I don't think it should be integrated as "auth." A more extensible pattern might be that you're actually registering a function per option. For instance, what if you want oauth2 and hawk support? You wouldn't be able to do just |
I agree we should have a way to integrate many types of auth. |
Just stumbled across this thread and wanted to share something I worked on a few months ago. I've since abandoned the project, maybe I'll pick it up again some day, maybe not. It's basically a different approach on how to achieve a Glancing over my readmes I notice they are not ready and fully consistent so let me add an example here as well, this is how it worked: var kwest = require('kwest');
var kwestRedirect = require('kwest-redirect');
var kwestBodyparser = require('kwest-bodyparser');
var request = kwest()
.use(kwestRedirect({ maxRedirects: 10 }))
.use(kwestBodyparser());
request.use(function rejectErrors(req, next) {
// You can also alter req here, like setting headers, etc
return next(req).then(function (res) {
// This lib was promise based, but that's ofcourse an implementation detail
if (res.statusCode !== 200) {
throw new Error('bad status');
}
return res;
});
});
request({
url: 'http://www.google.com'
}).then(function (res) {
// do something with the response here
}); Again, not claiming this is better or saying you should do it this way, just want to spark some inspiration by showing a different approach. I found that this approach had its advantages. I use |
Thanks for sharing this @Janpot I'll take a look at it. Actually I had similar idea (too lazy to search for the exact thread and comment) still a code in that direction is more valuable than just the idea 👍 |
+1 for @Janpot 's idea all the way! Breaking apart logic into modules would be tremendous. It's definitely a reason I might point out though, that most of what's described by @Janpot can be achieved with a function like _.compose, R.compose, or R.composeP. I submit my vote for the decomposition though, as I have definitely struggled with redirects and cookies. From my limited debugging experience I've gleaned it's very difficult to trace the route from my request to how redirect requests are formed. Just to add another potential perspective for how one could decompose let j = request.jar();
let r = request.defaults({
jar: j
});
let inputs = {
method: 'GET',
url: 'https://www.google.com'
};
redirect(iteratee, r, inputs, cb); Where |
I'm in the party of less is more. I think not implementing a middleware stack in Ideally I would be able to do something like below: //requestAsync is a customized promise I wrote myself.
R.composeP(parse, toJSON, requestAsync(R.__));
//or
R.composeP(parse, toJson, redirect, requestAsync(R.__)); But also be able to use the raw Another point to make, is that var get = R.composeP(toJson, redirect, requestAsync);
var getDo = R.composeP(do, get);
var getDoSomethingElse = R.compose(doSomethingElse, get); Three totally different ways of handling your request. Pretty nice huh? |
I see what you're saying about not having middleware. Most of the suggestions around breaking Request end up showing some kind of middleware system. That said I would prefer having a system where all the features that could be built on top of request would be modeled like function that can compose. Which is what I believe you're describing right? |
Pseudo code is fine, but reality is different, that's why we have the code base as it is ATM. While I'm not saying that the current code base is structured good enough, there are certain design patterns that simply won't work for building HTTP client. And you can't know that for sure before you actually try and build one with a good amount of features in it. Also there is a huge difference between refactoring and rewrite. IMO refactoring isn't possible for the current code base. There are too much constraints to look for, the biggest one being the huge user base relying on almost every hack/fix/decision that you can find in the code base. Mikeal started such rewrite two years ago, and got nowhere. Also it's interesting to see the http-duplex-client which is Streams2 compatible. I meditated a bit on the above commit, and at the end I decided that I have a different vision about how the HTTP client should be built. So about a month ago I started to put some code aside trying to assemble a working HTTP client. I'm using the HTTP duplex client mentioned above, slightly modified, also it turned out that some parts of that implementation doesn't work as expected, namely the Anyway, what I'm trying to say is that we should keep the current version 2.x working as it is, that includes adding of new features and fixes. Probably migrate to Streams2 at best, which will require rewrite of a few external modules as well. Anyone is free to start their own rewrite from scratch. Of course it will be a huge effort to implement all of the request's features, but at least you can see them. |
Yeah doing a total rewrite would be a pain, and refactoring also a pain. As it stands now, the ideas I'm throwing out are probably better off for user land (Which you also see with It makes me curious though about what the original intent of this project was? What problem is it solving? If I like breaking off different pieces of logic like I am, should I just call it quits and start using the plain old |
Yep, the API you are proposing should be implemented as a request wrapper. Good example is Purest which only configures request through its API, it doesn't handle anything HTTP specific directly. As for request, its purpose is to implement some of the common features that you expect an HTTP client to have. The core HTTP module mainly parses the raw HTTP messages and handles the TCP sockets for you, through its Agent class. Well that's probably overly simplified but it's definitely not a full featured HTTP client, like a browser for example. Some of the features currently in core were first implemented in request, like the forever-agent, also at that time there were no Streams2 and common Duplex stream interface. Here is what I'm currently outlining as requirements for my own experiment:
I think that's it for now. Oh and btw the redirect logic with all of the request features is not the easiest thing to implement. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Hey can some one give me a high level breakdown of request's architecture?
I basically want to get some point of views of how this software is perceived and how it actually works.
@mikeal
The text was updated successfully, but these errors were encountered: