Skip to content

Improvement proposals #51

Open
vseloved opened this Issue Oct 27, 2012 · 4 comments

3 participants

@vseloved

Hello,

First of all, I'd like to say, that it is very nice to see the development of this project, especially since I appreciate the simplicity of Ring, as well as similar approaches in other languages (Rack, WSGI). I also contemplated created a similar library.

But there's one major concert I wanted to raise regarding the current implementation. As the saying goes, "LISP programmers know the value of anything, and the cost of nothing". Yet I don't think, this should stand for Common Lisp programmers :) What I mean is that such software as Clack should be very robust, stable and fast, so that people could reliably build on top of it. As it stands currently, I think there are some choices in the main request handling pipeline, which are not optimal in regards of performance.

  1. Using of plists for request and response data. It is a common knowledge, that working with lists, that contain more than 5-6 elements, in random-access mode is much less efficient, than using other data-structures. The obvious candidates are hash-tables, structs and objects. As the requests and responses may have fields, that cannot be anticipated, using structs or objects would require an additional level of indirection for holding additional fields, which is, obviously, overkill. So what remains is only hash-tables. I don't know, why you didn't choose them for the task - is it related to lacking syntactic support for their handling in the standard library? If so, this can be solved. In my library RUTILS I provide a readtable, that allows to use a literal syntax for creating them, like this: #{:status-code 200 :text "ok"} (source code), as well as several convenience functions for working with them, like get# (an alias of gethash), set# (an alias of (setf (gethash ...)), take#, and a utility for printing them in the same manner. These small syntactic additions make working with hash-tables just a breeze - as convenient as in other languages with native support for them, like Clojure or Python.

  2. Using keywords, interned at runtime, for handling additional fields of requests. Surely, keywords are the most convenient and appropriate representation for request fields, known beforehand, but handling other fields as keywords, created with intern will make all the applications incur a significant and unnecessary performance hit. Here I'd like to refer you to the Google Common Lisp Styleguide. I think, that leaving this arguments as strings will not pose any inconveniences for the users. The only problem I see is that this will necessitate that the request hash-table has equal test instead of eql, but I think, this will not degrade its performance significantly (although this assumption needs additional testing). Yet, if it will be the case, the other approach may be to use a plain struct for the request object with the additional-headers field, designated to unanticipated headers, that can be an equal hash-table or even a plist/alist (since there shouldn't be many such headers in the majority of cases).

  3. Not using multiple-values for returning results from the app. This issue is more of a convenience proposal, but still, I believe, it should be raised.

    AFAIU, currently Clack applications should return a list, consisting of a status-code, headers and body, which will be transformed by Clack stack into response plist and eventually into appropriate response objects. I think, this is both not convenient and not very efficient.

    I think it will be much clearer, as well as more efficient, if the application returned multiple-values. The first value may have a double meaning: if it's a string — it's a response body, otherwise if it's a number — it's a status code. The second value is a hash-table of response headers. The optional third value may hold a status-code.

    So the normal control flow will return just a string. The simple page not found will return 404, while a more involved example may be:

      (values "[\"/variant1\",\"/variant2\"]"
              #{:content-type "application/json"}
              300)
    
@fukamachi
Owner

Thank you for these proposals. It seems most of them are reasonable.

1. Using of `plists` for request and response data.

Your understanding is completely right. I thought hash-table isn't needed because requests will have only 10 or 15 key-value pairs. And it's syntactic support is related, you said.

Clack's API must be simple and should be easy to understand.

On the other hand, the introduction of the reader macro increases a dependence, and Clack newbie must learn it (even it is simple). Moreover syntaxes of hash-table are invented many times. And there are no popular one which can be the default syntax. How can we choose from them? #{:status-code 200 :text "ok"} or { (:status-code 200) (:text "ok") } or something like them.

Therefore this suggestion is interesting and deserves consideration, but we should be careful about the implementation.

Actually, how much performance will be necessary? Should it be implemented even if loses simplicity? That's the point.

(Or don't we have to worry so if we think that the implementation of the request is absorbed at a framework level?)

2. Using keywords, interned at runtime, for handling additional fields of requests.

I've never realized this. Let's be settled with 1.

3. Not using `multiple-values` for returning results from the app.

I've never realized this either. You're completely right. That looks good and more Lispish way.

Then, is it time to think about Clack API ver2 :) ?

@vseloved

As for the reader-macro, it's basically a readtable with an additional function. IMHO, named-readtables is a de facto standard for handling readtables, so a dependence on it won't hurt the project - on the contrary, empower it with ways to properly manage reader-macros, and so to actually use them - they are a great tool, if applied properly. As for the function, you can just copy it as is and not depend on rutils or any other utility library.

So, regarding performance vs. loss of simplicity: hopefully, in this case it's a false dichotomy, as hash-tables are actually simpler to work with, if you have proper tools. At the same time, I would argue, that for such things as CLACK performance has paramount importance, because you want people to use it with slow and fast servers alike and not worry about losing speed because of the middleware.

Considering ver2 - it's up to you, how do you plan to go on. I'd be glad to help, if there are ways, where I can contribute. I've looked at the code, and it seems pretty understandable and well-structured. If you agree, I can try to implement my proposals in a separate branch

@fukamachi
Owner

I was still thinking about these proposals, and finally, I've noticed there are no reason to reject them. Though I don't agree whether the "rutil" syntax should be the default yet, if you don't intend to introduce a default/bundled syntax for hash-tables into Clack, it's enough reasonable. Let's put off the decision which syntax to use for hash-tables.

One thing I still worry about is how to handle multiple keys in the same hash-table.
Currently, Clack allows multiple keys in envby using multival-plist. We have to think about how to change that in Clack v2. Fortunately, it seems such code which depends on the behavior are only in Clack.Request.

Anyway, it makes easy if you implement your proposals. Lisp is easier to understand than English for me.

@alpha123
alpha123 commented Dec 6, 2012

+1 to all these 👍

Clack should focus on high performance rather than a friendly API. Most people won't be writing apps directly on top of Clack anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.