Revisiting, speeding up, and simplifying API Umbrella's architecture #86

GUI · 2015-01-13T07:07:41Z

Short(ish) Overview

I've prototyped a somewhat significant overhaul to API Umbrella's architecture that simplifies its operation and speeds things up. Because this would change quite a bit of API Umbrella's current code base, I wanted to open this issue to start gathering feedback.

The potential speed improvements appear quite substantial, and come by reducing the proxying overhead of having API Umbrella sitting in front of your underlying API backends. It's still very early, but benchmarks point to around a 10-50x reduction in our proxying overhead. On a test server, it takes the overhead of having API Umbrella sitting in front of an API from an average of 13ms down to 0.3ms (averages don't tell the whole story, though, so I'll get more into these numbers later).

Aside from the speed increases, this also simplifies API Umbrella quite a bit by reducing the number of components and dependencies. I think this could make API Umbrella easier to run, operate, and maintain. It should also be more performant with fewer server resources. And finally, I think all this simplifies the code-base and cleans up some of the more complicated pieces of functionality in our current code-base.

So this all sounds great, right? What's not to like? In my view, the main downside is that it would be a somewhat significant shift in architecture, and that obviously has implications and repercussions.

But before diving into the technical details, here are some random, high-level questions that come to mind:

How important are these kind of potential speed gains to you?
What do you think about changing our Gatekeeper component from being written in Node.js to written in Lua embedded inside nginx (aka OpenResty)?
If the Gatekeeper component were written in Lua instead of Node.js, do you feel like that would impact your ability to contribute to this project?
How attached to the current Node.js Gatekeeper and router code bases are you? (Note, this would not affect the Ruby on Rails "Web" component)
Any other comments/questions/concerns? Or general thoughts on this potential change in architecture?

We'd love to get your feedback on any of this, so please share if you have any thoughts (and don't feel obligated to read the rest of this short novella of technical mumbo-jumbo).

To reiterate, this is still an early prototype. The current benchmarks aren't very rigorous right now, so the speed numbers are subject to change. However, I do believe the benchmarks are in the right ballpark of how much faster this could make API Umbrella (somewhere between 10x-50x).

Longer Version

From here, I'll dive into the nitty-gritty of what our current architecture looks like and what I'm proposing. Along the way, I'll probably write far too much text and bore you to death. Sorry!

For quite a while, I've had some ideas on how to speed up and simplify API Umbrella rolling around in my head. A couple weekends ago, I decided to try to finally start playing around and put some of those ideas to use. The basic premises behind the optimizations are:

Switch our custom proxy layer (the Gatekeeper) from Node.js to Lua embedded inside Nginx (via OpenResty).
Cache more lookup data in local process memory (rather than making local Redis requests or remote database queries).
Remove some of our internal network hops.

To be fair, we could optimize our current implementation using _#_2 and _#_3 without rewriting it in Lua, but there are some reasons I'll get to of why using Lua inside nginx makes these optimizations easier.

Aside from the speed gains, I think the other important thing to consider in this rearchitecture is operational simplicity of API Umbrella. Even if it weren't for the speed gains, I think all of these changes actually lend themselves to making API Umbrella easier to run, manage, and debug.

Current Architecture

Let's start with how things currently look:

Proposed Architecture

With these changes, we basically squash all of that down into:

How & Why

So you might be wondering why in the world we have all those pieces in the current architecture, and why we can just squash them all into one component now. I think the best way to explain the changes is to step through each component in the previous architecture, explain why it existed, and then detail how it's being handled in the new architecture.

nginx: Initial Routing / Load Balancing

Instead of incoming requests being directly handled by our Node.js Gatekeeper, this initial nginx server was in the stack for a couple primary reasons:

For routing requests to the Web component: Yes, this could have also been done with our Node.js Gatekeeper component, but using nginx for this routing gave us more capabilities, like better load balancing algorithms.
For load balancing against multiple Gatekeeper processes: Since we need to run multiple Node.js Gatekeeper processes (1 per CPU core), nginx load balanced against those separate processes. This could have been achieved through Node.js's clustering feature, but we didn't want to use those directly for 2 reasons:
1. If we used Node.js's clustering, we couldn't perform Node.js upgrades (for security reasons, etc) or upgrades to the master process with zero-downtime. With the node.js cluster, the workers can be restarted without incurring downtime, but if you ever need to restart the master process, then the server will experience downtime. Putting nginx in front solves this. nginx also has these types of restarts and upgrades better figured out, in my opinion, since it does support zero downtime upgrades between nginx binaries.
2. Prior to 0.12, Node.js's clustering has some issues with fairness in how it distributes requests amongst workers.

In the rearchitecure, we're still hitting nginx first, so this first step doesn't really change. We're just able to remove more of the pieces behind this by embedding more functionality directly inside nginx.

Node.js: Gatekeeper

This is our reverse proxy layer where we've implemented most of our custom logic. This includes things like API key validation, rate limiting, etc.

In the rearchitecture, nothing really changes about what this layer does, it just shifts the implementation to be in Lua instead of Node.js. That in turn allows it to be embedded inside nginx.

So why nginx and Lua instead of Node.js?

Embedded: By embedding it directly inside nginx, we're able to still use nginx, but cut out a network hop (even though it was always a localhost hop). This also reduces complexity, since we don't have to worry about running the multiple Node.js worker processes (nginx handles running its own worker processes itself).
Event-based: Similar to Node.js, Lua inside nginx is all event based. Event-based systems have their own complexities and pitfalls, but I'm still a big fan of their speed and efficiency for proxying layers. Plus, OpenResty's take on evented programming is quite nice, and, in my opinion, simpler than Node.js's (it removes most of the callback/promise requirements).
Shared memory across processes: One of the primary reasons I began looking more heavily into nginx and Lua recently was their shared memory implementation (ngx.shared.DICT). This allows us to cache things in memory, but that memory can be shared across all the nginx worker processes. Node.js seems to lack an equivalent feature, so the standard solution to this in Node.js is to share the memory externally via something like Redis. That's precisely what we've done in our current stack with a local Redis instance. Since it's always a local connection, it's fast, but nginx and lua's shared dict implementation is a lot faster (since Redis still requires local network connection, which the in-memory option avoids). The shared dicts have features we need like key expiration built in, so it nicely replaces our need for Redis in a simpler, faster way.
Lua is simple: I've been pleasantly delighted with Lua as a programming language. It's an extremely simple language I was able to pickup very quickly. Compared to other alternatives, I feel like it's one of the simplest options.
LuaJIT's fast: If you google around, you'll find plenty of impressive benchmarks for how fast LuaJIT is.

Lua vs Node.js Implementation Comparisons

If you're curious what the be Lua code looks like, here are a couple of quick comparisons of equivalent features:

Role Validation: node vs lua
API Key Validation: node vs lua

You'll notice that our implementation logic remains pretty similar, so the move to Lua doesn't really mean we're throwing everything out. It's largely a translation of our current code base, just in a different language. It's also given us an opportunity to clean some things up with our old codebase.

Varnish: HTTP Cache

This serves as the HTTP caching layer in our stack. It needs to be placed behind the Gatekeeper, so we can enforce API key validation, rate limits, etc before allowing cache hits. I've liked using Varnish in the past, so that's largely how it landed in our stack here.

In the rearchitecture, we're going to use nginx's built in proxy_cache (or possibly ledge). In either case, it will be a cache embedded inside nginx. This is one feature I haven't tackled in the current prototype, and this area still needs more exploration. However, I think one of those two options will give us the caching capabilities we need directly inside nginx. Functionally, the cache should do the same thing (since HTTP caching's a pretty standard thing), this again just simplifies things and removes the need to also run Varnish.

One of my main hesitations previously about using nginx's built-in cache was the lack of purge capabilities. However, through plugins, I think this can be addressed. So we'd probably use something like ngx_cache_purge, nginx-selective-cache-purge-module, or Ledge, to provide a purge API endpoint for administrators.

And while Varnish's banlists may technically be superior than straight purges, and Varnish's caching capabilities may also be more robust, I like the simplicity gained by making nginx our default cache implementation. And if someone is super keen on Varnish or other caching servers, there's no reason they couldn't still run those behind the scenes instead.

nginx: API Routing / Load Balancing

Finally, we get to the last piece of our current "nginx sandwich". After hitting nginx, then our Gatekeeper, and then Varnish, we route back to nginx again to perform the actual routing of API requests to the API backend servers. Instead of using the Gatekeeper or Varnish for this routing, we go back to nginx for a few reasons:

Connection caching/pooling for upstream keepalive servers: This has considerable benefits for us proxying to backend APIs that exist remotely or have high latency.
More advanced load balancing algorithms (for example, least connections)
Simplified logging: Having the front-most component and back-most component both be the same nginx instance simplifies some of our logging requirements (particularly for timers around how long requests took).

Why not implement it in X (Java, Go, Erlang, Node.js, etc)?

So if we can squash all of our functionality down into a single component, why not use something else to implement that component? Why Lua and nginx? We already have a lot of our custom proxying stack implemented in Node.js, so why not do everything there? I've touched on some of those details already for Node.js, but in all these cases it basically boils down to: this layer of API Umbrella is just a proxy with some additional logic in it. By embedding our logic inside nginx, we get to take advantage of nginx's proven proxying capabilities and features. While there may be libraries for nice reverse proxies in these other languages, in my experience, they don't provide a lot of the features we need and that get from nginx (and these features would be non-trivial to reimplement). Here's a few examples of features we get with nginx that usually aren't included in proxy libraries I usually see written in other high-level proxies that allow customization (for example, node-http-proxy):

Connection caching/pooling for upstream keepalive servers: This has considerable benefits for us proxying to backend APIs that exist remotely or have high latency.
More advanced load balancing algorithms (for example, least connections)
Backend server health monitoring
Built-in HTTP cache
SPDY, WebSockets, etc: nginx has been very quick to adopt and implement some of the newer protocols. Libraries and other servers often don't support these for quite a while (we're not necessarily using these yet, but I'd like for them to be supported).
High performance core with low resource requirements

Benchmarks

On some simplistic benchmarks, the average overhead of having API Umbrella proxying to an API drops from around 13ms in the old stack to 0.2-0.3ms in the nginx+Lua stack.

(Note: When I'm talking about benchmarking API Umbrella's "overhead," I'm referring to how much time API Umbrella adds to a raw API call locally--this does not account for network latency between users and API Umbrella or between API Umbrella and API backends--but there's not much we can do about those, so this is about optimizing what we can at API Umbrella's layer.)

Getting back to the averages above, there's a bit more to the story than what the averages tell. The current stack can be quite a bit faster than the 13ms average indicates (it can be as fast as 2-3ms), but a quarter of the requests it serves up are consistently much slower to respond (in the 40ms range), which drives the average up. We could certainly get to the bottom of where those periodic slowdowns are coming from in the current stack (since I'm not sure we've always had those slowdowns), but a few notes on that:

Part of the appeal of the new stack is that with its simplicity, it's a lot easier to debug these kinds of performance issues. With the current stack, it would be considerable more work to pinpoint where these slowdowns are coming from (for example, which piece of software in the stack, which network hop, etc).
Even then, the new stack's average is still considerably faster than the old stack at peak performance. It's admittedly less impressive, but it still drops the overhead by 10x from 2-3ms to 0.2-0.3ms.
On less impressive hardware, the overhead of nginx+Lua remains at around 0.2-0.3ms, while the overhead of our current stack increases. For example, on a local test VM, the nginx+Lua overhead remained at 0.2ms-0.3ms, while the current stack's average creeped up to 15ms and, more notably, the typical peak performance went from 2-3ms to 7ms. So it would appear the nginx+Lua combination is much more efficient with server resources and able to operate with lower overhead.

Here's what I did for the tests:

Created a simple "hello world" API endpoint and benchmarked hitting that API directly as a baseline.
Placed that API behind our current proxy and the new prototyped proxy implementations. Ran the benchmarks again, and measured the difference in average response time.
These tests were only performed with 1 concurrent connection, since I was just focusing on the proxying overhead. But as a quick note on concurrency, on some informal tests, I was seeing similar benefits to the new platform as we scale up concurrency (we're able to serve more concurrent requests at lower latency speeds with the new stack versus the old stack).

Onwards to the benchmark details:

Baseline (Direct API)

Requests per second:    5454.08 [#/sec] (mean)
Time per request:       0.183 [ms] (mean)

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       2
Processing:     0    0   0.0      0       1
Waiting:        0    0   0.0      0       1
Total:          0    0   0.0      0       2

Percentage of the requests served within a certain time (ms)
  50%      0
  66%      0
  75%      0
  80%      0
  90%      0
  95%      0
  98%      0
  99%      0
 100%      2 (longest request)

Current Stack (Node.js + nginx + Varnish)

Requests per second:    75.79 [#/sec] (mean)
Time per request:       13.195 [ms] (mean)

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     2   13  17.2      3     195
Waiting:        1    3   1.1      3      29
Total:          2   13  17.2      3     195

Percentage of the requests served within a certain time (ms)
  50%      3
  66%      4
  75%     15
  80%     42
  90%     43
  95%     43
  98%     44
  99%     45
 100%    195 (longest request)

Overhead: 13.012 milliseconds on average (13.195 ms average response times for this test - 0.183 ms average response times for the baseline)

However, what's of particular note is the standard deviation and the percentile breakdowns. When I first started seeing these average overhead numbers, they seemed much higher than what I remembered the last time I was profiling API Umbrella's internals. And to some degree the average is high--50% of requests are served in 2-3ms, but the problem become evident once you look at the percentile breakdowns--around 20% of our requests are served in the 42-45ms range. That's a non-insignificant number of requests that are much slower.

Prototyped Stack (Lua + nginx)

Requests per second:    2394.72 [#/sec] (mean)
Time per request:       0.418 [ms] (mean)

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     0    0   0.2      0      41
Waiting:        0    0   0.2      0      40
Total:          0    0   0.2      0      41

Percentage of the requests served within a certain time (ms)
  50%      0
  66%      0
  75%      0
  80%      0
  90%      0
  95%      0
  98%      0
  99%      1
 100%     41 (longest request)

Overhead: 0.235 milliseconds on average (0.418 ms average response times for this test - 0.183 ms average response times for the baseline)

Also note that 98% of requests were served in less than 1ms. 99% served in ~1ms or less. The max is 41ms, but I think that represents much more of a true outlier, since on other test runs I've had the max be around 6ms.

History / Cautionary Tale

I think it's worth touching briefly on the history of API Umbrella's architecture, since there are perhaps some lessons to be learned there.

From 2010-2013, API Umbrella's core proxying was composed of Ruby EventMachine and HAProxy. In 2013, we decided to switch over to Node.js and nginx.

The switch from HAProxy to nginx is perhaps the simplest to explain: nginx has connection pooling/caching for backend keepalive servers while HAProxy does not. HAProxy is still a fantastic piece of software, and it has other features I wish nginx had, but that keepalive feature has particular performance benefits for us, since we're proxying to remote API backends in different datacenters. This ability to hold open backend server connections helps speed things up for our specific use case.

The reason for switching from Ruby EventMachine to Node.js was also pretty straightforward. But if you're unfamiliar Ruby EventMachine, it offers an event-driven programming environment using the Ruby language. Node.js is also event-driven, so there are a lot of parallels in how the two function (non-blocking I/O, async callbacks, etc), it's just a different syntax. But the primary thing Node.js has going for it is that it was built from the ground up with this non-blocking mentality in mind, and so the entire ecosystem is non-blocking (database adapters, etc). With Ruby EventMachine, you're sort of split between two worlds--there are a plethora of standard Ruby libraries out there, but they may or may not block your application and harm your performance. There are certainly eventmachine-specific libraries available, but at the time, that ecosystem was much more nascent. It got frustrating not being able to use highly popular "normal" Ruby libraries for something like Redis connections, since those libraries performed blocking connections (or you could use them, but the performance of your app would suffer). And while there might have been an alternative eventmachine-based Redis library available, it was usually newer, not as well supported, lacking features, etc. The eventmachine scene has maybe changed over the past couple years, but at the time, it seemed like switching to Node.js was the best bet, since the entire ecosystem is oriented around non-blocking.

I mention this, because in some ways a Lua+nginx/OpenResty architecture would put us back in a similar situation to where we were with Ruby EventMachine. nginx+Lua is also event-based and must use non-blocking libraries. However, there are plenty of "normal" Lua libraries floating out there that do perform blocking. For situations where blocking is a concern (which isn't all the time), you then have to really look for the OpenResty-compliant libraries, since those have been built with nginx in mind. So if Lua's ecosystem wasn't already small enough, the OpenResty ecosystem is even smaller and less mature. The libraries supported by the core OpenResty team are of great quality, but the other ones seem to be of varying quality and popularity. As a quick example, I've already encountered bugs with the Resty MongoDB library that I've had to patch and submit fixes for.

So the idea of joining an ecosystem where some of the libraries aren't mature or missing doesn't make me super excited. But on the other hand, the OpenResty core is very solid, and we don't actually need a lot of other functionality or libraries in our proxy layer. Plus, some of the functionality OpenResty does provide is somewhat unique and makes our lives easier.

I think it's also worth noting that while the nginx+Lua/OpenResty ecosystem might be small, one of the main drivers of the entire platform is CloudFlare. They seem to run a lot of their services with this stack, and they do a lot of truly impressive stuff with it (and open source quite a bit of it). So it at least makes me feel a little bit better that they're invested in this stack and do a lot to contribute to the open source OpenResty world. Anecdotally, it also seems like I've seen more interest and growth in the OpenResty platform as a whole over the past couple years. This might just be due to me looking more into it recently, but I had actually looked at OpenResty a couple years ago when I was contemplating the switch to Node.js. I skipped over it at the time, since it still seemed too new and niche, but these days it seems like I see more being done with it, and I'm a little more comfortable with where the ecosystem is (albeit, it's still admittedly small).

Conclusion

So… Have you made it through my rambling (or at least skipped to the bottom)? We have an early prototype of running API Umbrella with Lua embedded inside nginx, but where do we go from here? What exactly are the pros and cons of such an architecture? Here are some ones I can think of:

Pros
- Speed: Reduces API Umbrella's proxying overhead by as much as 50x to speed up our proxying to the underlying APIs.
- Operational simplicity: It reduces the number of components in API Umbrella, which simplifies the operation of running and managing API Umbrella. It also simplifies aspects of the codebase.
- Lua's simplicity: Yes, Lua is a different language, but one thing it has going for it, in my opinion, is how simple the language is. I found it extremely approachable, readable, and easy to pick up.
Cons
- Lua's marketshare/mindshare: Lua definitely has fewer developers and less marketshare than Node.js. LangPop.com has lots of statistics and graphs. Until recently, I also don't think it was used much in the web-world. So while I think Lua is a simple language for developers to pick up, this might make API Umbrella less appealing to contributors for people that knew Node.js but don't currently know Lua.
- OpenResty's (nginx + Lua) smallish ecosystem: Even more specific than Lua's marketshare is the fact that the nginx+Lua (aka OpenResty)'s ecosystem is even smaller.
- Another language switch: In the past few years, we've already transitioned from Ruby EventMachine to Node.js for the Gatekeeper codebase. While I think we had good reasons for switching to Node.js, and I'd like to think we have good reasons to even be pondering this switch to Lua, I never like making big changes like this just for the sake of change.

And I guess to summarize my own personal opinion, I'm slightly favoring this nginx+Lua implementation, primarily because of how much it can simplify our server stack, which I think will make it much easier to run and maintain API Umbrella. The speed increases are also quite nice. The parts that make me nervous are the more nascent OpenResty ecosystem and whether Lua would be off-putting to potential contributors. I think those issues are surmountable, but I'm still not sold one way or another. Which is why we'd love to get your feedback!

So if you have any feedback, additional questions, comments, etc, feel free to leave them here. Or if all this has just been far too many words for you, you're welcome to take a nap.

Thanks!

The text was updated successfully, but these errors were encountered:

jtmkrueger · 2015-01-14T15:06:27Z

This is a bold and visionary move! Using Lapis and api-umbrella could be a really powerful combination to create fast, responsive api services. I wonder if both could share the same OpenResty instance?

GUI · 2015-01-21T05:01:53Z

For reference, some additional feedback was gathered via e-mail on the US Government APIs mailing list: https://groups.google.com/forum/?nomobile=true#!topic/us-government-apis/5QcbBKKD4dk

GUI · 2015-01-21T05:27:28Z

@jtmkrueger While technically a Lapis API could share the same OpenResty instance as API Umbrella, that's not something we would probably support. But you could certainly run another OpenResty instance on the same server and still proxy to that with API Umbrella with pretty minimal overhead (that's essentially what these benchmarks were doing). The reason we don't want to particularly support embedding APIs in the same server instances as API Umbrella is that I think it would add complexity to how we currently package and deploy API Umbrella. We've seen considerable benefits in making API Umbrella more of a standalone package that you just install and run as a whole, rather than the server administrator having to concern themselves with the API Umbrella internals and trying to mesh that with other pieces that may already exist on servers. If we try to allow sharing a single OpenResty server, I think it just gets trickier to package up or manage in a way that we wouldn't step on each others toes. But again, you could certainly run a separate OpenResty process on the same server or other servers to provide your APIs, and API Umbrella could proxy to that just like any other API backend.

But I think this does bring up an important point, which is that if we did pursue this prototyped rearchitecture, it would be backwards compatible from a usage and administration perspective with the current version of API Umbrella. We would just provide new binary packages that could be installed, and existing installations should be upgradeable just like any other package upgrade. Again, by treating API Umbrella as more of a standalone package, I think that has some advantages over having server admins setup all the individual components. In this case, it gives us a little more latitude to pick which dependencies we need, or even do something wild, like swap an entire component's implementation (as long as the functionality is the same), with minimal impact on users and administrators of API Umbrella.

brylie · 2015-01-21T08:31:14Z

Great writeup :-)

How might the OpenResty approach affect server memory overhead, generally speaking? Is there any way to efficiently leverage Node.js libraries using the OpenResty approach?

ghchinoy · 2015-01-21T11:41:26Z

+1 This seems like a great move, and well explained. What do you think the effect will be on the other contributors to the project (are there many that aren't NREL based) - i.e. would the switch from node to lua cut down on existing external contributors? I don't think implementers of api-umbrella would have too much of a problem with an internal change that improves performance.
I'd be fine with Lua :)

darylrobbins · 2015-01-21T15:47:27Z

I think your new architecture is very sound and Nginx is definitely the right horse to be hitching your wagon to. As an added benefit with the significantly increased throughput, API Umbrella would be much better positioned to try and protect against DoS attacks.

I do have similar apprehensions about Lua/OpenResty though:

On the contribution side, Lua is pretty easy as a language to pickup but I think the bigger barrier to entry is getting up to speed on OpenResty/Nginx
The ecosystem and finding appropriate libraries definitely sounds like it could be a headache. Take, for example, if you wanted to support multiple data stores besides MongoDB in the next go round; your likelihood of finding the libraries required could be quite limited.

All that said, what's the alternative?

If we look at some of the languages designed for concurrency, there may be some suitable web server implemented in Go or Erlang that could be extended. But you're going to run into similar problems of a small ecosystem around a more esoteric web server. Not to mention that choosing Erlang would shrink your contributor base significantly. I have nothing against Erlang, but it's more of a niche language.

So, let's take for given that Nginx is right "container" for this proxy. To write an Nginx module, one would typically use C, but then you're losing a lot of the developer productivity benefits of a higher level languages. You could create some sort of C wrapper and embed some other concurrent scripting container, but why bother when that's what you're already getting with Lua using OpenResty.

So, although I have some apprehensions about OpenResty, I don't see a better option either.

brylie · 2015-01-21T17:32:34Z

Nginx developers are planning core JavaScript support.

We're planning JavaScript configurations, using JavaScript in [an] Nginx configuration. We plan to be more efficient on these [configurations], and we plan to develop a flexible application platform. You can use JavaScript snippets inside configurations to allow more flexible handling of requests, to filter responses, to modify responses. Also, eventually, JavaScript can be used as [an] application language for Nginx.

In the meantime, assuming JS is a candidate language, would it be possible to port some of the Umbrella components using ngx_http_js?

brylie · 2015-01-27T08:34:51Z

I opened a ticket on the Nginx Trac requesting clarification of plans for and progress towards native JavaScript support.

GUI · 2015-01-30T03:49:12Z

Thanks all! Some belated responses:

@brylie:

How might the OpenResty approach affect server memory overhead, generally speaking?

This is something I plan to benchmark, so I have more concrete numbers, but very loosely speaking, I think memory usage should be quite a bit lower with this OpenResty approach. The processes weren't super memory hungry before, but the new architecture does cut down on the number of processes quite a bit, so I think it should consume less memory (since we'd no longer be running Varnish, another nginx process, or multiple node.js processes). The main nginx worker processes would consume more memory in return, but I think we should see a somewhat substantial reduction in memory overall. But again, this is something I need to properly document and benchamrk.

Is there any way to efficiently leverage Node.js libraries using the OpenResty approach?

No, not really. Since this OpenResty approach would shift everything towards Lua, we would no longer be depending on Node.js

I opened a ticket on the Nginx Trac requesting clarification of plans for and progress towards native JavaScript support.

Thanks for bringing this to my attention. I had not heard about these JS in nginx plans before. When I Googled around, I couldn't find much more information than that quote in InfoWorld, so thanks for filing the issue asking for more details.

This would certainly give us more to think about if this is happening in nginx core. However, without knowing more details, I'm somewhat apprehensive to put all our eggs in that theoretical basket. One of my primary questions would be whether or not nginx's implementation would actually work with Node.js, or if they would be doing something different (for example, I don't think that existing ngx_http_js module would support Node.js libraries). Doing something different wouldn't necessarily be a bad thing, but then in that case, it seems like the library ecosystem would be back to square one, in which case I don't see a huge advantage over OpenResty.

GUI · 2015-01-30T03:57:39Z

@ghchinoy: Thanks for the feedback! Regarding contributors, that is something I was hoping to get a sense from in this issue. Currently, we don't have a great deal of external contributors to the primary code base. Of course, we'd love to see that change if there's community interest in this project, so that is why I wanted to open this issue and try to gauge the current community's take on Lua. Because even if people aren't contributing now, we'd like to avoid making a change in direction that would be seen as a big detractor from people possibly contributing in the future.

So you saying you'd be fine with Lua is precisely the type of thing we're interested in knowing. Thanks for the input!

GUI · 2015-01-30T04:01:30Z

@darylrobbins: Sorry for the delay, but I appreciate the feedback! And I think your thoughts pretty much mirror my own in terms of apprehensions, alternatives, and fit. Overhauling our platform to use Lua and OpenResty certainly gives me pause, but the more I dwell on it, I think it does feel better than some possible alternatives, and I think it is a good fit for what we're trying to accomplish with our proxy layer.

GUI · 2015-11-29T16:57:21Z

v0.9 has been released with these architecture updates. Further details: #183 https://github.com/NREL/api-umbrella/releases/tag/v0.9.0

GUI mentioned this issue Jan 13, 2015

Gather feedback on Lua/OpenResty prototype and speed improvements 18F/api.data.gov#167

Closed

perfaram mentioned this issue Jan 27, 2015

Problems with mongod #92

Closed

brylie mentioned this issue Jan 28, 2015

port to plain old C++ peter-leonov/ngx_http_js_module#10

Open

GUI mentioned this issue Feb 23, 2015

Caching of API Keys on Router Nodes #111

Closed

GUI mentioned this issue Mar 31, 2015

routing error #128

Closed

GUI mentioned this issue May 27, 2015

Prevent analytics backlog from blowing up system memory 18F/api.data.gov#233

Closed

GUI mentioned this issue Jun 27, 2015

Add API endpoint for admins to be able to purge cached content on demand 18F/api.data.gov#252

Closed

GUI mentioned this issue Sep 20, 2015

Create strategic partnership with other open-source API proxies #159

Open

6 tasks

GUI mentioned this issue Oct 27, 2015

The grand Lua pull request #183

Merged

GUI closed this as completed Nov 29, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revisiting, speeding up, and simplifying API Umbrella's architecture #86

Revisiting, speeding up, and simplifying API Umbrella's architecture #86

GUI commented Jan 13, 2015

jtmkrueger commented Jan 14, 2015

GUI commented Jan 21, 2015

GUI commented Jan 21, 2015

brylie commented Jan 21, 2015

ghchinoy commented Jan 21, 2015

darylrobbins commented Jan 21, 2015

brylie commented Jan 21, 2015

brylie commented Jan 27, 2015

GUI commented Jan 30, 2015

GUI commented Jan 30, 2015

GUI commented Jan 30, 2015

GUI commented Nov 29, 2015

Revisiting, speeding up, and simplifying API Umbrella's architecture #86

Revisiting, speeding up, and simplifying API Umbrella's architecture #86

Comments

GUI commented Jan 13, 2015

Short(ish) Overview

Longer Version

Current Architecture

Proposed Architecture

How & Why

nginx: Initial Routing / Load Balancing

Node.js: Gatekeeper

Lua vs Node.js Implementation Comparisons

Varnish: HTTP Cache

nginx: API Routing / Load Balancing

Why not implement it in X (Java, Go, Erlang, Node.js, etc)?

Benchmarks

Baseline (Direct API)

Current Stack (Node.js + nginx + Varnish)

Prototyped Stack (Lua + nginx)

History / Cautionary Tale

Conclusion

jtmkrueger commented Jan 14, 2015

GUI commented Jan 21, 2015

GUI commented Jan 21, 2015

brylie commented Jan 21, 2015

ghchinoy commented Jan 21, 2015

darylrobbins commented Jan 21, 2015

brylie commented Jan 21, 2015

brylie commented Jan 27, 2015

GUI commented Jan 30, 2015

GUI commented Jan 30, 2015

GUI commented Jan 30, 2015

GUI commented Nov 29, 2015