Skip to content

thattommyhall/ipfs-nginx-accesscontrol

master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
app
 
 
 
 
 
 
 
 
 
 
 
 

Overview

There are 3 things running:

  • app is a simple Python Flask app that serves up /acl for an allow/denylist and a check_cid endpoint. Note: its running in dev mode and will be sloooooooow, the point of only hitting it for novel CIDs is that it should not matter too much. THERE IS ZERO EXPECTATION ANYONE WOULD PUT THIS PYTHON APP IN PRODUCTION
  • go-ipfs for the upstream connection (commented out to benchmark the other stuff)
  • nginx

Nginx / Lua

The best diagram for what where the lua hooks are is here

We use the following phases

init_by_lua

Usually just to require all the modules before forking to save RAM and speed up later requires, simmilar to puma in ruby etc.

init_worker_by_lua

This runs once per nginx worker and can be used to schedule things periodically

content_by_lua

I have a /refresh and /status page for testing, with lua_code_cache off; you can have a reasonable flow for testing changes w/o creating an elaborate test harness

rewrite_by_lua

Runs on every request, we have 3 strategies, all using shared (thread safe) dictionaries set by lua_shared_dict called allow and deny:

rewrite-allow.lua

For when you want to only serve a list of CIDs. Assumes a periodic refresh of the allow/deny dict from the api, looks if CID is allowed, then checks if CID/PATH is disallowed.

Note if you dont ngx.exit() this script it goes to the next phase (ie proxying the backend)

rewrite-deny.lua

Like allow but just looks if CID/PATH is denyed, more like the use case where you block "bad" or legally taken-down content.

rewrite-check.lua

This only allows/denys at the CID level (assumption you have lots of them and that you dont want to block paths underneath them)

Will hit the api and add into the allow/deny dict, if you can allow some requests to slip past, you could do the api call and dict update in another co-routine with ngx.timer.at(0, fn)

Note the early return if the CID is allowed, that exits the processing of the request.

Notes

The shape of the api has been tweaked a little but should not be considered canonical or even good, especially check_cid (the shape of whats returned just matches /acl as I wrote that first and didnt want to re-work for a PoC), you probably want to return less (maybe even just a code?) on that call.

Performance

It performs ok, I took one days worth of URIs from a PL gateway and ran 20k requests (I made sure nginx only started one worker)

hitting the (slooooow) api

() $ baton -u http://localhost:3000/ -c 30 -r 20000 -z urls.txt
Configuring to send requests from file. (Read 594884 requests)
Generating the requests...
Finished generating the requests
Sending the requests to the server...
Finished sending the requests
Processing the results...
=========================== Results ========================================
Total requests:                                 20000
Time taken to complete requests:        21.422449757s
Requests per second:                              934
Max response time (ms):                           199
Min response time (ms):                             0
Avg response time (ms):                         31.55
========= Percentage of responses by status code ==========================
Number of connection errors:                        0
Number of 1xx responses:                            0
Number of 2xx responses:                            0
Number of 3xx responses:                            0
Number of 4xx responses:                        20000
Number of 5xx responses:                            0
========= Percentage of responses received within a certain time (ms)======
        64% : 19 ms
        64% : 38 ms
        64% : 57 ms
        64% : 76 ms
        94% : 95 ms
        99% : 114 ms
        99% : 133 ms
        99% : 152 ms
        99% : 171 ms
       100% : 199 ms
===========================================================================

Rerun (presumably now all cached in the dicts)

() $ baton -u http://localhost:3000/ -c 30 -r 20000 -z urls.txt
Configuring to send requests from file. (Read 594884 requests)
Generating the requests...
Finished generating the requests
Sending the requests to the server...
Finished sending the requests
Processing the results...
=========================== Results ========================================
Total requests:                                 20000
Time taken to complete requests:         1.853002984s
Requests per second:                            10793
Max response time (ms):                            21
Min response time (ms):                             0
Avg response time (ms):                          2.26
========= Percentage of responses by status code ==========================
Number of connection errors:                        0
Number of 1xx responses:                            0
Number of 2xx responses:                            0
Number of 3xx responses:                            0
Number of 4xx responses:                        20000
Number of 5xx responses:                            0
========= Percentage of responses received within a certain time (ms)======
        80% : 2 ms
        96% : 4 ms
        98% : 6 ms
        99% : 8 ms
        99% : 10 ms
        99% : 12 ms
        99% : 14 ms
        99% : 16 ms
        99% : 18 ms
       100% : 21 ms
===========================================================================

Might want to experiment more with dict sizes and different amounts of novel CIDs

About

Spike of letting nginx grab access control lists for ipfs gateways

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published