Permalink
Browse files

Sync-ing latest waiter code - 20170704 (#48)

* -- sync-ing latest waiter code

* -- update jet dependency

* -- remove dependency on waiter.auth.spnego in waiter.websocket

* -- avoid default content-type fix
  • Loading branch information...
shamsimam authored and dposada committed Jul 6, 2017
1 parent 2f8d2fc commit 64279418bcfb8f3f319bf89220b4a783f81e89e7
Showing with 2,650 additions and 828 deletions.
  1. +2 −0 .gitignore
  2. +21 −8 waiter/README.md
  3. +7 −1 waiter/config-full.edn
  4. +2 −0 waiter/docs/parameters.md
  5. +2 −2 waiter/integration/waiter/async_request_integration_test.clj
  6. +2 −2 waiter/integration/waiter/autoscaling_test.clj
  7. +59 −7 waiter/integration/waiter/basic_test.clj
  8. +2 −2 waiter/integration/waiter/busy_instance_test.clj
  9. +2 −2 waiter/integration/waiter/cookie_support_test.clj
  10. +2 −2 waiter/integration/waiter/instance_reservation_test.clj
  11. +2 −2 waiter/integration/waiter/killed_instance_test.clj
  12. +3 −3 waiter/integration/waiter/latency_test.clj
  13. +2 −2 waiter/integration/waiter/metrics_output_test.clj
  14. +2 −2 waiter/integration/waiter/new_app_test.clj
  15. +2 −2 waiter/integration/waiter/request_method_test.clj
  16. +3 −2 waiter/integration/waiter/request_timeout_test.clj
  17. +2 −2 waiter/integration/waiter/response_headers_test.clj
  18. +2 −2 waiter/integration/waiter/statsd_support_test.clj
  19. +2 −2 waiter/integration/waiter/streaming_test.clj
  20. +46 −3 waiter/integration/waiter/token_request_test.clj
  21. +264 −0 waiter/integration/waiter/websocket_integration_test.clj
  22. +2 −2 waiter/integration/waiter/work_stealing_integration_test.clj
  23. +14 −12 waiter/project.clj
  24. +6 −1 waiter/resources/web/consent.html
  25. +2 −2 waiter/src/waiter/async_request.clj
  26. +2 −2 waiter/src/waiter/async_utils.clj
  27. +46 −5 waiter/src/waiter/auth/authentication.clj
  28. +6 −6 waiter/src/waiter/auth/kerberos.clj
  29. +34 −55 waiter/src/waiter/auth/spnego.clj
  30. +5 −6 waiter/src/waiter/client_tools.clj
  31. +22 −9 waiter/src/waiter/cookie_support.clj
  32. +112 −44 waiter/src/waiter/core.clj
  33. +2 −2 waiter/src/waiter/correlation_id.clj
  34. +2 −2 waiter/src/waiter/cors.clj
  35. +14 −5 waiter/src/waiter/curator.clj
  36. +2 −2 waiter/src/waiter/discovery.clj
  37. +17 −12 waiter/src/waiter/handler.clj
  38. +20 −5 waiter/src/waiter/headers.clj
  39. +2 −2 waiter/src/waiter/kv.clj
  40. +4 −3 waiter/src/waiter/main.clj
  41. +10 −5 waiter/src/waiter/marathon.clj
  42. +18 −2 waiter/src/waiter/metrics.clj
  43. +9 −8 waiter/src/waiter/metrics_sync.clj
  44. +2 −2 waiter/src/waiter/monitoring.clj
  45. +2 −2 waiter/src/waiter/password_store.clj
  46. +184 −157 waiter/src/waiter/process_request.clj
  47. +2 −2 waiter/src/waiter/scaling.clj
  48. +11 −16 waiter/src/waiter/scheduler.clj
  49. +12 −2 waiter/src/waiter/schema.clj
  50. +2 −2 waiter/src/waiter/security.clj
  51. +4 −4 waiter/src/waiter/service.clj
  52. +53 −16 waiter/src/waiter/service_description.clj
  53. +12 −6 waiter/src/waiter/settings.clj
  54. +11 −7 waiter/src/waiter/shell_scheduler.clj
  55. +2 −2 waiter/src/waiter/simulator.clj
  56. +3 −3 waiter/src/waiter/state.clj
  57. +2 −2 waiter/src/waiter/statsd.clj
  58. +34 −26 waiter/src/waiter/token.clj
  59. +22 −6 waiter/src/waiter/utils.clj
  60. +359 −0 waiter/src/waiter/websocket.clj
  61. +2 −2 waiter/src/waiter/work_stealing.clj
  62. +2 −2 waiter/test-files/test-bar.edn
  63. +2 −2 waiter/test-files/test-foo.edn
  64. +4 −4 waiter/test/waiter/async_request_test.clj
  65. +2 −2 waiter/test/waiter/async_utils_test.clj
  66. +42 −11 waiter/test/waiter/auth/authenticator_test.clj
  67. +2 −2 waiter/test/waiter/auth/kerberos_test.clj
  68. +13 −7 waiter/test/waiter/auth/spnego_test.clj
  69. +23 −5 waiter/test/waiter/cookie_support_test.clj
  70. +200 −68 waiter/test/waiter/core_test.clj
  71. +2 −2 waiter/test/waiter/correlation_id_test.clj
  72. +2 −2 waiter/test/waiter/cors_test.clj
  73. +2 −2 waiter/test/waiter/curator_test.clj
  74. +2 −2 waiter/test/waiter/discovery_test.clj
  75. +2 −2 waiter/test/waiter/dummy_test.clj
  76. +68 −30 waiter/test/waiter/handler_test.clj
  77. +2 −2 waiter/test/waiter/headers_test.clj
  78. +2 −2 waiter/test/waiter/kv_test.clj
  79. +27 −17 waiter/test/waiter/marathon_test.clj
  80. +18 −13 waiter/test/waiter/metrics_sync_test.clj
  81. +2 −2 waiter/test/waiter/metrics_test.clj
  82. +2 −2 waiter/test/waiter/mocks.clj
  83. +2 −2 waiter/test/waiter/monitoring_test.clj
  84. +2 −2 waiter/test/waiter/password_store_test.clj
  85. +64 −30 waiter/test/waiter/process_request_test.clj
  86. +2 −2 waiter/test/waiter/scaling_test.clj
  87. +9 −7 waiter/test/waiter/scheduler_test.clj
  88. +2 −2 waiter/test/waiter/schema_test.clj
  89. +2 −2 waiter/test/waiter/security_test.clj
  90. +199 −57 waiter/test/waiter/service_description_test.clj
  91. +2 −2 waiter/test/waiter/service_test.clj
  92. +2 −2 waiter/test/waiter/settings_test.clj
  93. +31 −13 waiter/test/waiter/shell_scheduler_test.clj
  94. +2 −2 waiter/test/waiter/simulator_test.clj
  95. +2 −2 waiter/test/waiter/state_service_responder_test.clj
  96. +2 −2 waiter/test/waiter/state_test.clj
  97. +2 −2 waiter/test/waiter/statsd_test.clj
  98. +2 −2 waiter/test/waiter/test_helpers.clj
  99. +67 −11 waiter/test/waiter/token_test.clj
  100. +2 −2 waiter/test/waiter/utils_test.clj
  101. +2 −2 waiter/test/waiter/utils_test_ns.clj
  102. +326 −0 waiter/test/waiter/websocket_test.clj
  103. +2 −2 waiter/test/waiter/work_stealing_test.clj
  104. +2 −2 waiter/test/waiter/zk_test.clj
@@ -1 +1,3 @@
.DS_Store
.idea/
.lein*
@@ -1,4 +1,6 @@
Waiter is a distributed autoscaler and load balancer for managing web services at scale. Waiter particularly excels at running services with unpredictable loads or multiple co-existing versions. Waiter uses [Marathon](https://mesosphere.github.io/marathon/) to schedule services on a [Mesos](http://mesos.apache.org/) cluster.
Waiter is a distributed autoscaler and load balancer for managing web services at scale.
Waiter particularly excels at running services with unpredictable loads or multiple co-existing versions.
Waiter uses [Marathon](https://mesosphere.github.io/marathon/) to schedule services on a [Mesos](http://mesos.apache.org/) cluster.

## Running Waiter

@@ -8,15 +10,19 @@ Prerequisites:
* [Leiningen](http://leiningen.org/)
* A running [Marathon](https://mesosphere.github.io/marathon/)

The quickest way to get Mesos, Marathon, and Waiter running locally is with [docker](https://www.docker.com/) and [minimesos](https://minimesos.org/). Check out the [Quickstart](../README.md#quickstart) for details.
The quickest way to get Mesos, Marathon, and Waiter running locally is with [docker](https://www.docker.com/) and [minimesos](https://minimesos.org/).
Check out the [Quickstart](../README.md#quickstart) for details.

Read the [config-minimal.edn](config-minimal.edn) or [config-full.edn](config-full.edn) files for descriptions of the Waiter config structure. Waiter logs are in `/log`, and `waiter.log` should contain info on what went wrong if Waiter doesn't start.
Read the [config-minimal.edn](config-minimal.edn) or [config-full.edn](config-full.edn) files for descriptions of the Waiter config structure.
Waiter logs are in `/log`, and `waiter.log` should contain info on what went wrong if Waiter doesn't start.

## Running Waiter tests

To run all unit tests, simply run `lein test`. The unit tests run very fast, and they do not require Waiter to be up and running.

The Waiter integration tests require Waiter to be up and running. The integration tests rely heavily on the [kitchen test app](../kitchen). They therefore need to know where kitchen is installed on your Mesos agent(s), so that they can send the appropriate command to Waiter. You can customize this path by setting the `WAITER_TEST_KITCHEN_CMD` environment variable.
The Waiter integration tests require Waiter to be up and running. The integration tests rely heavily on the [kitchen test app](../kitchen).
They therefore need to know where kitchen is installed on your Mesos agent(s), so that they can send the appropriate command to Waiter.
You can customize this path by setting the `WAITER_TEST_KITCHEN_CMD` environment variable.

Once Waiter has started:

@@ -41,9 +47,11 @@ $ WAITER_TEST_REQUEST_LATENCY_MAX_INSTANCES=30 APACHE_BENCH_DIR=/usr/bin lein te

## What is Waiter

Waiter is a web service platform that runs, manages and automatically scales services without human intervention. Waiter particularly excels at running services with unpredictable loads or requiring intensive computing resources.
Waiter is a web service platform that runs, manages and automatically scales services without human intervention.
Waiter particularly excels at running services with unpredictable loads or requiring intensive computing resources.

Developers register services on Waiter by simply supplying their service startup command via a token or directly in a HTTP request header. Waiter takes care of managing the entire lifecycle of services from that point on, including running the service when and only when there is traffic, scaling the service up when there is more traffic, and tearing down the service if there is no traffic.
Developers register services on Waiter by simply supplying their service startup command via a token or directly in a HTTP request header.
Waiter takes care of managing the entire lifecycle of services from that point on, including running the service when and only when there is traffic, scaling the service up when there is more traffic, and tearing down the service if there is no traffic.

Managing service lifecycle is the key differentiator of Waiter that empowers developers to build more and faster.

@@ -56,13 +64,18 @@ Waiter was designed with simplicity in mind - your existing web services can run

### Handling Unpredictable Traffic with Optimal Resource Utilization

In many (arguably most) situations, it is hard to anticipate how much traffic your service will receive and when it will come. At Two Sigma, we run massive batch workloads to simulate real trading environments. The demand fluctuates and is highly unpredictable. The services that serve those workloads may see zero requests for several days in a row and then suddenly see thousands of requests per second. Capacity planning becomes infeasible. If we underestimate the traffic, the services can be easily overwhelmed, become unresponsive, or even crash, resulting in constant human intervention and poor developer productivity. If we provision sufficient capacity, then the allocated resources are completely wasted when there is no traffic.
In many (arguably most) situations, it is hard to anticipate how much traffic your service will receive and when it will come.
At Two Sigma, we run massive batch workloads to simulate real trading environments.
The demand fluctuates and is highly unpredictable. The services that serve those workloads may see zero requests for several days in a row and then suddenly see thousands of requests per second. Capacity planning becomes infeasible. If we underestimate the traffic, the services can be easily overwhelmed, become unresponsive, or even crash, resulting in constant human intervention and poor developer productivity. If we provision sufficient capacity, then the allocated resources are completely wasted when there is no traffic.

Waiter solves this problem. When demand increases, Waiter launches more instances to meet demand and scales down when those instances are no longer needed.

### Machine Learning at Scale

Many machine learning libraries and algorithms are memory hungry or CPU hungry. By running machine learning services on Waiter, you can easily scale these services on demand. At Two Sigma, Waiter serves and scales machines learning services critical to our business. We have services running on Waiter handling thousands of requests per second.
Many machine learning libraries and algorithms are memory hungry or CPU hungry.
By running machine learning services on Waiter, you can easily scale these services on demand.
At Two Sigma, Waiter serves and scales machines learning services critical to our business.
We have services running on Waiter handling thousands of requests per second.

## Documentation

@@ -318,7 +318,9 @@
;; For the other parameters, if the user does not provide a
;; value for the parameter when constructing her service
;; description, these defaults will be used:
:service-description-defaults {"blacklist-on-503" true
:service-description-defaults {"authentication" "standard"
"backend-proto" "http"
"blacklist-on-503" true
"concurrency-level" 1
"distribution-scheme" "balanced"
"env" {"FOO" "bar"
@@ -377,6 +379,10 @@
;; The HTTP idle timeout (milliseconds) for instance requests:
:initial-socket-timeout-ms 900000

;; Size in bytes of the output buffer used to aggregate HTTP responses from a backend.
;; The value must be a positive integer.
:output-buffer-size 2048

;; The default amount of time (milliseconds) each request will wait in
;; the Waiter queue before an instance is available to process it. This
;; can be overriden in the service description:
@@ -16,6 +16,8 @@ Additional (optional) parameters that can be set:

|Parameter|Default Value|Valid Values|Description|Guidance|
|---------|-------------|------------|-----------|--------|
|`X-Waiter-Authentication`|standard|disabled or standard|The authentication mechanism to use for incoming requests.|By default, Waiter authenticates incoming requests using the standard protocol (e.g. Kerberos). If you would prefer that Waiter not authenticate incoming requests, set this flag to disabled.|
|`X-Waiter-Backend-Proto`|http|http or https|The backend connection protocol to use.|By default, Waiter connects to backend instances using the HTTP protocol. If you would prefer that Waiter use HTTPS, feel free to set this parameter to https.|
|`X-Waiter-Blacklist-On-503`|true|true or false|If an instance returns 503, whether or not the instance will be blacklisted.|By default, Waiter avoids instances that are returning a 503 error code in responses, which typically indicates the server is too busy. If you would prefer that Waiter not do this, feel free to disable this feature.|
|`X-Waiter-Cmd-Type`|"shell"|"shell"|Provides an extension point for supporting different types of commands in the future.|Feel free to add new command types to suit your needs.|
|`X-Waiter-Concurrency-Level`|1|1-10000|The number of simultaneous requests to an individual backend instance.|Increasing this value will likely lead to better performance and better resource utilization. Avoid queuing your requests on your backend instances by not increasing concurrency level above what an individual instance can handle. The shorter your requests, the more benefit you will get out of a higher concurrency level.|
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -17,15 +17,19 @@
[clojure.test :refer :all]
[clojure.tools.logging :as log]
[waiter.client-tools :refer :all]
[waiter.service-description :as sd]))
[waiter.service-description :as sd])
(:import java.io.ByteArrayInputStream))

(deftest ^:parallel ^:integration-fast test-basic-secrun
(testing-using-waiter-url
(log/info (str "Basic test using endpoint: /secrun"))
(let [response (make-request-with-debug-info
{:x-waiter-name (rand-name "testbasic")}
#(make-kitchen-request waiter-url % :path "/secrun"))]
(delete-service waiter-url (:service-id response)))))
(let [{:keys [body service-id] :as response}
(make-request-with-debug-info
{:x-waiter-name (rand-name "testbasic")}
#(make-kitchen-request waiter-url % :path "/secrun"))]
(assert-response-status response 200)
(is (= "Hello World" body))
(delete-service waiter-url service-id))))

(deftest ^:parallel ^:integration-fast test-basic-no-content
(testing-using-waiter-url
@@ -43,6 +47,54 @@
(is (= "text/plain" (get-in body-json ["headers" "accept"])) (str body))
(delete-service waiter-url service-id))))

(deftest ^:parallel ^:integration-fast test-basic-http-methods-support
(testing-using-waiter-url
(log/info "Basic test for empty body in request")
(let [{:keys [request-headers service-id]} (make-request-with-debug-info
{:x-waiter-name (rand-name "test-basic-http-methods-support")
:accept "text/plain"}
#(make-kitchen-request waiter-url % :path "/hello"))
http-method-helper (fn http-method-helper [http-method]
(fn inner-http-method-helper [url & [req]]
(http/request (merge req {:method http-method :url url}))))]
(testing "verifying whether request method HEAD works"
(let [response (make-kitchen-request waiter-url request-headers
:http-method-fn (http-method-helper :head)
:path "/request-info")]
(assert-response-status response 200)
(is (nil? (:body response)))))
(doseq [request-method [:delete :copy :get :move :patch :post :put]]
(testing (str "verifying whether request method " (-> request-method name str/upper-case) " works")
(let [{:keys [body] :as response} (make-kitchen-request waiter-url request-headers
:http-method-fn (http-method-helper request-method)
:path "/request-info")
body-json (json/read-str (str body))]
(assert-response-status response 200)
(is (= (name request-method) (get body-json "request-method"))))))
(delete-service waiter-url service-id))))

(deftest ^:parallel ^:integration-fast test-request-content-headers
(testing-using-waiter-url
(let [request-length 100000
long-request (apply str (repeat request-length "a"))
headers {:x-waiter-name (rand-name "testcontentheaders")}
{:keys [service-id cookies] :as plain-resp} (make-request-with-debug-info
headers
#(make-kitchen-request waiter-url % :path "/request-info" :body long-request))
chunked-resp (make-request-with-debug-info
headers
#(make-kitchen-request waiter-url % :path "/request-info"
; force clj-http to make a chunked request
:body (ByteArrayInputStream. (.getBytes long-request))
:cookies cookies))
plain-body-json (json/read-str (str (:body plain-resp)))
chunked-body-json (json/read-str (str (:body chunked-resp)))]
(is (= (str request-length) (get-in plain-body-json ["headers" "content-length"])))
(is (nil? (get-in plain-body-json ["headers" "transfer-encoding"])))
(is (= "chunked" (get-in chunked-body-json ["headers" "transfer-encoding"])))
(is (nil? (get-in chunked-body-json ["headers" "content-length"])))
(delete-service waiter-url service-id))))

(deftest ^:parallel ^:integration-fast test-large-header
(testing-using-waiter-url
(log/info (str "Basic test using endpoint: /secrun"))
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -69,7 +69,7 @@

(deftest ^:perf test-request-latency-apache-bench
(testing-using-waiter-url
(let [max-instances (Integer/parseInt (or (System/getenv "WAITER_TEST_REQUEST_LATENCY_MAX_INSTANCES") 50))
(let [max-instances (Integer/parseInt (or (System/getenv "WAITER_TEST_REQUEST_LATENCY_MAX_INSTANCES") "50"))
_ (log/info "using max-instances =" max-instances)
client-concurrency-level 800
waiter-concurrency-level 4
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
@@ -1,9 +1,9 @@
;;
;; Copyright (c) 2017 Two Sigma Investments, LLC.
;; Copyright (c) 2017 Two Sigma Investments, LP.
;; All Rights Reserved
;;
;; THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF
;; Two Sigma Investments, LLC.
;; Two Sigma Investments, LP.
;;
;; The copyright notice above does not evidence any
;; actual or intended publication of such source code.
Oops, something went wrong.

0 comments on commit 6427941

Please sign in to comment.