exclusive worker mode as default (#513)

* exclusive worker mode as default * context settings - rename max-parallel-jobs to max-jobs in mists config to keep consistent naming. Its json representation uses maxJobs for this field
Hydrospheredata · Aug 23, 2018 · 186be19 · 186be19
1 parent 7e598b0
commit 186be19
Show file tree

Hide file tree

Showing 9 changed files with 121 additions and 29 deletions.
diff --git a/docs/src/main/tut/01_about.md b/docs/src/main/tut/01_about.md
@@ -10,7 +10,7 @@ Hydrosphere Mist is a serverless proxy for Spark.
 It provides a better way to write, deploy, run, and manage spark applications.
 
 Features:
-- Spark Function as a Service. Deploy Spark functions rather than nodetebooks or scripts.
+- Spark Function as a Service. Deploy Spark functions rather than notebooks or scripts.
 - Typeful library api for scala and java + python dsl.
 - Spark contexts managing. Decoupling of user API from Spark settings, cluster provisioning, resource isolation, sharing and auto-scaling. 
 - Support several interfaces for communication:

diff --git a/docs/src/main/tut/02_quick_start.md b/docs/src/main/tut/02_quick_start.md
@@ -69,7 +69,6 @@ Demo:
  <source src="/mist-docs/img/quick-start-ui.webm" type='video/webm; codecs="vp8, vorbis"'>
 </video>
 
-
 ### Build your own function
 
 Mist provides typeful library for writing functions in scala/java and special dsl for python.
@@ -316,7 +315,6 @@ def hello_mist(sc, n):
     return {'result': pi}
 ```
 
-
 ### Connect to your existing Apache Spark cluster
 
 **Note** For this section it's recommended to use mist from binary distributive. Using mist from docker 
@@ -345,7 +343,7 @@ Configuring a particular function to be executed on a different Spark Cluster is
   ```
 
 Yarn, Mesos and Kubernetes cluster settings are managed the same way.
-Please, follow to offical guides:
+Please, follow to official guides:
 - [Yarn](https://spark.apache.org/docs/latest/running-on-yarn.html)
 - [Mesos](https://spark.apache.org/docs/latest/running-on-mesos.html)
 - [Kubernetes](https://spark.apache.org/docs/latest/running-on-kubernetes.html)
@@ -354,5 +352,11 @@ Please, follow to offical guides:
 
 **Note** It may be required to correctly configure `host` values to make mist visible from outside - see [configuration page](/mist-docs/configuration.html)
 
-Mist uses `spark-submit` under the hood, if you need to provide environment variables for it, just set it up before launching Mist Worker.
+Mist uses `spark-submit` under the hood, if you need to provide environment variables for it, just set it up before launching Mist Master.
+
+### Next
+
+To get a more information about how mist works and how to configure contexts check following pages:
+- [Invocation details](/mist-docs/invocation.html)
+- [Context configuration](/mist-docs/contexts.html)
 
diff --git a/docs/src/main/tut/07_invocation.md b/docs/src/main/tut/07_invocation.md
@@ -5,15 +5,38 @@ permalink: invocation.html
 position: 7
 ---
 
-### Invocation
+### Invocation model
 
-Job could be in one of possible statuses:
+Mist treats every function invocation as a job.
+To invoke its mist has to have a spark-driver application with spark-context.
+This application called `mist-worker` and mist starts and manage them automatically.
+
+Before starting job execution mist master queues them
+and waits when there will be a free worker for a context where the job should be executed.
+There are two worker-modes for contexts: `exclusive` and `shared`.
+For exclusive worker-mode mist starts a new worker per every job and shuts it down after its completion.
+For shared mist doesn't shutdown workers right after job completion, it reuses them to execute next jobs that should be executed in the same context.
+
+Also, this invocation model provides following benefits:
+- **Parallelization**:
+  For a particular context it's possible to start more than one `mist-worker` at the same time and balance job requests between them
+  to provide a way to execute jobs in parallel. Parallelization level is controlled by `max-parallel-jobs` in contexts settings - [contexts doc](/mist-docs/contexts.html)
+- **Fault-tolerance**:
+  If a worker was crushed mists tries to restart it to invoke the next job.
+  If workers were crushed more than `maxConnFailures` ([contexts doc](/mist-docs/contexts.html)) in sequence this context marks as broken
+  and fails all jobs until it will be updated.
+
+### Job statuses
+
+Job may be in one of possible statuses:
 - `queued`
 - `started` 
 - `canceled`
 - `finished`
 - `failed`
 
+### Steps
+
 Steps on Mist after receiving a new job request:
 - assign `id` fo job.
   For example if you use http:
@@ -32,3 +55,4 @@ Steps on Mist after receiving a new job request:
   
 - transmit function artifact to worker(jar/py)
 - invoke function. status turns into `started` and then after compeletion `finished` or `failed`
+
diff --git a/docs/src/main/tut/10_configuration.md b/docs/src/main/tut/10_configuration.md
@@ -100,3 +100,9 @@ mist {
   }
 }
 ```
+
+### Default context
+
+To override settings for `default` context use prefix `mist.context-defaults` - (master.conf)[https://github.com/Hydrospheredata/mist/blob/master/mist/master/src/main/resources/master.conf#L73].
+
+
diff --git a/docs/src/main/tut/11_contexts.md b/docs/src/main/tut/11_contexts.md
@@ -4,23 +4,80 @@ title: "Contexts"
 permalink: contexts.html
 position: 11
 ---
-## Contexts 
+### Contexts
 
 Mist creates and orchestrates Apache Spark contexts automatically. Every job is run in a context.
 In fact context describes a named Spark context and Mist settings for this Spark context.
-Mist context settings:
-- `spark-conf` - settings for a [spark](https://spark.apache.org/docs/latest/configuration.html)
-- `max-parallel-jobs` - amount of jobs that can be executed in parallel on the same context
-- `run-options` - additional option with command line arguments that will be used during worker creationa using spark-submit.
-   by default it's empty. [spark docs](https://spark.apache.org/docs/latest/submitting-applications.html)
-- `streaming-duration` - spark streaming duration 
-- `worker-mode`:
-    There are two types of modes:
-    - `shared`:
-        By default when you request a job to run the first time, Mist creates a worker node with new Spark context.
-        The second request will use the created namespace so the context will be alive while Mist is running.
-        settings:
-        - `downtime`(durarion) - idle-timeout for `shared` worker
-        - `precreated`(boolean) - start context at mist startup time
-
-    - `exclusive` - spawn new driver application for every job invocation.
+
+Contexts may be created using [mist-cli](/mist-docs/mist-cli.html) or [http-api](/mist-docs/http_api.html).
+Also, there is special `default` context. It may be configured only using mist-configuration file.
+It's goal to setup default values for all context, so for creating a new context it isn't required to define values for all its fields.
+
+Settings:
+<table>
+  <thead>
+    <tr>
+       <td>Key</td>
+       <td>Default</td>
+       <td>Meaning</td>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+       <td>sparkConf</td>
+       <td>empty</td>
+       <td>settings for a [spark](https://spark.apache.org/docs/latest/configuration.html)</td>
+    </tr>
+    <tr>
+       <td>maxJobs</td>
+       <td>1</td>
+       <td>amount of jobs executed in parallel</td>
+    </tr>
+    <tr>
+       <td>workerMode</td>
+       <td>exclusive</td>
+       <td>
+         <ul>
+           <li>exclusive - starts new worker for every job invocation</li>
+           <li>shared - reuses worker between several jobs</li>
+         </ul>
+       </td>
+    </tr>
+    <tr>
+       <td>maxConnFailures</td>
+       <td>5</td>
+       <td>
+         allowed amount of worker crushes before context will be switched into `broken` state
+         (it fails all incoming requests until context settings is updated).
+       </td>
+    </tr>
+    <tr>
+       <td>runOptions</td>
+       <td>""</td>
+       <td>
+         additional command line arguments for building spark-submit command to start worker, e.x: pass `--jars`
+       </td>
+    </tr>
+    <tr>
+       <td>streamingDuration</td>
+       <td>1s</td>
+       <td>
+         spark streaming duration
+       </td>
+    </tr>
+    <tr>
+       <td>precreated</td>
+       <td>false</td>
+       <td>
+         if true starts worker immediately,
+         if false await first job start requests before starting worker
+         *NOTE*: works only with `shared` workerMode
+       </td>
+    </tr>
+    <tr>
+       <td>downtime</td>
+       <td>false</td>
+       <td>idle-timeout for `shared` worker</td>
+    </tr>
+  </tbody>
+</table>
diff --git a/docs/src/main/tut/13_mist_cli.md b/docs/src/main/tut/13_mist_cli.md
@@ -13,7 +13,7 @@ position: 13
 Mist-cli provides an command line interface to mist-server to manage contexts/functions.
 Actually under the hood it uses [http api](/mist_docs/http_api.html).
 
-Instanciation of a new endpoint on mist requres following steps:
+Instanciation of a new endpoint on mist requires following steps:
 - upload an articat with function
 - create context for it's invocation
 - create function

diff --git a/mist/master/src/main/resources/master.conf b/mist/master/src/main/resources/master.conf
@@ -73,9 +73,10 @@ mist {
   context-defaults {
     downtime = 1 hour
     streaming-duration = 1 seconds
-    max-parallel-jobs = 4
+    max-parallel-jobs = 1
+    max-jobs = 1
     precreated = false
-    worker-mode = "shared" # shared | exclusive
+    worker-mode = "exclusive" # shared | exclusive
     spark-conf {
       #spark.default.parallelism = 128
       #spark.driver.memory = "512m"

diff --git a/mist/master/src/main/scala/io/hydrosphere/mist/master/data/ConfigRepr.scala b/mist/master/src/main/scala/io/hydrosphere/mist/master/data/ConfigRepr.scala
@@ -74,7 +74,7 @@ object ConfigRepr {
           .map(entry => cleanKey(entry.getKey) -> entry.getValue.unwrapped().toString)
           .toMap,
         downtime = Duration(config.getString("downtime")),
-        maxJobs = config.getInt("max-parallel-jobs"),
+        maxJobs = config.getOptInt("max-jobs").getOrElse(config.getInt("max-parallel-jobs")),
         precreated = config.getBoolean("precreated"),
         workerMode = runMode(config.getString("worker-mode")) ,
         runOptions = config.getString("run-options"),

diff --git a/mist/master/src/test/scala/io/hydrosphere/mist/master/data/ContextsStorageSpec.scala b/mist/master/src/test/scala/io/hydrosphere/mist/master/data/ContextsStorageSpec.scala
@@ -54,7 +54,7 @@ class ContextsStorageSpec extends FunSpec with Matchers with BeforeAndAfter {
 
   it("should fallback to default") {
     val contexts = testStorage()
-    val expected = TestUtils.contextSettings.default.copy(name = "new")
+    val expected = contexts.defaultConfig.copy(name = "new")
     contexts.getOrDefault("new").await shouldBe expected
   }