-
Notifications
You must be signed in to change notification settings - Fork 13k
/
job_manager_configuration.html
66 lines (66 loc) · 4.16 KB
/
job_manager_configuration.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
<table class="table table-bordered">
<thead>
<tr>
<th class="text-left" style="width: 20%">Key</th>
<th class="text-left" style="width: 15%">Default</th>
<th class="text-left" style="width: 65%">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><h5>jobmanager.archive.fs.dir</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td>Dictionary for JobManager to store the archives of completed jobs.</td>
</tr>
<tr>
<td><h5>jobmanager.execution.attempts-history-size</h5></td>
<td style="word-wrap: break-word;">16</td>
<td>The maximum number of prior execution attempts kept in history.</td>
</tr>
<tr>
<td><h5>jobmanager.execution.failover-strategy</h5></td>
<td style="word-wrap: break-word;">"full"</td>
<td>This option specifies how the job computation recovers from task failures. Accepted values are:<ul><li>'full': Restarts all tasks to recover the job.</li><li>'region': Restarts all tasks that could be affected by the task failure. More details can be found <a href="../dev/task_failure_recovery.html#restart-pipelined-region-failover-strategy">here</a>.</li><li>'region-fast': It behaves the same as 'region' but has better performance to determine tasks to restart. This improvement would help if the job scale is large. The side effect is longer region building time and more memory for cache.</li></ul></td>
</tr>
<tr>
<td><h5>jobmanager.heap.size</h5></td>
<td style="word-wrap: break-word;">"1024m"</td>
<td>JVM heap size for the JobManager.</td>
</tr>
<tr>
<td><h5>jobmanager.rpc.address</h5></td>
<td style="word-wrap: break-word;">(none)</td>
<td>The config parameter defining the network address to connect to for communication with the job manager. This value is only interpreted in setups where a single JobManager with static name or address exists (simple standalone setups, or container setups with dynamic service name resolution). It is not used in many high-availability setups, when a leader-election service (like ZooKeeper) is used to elect and discover the JobManager leader from potentially multiple standby JobManagers.</td>
</tr>
<tr>
<td><h5>jobmanager.rpc.port</h5></td>
<td style="word-wrap: break-word;">6123</td>
<td>The config parameter defining the network port to connect to for communication with the job manager. Like jobmanager.rpc.address, this value is only interpreted in setups where a single JobManager with static name/address and port exists (simple standalone setups, or container setups with dynamic service name resolution). This config option is not used in many high-availability setups, when a leader-election service (like ZooKeeper) is used to elect and discover the JobManager leader from potentially multiple standby JobManagers.</td>
</tr>
<tr>
<td><h5>jobstore.cache-size</h5></td>
<td style="word-wrap: break-word;">52428800</td>
<td>The job store cache size in bytes which is used to keep completed jobs in memory.</td>
</tr>
<tr>
<td><h5>jobstore.expiration-time</h5></td>
<td style="word-wrap: break-word;">3600</td>
<td>The time in seconds after which a completed job expires and is purged from the job store.</td>
</tr>
<tr>
<td><h5>jobstore.max-capacity</h5></td>
<td style="word-wrap: break-word;">2147483647</td>
<td>The max number of completed jobs that can be kept in the job store.</td>
</tr>
<tr>
<td><h5>slot.idle.timeout</h5></td>
<td style="word-wrap: break-word;">50000</td>
<td>The timeout in milliseconds for a idle slot in Slot Pool.</td>
</tr>
<tr>
<td><h5>slot.request.timeout</h5></td>
<td style="word-wrap: break-word;">300000</td>
<td>The timeout in milliseconds for requesting a slot from Slot Pool.</td>
</tr>
</tbody>
</table>