trinodb
diff --git a/‎338/_sources/admin/properties-exchange.rst.txt
Lines changed: 89 additions & 0 deletions b/‎338/_sources/admin/properties-exchange.rst.txt
Lines changed: 89 additions & 0 deletions
diff --git a/‎338/_sources/admin/properties-general.rst.txt
Lines changed: 38 additions & 0 deletions b/‎338/_sources/admin/properties-general.rst.txt
Lines changed: 38 additions & 0 deletions
diff --git a/‎338/_sources/admin/properties-logging.rst.txt
Lines changed: 72 additions & 0 deletions b/‎338/_sources/admin/properties-logging.rst.txt
Lines changed: 72 additions & 0 deletions
diff --git a/‎338/_sources/admin/properties-memory-management.rst.txt
Lines changed: 66 additions & 0 deletions b/‎338/_sources/admin/properties-memory-management.rst.txt
Lines changed: 66 additions & 0 deletions
diff --git a/‎338/_sources/admin/properties-node-scheduler.rst.txt
Lines changed: 110 additions & 0 deletions b/‎338/_sources/admin/properties-node-scheduler.rst.txt
Lines changed: 110 additions & 0 deletions
@@ -0,0 +1,89 @@
+===================
+Exchange Properties
+===================
+
+Exchanges transfer data between Presto nodes for different stages of
+a query. Adjusting these properties may help to resolve inter-node
+communication issues or improve network utilization.
+
+``exchange.client-threads``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``integer``
+* **Minimum value:** ``1``
+* **Default value:** ``25``
+
+Number of threads used by exchange clients to fetch data from other Presto
+nodes. A higher value can improve performance for large clusters or clusters
+with very high concurrency, but excessively high values may cause a drop
+in performance due to context switches and additional memory usage.
+
+``exchange.concurrent-request-multiplier``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``integer``
+* **Minimum value:** ``1``
+* **Default value:** ``3``
+
+Multiplier determining the number of concurrent requests relative to
+available buffer memory. The maximum number of requests is determined
+using a heuristic of the number of clients that can fit into available
+buffer space, based on average buffer usage per request times this
+multiplier. For example, with an ``exchange.max-buffer-size`` of ``32 MB``
+and ``20 MB`` already used and average size per request being ``2MB``,
+the maximum number of clients is
+``multiplier * ((32MB - 20MB) / 2MB) = multiplier * 6``. Tuning this
+value adjusts the heuristic, which may increase concurrency and improve
+network utilization.
+
+``exchange.data-integrity-verification``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``string``
+* **Allowed values:** ``NONE``, ``ABORT``, ``RETRY``
+* **Default value:** ``ABORT``
+
+Configure the resulting behavior of data integrity issues. By default,
+``ABORT`` causes queries to be aborted when data integrity issues are
+detected as part of the built-in verification. Setting the property to
+``NONE`` disables the verification. ``RETRY`` causes the data exchange to be
+repeated when integrity issues are detected.
+
+``exchange.max-buffer-size``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Default value:** ``32MB``
+
+Size of buffer in the exchange client that holds data fetched from other
+nodes before it is processed. A larger buffer can increase network
+throughput for larger clusters, and thus decrease query processing time,
+but reduces the amount of memory available for other usages.
+
+``exchange.max-response-size``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Minimum value:** ``1MB``
+* **Default value:** ``16MB``
+
+Maximum size of a response returned from an exchange request. The response
+is placed in the exchange client buffer, which is shared across all
+concurrent requests for the exchange.
+
+Increasing the value may improve network throughput, if there is high
+latency. Decreasing the value may improve query performance for large
+clusters as it reduces skew, due to the exchange client buffer holding
+responses for more tasks, rather than hold more data from fewer tasks.
+
+``sink.max-buffer-size``
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Default value:** ``32MB``
+
+Output buffer size for task data that is waiting to be pulled by upstream
+tasks. If the task output is hash partitioned, then the buffer is
+shared across all of the partitioned consumers. Increasing this value may
+improve network throughput for data transferred between stages, if the
+network has high latency, or if there are many nodes in the cluster.
@@ -0,0 +1,38 @@
+==================
+General Properties
+==================
+
+``join-distribution-type``
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``string``
+* **Allowed values:** ``AUTOMATIC``, ``PARTITIONED``, ``BROADCAST``
+* **Default value:** ``AUTOMATIC``
+
+The type of distributed join to use.  When set to ``PARTITIONED``, Presto
+uses hash distributed joins.  When set to ``BROADCAST``, it broadcasts the
+right table to all nodes in the cluster that have data from the left table.
+Partitioned joins require redistributing both tables using a hash of the join key.
+This can be slower, sometimes substantially, than broadcast joins, but allows much
+larger joins. In particular broadcast joins are faster, if the right table is
+much smaller than the left.  However, broadcast joins require that the tables on the right
+side of the join after filtering fit in memory on each node, whereas distributed joins
+only need to fit in distributed memory across all nodes. When set to ``AUTOMATIC``,
+Presto makes a cost based decision as to which distribution type is optimal.
+It considers switching the left and right inputs to the join.  In ``AUTOMATIC``
+mode, Presto defaults to hash distributed joins if no cost could be computed, such as if
+the tables do not have statistics. This can be specified on a per-query basis using
+the ``join_distribution_type`` session property.
+
+``redistribute-writes``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``boolean``
+* **Default value:** ``true``
+
+This property enables redistribution of data before writing. This can
+eliminate the performance impact of data skew when writing by hashing it
+across nodes in the cluster. It can be disabled, when it is known that the
+output data set is not skewed, in order to avoid the overhead of hashing and
+redistributing all the data across the network. This can be specified
+on a per-query basis using the ``redistribute_writes`` session property.
@@ -0,0 +1,72 @@
+==================
+Logging Properties
+==================
+
+``log.path``
+^^^^^^^^^^^^
+
+* **Type:** ``string``
+* **Default value:** ``var/log/server.log``
+
+The path to the log file used by Presto. The path is relative to the data
+directory, configured by the launcher script as detailed in
+:ref:`running_presto`.
+
+``log.max-history``
+^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``integer``
+* **Default value:** ``30``
+
+The maximum number of general application log files to use, before log
+rotation replaces old content.
+
+``log.max-size``
+^^^^^^^^^^^^^^^^
+* **Type:** ``data size``
+* **Default value:** ``100MB``
+
+The maximum file size for the general application log file.
+
+``http-server.log.enabled``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``boolean``
+* **Default value:** ``true``
+
+Flag to enable or disable logging for the HTTP server.
+
+``http-server.log.compression.enabled``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``boolean``
+* **Default value:** ``true``
+
+Flag to enable or disable compression of the log files of the HTTP server.
+
+``http-server.log.path``
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``string``
+* **Default value:** ``var/log/http-request.log``
+
+The path to the log file used by the HTTP server. The path is relative to
+the data directory, configured by the launcher script as detailed in
+:ref:`running_presto`.
+
+``http-server.log.max-history``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``integer``
+* **Default value:** ``15``
+
+The maximum number of log files for the HTTP server to use, before
+log rotation replaces old content.
+
+``http-server.log.max-size``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Default value:** ``unlimited``
+
+The maximum file size for the log file of the HTTP server.
@@ -0,0 +1,66 @@
+============================
+Memory Management Properties
+============================
+
+``query.max-memory-per-node``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Default value:** ``JVM max memory * 0.1``
+
+This is the max amount of user memory a query can use on a worker.
+User memory is allocated during execution for things that are directly
+attributable to, or controllable by, a user query. For example, memory used
+by the hash tables built during execution, memory used during sorting, etc.
+When the user memory allocation of a query on any worker hits this limit,
+it is killed.
+
+``query.max-total-memory-per-node``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Default value:** ``JVM max memory * 0.3``
+
+This is the max amount of user and system memory a query can use on a worker.
+System memory is allocated during execution for things that are not directly
+attributable to, or controllable by, a user query. For example, memory allocated
+by the readers, writers, network buffers, etc. When the sum of the user and
+system memory allocated by a query on any worker hits this limit, it is killed.
+The value of ``query.max-total-memory-per-node`` must be greater than
+``query.max-memory-per-node``.
+
+``query.max-memory``
+^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Default value:** ``20GB``
+
+This is the max amount of user memory a query can use across the entire cluster.
+User memory is allocated during execution for things that are directly
+attributable to, or controllable by, a user query. For example, memory used
+by the hash tables built during execution, memory used during sorting, etc.
+When the user memory allocation of a query across all workers hits this limit
+it is killed.
+
+``query.max-total-memory``
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Default value:** ``query.max-memory * 2``
+
+This is the max amount of user and system memory a query can use across the entire cluster.
+System memory is allocated during execution for things that are not directly
+attributable to, or controllable by, a user query. For example, memory allocated
+by the readers, writers, network buffers, etc. When the sum of the user and
+system memory allocated by a query across all workers hits this limit it is
+killed. The value of ``query.max-total-memory`` must be greater than
+``query.max-memory``.
+
+``memory.heap-headroom-per-node``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``data size``
+* **Default value:** ``JVM max memory * 0.3``
+
+This is the amount of memory set aside as headroom/buffer in the JVM heap
+for allocations that are not tracked by Presto.
@@ -0,0 +1,110 @@
+=========================
+Node Scheduler Properties
+=========================
+
+``node-scheduler.max-splits-per-node``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``integer``
+* **Default value:** ``100``
+
+The target value for the total number of splits that can be running for
+each worker node.
+
+Using a higher value is recommended, if queries are submitted in large batches
+(e.g., running a large group of reports periodically), or for connectors that
+produce many splits that complete quickly. Increasing this value may improve
+query latency, by ensuring that the workers have enough splits to keep them
+fully utilized.
+
+Setting this too high wastes memory and may result in lower performance
+due to splits not being balanced across workers. Ideally, it should be set
+such that there is always at least one split waiting to be processed, but
+not higher.
+
+``node-scheduler.max-pending-splits-per-task``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``integer``
+* **Default value:** ``10``
+
+The number of outstanding splits that can be queued for each worker node
+for a single stage of a query, even when the node is already at the limit for
+total number of splits. Allowing a minimum number of splits per stage is
+required to prevent starvation and deadlocks.
+
+This value must be smaller than ``node-scheduler.max-splits-per-node``,
+is usually increased for the same reasons, and has similar drawbacks
+if set too high.
+
+``node-scheduler.min-candidates``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``integer``
+* **Minimum value:** ``1``
+* **Default value:** ``10``
+
+The minimum number of candidate nodes that are evaluated by the
+node scheduler when choosing the target node for a split. Setting
+this value too low may prevent splits from being properly balanced
+across all worker nodes. Setting it too high may increase query
+latency and increase CPU usage on the coordinator.
+
+``node-scheduler.policy``
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``string``
+* **Allowed values:** ``uniform``, ``topology``
+* **Default value:** ``uniform``
+
+Sets the node scheduler policy to use when scheduling splits. ``uniform``  attempts
+to schedule splits on the host where the data is located, while maintaining a uniform
+distribution across all hosts. ``topology`` tries to schedule splits according to
+the topology distance between nodes and splits. It is recommended to use ``uniform``
+for clusters where distributed storage runs on the same nodes as Presto workers.
+
+``node-scheduler.network-topology.segments``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``string``
+* **Default value:** ``machine``
+
+A comma-separated string describing the meaning of each segment of a network location.
+For example, setting ``region,rack,machine`` means a network location contains three segments.
+
+``node-scheduler.network-topology.type``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``string``
+* **Allowed values:** ``flat``, ``file``
+* **Default value:** ``flat``
+
+Sets the network topology type. To use this option, ``node-scheduler.policy`` must be set to
+``topology``. ``flat`` has only one segment, with one value for each machine.
+``file`` loads the topology from a file as described below.
+
+``node-scheduler.network-topology.file``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``string``
+
+Load the network topology from a file. To use this option, ``node-scheduler.network-topology.type``
+must be set to ``file``. Each line contains a mapping between a host name and a
+network location, separated by whitespace. Network location must begin with a leading
+``/`` and segments are separated by a ``/``.
+
+.. code-block:: none
+
+192.168.0.1 /region1/rack1/machine1
+192.168.0.2 /region1/rack1/machine2
+hdfs01.example.com /region2/rack2/machine3
+
+``node-scheduler.network-topology.refresh-period``
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+* **Type:** ``duration``
+* **Minimum value:** ``1ms``
+* **Default value:** ``5m``
+
+Controls how often the network topology file is reloaded.  To use this option,
+``node-scheduler.network-topology.type`` must be set to ``file``.