layout | title | nickname | group | priority |
---|---|---|---|---|
global |
Deploy Alluxio on a Cluster |
Cluster |
Deploying Alluxio |
2 |
- Table of Contents {:toc}
The simplest way to deploy Alluxio on a cluster is to use one master. However, this single master becomes the single point of failure (SPOF) in an Alluxio cluster: if that machine or process became unavailable, the cluster as a whole would be unavailable. We highly recommend running Alluxio masters in High Availability mode in production.
- A single master node, and 1 or more worker nodes
- SSH login without password to all nodes. You can add a public SSH key for the host into
~/.ssh/authorized_keys
. See this tutorial for more details. - A shared storage system to mount to Alluxio (accessible by all Alluxio nodes). For example, HDFS or Amazon S3.
The following sections describe how to install and configure Alluxio with a single master in a cluster.
To deploy Alluxio in a cluster, first download the Alluxio tar file, and copy it to every node (master node, worker nodes). Extract the tarball to the same path on every node.
On the master node of the installation, create the conf/alluxio-site.properties
configuration file from the template.
cp conf/alluxio-site.properties.template conf/alluxio-site.properties
This configuration file (conf/alluxio-site.properties
) is read by all Alluxio masters and Alluxio
workers. The configuration parameters which must be set are:
alluxio.master.hostname=<MASTER_HOSTNAME>
- This is set to the hostname of the master node.
- Examples:
alluxio.master.hostname=1.2.3.4
,alluxio.master.hostname=node1.a.com
alluxio.master.mount.table.root.ufs=<STORAGE_URI>
- This is set to the URI of the shared storage system to mount to the Alluxio root. This shared shared storage system must be accessible by the master node and all worker nodes.
- Examples:
alluxio.master.mount.table.root.ufs=hdfs://1.2.3.4:9000/alluxio/root/
,alluxio.master.mount.table.root.ufs=s3a://bucket/dir/
This is the minimal configuration to start Alluxio, but additional configuration may be added.
Since this same configuration file will be the same for the master and all the workers, this configuration
file must be copied to all the other Alluxio nodes. The simplest way to do this is
to use the copyDir
shell command on the master node. In order to do so, add the IP addresses or
hostnames of all the worker nodes to the conf/workers
file. Then run:
./bin/alluxio copyDir conf/
This will copy the conf/
directory to all the workers specified in the conf/workers
file. Once
this command suceeds, all the Alluxio nodes will be correctly configured.
Before Alluxio can be started for the first time, the journal must be formatted.
Formatting the journal will delete all metadata from Alluxio. However, the data in under storage will be untouched.
On the master node, format Alluxio with the following command:
./bin/alluxio format
To start the Alluxio cluster, on the master node, make sure the conf/workers
file is correct
with all the hostnames of the workers.
On the master node, start the Alluxio cluster with the following command:
./bin/alluxio-start.sh all SudoMount
This will start the master on the node you are running it on, and start all the workers on all the
nodes specified in the conf/workers
file.
To verify that Alluxio is running, visit http://<alluxio_master_hostname>:19999
to see the status
page of the Alluxio master.
Alluxio comes with a simple program that writes and reads sample files in Alluxio. Run the sample program with:
./bin/alluxio runTests
High availability in Alluxio is based upon a multi-master approach, where multiple master processes are running in the system. One of these processes is elected the leader and is used by all workers and clients as the primary point of contact. The other masters act as standby, and use a shared journal to ensure that they maintain the same file system metadata as the leader. The standby masters do not serve any client or worker requests.
If the leader master fails, a standby master will automatically be chosen to take over and become the new leader. Once the new leader starts serving, Alluxio clients and workers proceed as usual. Note that during the failover to a standby master, clients may experience brief delays or transient errors.
Alluxio can either use an embedded journal or UFS-based journal for maintaining state across restarts. The embedded journal comes with its own leader election, while UFS journaling relies on Zookeeper for leader election. See [journal management documentation]({{ '/en/operation/Journal.html' | relativize_url }}) for more information about choosing and configuring Alluxio journal system.
The common prerequisites of setting up HA cluster are:
- Multiple master nodes, and 1 or more worker nodes
- SSH login without password to all nodes. You can add a public SSH key for the host into
~/.ssh/authorized_keys
. See this tutorial for more details. - A shared storage system to mount to Alluxio (accessible by all Alluxio nodes). For example, HDFS or Amazon S3.
To deploy Alluxio in a cluster, first download the Alluxio tar file, and copy it to every node (master nodes, worker nodes). Extract the tarball to the same path on every node.
Each Alluxio node (masters and workers) must be configured for HA. On each Alluxio node,
create the conf/alluxio-site.properties
configuration file from the template.
cp conf/alluxio-site.properties.template conf/alluxio-site.properties
Add the following properties to the conf/alluxio-site.properties
file:
alluxio.master.hostname=<MASTER_HOSTNAME>
- This must be set to the correct externally visible hostname for each master node.
(Workers ignore this parameter when
alluxio.zookeeper.enabled
is enabled) - Examples:
alluxio.master.hostname=1.2.3.4
,alluxio.master.hostname=node1.a.com
- This must be set to the correct externally visible hostname for each master node.
(Workers ignore this parameter when
alluxio.master.mount.table.root.ufs=<STORAGE_URI>
- This is set to the URI of the shared storage system to mount to the Alluxio root. This shared shared storage system must be accessible by all master nodes and all worker nodes.
- Examples:
alluxio.master.mount.table.root.ufs=hdfs://1.2.3.4:9000/alluxio/root/
,alluxio.master.mount.table.root.ufs=s3a://bucket/dir/
The minimal configuration to set up a HA cluster is to give the embedded journal addresses to all nodes inside the cluster.
Alluxio's internal leader election will determine the leader master. The default embedded journal port is 19200
.
alluxio.master.embedded.journal.addresses=master_hostname_1:19200,master_hostname_2:19200,master_hostname_3:19200
Note that embedded journal feature relies on Copycat which has built-in leader election. The built-in leader election cannot work with Zookeeper since we cannot have two leaders which might not match. Enabling embedded journal enables Alluxio's internal leader election. See [embedded journal configuration documentation]({{ '/en/operation/Journal.html' | relativize_url }}#Embedded-Journal-Configuration) for alternative ways to set up HA cluster with internal leader election.
The additional prerequisites of setting up Zookeeper HA cluster are:
- A ZooKeeper cluster. Alluxio masters use ZooKeeper for leader election, and Alluxio clients and workers use ZooKeeper to inquire about the identity of the current leader master.
- A shared storage system on which to place the journal (accessible by all Alluxio nodes). The leader
master writes to the journal on this shared storage system, while the standby masters continually
replay the journal entries to stay up-to-date.
The journal storage system is recommended to be:
- Highly available. All metadata modifications on the master requires writing to the journal, so any downtime of the journal storage system will directly impact the Alluxio master availability.
- Filesystem, not an object store. The Alluxio master writes to journal files to this storage system, and utilizes filesystem operations such as rename and flush. Object stores do not support these operations, and/or perform them slowly, so when the journal is stored on an object store, the Alluxio master operation throughput is significantly reduced.
The configuration parameters which must be set are:
alluxio.zookeeper.enabled=true
- This enables the HA mode for the masters, and informs workers that HA mode is enabled.
alluxio.zookeeper.address=<ZOOKEEPER_ADDRESS>
- The ZooKeeper address must be specified when
alluxio.zookeeper.enabled
is enabled. - The HA masters use ZooKeeper for leader election, and the workers use ZooKeeper to discover the leader master.
- Multiple ZooKeeper addresses can be specified by delimiting with commas
- Examples:
alluxio.zookeeper.address=1.2.3.4:2181
,alluxio.zookeeper.address=zk1:2181,zk2:2181,zk3:2181
- The ZooKeeper address must be specified when
alluxio.master.journal.type=UFS
- This uses UFS as the journal place. Note that Zookeeper cannot work
with journal type
EMBEDDED
(use a journal embedded in the masters).
- This uses UFS as the journal place. Note that Zookeeper cannot work
with journal type
alluxio.master.journal.folder=<JOURNAL_URI>
- This is set to the URI of the shared journal location for the Alluxio leader master to write the journal to, and for standby masters to replay journal entries from. This shared shared storage system must be accessible by all master nodes.
- Examples:
alluxio.master.journal.folder=hdfs://1.2.3.4:9000/alluxio/journal/
For clusters with large namespaces, increased CPU overhead on leader could cause delays on Zookeeper client heartbeats. For this reason, we recommend setting Zookeeper client session timeout to at least 2 minutes on large clusters with namespace size more than several hundred millions of files.
alluxio.zookeeper.session.timeout=120s
- Zookeeper server's tick time must also be configured as such to allow this timeout. The current implementation requires that the timeout be a minimum of 2 times the tickTime (as set in the server configuration) and a maximum of 20 times the tickTime.
Make sure all master nodes and all worker nodes have configured their respective
conf/alluxio-site.properties
configuration file appropriately.
Once all the Alluxio masters and workers are configured in this way, Alluxio is ready to be formatted started.
Before Alluxio can be started for the first time, the journal must be formatted.
Formatting the journal will delete all metadata from Alluxio. However, the data in under storage will be untouched.
On any master node, format Alluxio with the following command:
./bin/alluxio format
To start the Alluxio cluster with the provided scripts, on any master node, list all the
worker hostnames in the conf/workers
file, and list all the masters in the conf/masters
file. This
will allow the start scripts to start the appropriate processes on the appropriate nodes.
On the master node, start the Alluxio cluster with the following command:
./bin/alluxio-start.sh all SudoMount
This will start Alluxio masters on all the nodes specified in conf/masters
, and start the workers on all the
nodes specified in conf/workers
.
To verify that Alluxio is running, you can visit the web UI of the leader master. To determine the leader master, run:
./bin/alluxio fs leader
Then, visit http://<LEADER_HOSTNAME>:19999
to see the status page of the Alluxio leader master.
Alluxio comes with a simple program that writes and reads sample files in Alluxio. Run the sample program with:
./bin/alluxio runTests
When an Alluxio client interacts with Alluxio in HA mode, the client must know about the Alluxio HA cluster, so that the client knows how to discover the Alluxio leader master. There are 2 ways to configure the client for Alluxio HA.
When connecting to Alluxio HA cluster using internal leader election,
Alluxio clients need the alluxio.master.rpc.addresses
information to decide the node addresses
to query to find out the Alluxio leader master which is serving rpc requests.
alluxio.master.rpc.addresses=master_hostname_1:19998,master_hostname_2:19998,master_hostname_3:19998
When connecting to Alluxio HA cluster using Zookeeper-based leader election, the following properties are needed to connect to Zookeeper to get the leader master information.
alluxio.zookeeper.enabled=true
alluxio.zookeeper.address=<ZOOKEEPER_ADDRESS>
- The ZooKeeper address must be specified when
alluxio.zookeeper.enabled
is enabled. - Multiple ZooKeeper addresses can be specified by delimiting with commas
- The ZooKeeper address must be specified when
When the application is configured with these parameters, the Alluxio URI can be simplified to
alluxio:///path
, since the HA cluster information is already configured.
For example, if using Hadoop, you can configure the properties in core-site.xml
, and then use the
Hadoop CLI with an Alluxio URI.
hadoop fs -ls alluxio:///directory
Users can fully specify the HA cluster information in the URI to connect to an Alluxio HA cluster. Configuration derived from the HA authority takes precedence over all other forms of configuration, e.g. site properties or environment variables.
When using internal leader election, use alluxio://master_hostname_1:19998,master_hostname_2:19998,master_hostname_3:19998
When using Zookeeper leader election, use alluxio://zk@<ZOOKEEPER_ADDRESS>
.
For many applications (e.g., Hadoop, HBase, Hive and Flink), you can use a comma as the
delimiter for multiple addresses in the URI, like alluxio://master_hostname_1:19998,master_hostname_2:19998,master_hostname_3:19998
and alluxio://zk@zkHost1:2181,zkHost2:2181,zkHost3:2181/path
.
For some other applications (e.g., Spark), you need to use semicolons as the delimiter for multiple
addresses, like alluxio://master_hostname_1:19998;master_hostname_2:19998;master_hostname_3:19998
and alluxio://zk@zkHost1:2181;zkHost2:2181;zkHost3:2181/path
.
Below are common operations to perform on an Alluxio cluster.
In order to update the server configuration for a node, you have to update the conf/alluxio-site.properties
file on that node, stop the server, and then start the server. The configuration file is read on startup.
To stop an Alluxio cluster, the simplest way is to use the provided tools. To use the tools, list all the
worker hostnames in the conf/workers
file, and list all the masters in the conf/masters
file.
Then, you can stop Alluxio with:
./bin/alluxio-stop.sh all
This will stop all the processes on all nodes listed in conf/workers
and conf/masters
.
You can stop just the masters and just the workers with the following commands:
./bin/alluxio-stop.sh masters # stops all masters in conf/masters
./bin/alluxio-stop.sh workers # stops all workers in conf/workers
If you do not want to use ssh
to login to all the nodes and stop all the processes, you can run
commands on each node individually to stop each component. For any node, you can stop a master or worker with:
./bin/alluxio-stop.sh master # stops the local master
./bin/alluxio-stop.sh worker # stops the local worker
Starting Alluxio is similar. If conf/workers
and conf/masters
are both populated, you can start
the cluster with:
./bin/alluxio-start.sh all SudoMount
You can start just the masters and just the workers with the following commands:
./bin/alluxio-start.sh masters # starts all masters in conf/masters
./bin/alluxio-start.sh workers SudoMount # starts all workers in conf/workers
If you do not want to use ssh
to login to all the nodes and start all the processes, you can run
commands on each node individually to start each component. For any node, you can start a master or worker with:
./bin/alluxio-start.sh master # starts the local master
./bin/alluxio-start.sh worker SudoMount # starts the local worker
On any master node, format the Alluxio journal with the following command:
./bin/alluxio format
Formatting the journal will delete all metadata from Alluxio. However, the data in under storage will be untouched.
Adding a worker to an Alluxio cluster is as simple as starting a new Alluxio worker process, with the appropriate configuration. In most cases, the new worker's configuration should be the same as all the other workers' configuration. Once the worker is started, it will register itself with the Alluxio master, and become part of the Alluxio cluster.
Removing a worker is as simple as stopping the worker process. Once the worker is stopped, and after
a timeout on the master (configured by master parameter alluxio.master.worker.timeout
), the master
will consider the worker as "lost", and no longer consider it as part of the cluster.
In order to add a master, the Alluxio cluster must be in HA mode. If you are running the cluster as a single master cluster, you must configure it to be an HA cluster to be able to have more than 1 master.
When internal leader election is used, Alluxio masters are determined. Adding or removing master nodes requires:
- Shut down the whole cluster
- Add/remove one or more Alluxio master
- Format the whole cluster
- Update the cluster-wide embedded journal configuration
- Start the whole cluster
Note that all previously stored data and metadata in Alluxio filesystem will be erased.
To add a master to an HA Alluxio cluster, you can simply start a new Alluxio master process, with
the appropriate configuration. The configuration for the new master should be the same as other masters,
except that the parameter alluxio.master.hostname=<MASTER_HOSTNAME>
should reflect the new hostname.
Once the new master is started, it will start interacting with ZooKeeper to participate in leader election.
Removing a master is as simple as stopping the master process. If the cluster is a single master cluster, stopping the master will essentially shutdown the cluster, since the single master is down. If the Alluxio cluster is an HA cluster, stopping the leader master will force ZooKeeper to elect a new leader master and failover to that new leader. If a standby master is stopped, then the operation of the cluster is unaffected. Keep in mind, Alluxio masters high availability depends on the availability on standby masters. If there are not enough standby masters, the availability of the leader master will be affected. It is recommended to have at least 3 masters for an HA Alluxio cluster.