Skip to content
This repository has been archived by the owner on Mar 30, 2021. It is now read-only.

Setting up multiple Sparkline ThriftServers (Load Balancing & HA)

jpullokkaran edited this page Sep 15, 2016 · 4 revisions

To handle multiple concurrent client connections, often it is necessary to setup more than one Sparkline Thrift Server. Client application can be made agnostic of multiple servers using Dynamic Service Discovery feature of SparklineThrift Server. This feature is similar to HiveServer2 Dynamic Service Discovery.

Follow the steps below to enable Dynamic Service Discovery/LB/HA

  • Install Spark & Sparkline Accelerator Jar on all the machines.
  • Update hive_site.xml
hive.server2.support.dynamic.service.discovery=true
hive.server2.zookeeper.namespace=hiveserver2 (change this to reflect the namespace that you want to use)
hive.zookeeper.quorum="localhost:8080” (ZooKeeper host:port ‘,’ separated list if using ZooKeeper Ensemble)
  • Bring up Sparkline Thrift Server
  • Change client connection URL to point to ZooKeeper
jdbc:hive2://<zookeeper_ensemble>/<db>;serviceDiscoveryMode=zooKeeper; zooKeeperNamespace=<hiveserver2_namespace>
Example: jdbc:hive2://localhost:2181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2

Each thrift Server will register itself with Zookeeper when it comes up & will remove it self when it goes down. Client specifies ZooKeeper address & name space (hive.server2.zookeeper.namespace); client then gets redirected to one of the available thrift servers. ZooKeeper load balances client connections in round robin fashion. Note, there is no transparent failover for active sessions in case ThriftServer goes down.

Clone this wiki locally