-
-
Notifications
You must be signed in to change notification settings - Fork 3
Communicating with the data warehouse and creating custom hive views
Below is a brief overview of how HSLynk creates custom Hive/Impala views based on the HMIS, CES, and general human services data for each customer.
The primary technology behind the data warehouse is Hadoop. We currently use Cloudera Hadoop cluster with Ldap sentry authentication. Essentially the data is stored in HBASE (HDFS) and we perform real-time analytics on the data loaded via creating external tables on Hive/Impala.
We have the following projects which contain code specific to populating data our custom Hsynk and CES views. Two of the frequently used views like VI-SPDAT and CES Active List are here. https://github.com/servinglynk/hslynk-open-source/tree/master/sync-general
Although we use impala to populate the data to HBASE. We usually create the views on Hive because Impala and Hive share the same metadata.