Hybrid Hadoop

elopezsa edited this page Sep 21, 2016 · 2 revisions

On existing Hadoop installations, a different approach involves using additional virtual machines and interacting with Hadoop components (Spark, HDFS) through a gateway node. This approach is recommended for customers with a Hadoop environment hosting heterogeneous use cases, where minimal deviation from node roles is desired. The disadvantage is that virtual machines must be sized properly according to workloads.

Hybrid Hadoop

In addition to the services deployed on the existing cluster, additional Virtual Machines (VM’s) are required to host the non-Hadoop functions of the solution. The gateway service is required for some of these VM’s to allow for interaction with Spark, Hive, and HDFS.

Note: While the above condition is a recommended layout for production, pilot deployments may be chosen to combine the above roles into fewer VM’s. Each component of the Open Network Insight solution has integral interactions with Hadoop, but its non-Hadoop processing and memory requirements are separable with this approach.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.