HPCC Systems (High Performance Computing Cluster) is an open source, massive parallel-processing computing platform for big data processing and analytics.
C++ ECL XSLT JavaScript CMake Yacc Other
Latest commit 58f57e6 Dec 2, 2016 @richardkchapman richardkchapman committed on GitHub Merge pull request #9383 from jakesmith/hpcc-16703
HPCC-16703 Ensure default keepStores match configmgr default

Reviewed-By: Mark Kelly <mark.kelly@lexisnexis.com>
Reviewed-By: Richard Chapman <rchapman@hpccsystems.com>
Permalink
Failed to load latest commit information.
build_utils HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® Aug 4, 2015
charm HPCC-11289 Add README file for HPCC Juju Charm Development Jan 22, 2015
clienttools HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® Aug 4, 2015
cmake_modules Merge branch 'candidate-6.2.0' Nov 22, 2016
common Merge branch 'candidate-6.2.0' into candidate-6.4.0 Dec 1, 2016
configuration Merge pull request #9220 from ghalliday/issue16412 Oct 6, 2016
dali HPCC-16703 Ensure default keepStores match configmgr default Nov 30, 2016
deploy HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® Aug 4, 2015
deployment HPCC-8983 Add support for -fvisibility to reduce dll exports Nov 9, 2016
docs Merge branch 'candidate-6.0.10' into candidate-6.2.0 Dec 1, 2016
ecl Merge branch 'candidate-6.4.0' Dec 1, 2016
ecllibrary HPCC-16655 Extract_Tri anomaly in eclblas Nov 24, 2016
esp Merge branch 'candidate-6.4.0' Dec 1, 2016
githooks Merge remote-tracking branch 'origin/candidate-3.10.x' Dec 20, 2012
initfiles HPCC-16632 Add cassandra logging agent to init files Nov 30, 2016
lib2 HPCC-16688 Add CBlas library to lib2 for APPLE Nov 29, 2016
misc HPCC-9508 Add eclipse code layout settings file to project Jun 19, 2013
package HPCC-16491 Work-around CMake productbuild packaging issue Oct 26, 2016
plugins Merge branch 'candidate-6.2.0' into candidate-6.4.0 Dec 1, 2016
roxie Merge branch 'candidate-6.4.0' Dec 1, 2016
rtl HPCC-15926 Add BEST attribute to DEDUP Dec 1, 2016
services HPCC-8983 Add support for -fvisibility to reduce dll exports Nov 9, 2016
system Merge branch 'candidate-6.2.0' into candidate-6.4.0 Nov 30, 2016
testing Merge pull request #9023 from shamser/issue15926 Dec 1, 2016
thorlcr Merge pull request #9023 from shamser/issue15926 Dec 1, 2016
tools HPCC-16429 Coverity: out-of-bounds-access Nov 25, 2016
.gitattributes HPCC-16584 Ensure run time script files are \n terminated Nov 7, 2016
.gitignore HPCC-15799 Add minimal linting Sep 15, 2016
.gitmodules HPCC-16661 Added libhiredis submodule Dec 1, 2016
.travis.yml HPCC-15799 Add minimal linting Sep 15, 2016
BUILD_ME.md HPCC-13515 A proper README.md, moving old README.md -> BUILD_ME.md May 22, 2015
CMakeLists.txt HPCC-16491 Work-around CMake productbuild packaging issue Oct 26, 2016
CNAME Add CNAME entry for GitHub pages redirection Aug 23, 2011
CONTRIBUTORS HPCC-16014 Contributors file needs some refreshing Sep 6, 2016
FUTURE Initial version of FUTURE document Sep 14, 2011
LICENSE.txt HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® Aug 4, 2015
R-LICENSE.txt HPCC-14457 Split R plugin to its own package Dec 15, 2015
README.md HPCC-13515 A proper README.md, moving old README.md -> BUILD_ME.md May 22, 2015
VERSIONS Preparation for 6.0.0-beta1 release Sep 22, 2015
baseaddr.txt HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® Aug 4, 2015
build-config.h.cmake HPCC-9902 Use the build version as the ecl version reported by eclcc Sep 3, 2013
cmake_uninstall.cmake.in HPCC-15142 Minimal changes needed for DESTDIR Aug 10, 2016
sourcedoc.xml HPCC-13448 Source Code needs Marca Registrada next to HPCC Systems® Aug 4, 2015
version.cmake Split off candidate-6.4.0 Nov 28, 2016

README.md

Description / Rationale

HPCC Systems offers an enterprise ready, open source supercomputing platform to solve big data problems. As compared to Hadoop, the platform offers analysis of big data using less code and less nodes for greater efficiencies and offers a single programming language, a single platform and a single architecture for efficient processing. HPCC Systems is a technology division of LexisNexis Risk Solutions.

Getting Started

Architecture

The HPCC Systems architecture incorporates the Thor and Roxie clusters as well as common middleware components, an external communications layer, client interfaces which provide both end-user services and system management tools, and auxiliary components to support monitoring and to facilitate loading and storing of filesystem data from external sources. An HPCC environment can include only Thor clusters, or both Thor and Roxie clusters. Each of these cluster types is described in more detail in the following sections below the architecture diagram.

Thor

Thor (the Data Refinery Cluster) is responsible for consuming vast amounts of data, transforming, linking and indexing that data. It functions as a distributed file system with parallel processing power spread across the nodes. A cluster can scale from a single node to thousands of nodes.

  • Single-threaded
  • Distributed parallel processing
  • Distributed file system
  • Powerful parallel processing programming language (ECL)
  • Optimized for Extraction, Transformation, Loading, Sorting, Indexing and Linking
  • Scales from 1-1000s of nodes

Roxie

Roxie (the Query Cluster) provides separate high-performance online query processing and data warehouse capabilities. Roxie (Rapid Online XML Inquiry Engine) is the data delivery engine used in HPCC to serve data quickly and can support many thousands of requests per node per second.

  • Multi-threaded
  • Distributed parallel processing
  • Distributed file system
  • Powerful parallel processing programming language (ECL)
  • Optimized for concurrent query processing
  • Scales from 1-1000s of nodes

ECL

ECL (Enterprise Control Language) is the powerful programming language that is ideally suited for the manipulation of Big Data.

  • Transparent and implicitly parallel programming language
  • Non-procedural and dataflow oriented
  • Modular, reusable, extensible syntax
  • Combines data representation and algorithm implementation
  • Easily extend using C++ libraries
  • ECL is compiled into optimized C++

ECL IDE

ECL IDE is a modern IDE used to code, debug and monitor ECL programs.

  • Access to shared source code repositories
  • Complete development, debugging and testing environment for developing ECL dataflow programs
  • Access to the ECLWatch tool is built-in, allowing developers to watch job graphs as they are executing
  • Access to current and historical job workunits

ESP

ESP (Enterprise Services Platform) provides an easy to use interface to access ECL queries using XML, HTTP, SOAP and REST.

  • Standards-based interface to access ECL functions