Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
IMPORTANT - Issue tracking has been migrated to JIRA - use your github ID and password reset mechanism to log in
C++ ECL JavaScript XSLT Yacc CMake Other
Branch: master

Merge branch 'candidate-5.4.0'

Signed-off-by: Richard Chapman <rchapman@hpccsystems.com>
Failed to load latest commit information.
build_utils HPCC-9713 Bad installation message when installing on Ubuntu 13.04
charm HPCC-11289 Add README file for HPCC Juju Charm Development
clienttools HPCC-10526 add testing/regress to platform and clienttools install
cmake_modules Merge branch 'candidate-5.4.0'
common Merge branch 'candidate-5.4.0'
dali Merge branch 'candidate-5.4.0'
deploy gh-2562 Change license to Apache
deployment HPCC-13979 Configgen cores if env contains elements without @name or …
docs Merge branch 'candidate-5.4.0'
ecl Merge branch 'candidate-5.4.0'
ecllibrary HPCC-13254 Fix duplicate id errors in unicode test module
esp Merge branch 'candidate-5.4.0'
githooks Merge remote-tracking branch 'origin/candidate-3.10.x'
initfiles HPCC-13942 ConfigMgr wizards fails with "Could not locate filename"
lib2 HPCC-13273 Add and re-order header files to support Visual Studio 12 …
misc HPCC-9508 Add eclipse code layout settings file to project
plugins Merge branch 'candidate-5.4.0'
roxie Merge branch 'candidate-5.4.0'
rtl Merge branch 'candidate-5.4.0'
services HPCC-12622 Deprecate toCharArray()
system Merge branch 'candidate-5.4.0'
testing Merge branch 'candidate-5.4.0'
thorlcr Merge branch 'candidate-5.4.0'
tools Merge branch 'candidate-5.4.0'
.gitattributes Issue #254 Switches template reading to use jlib
.gitignore Minor code cleaup to avoid false positives from Eclipse
.gitmodules HPCC-13635 Update Viz Framework to v1.0.2
.travis.yml HPCC-13601 Travis-CI
BUILD_ME.md HPCC-13515 A proper README.md, moving old README.md -> BUILD_ME.md
CMakeLists.txt Merge branch 'candidate-5.4.0'
CNAME Add CNAME entry for GitHub pages redirection
CONTRIBUTORS HPCC-9508 Add eclipse code layout settings file to project
FUTURE Initial version of FUTURE document
LICENSE.txt HPCC-11269 Add Word Cloud Visualisation
README.md HPCC-13515 A proper README.md, moving old README.md -> BUILD_ME.md
VERSIONS VERSION rules updated
baseaddr.txt HPCC-9494 Clean up unused roxiemanager code
build-config.h.cmake HPCC-9902 Use the build version as the ecl version reported by eclcc
sourcedoc.xml Merge remote-tracking branch 'origin/closedown-4.2.x'
version.cmake Split off candidate-5.2.0 branch

README.md

Description / Rationale

HPCC Systems offers an enterprise ready, open source supercomputing platform to solve big data problems. As compared to Hadoop, the platform offers analysis of big data using less code and less nodes for greater efficiencies and offers a single programming language, a single platform and a single architecture for efficient processing. HPCC Systems is a technology division of LexisNexis Risk Solutions.

Getting Started

Architecture

The HPCC Systems architecture incorporates the Thor and Roxie clusters as well as common middleware components, an external communications layer, client interfaces which provide both end-user services and system management tools, and auxiliary components to support monitoring and to facilitate loading and storing of filesystem data from external sources. An HPCC environment can include only Thor clusters, or both Thor and Roxie clusters. Each of these cluster types is described in more detail in the following sections below the architecture diagram.

Thor

Thor (the Data Refinery Cluster) is responsible for consuming vast amounts of data, transforming, linking and indexing that data. It functions as a distributed file system with parallel processing power spread across the nodes. A cluster can scale from a single node to thousands of nodes.

  • Single-threaded
  • Distributed parallel processing
  • Distributed file system
  • Powerful parallel processing programming language (ECL)
  • Optimized for Extraction, Transformation, Loading, Sorting, Indexing and Linking
  • Scales from 1-1000s of nodes

Roxie

Roxie (the Query Cluster) provides separate high-performance online query processing and data warehouse capabilities. Roxie (Rapid Online XML Inquiry Engine) is the data delivery engine used in HPCC to serve data quickly and can support many thousands of requests per node per second.

  • Multi-threaded
  • Distributed parallel processing
  • Distributed file system
  • Powerful parallel processing programming language (ECL)
  • Optimized for concurrent query processing
  • Scales from 1-1000s of nodes

ECL

ECL (Enterprise Control Language) is the powerful programming language that is ideally suited for the manipulation of Big Data.

  • Transparent and implicitly parallel programming language
  • Non-procedural and dataflow oriented
  • Modular, reusable, extensible syntax
  • Combines data representation and algorithm implementation
  • Easily extend using C++ libraries
  • ECL is compiled into optimized C++

ECL IDE

ECL IDE is a modern IDE used to code, debug and monitor ECL programs.

  • Access to shared source code repositories
  • Complete development, debugging and testing environment for developing ECL dataflow programs
  • Access to the ECLWatch tool is built-in, allowing developers to watch job graphs as they are executing
  • Access to current and historical job workunits

ESP

ESP (Enterprise Services Platform) provides an easy to use interface to access ECL queries using XML, HTTP, SOAP and REST.

  • Standards-based interface to access ECL functions
Something went wrong with that request. Please try again.