Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Real-time Query for Hadoop
C++ Java Python Thrift Shell CMake Other

IMPALA-1916: Replace Status::OK by Status::OK()

By doing so, we avoid unnecessarily calling the copy constructor for
Status OK objects and loading the value from memory (due to the old
Status::OK being a global). The impact of this patch was validated by
inspecting both optimized assembly code and generated IR code.

Applying this patch has some effect on the amount of generated code. The
new tool `get_code_size` will list the text, data, and bss sizes for all
archives that we produce in a release build. This patch reduces the code
size by ~20 kB.

      Text      Data    BSS
Old   10578622  576864  40825
New   10559367  576864  40809

The majority of the changes in this patch have been mechanically applied
using:

   find be/src -name "*.cc" -or -name "*.h" | xargs sed -i
   's/Status::OK;/Status::OK\(\);/'

A new micro-benchmark was added to determine the overhead of using
Status in hot code sections.

Machine Info: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
status:               Function     Rate (iters/ms)          Comparison
----------------------------------------------------------------------
             Call Status::OK()           9.555e+08                  1X
     Call static Status::Error           4.515e+07            0.04725X
   Call Status(Code, 'string')           9.873e+06            0.01033X
            Call w/ Assignment           5.422e+08             0.5674X
           Call Cond Branch OK           5.941e+06           0.006218X
        Call Cond Branch ERROR           7.047e+06           0.007375X
 Call Cond Branch Bool (false)           1.914e+10              20.03X
  Call Cond Branch Bool (true)           1.491e+11                156X
Call Cond Boost Optional (true)          3.935e+09              4.118X
Call Cond Boost Optional (false)         2.147e+10              22.47X

Change-Id: I1be6f4c52e2db8cba35b3938a236913faa321e9e
Reviewed-on: http://gerrit.cloudera.org:8080/351
Reviewed-by: Martin Grund <mgrund@cloudera.com>
Tested-by: Internal Jenkins
latest commit 84c1f835f3
@grundprinzip grundprinzip authored Internal Jenkins committed

README.md

Welcome to Impala

Lightning-fast, distributed SQL queries for petabytes of data stored in Apache Hadoop clusters.

Impala is a modern, massively-distributed, massively-parallel, C++ query engine that lets you analyze, transform and combine data from a variety of data sources:

  • Best of breed performance and scalability.
  • Support for data stored in HDFS, Apache HBase and Amazon S3.
  • Wide analytic SQL support, including window functions and subqueries.
  • On-the-fly code generation using LLVM to generate CPU-efficient code tailored specifically to each individual query.
  • Support for the most commonly-used Hadoop file formats, including the Apache Parquet (incubating) project.
  • Apache-licensed, 100% open source.

More about Impala

To learn more about Impala as a business user, or to try Impala live or in a VM, please visit the Impala homepage.

If you are interested in contributing to Impala as a developer, or learning more about Impala's internals and architecture, visit the Impala wiki.

Something went wrong with that request. Please try again.