Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove indexHint function #9542

Merged
merged 4 commits into from
Mar 7, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
63 changes: 0 additions & 63 deletions dbms/src/Functions/indexHint.cpp

This file was deleted.

2 changes: 0 additions & 2 deletions dbms/src/Functions/registerFunctionsMiscellaneous.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ void registerFunctionSleepEachRow(FunctionFactory &);
void registerFunctionMaterialize(FunctionFactory &);
void registerFunctionIgnore(FunctionFactory &);
void registerFunctionIgnoreExceptNull(FunctionFactory &);
void registerFunctionIndexHint(FunctionFactory &);
void registerFunctionIdentity(FunctionFactory &);
void registerFunctionArrayJoin(FunctionFactory &);
void registerFunctionReplicate(FunctionFactory &);
Expand Down Expand Up @@ -87,7 +86,6 @@ void registerFunctionsMiscellaneous(FunctionFactory & factory)
registerFunctionMaterialize(factory);
registerFunctionIgnore(factory);
registerFunctionIgnoreExceptNull(factory);
registerFunctionIndexHint(factory);
registerFunctionIdentity(factory);
registerFunctionArrayJoin(factory);
registerFunctionReplicate(factory);
Expand Down
10 changes: 0 additions & 10 deletions dbms/src/Interpreters/ActionsVisitor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -400,16 +400,6 @@ void ActionsMatcher::visit(const ASTFunction & node, const ASTPtr & ast, Data &
}
}

/// A special function `indexHint`. Everything that is inside it is not calculated
/// (and is used only for index analysis, see KeyCondition).
if (node.name == "indexHint")
{
data.addAction(ExpressionAction::addColumn(ColumnWithTypeAndName(
ColumnConst::create(ColumnUInt8::create(1, 1), 1), std::make_shared<DataTypeUInt8>(),
column_name.get(ast))));
return;
}

if (AggregateFunctionFactory::instance().isAggregateFunctionName(node.name))
return;

Expand Down
3 changes: 1 addition & 2 deletions dbms/src/Interpreters/RequiredSourceColumnsVisitor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,8 @@ bool RequiredSourceColumnsMatcher::needChildVisit(const ASTPtr & node, const AST

if (const auto * f = node->as<ASTFunction>())
{
/// "indexHint" is a special function for index analysis. Everything that is inside it is not calculated. @sa KeyCondition
/// "lambda" visit children itself.
if (f->name == "indexHint" || f->name == "lambda")
if (f->name == "lambda")
return false;
}

Expand Down
12 changes: 4 additions & 8 deletions dbms/src/Storages/MergeTree/KeyCondition.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -281,11 +281,11 @@ static const std::map<std::string, std::string> inverse_relations = {

bool isLogicalOperator(const String & func_name)
{
return (func_name == "and" || func_name == "or" || func_name == "not" || func_name == "indexHint");
return (func_name == "and" || func_name == "or" || func_name == "not");
}

/// The node can be one of:
/// - Logical operator (AND, OR, NOT and indexHint() - logical NOOP)
/// - Logical operator (AND, OR, NOT)
/// - An "atom" (relational operator, constant, expression)
/// - A logical constant expression
/// - Any other function
Expand All @@ -302,8 +302,7 @@ ASTPtr cloneASTWithInversionPushDown(const ASTPtr node, const bool need_inversio

const auto result_node = makeASTFunction(func->name);

/// indexHint() is a special case - logical NOOP function
if (result_node->name != "indexHint" && need_inversion)
if (need_inversion)
{
result_node->name = (result_node->name == "and") ? "or" : "and";
}
Expand Down Expand Up @@ -887,9 +886,6 @@ bool KeyCondition::tryParseAtomFromAST(const ASTPtr & node, const Context & cont
bool KeyCondition::tryParseLogicalOperatorFromAST(const ASTFunction * func, RPNElement & out)
{
/// Functions AND, OR, NOT.
/** Also a special function `indexHint` - works as if instead of calling a function there are just parentheses
* (or, the same thing - calling the function `and` from one argument).
*/
const ASTs & args = func->arguments->children;

if (func->name == "not")
Expand All @@ -901,7 +897,7 @@ bool KeyCondition::tryParseLogicalOperatorFromAST(const ASTFunction * func, RPNE
}
else
{
if (func->name == "and" || func->name == "indexHint")
if (func->name == "and")
out.function = RPNElement::FUNCTION_AND;
else if (func->name == "or")
out.function = RPNElement::FUNCTION_OR;
Expand Down
4 changes: 2 additions & 2 deletions dbms/src/Storages/MergeTree/MergeTreeIndexSet.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -363,7 +363,7 @@ bool MergeTreeIndexConditionSet::operatorFromAST(ASTPtr & node) const

func->name = "__bitSwapLastTwo";
}
else if (func->name == "and" || func->name == "indexHint")
else if (func->name == "and")
{
auto last_arg = args.back();
args.pop_back();
Expand Down Expand Up @@ -419,7 +419,7 @@ bool MergeTreeIndexConditionSet::checkASTUseless(const ASTPtr & node, bool atomi

const ASTs & args = func->arguments->children;

if (func->name == "and" || func->name == "indexHint")
if (func->name == "and")
return checkASTUseless(args[0], atomic) && checkASTUseless(args[1], atomic);
else if (func->name == "or")
return checkASTUseless(args[0], atomic) || checkASTUseless(args[1], atomic);
Expand Down
4 changes: 0 additions & 4 deletions dbms/src/Storages/MergeTree/MergeTreeWhereOptimizer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -332,10 +332,6 @@ bool MergeTreeWhereOptimizer::cannotBeMoved(const ASTPtr & ptr) const
if ("globalIn" == function_ptr->name
|| "globalNotIn" == function_ptr->name)
return true;

/// indexHint is a special function that it does not make sense to transfer to PREWHERE
if ("indexHint" == function_ptr->name)
return true;
}
else if (auto opt_name = IdentifierSemantic::getColumnName(ptr))
{
Expand Down
5 changes: 1 addition & 4 deletions dbms/src/Storages/MergeTree/RPNBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -91,9 +91,6 @@ class RPNBuilder
bool operatorFromAST(const ASTFunction * func, RPNElement & out)
{
/// Functions AND, OR, NOT.
/** Also a special function `indexHint` - works as if instead of calling a function there are just parentheses
* (or, the same thing - calling the function `and` from one argument).
*/
const ASTs & args = typeid_cast<const ASTExpressionList &>(*func->arguments).children;

if (func->name == "not")
Expand All @@ -105,7 +102,7 @@ class RPNBuilder
}
else
{
if (func->name == "and" || func->name == "indexHint")
if (func->name == "and")
out.function = RPNElement::FUNCTION_AND;
else if (func->name == "or")
out.function = RPNElement::FUNCTION_OR;
Expand Down
105 changes: 1 addition & 104 deletions docs/en/query_language/functions/other_functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -734,109 +734,6 @@ SELECT defaultValueOfArgumentType( CAST(1 AS Nullable(Int8) ) )
└───────────────────────────────────────────────────────┘
```

## indexHint {#indexhint}

The function is intended for debugging and introspection purposes. The function ignores it's argument and always returns 1. Arguments are not even evaluated.

But for the purpose of index analysis, the argument of this function is analyzed as if it was present directly without being wrapped inside `indexHint` function. This allows to select data in index ranges by the corresponding condition but without further filtering by this condition. The index in ClickHouse is sparse and using `indexHint` will yield more data than specifying the same condition directly.

**Syntax**

```sql
SELECT * FROM table WHERE indexHint(<expression>)
```

**Returned value**

1. Type: [Uint8](https://clickhouse.yandex/docs/en/data_types/int_uint/#diapazony-uint).

**Example**

Here is the example of test data from the table [ontime](../../getting_started/example_datasets/ontime.md).

Input table:

```sql
SELECT count() FROM ontime
```

```text
┌─count()─┐
│ 4276457 │
└─────────┘
```

The table has indexes on the fields `(FlightDate, (Year, FlightDate))`.

Create a query, where the index is not used.

Query:

```sql
SELECT FlightDate AS k, count() FROM ontime GROUP BY k ORDER BY k
```

ClickHouse processed the entire table (`Processed 4.28 million rows`).

Result:

```text
┌──────────k─┬─count()─┐
│ 2017-01-01 │ 13970 │
│ 2017-01-02 │ 15882 │
........................
│ 2017-09-28 │ 16411 │
│ 2017-09-29 │ 16384 │
│ 2017-09-30 │ 12520 │
└────────────┴─────────┘
```

To apply the index, select a specific date.

Query:

```sql
SELECT FlightDate AS k, count() FROM ontime WHERE k = '2017-09-15' GROUP BY k ORDER BY k
```

By using the index, ClickHouse processed a significantly smaller number of rows (`Processed 32.74 thousand rows`).

Result:

```text
┌──────────k─┬─count()─┐
│ 2017-09-15 │ 16428 │
└────────────┴─────────┘
```

Now wrap the expression `k = '2017-09-15'` into `indexHint` function.

Query:

```sql
SELECT
FlightDate AS k,
count()
FROM ontime
WHERE indexHint(k = '2017-09-15')
GROUP BY k
ORDER BY k ASC
```

ClickHouse used the index in the same way as the previous time (`Processed 32.74 thousand rows`).
The expression `k = '2017-09-15'` was not used when generating the result.
In examle the `indexHint` function allows to see adjacent dates.

Result:

```text
┌──────────k─┬─count()─┐
│ 2017-09-14 │ 7071 │
│ 2017-09-15 │ 16428 │
│ 2017-09-16 │ 1077 │
│ 2017-09-30 │ 8167 │
└────────────┴─────────┘
```

## replicate {#other_functions-replicate}

Expand Down Expand Up @@ -1005,7 +902,7 @@ joinGet(join_storage_table_name, `value_column`, join_keys)

Returns list of values corresponded to list of keys.

If certain doesn't exist in source table then `0` or `null` will be returned based on [join_use_nulls](../../operations/settings/settings.md#join_use_nulls) setting.
If certain doesn't exist in source table then `0` or `null` will be returned based on [join_use_nulls](../../operations/settings/settings.md#join_use_nulls) setting.

More info about `join_use_nulls` in [Join operation](../../operations/table_engines/join.md).

Expand Down