Skip to content

Commit

Permalink
Add better handling of redundant IS [NOT] NULL quals
Browse files Browse the repository at this point in the history
Until now PostgreSQL has not been very smart about optimizing away IS
NOT NULL base quals on columns defined as NOT NULL.  The evaluation of
these needless quals adds overhead.  Ordinarily, anyone who came
complaining about that would likely just have been told to not include
the qual in their query if it's not required.  However, a recent bug
report indicates this might not always be possible.

Bug 17540 highlighted that when we optimize Min/Max aggregates the IS NOT
NULL qual that the planner adds to make the rewritten plan ignore NULLs
can cause issues with poor index choice.  That particular case
demonstrated that other quals, especially ones where no statistics are
available to allow the planner a chance at estimating an approximate
selectivity for can result in poor index choice due to cheap startup paths
being prefered with LIMIT 1.

Here we take generic approach to fixing this by having the planner check
for NOT NULL columns and just have the planner remove these quals (when
they're not needed) for all queries, not just when optimizing Min/Max
aggregates.

Additionally, here we also detect IS NULL quals on a NOT NULL column and
transform that into a gating qual so that we don't have to perform the
scan at all.  This also works for join relations when the Var is not
nullable by any outer join.

This also helps with the self-join removal work as it must replace
strict join quals with IS NOT NULL quals to ensure equivalence with the
original query.

Author: David Rowley, Richard Guo, Andy Fan
Reviewed-by: Richard Guo, David Rowley
Discussion: https://postgr.es/m/CAApHDvqg6XZDhYRPz0zgOcevSMo0d3vxA9DvHrZtKfqO30WTnw@mail.gmail.com
Discussion: https://postgr.es/m/17540-7aa1855ad5ec18b4%40postgresql.org
  • Loading branch information
david-rowley committed Jan 23, 2024
1 parent 183b6f7 commit b262ad4
Show file tree
Hide file tree
Showing 13 changed files with 664 additions and 67 deletions.
16 changes: 8 additions & 8 deletions contrib/postgres_fdw/expected/postgres_fdw.out
Original file line number Diff line number Diff line change
Expand Up @@ -656,20 +656,20 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 =
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" = 100)) AND ((c2 = 0))
(3 rows)

EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL; -- NullTest
QUERY PLAN
-------------------------------------------------------------------------------------------------
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c3 IS NULL; -- NullTest
QUERY PLAN
----------------------------------------------------------------------------------------------
Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NULL))
(3 rows)

EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL; -- NullTest
QUERY PLAN
-----------------------------------------------------------------------------------------------------
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c3 IS NOT NULL; -- NullTest
QUERY PLAN
--------------------------------------------------------------------------------------------------
Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NOT NULL))
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE ((c3 IS NOT NULL))
(3 rows)

EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
Expand Down
4 changes: 2 additions & 2 deletions contrib/postgres_fdw/sql/postgres_fdw.sql
Original file line number Diff line number Diff line change
Expand Up @@ -332,8 +332,8 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft_empty ORDER BY c1;
-- ===================================================================
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- Var, OpExpr(b), Const
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 = 0; -- BoolExpr
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL; -- NullTest
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL; -- NullTest
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c3 IS NULL; -- NullTest
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c3 IS NOT NULL; -- NullTest
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; -- FuncExpr
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1; -- OpExpr(l)
EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
Expand Down
197 changes: 188 additions & 9 deletions src/backend/optimizer/plan/initsplan.c
Original file line number Diff line number Diff line change
Expand Up @@ -2618,6 +2618,193 @@ check_redundant_nullability_qual(PlannerInfo *root, Node *clause)
return false;
}

/*
* add_base_clause_to_rel
* Add 'restrictinfo' as a baserestrictinfo to the base relation denoted
* by 'relid'. We offer some simple prechecks to try to determine if the
* qual is always true, in which case we ignore it rather than add it.
* If we detect the qual is always false, we replace it with
* constant-FALSE.
*/
static void
add_base_clause_to_rel(PlannerInfo *root, Index relid,
RestrictInfo *restrictinfo)
{
RelOptInfo *rel = find_base_rel(root, relid);

Assert(bms_membership(restrictinfo->required_relids) == BMS_SINGLETON);

/* Don't add the clause if it is always true */
if (restriction_is_always_true(root, restrictinfo))
return;

/*
* Substitute the origin qual with constant-FALSE if it is provably always
* false. Note that we keep the same rinfo_serial.
*/
if (restriction_is_always_false(root, restrictinfo))
{
int save_rinfo_serial = restrictinfo->rinfo_serial;

restrictinfo = make_restrictinfo(root,
(Expr *) makeBoolConst(false, false),
restrictinfo->is_pushed_down,
restrictinfo->has_clone,
restrictinfo->is_clone,
restrictinfo->pseudoconstant,
0, /* security_level */
restrictinfo->required_relids,
restrictinfo->incompatible_relids,
restrictinfo->outer_relids);
restrictinfo->rinfo_serial = save_rinfo_serial;
}

/* Add clause to rel's restriction list */
rel->baserestrictinfo = lappend(rel->baserestrictinfo, restrictinfo);

/* Update security level info */
rel->baserestrict_min_security = Min(rel->baserestrict_min_security,
restrictinfo->security_level);
}

/*
* expr_is_nonnullable
* Check to see if the Expr cannot be NULL
*
* If the Expr is a simple Var that is defined NOT NULL and meanwhile is not
* nulled by any outer joins, then we can know that it cannot be NULL.
*/
static bool
expr_is_nonnullable(PlannerInfo *root, Expr *expr)
{
RelOptInfo *rel;
Var *var;

/* For now only check simple Vars */
if (!IsA(expr, Var))
return false;

var = (Var *) expr;

/* could the Var be nulled by any outer joins? */
if (!bms_is_empty(var->varnullingrels))
return false;

/* system columns cannot be NULL */
if (var->varattno < 0)
return true;

/* is the column defined NOT NULL? */
rel = find_base_rel(root, var->varno);
if (var->varattno > 0 &&
bms_is_member(var->varattno, rel->notnullattnums))
return true;

return false;
}

/*
* restriction_is_always_true
* Check to see if the RestrictInfo is always true.
*
* Currently we only check for NullTest quals and OR clauses that include
* NullTest quals. We may extend it in the future.
*/
bool
restriction_is_always_true(PlannerInfo *root,
RestrictInfo *restrictinfo)
{
/* Check for NullTest qual */
if (IsA(restrictinfo->clause, NullTest))
{
NullTest *nulltest = (NullTest *) restrictinfo->clause;

/* is this NullTest an IS_NOT_NULL qual? */
if (nulltest->nulltesttype != IS_NOT_NULL)
return false;

return expr_is_nonnullable(root, nulltest->arg);
}

/* If it's an OR, check its sub-clauses */
if (restriction_is_or_clause(restrictinfo))
{
ListCell *lc;

Assert(is_orclause(restrictinfo->orclause));

/*
* if any of the given OR branches is provably always true then the
* entire condition is true.
*/
foreach(lc, ((BoolExpr *) restrictinfo->orclause)->args)
{
Node *orarg = (Node *) lfirst(lc);

if (!IsA(orarg, RestrictInfo))
continue;

if (restriction_is_always_true(root, (RestrictInfo *) orarg))
return true;
}
}

return false;
}

/*
* restriction_is_always_false
* Check to see if the RestrictInfo is always false.
*
* Currently we only check for NullTest quals and OR clauses that include
* NullTest quals. We may extend it in the future.
*/
bool
restriction_is_always_false(PlannerInfo *root,
RestrictInfo *restrictinfo)
{
/* Check for NullTest qual */
if (IsA(restrictinfo->clause, NullTest))
{
NullTest *nulltest = (NullTest *) restrictinfo->clause;

/* is this NullTest an IS_NULL qual? */
if (nulltest->nulltesttype != IS_NULL)
return false;

return expr_is_nonnullable(root, nulltest->arg);
}

/* If it's an OR, check its sub-clauses */
if (restriction_is_or_clause(restrictinfo))
{
ListCell *lc;

Assert(is_orclause(restrictinfo->orclause));

/*
* Currently, when processing OR expressions, we only return true when
* all of the OR branches are always false. This could perhaps be
* expanded to remove OR branches that are provably false. This may
* be a useful thing to do as it could result in the OR being left
* with a single arg. That's useful as it would allow the OR
* condition to be replaced with its single argument which may allow
* use of an index for faster filtering on the remaining condition.
*/
foreach(lc, ((BoolExpr *) restrictinfo->orclause)->args)
{
Node *orarg = (Node *) lfirst(lc);

if (!IsA(orarg, RestrictInfo) ||
!restriction_is_always_false(root, (RestrictInfo *) orarg))
return false;
}
return true;
}

return false;
}

/*
* distribute_restrictinfo_to_rels
* Push a completed RestrictInfo into the proper restriction or join
Expand All @@ -2632,7 +2819,6 @@ distribute_restrictinfo_to_rels(PlannerInfo *root,
RestrictInfo *restrictinfo)
{
Relids relids = restrictinfo->required_relids;
RelOptInfo *rel;

if (!bms_is_empty(relids))
{
Expand All @@ -2644,14 +2830,7 @@ distribute_restrictinfo_to_rels(PlannerInfo *root,
* There is only one relation participating in the clause, so it
* is a restriction clause for that relation.
*/
rel = find_base_rel(root, relid);

/* Add clause to rel's restriction list */
rel->baserestrictinfo = lappend(rel->baserestrictinfo,
restrictinfo);
/* Update security level info */
rel->baserestrict_min_security = Min(rel->baserestrict_min_security,
restrictinfo->security_level);
add_base_clause_to_rel(root, relid, restrictinfo);
}
else
{
Expand Down
28 changes: 28 additions & 0 deletions src/backend/optimizer/util/joininfo.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,12 @@
*/
#include "postgres.h"

#include "nodes/makefuncs.h"
#include "optimizer/joininfo.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/planmain.h"
#include "optimizer/restrictinfo.h"


/*
Expand Down Expand Up @@ -98,6 +101,31 @@ add_join_clause_to_rels(PlannerInfo *root,
{
int cur_relid;

/* Don't add the clause if it is always true */
if (restriction_is_always_true(root, restrictinfo))
return;

/*
* Substitute constant-FALSE for the origin qual if it is always false.
* Note that we keep the same rinfo_serial.
*/
if (restriction_is_always_false(root, restrictinfo))
{
int save_rinfo_serial = restrictinfo->rinfo_serial;

restrictinfo = make_restrictinfo(root,
(Expr *) makeBoolConst(false, false),
restrictinfo->is_pushed_down,
restrictinfo->has_clone,
restrictinfo->is_clone,
restrictinfo->pseudoconstant,
0, /* security_level */
restrictinfo->required_relids,
restrictinfo->incompatible_relids,
restrictinfo->outer_relids);
restrictinfo->rinfo_serial = save_rinfo_serial;
}

cur_relid = -1;
while ((cur_relid = bms_next_member(join_relids, cur_relid)) >= 0)
{
Expand Down
19 changes: 19 additions & 0 deletions src/backend/optimizer/util/plancat.c
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,25 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
rel->attr_widths = (int32 *)
palloc0((rel->max_attr - rel->min_attr + 1) * sizeof(int32));

/* record which columns are defined as NOT NULL */
for (int i = 0; i < relation->rd_att->natts; i++)
{
FormData_pg_attribute *attr = &relation->rd_att->attrs[i];

if (attr->attnotnull)
{
rel->notnullattnums = bms_add_member(rel->notnullattnums,
attr->attnum);

/*
* Per RemoveAttributeById(), dropped columns will have their
* attnotnull unset, so we needn't check for dropped columns in
* the above condition.
*/
Assert(!attr->attisdropped);
}
}

/*
* Estimate relation size --- unless it's an inheritance parent, in which
* case the size we want is not the rel's own size but the size of its
Expand Down
3 changes: 3 additions & 0 deletions src/backend/optimizer/util/relnode.c
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
rel->relid = relid;
rel->rtekind = rte->rtekind;
/* min_attr, max_attr, attr_needed, attr_widths are set below */
rel->notnullattnums = NULL;
rel->lateral_vars = NIL;
rel->indexlist = NIL;
rel->statlist = NIL;
Expand Down Expand Up @@ -719,6 +720,7 @@ build_join_rel(PlannerInfo *root,
joinrel->max_attr = 0;
joinrel->attr_needed = NULL;
joinrel->attr_widths = NULL;
joinrel->notnullattnums = NULL;
joinrel->nulling_relids = NULL;
joinrel->lateral_vars = NIL;
joinrel->lateral_referencers = NULL;
Expand Down Expand Up @@ -917,6 +919,7 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
joinrel->max_attr = 0;
joinrel->attr_needed = NULL;
joinrel->attr_widths = NULL;
joinrel->notnullattnums = NULL;
joinrel->nulling_relids = NULL;
joinrel->lateral_vars = NIL;
joinrel->lateral_referencers = NULL;
Expand Down
7 changes: 6 additions & 1 deletion src/include/nodes/pathnodes.h
Original file line number Diff line number Diff line change
Expand Up @@ -913,6 +913,8 @@ typedef struct RelOptInfo
Relids *attr_needed pg_node_attr(read_write_ignore);
/* array indexed [min_attr .. max_attr] */
int32 *attr_widths pg_node_attr(read_write_ignore);
/* zero-based set containing attnums of NOT NULL columns */
Bitmapset *notnullattnums;
/* relids of outer joins that can null this baserel */
Relids nulling_relids;
/* LATERAL Vars and PHVs referenced by rel */
Expand Down Expand Up @@ -2598,7 +2600,10 @@ typedef struct RestrictInfo
* 2. If we manufacture a commuted version of a qual to use as an index
* condition, it copies the original's rinfo_serial, since it is in
* practice the same condition.
* 3. RestrictInfos made for a child relation copy their parent's
* 3. If we reduce a qual to constant-FALSE, the new constant-FALSE qual
* copies the original's rinfo_serial, since it is in practice the same
* condition.
* 4. RestrictInfos made for a child relation copy their parent's
* rinfo_serial. Likewise, when an EquivalenceClass makes a derived
* equality clause for a child relation, it copies the rinfo_serial of
* the matching equality clause for the parent. This allows detection
Expand Down
4 changes: 4 additions & 0 deletions src/include/optimizer/planmain.h
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,10 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
extern bool restriction_is_always_true(PlannerInfo *root,
RestrictInfo *restrictinfo);
extern bool restriction_is_always_false(PlannerInfo *root,
RestrictInfo *restrictinfo);
extern void distribute_restrictinfo_to_rels(PlannerInfo *root,
RestrictInfo *restrictinfo);
extern RestrictInfo *process_implied_equality(PlannerInfo *root,
Expand Down
Loading

0 comments on commit b262ad4

Please sign in to comment.