Skip to content

[Enhancement] explode_bitmap with large bitmap is very slow #12034

@cambyzju

Description

@cambyzju

Search before asking

  • I had searched in the issues and found no similar issues.

Description

  1. two table A and B, B is a bitmap table;
> desc A;
+------------------+-------------+------+-------+---------+---------+
| Field            | Type        | Null | Key   | Default | Extra   |
+------------------+-------------+------+-------+---------+---------+
| id               | BIGINT      | Yes  | true  | NULL    |         |
| name             | VARCHAR(50) | Yes  | false | NULL    | REPLACE |
+------------------+-------------+------+-------+---------+---------+
> desc B;
+------------+--------+------+-------+---------+--------------+
| Field      | Type   | Null | Key   | Default | Extra        |
+------------+--------+------+-------+---------+--------------+
| key        | BIGINT | Yes  | true  | NULL    |              |
| ids        | BITMAP | Yes  | false |         | BITMAP_UNION |
+------------+--------+------+-------+---------+--------------+
  1. make some test data, for example bitmap_count(B.ids)=10W

  2. try to query with explode_bitmap, it is very slow (1m12s) for 10W bitmap:
    select name, count(*) as num from A where id in (select bc from B lateral view explode_bitmap(ids) tmp_v as bc where key=100000) group by name order by num desc limit 10;

  3. after profile, I find explode_bitmap return lots of data, make it very slow

Solution

avoid unused columns copy in table function node

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions