Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cubestore preAggregations Unsupported CAST from Utf8 to Timestamp(Nanosecond, None) #2646

Closed
rongfengliang opened this issue May 2, 2021 · 6 comments
Labels
backend:cube-store Issues relating to Cube Store question The issue is a question. Please use Stack Overflow for questions.

Comments

@rongfengliang
Copy link
Contributor

rongfengliang commented May 2, 2021

Describe the bug

Unsupported CAST from Utf8 to Timestamp(Nanosecond, None)

Screenshots

Query: 'SELECT
      `hr_sea_approve_statistics`.approverid `hr_sea_approve_statistics__approverid`, `hr_sea_approve_statistics`.avg_order_id `hr_sea_approve_statistics__avg_order_id`, `hr_sea_approve_statistics`.max_instance_time `hr_sea_approve_statistics__max_instance_time`, count(*) `hr_sea_approve_statistics__count`, sum(`hr_sea_approve_statistics`.instance_count) `hr_sea_approve_statistics__instance_count`
    FROM
      dev_pre_aggregations.hr_sea_approve_statistics_main_zkuw4aso_xaclbdwo_1g8rvge AS `hr_sea_approve_statistics`  WHERE (`hr_sea_approve_statistics`.start_date >= to_timestamp('2021-05-01T00:00:00.000') AND `hr_sea_approve_statistics`.start_date <= to_timestamp('2021-05-01T23:59:59.999')) GROUP BY 1, 2, 3 ORDER BY 4 DESC LIMIT 10000'
2021-05-02 01:19:27,290 TRACE [cubestore::queryplanner] Logical Plan: Limit: 10000
  Sort: #hr_sea_approve_statistics__count DESC NULLS FIRST
    Projection: #hr_sea_approve_statistics__approverid, #hr_sea_approve_statistics__avg_order_id, #hr_sea_approve_statistics__max_instance_time, #COUNT(UInt8(1)) AS hr_sea_approve_statistics__count, #SUM(instance_count) AS hr_sea_approve_statistics__instance_count
      Aggregate: groupBy=[[#hr_sea_approve_statistics.approverid AS hr_sea_approve_statistics__approverid, #hr_sea_approve_statistics.avg_order_id AS hr_sea_approve_statistics__avg_order_id, #hr_sea_approve_statistics.max_instance_time AS hr_sea_approve_statistics__max_instance_time]], aggr=[[COUNT(UInt8(1)), SUM(#hr_sea_approve_statistics.instance_count)]]
        Filter: #hr_sea_approve_statistics.start_date GtEq totimestamp(Utf8("2021-05-01T00:00:00.000")) And #hr_sea_approve_statistics.start_date LtEq totimestamp(Utf8("2021-05-01T23:59:59.999"))
          TableScan: dev_pre_aggregations.hr_sea_approve_statistics_main_zkuw4aso_xaclbdwo_1g8rvge projection=Some([0, 2, 3, 5, 8]), filters=[#start_date GtEq totimestamp(Utf8("2021-05-01T00:00:00.000")), #start_date LtEq totimestamp(Utf8("2021-05-01T23:59:59.999"))]
2021-05-02 01:19:27,291 TRACE [cubestore::queryplanner::planning] Extracted partition filter is PartitionFilter { min_max: [] }
2021-05-02 01:19:27,291 TRACE [cubestore::queryplanner::planning] Pruned 0 of 1 partitions
2021-05-02 01:19:27,292 TRACE [cubestore::http] Sending web socket response
2021-05-02 01:19:27,603 TRACE [cubestore::http] Received web socket message
2021-05-02 01:19:27,604 TRACE [cubestore::sql] Query: 'SELECT
      `hr_sea_approve_statistics`.approverid `hr_sea_approve_statistics__approverid`, `hr_sea_approve_statistics`.avg_order_id `hr_sea_approve_statistics__avg_order_id`, `hr_sea_approve_statistics`.max_instance_time `hr_sea_approve_statistics__max_instance_time`, count(*) `hr_sea_approve_statistics__count`, sum(`hr_sea_approve_statistics`.instance_count) `hr_sea_approve_statistics__instance_count`
    FROM
      dev_pre_aggregations.hr_sea_approve_statistics_main_zkuw4aso_xaclbdwo_1g8rvge AS `hr_sea_approve_statistics`  WHERE (`hr_sea_approve_statistics`.start_date >= to_timestamp('2021-05-01T00:00:00.000') AND `hr_sea_approve_statistics`.start_date <= to_timestamp('2021-05-01T23:59:59.999')) GROUP BY 1, 2, 3 ORDER BY 4 DESC LIMIT 10000'
2021-05-02 01:19:27,605 TRACE [cubestore::queryplanner] Logical Plan: Limit: 10000
  Sort: #hr_sea_approve_statistics__count DESC NULLS FIRST
    Projection: #hr_sea_approve_statistics__approverid, #hr_sea_approve_statistics__avg_order_id, #hr_sea_approve_statistics__max_instance_time, #COUNT(UInt8(1)) AS hr_sea_approve_statistics__count, #SUM(instance_count) AS hr_sea_approve_statistics__instance_count
      Aggregate: groupBy=[[#hr_sea_approve_statistics.approverid AS hr_sea_approve_statistics__approverid, #hr_sea_approve_statistics.avg_order_id AS hr_sea_approve_statistics__avg_order_id, #hr_sea_approve_statistics.max_instance_time AS hr_sea_approve_statistics__max_instance_time]], aggr=[[COUNT(UInt8(1)), SUM(#hr_sea_approve_statistics.instance_count)]]
        Filter: #hr_sea_approve_statistics.start_date GtEq totimestamp(Utf8("2021-05-01T00:00:00.000")) And #hr_sea_approve_statistics.start_date LtEq totimestamp(Utf8("2021-05-01T23:59:59.999"))
          TableScan: dev_pre_aggregations.hr_sea_approve_statistics_main_zkuw4aso_xaclbdwo_1g8rvge projection=Some([0, 2, 3, 5, 8]), filters=[#start_date GtEq totimestamp(Utf8("2021-05-01T00:00:00.000")), #start_date LtEq totimestamp(Utf8("2021-05-01T23:59:59.999"))]
2021-05-02 01:19:27,605 TRACE [cubestore::queryplanner::planning] Extracted partition filter is PartitionFilter { min_max: [] }

Version:
[e.g. 0.27.4 ]

some links apache/arrow#9449

@paveltiunov
Copy link
Member

Hey @rongfengliang ! We indeed need a better way of handling it. However, the issue origin is the lack of timestamp cast during rollup preparation. We actually want to avoid any implicit UTF8 to timestamp casts during query time as those are really slow. So you just need to add one in your time dimension sql in cube.js schema.

@paveltiunov paveltiunov added the question The issue is a question. Please use Stack Overflow for questions. label May 3, 2021
@rongfengliang
Copy link
Contributor Author

@paveltiunov thanks , i have add time dimension sql in cube.js schema . if i add time filter will cause this problem

@paveltiunov
Copy link
Member

@rongfengliang Could you please share your cube.js schema?

@hassankhan hassankhan added the backend:cube-store Issues relating to Cube Store label May 4, 2021
@rongfengliang
Copy link
Contributor Author

rongfengliang commented May 5, 2021

@i found this maybe caused by data type mapper (create table in cubestore)

i use postgres

this is my schema

cube(`Demoapp`, {
  sql: `SELECT * FROM public.demoapp`,
  
  joins: {
    
  },
  preAggregations: {
    main: {
      type: `originalSql`,
      external: true
    },
  },
  measures: {
    count: {
      type: `count`,
      drillMembers: [id, name, startdate]
    }
  },
  
  dimensions: {
    id: {
      sql: `id`,
      type: `number`,
      primaryKey: true
    },
    
    name: {
      sql: `name`,
      type: `string`
    },
    
    startdate: {
      sql: `startdate`,
      type: `time`
    },
    
    insertdata: {
      sql: `insertdata`,
      type: `time`
    },
    
    myinfo: {
      sql: `myinfo`,
      type: `time`
    }
  },
  
  dataSource: `default`
});

my tabel define

CREATE TABLE demoapp (
    id integer,
    name character varying,
    startdate date,
    insertdata time without time zone,
    myinfo timestamp without time zone
);

if i filter with myinfo is ok, startdate && insertdata will cause convert exception

cubestore table info
startdate as string ,the same as insertdata

image

@paveltiunov
Copy link
Member

@rongfengliang Gotcha. Generally speaking, Cube Store is designed to store rollups and not originalSql pre-aggregations. We're still going to support originalSql as intermediate step during rollup preparation but storing originalSql in Cube Store is out of our roadmap scope.

@rongfengliang
Copy link
Contributor Author

@paveltiunov thanks !!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:cube-store Issues relating to Cube Store question The issue is a question. Please use Stack Overflow for questions.
Projects
None yet
Development

No branches or pull requests

3 participants