Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identical measures behave differently: one causes "Unrecognized name" error in generated SQL for BigQuery #8206

Closed
grzaks opened this issue Apr 30, 2024 · 1 comment
Labels
bug Something isn't working question The issue is a question. Please use Stack Overflow for questions.

Comments

@grzaks
Copy link

grzaks commented Apr 30, 2024

Problem

The following schema is simplified example of an issue I have with my production schema (~15 cubes, BigQuery data source). It consists of just two joined cubes poc_orders and poc_daily_metrics and some simple measures defined.

There is poc_orders.something measure that is using related cube measure. If you query this measure everything works as you might expect. Then there are poc_orders.count and poc_orders.count2 measures which are identical.

There is measure_working_fine which divides poc_orders.something by poc_orders.count and this one works fine.

Then there is measure_causing_error which divides poc_orders.something by poc_orders.count2 and here comes the issue: querying measure_causing_error makes the generated SQL that causes Unrecognized name: poc_orders at [1:79] error at BigQuery. Unless you query it together with poc_orders.count2, then it works fine.

How is that possible if count and count2 measures are 100% identical? Why it works only when poc_orders.count2 is also part of the query?

I pasted related schema file, generated SQL and log below and in this gist

Related Cube.js schema

cube(`poc_orders`, {
    public: false,
    sql: `
    SELECT co.*
    FROM cubes.orders co
    WHERE ${FILTER_PARAMS.orders.work_date.filter("addTimestamp")}
    `,
    joins: {
        poc_daily_metrics: {
            relationship: `many_to_one`,
            sql: `
            ${branch_uid} = ${poc_daily_metrics.branch_uid} 
            AND DATE(${work_date}) = ${poc_daily_metrics.work_date}
            `,
        },

    },
    measures: {
        count: {
            type: 'count',
        },

        count2: {
            type: 'count',
        },

        something: {
            sql: `${count} * ${poc_daily_metrics.finished_orders_count}`,
            type: 'number'
        },

        measure_working_fine: {
            sql: `${something}/${count}`,
            type: 'number',
        },

        measure_causing_error: {
            sql: `${something}/${count2}`,
            type: 'number',
        },
    },
    dimensions: {
        uid: {
            sql: `${CUBE}.uid`,
            type: `string`,
            primaryKey: true,
        },
        branch_uid: {
            sql: `${CUBE}.branchUid`,
            type: `string`,
        },
        work_date: {
            sql: `${CUBE}.addTimestamp`,
            type: `time`,
        },
    },
})

cube(`poc_daily_metrics`,
    {
        public: false,
        sql: `
                SELECT 
                    DATE(startTimestamp) as workDate,
                    branchUid,
                    COUNT(*) as finishedCount
                FROM cubes.activities
                WHERE ( 
                    ${FILTER_PARAMS.orders.work_date.filter("startTimestamp")}
                )
                GROUP BY 1, 2            
        `,
        measures: {
            finished_orders_count: {
                sql: `${CUBE}.finishedCount`,
                type: 'sum',
            },
        },
        dimensions: {
            id: {
                sql: `${work_date} || ${branch_uid}`,
                type: `string`,
                primaryKey: true,
            },
            branch_uid: {
                sql: `${CUBE}.branchUid`,
                type: `string`,
            },
            work_date: {
                sql: `${CUBE}.workDate`,
                type: `time`,
            },
        }
    }
)

Related Cube.js generated SQL for measure_working_fine

SELECT
  `poc_orders__count` * `poc_daily_metrics__finished_orders_count` / `poc_orders__count` `poc_orders__measure_working_fine`
FROM
  (
    SELECT
      count(`main__poc_orders`.uid) `poc_orders__count`
    FROM
      (
        SELECT
          co.*
        FROM
          cubes.orders co
        WHERE
          (1 = 1)
      ) AS `main__poc_orders`
      LEFT JOIN (
        SELECT
          DATE(startTimestamp) as workDate,
          branchUid,
          COUNT(*) as finishedCount
        FROM
          cubes.activities
        WHERE
          ((1 = 1))
        GROUP BY
          1,
          2
      ) AS `main__poc_daily_metrics` ON `main__poc_orders`.branchUid = `main__poc_daily_metrics`.branchUid
      AND DATE(`main__poc_orders`.addTimestamp) = `main__poc_daily_metrics`.workDate
    WHERE
      (
        `main__poc_orders`.addTimestamp >= TIMESTAMP(?)
        AND `main__poc_orders`.addTimestamp <= TIMESTAMP(?)
      )
  ) as q_0,
  (
    SELECT
      sum(
        `poc_daily_metrics_key__poc_daily_metrics`.finishedCount
      ) `poc_daily_metrics__finished_orders_count`
    FROM
      (
        SELECT
          DISTINCT `poc_daily_metrics_key__poc_daily_metrics`.workDate || `poc_daily_metrics_key__poc_daily_metrics`.branchUid `poc_daily_metrics__id`
        FROM
          (
            SELECT
              co.*
            FROM
              cubes.orders co
            WHERE
              (1 = 1)
          ) AS `poc_daily_metrics_key__poc_orders`
          LEFT JOIN (
            SELECT
              DATE(startTimestamp) as workDate,
              branchUid,
              COUNT(*) as finishedCount
            FROM
              cubes.activities
            WHERE
              ((1 = 1))
            GROUP BY
              1,
              2
          ) AS `poc_daily_metrics_key__poc_daily_metrics` ON `poc_daily_metrics_key__poc_orders`.branchUid = `poc_daily_metrics_key__poc_daily_metrics`.branchUid
          AND DATE(`poc_daily_metrics_key__poc_orders`.addTimestamp) = `poc_daily_metrics_key__poc_daily_metrics`.workDate
        WHERE
          (
            `poc_daily_metrics_key__poc_orders`.addTimestamp >= TIMESTAMP(?)
            AND `poc_daily_metrics_key__poc_orders`.addTimestamp <= TIMESTAMP(?)
          )
      ) AS `keys`
      LEFT JOIN (
        SELECT
          DATE(startTimestamp) as workDate,
          branchUid,
          COUNT(*) as finishedCount
        FROM
          cubes.activities
        WHERE
          ((1 = 1))
        GROUP BY
          1,
          2
      ) AS `poc_daily_metrics_key__poc_daily_metrics` ON `keys`.`poc_daily_metrics__id` = `poc_daily_metrics_key__poc_daily_metrics`.workDate || `poc_daily_metrics_key__poc_daily_metrics`.branchUid
  ) as q_1
LIMIT
  10000

Related Cube.js generated SQL for measure_causing_error

SELECT
  `poc_orders__count` * `poc_daily_metrics__finished_orders_count` / count(`poc_orders`.uid) `poc_orders__measure_causing_error`
FROM
  (
    SELECT
      count(`main__poc_orders`.uid) `poc_orders__count`
    FROM
      (
        SELECT
          co.*
        FROM
          cubes.orders co
        WHERE
          (1 = 1)
      ) AS `main__poc_orders`
      LEFT JOIN (
        SELECT
          DATE(startTimestamp) as workDate,
          branchUid,
          COUNT(*) as finishedCount
        FROM
          cubes.activities
        WHERE
          ((1 = 1))
        GROUP BY
          1,
          2
      ) AS `main__poc_daily_metrics` ON `main__poc_orders`.branchUid = `main__poc_daily_metrics`.branchUid
      AND DATE(`main__poc_orders`.addTimestamp) = `main__poc_daily_metrics`.workDate
    WHERE
      (
        `main__poc_orders`.addTimestamp >= TIMESTAMP(?)
        AND `main__poc_orders`.addTimestamp <= TIMESTAMP(?)
      )
  ) as q_0,
  (
    SELECT
      sum(
        `poc_daily_metrics_key__poc_daily_metrics`.finishedCount
      ) `poc_daily_metrics__finished_orders_count`
    FROM
      (
        SELECT
          DISTINCT `poc_daily_metrics_key__poc_daily_metrics`.workDate || `poc_daily_metrics_key__poc_daily_metrics`.branchUid `poc_daily_metrics__id`
        FROM
          (
            SELECT
              co.*
            FROM
              cubes.orders co
            WHERE
              (1 = 1)
          ) AS `poc_daily_metrics_key__poc_orders`
          LEFT JOIN (
            SELECT
              DATE(startTimestamp) as workDate,
              branchUid,
              COUNT(*) as finishedCount
            FROM
              cubes.activities
            WHERE
              ((1 = 1))
            GROUP BY
              1,
              2
          ) AS `poc_daily_metrics_key__poc_daily_metrics` ON `poc_daily_metrics_key__poc_orders`.branchUid = `poc_daily_metrics_key__poc_daily_metrics`.branchUid
          AND DATE(`poc_daily_metrics_key__poc_orders`.addTimestamp) = `poc_daily_metrics_key__poc_daily_metrics`.workDate
        WHERE
          (
            `poc_daily_metrics_key__poc_orders`.addTimestamp >= TIMESTAMP(?)
            AND `poc_daily_metrics_key__poc_orders`.addTimestamp <= TIMESTAMP(?)
          )
      ) AS `keys`
      LEFT JOIN (
        SELECT
          DATE(startTimestamp) as workDate,
          branchUid,
          COUNT(*) as finishedCount
        FROM
          cubes.activities
        WHERE
          ((1 = 1))
        GROUP BY
          1,
          2
      ) AS `poc_daily_metrics_key__poc_daily_metrics` ON `keys`.`poc_daily_metrics__id` = `poc_daily_metrics_key__poc_daily_metrics`.workDate || `poc_daily_metrics_key__poc_daily_metrics`.branchUid
  ) as q_1
LIMIT
  10000

Diff between the two above generated SQLs

--- generated_valid.sql	2024-04-29 22:00:39
+++ generated_invalid.sql	2024-04-29 22:01:19
@@ -1,5 +1,5 @@
 SELECT
-  `poc_orders__count` * `poc_daily_metrics__finished_orders_count` / `poc_orders__count` `poc_orders__measure_working_fine`
+  `poc_orders__count` * `poc_daily_metrics__finished_orders_count` / count(`poc_orders`.uid) `poc_orders__measure_causing_error`
 FROM
   (
     SELECT
\ No newline at end of file

Part of the cube console output

2024-04-29T22:07:51+02:00 --
2024-04-29T22:07:51+02:00 {
2024-04-29T22:07:51+02:00   "params": [
2024-04-29T22:07:51+02:00     "2024-04-28T00:00:00.000Z",
2024-04-29T22:07:51+02:00     "2024-04-28T23:59:59.999Z",
2024-04-29T22:07:51+02:00     "2024-04-28T00:00:00.000Z",
2024-04-29T22:07:51+02:00     "2024-04-28T23:59:59.999Z"
2024-04-29T22:07:51+02:00   ]
2024-04-29T22:07:51+02:00 } 
2024-04-29T22:07:51+02:00 Error: Unrecognized name: poc_orders at [1:79]
2024-04-29T22:07:51+02:00     at QueryQueue.parseResult (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryQueue.js:397:13)
2024-04-29T22:07:51+02:00     at QueryQueue.executeInQueue (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryQueue.js:369:19)
2024-04-29T22:07:51+02:00     at processTicksAndRejections (node:internal/process/task_queues:95:5)
2024-04-29T22:07:51+02:00     at /cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryCache.ts:756:17
2024-04-29T22:07:51+02:00     at QueryOrchestrator.fetchQuery (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryOrchestrator.ts:302:20)
2024-04-29T22:07:51+02:00     at OrchestratorApi.executeQuery (/cube/node_modules/@cubejs-backend/server-core/src/core/OrchestratorApi.ts:98:20)
2024-04-29T22:07:51+02:00     at /cube/node_modules/@cubejs-backend/api-gateway/src/gateway.ts:1467:21
2024-04-29T22:07:51+02:00     at async Promise.all (index 0)
2024-04-29T22:07:51+02:00     at ApiGateway.getSqlResponseInternal (/cube/node_modules/@cubejs-backend/api-gateway/src/gateway.ts:1465:31)
2024-04-29T22:07:51+02:00     at /cube/node_modules/@cubejs-backend/api-gateway/src/gateway.ts:1643:28
2024-04-29T22:07:51+02:00     at async Promise.all (index 0)
2024-04-29T22:07:51+02:00     at ApiGateway.load (/cube/node_modules/@cubejs-backend/api-gateway/src/gateway.ts:1634:23)
2024-04-29T22:07:51+02:00     at /cube/node_modules/@cubejs-backend/api-gateway/src/gateway.ts:272:7
2024-04-29T22:07:51+02:00 Orchestrator error: f44bd509-2f0f-4363-bfb4-51cc089113be-span-1 (294ms)
2024-04-29T22:07:51+02:00 --
2024-04-29T22:07:51+02:00 {
2024-04-29T22:07:51+02:00   "measures": [
2024-04-29T22:07:51+02:00     "poc_orders.measure_causing_error"
2024-04-29T22:07:51+02:00   ],
2024-04-29T22:07:51+02:00   "timeDimensions": [
2024-04-29T22:07:51+02:00     {
2024-04-29T22:07:51+02:00       "dimension": "poc_orders.work_date",
2024-04-29T22:07:51+02:00       "dateRange": "Yesterday"
2024-04-29T22:07:51+02:00     }
2024-04-29T22:07:51+02:00   ]
2024-04-29T22:07:51+02:00 }
2024-04-29T22:07:51+02:00 --
2024-04-29T22:07:51+02:00 {
2024-04-29T22:07:51+02:00   "securityContext": {
2024-04-29T22:07:51+02:00     "iat": 1714418741,
2024-04-29T22:07:51+02:00     "exp": 1714505141
2024-04-29T22:07:51+02:00   }
2024-04-29T22:07:51+02:00 } 
2024-04-29T22:07:51+02:00 Error: Unrecognized name: poc_orders at [1:79]

The Cube version is v0.35.24

@grzaks grzaks added the question The issue is a question. Please use Stack Overflow for questions. label Apr 30, 2024
@paveltiunov paveltiunov added the bug Something isn't working label May 1, 2024
@grzaks
Copy link
Author

grzaks commented May 13, 2024

I'm sorry to open this issue again @paveltiunov but I just tested it on v0.35.30 with the same exact schema as above (except FILTER_PARAMS.orders. fixed to correct path FILTER_PARAMS.poc_orders) and the issue is still there. Or I'm missing something.

Hash of the image:

% kubectl get -o json pod cube-api-f9f858f89-w75vw | grep "imageID"                  
"imageID": "docker.io/cubejs/cube@sha256:a3d4763f4bc1871c1d187f7a2b6ab7c4783b89c4a563c45673f2240656422b61",

From inside the container:

root@cube-api-f9f858f89-w75vw:/cube# grep version package.json
  "version": "0.35.30",

Is there any way I can help here?

Schema.js again:

cube(`poc_orders`, {
    public: false,
    sql: `
    SELECT co.*
    FROM cubes.orders co
    WHERE ${FILTER_PARAMS.poc_orders.work_date.filter("addTimestamp")}
    `,
    joins: {
        poc_daily_metrics: {
            relationship: `many_to_one`,
            sql: `
            ${branch_uid} = ${poc_daily_metrics.branch_uid} 
            AND DATE(${work_date}) = ${poc_daily_metrics.work_date}
            `,
        },

    },
    measures: {
        count: {
            type: 'count',
        },

        count2: {
            type: 'count',
        },

        something: {
            sql: `${count} * ${poc_daily_metrics.finished_orders_count}`,
            type: 'number'
        },

        measure_working_fine: {
            sql: `${something}/${count}`,
            type: 'number',
        },

        measure_causing_error: {
            sql: `${something}/${count2}`,
            type: 'number',
        },
    },
    dimensions: {
        uid: {
            sql: `${CUBE}.uid`,
            type: `string`,
            primaryKey: true,
        },
        branch_uid: {
            sql: `${CUBE}.branchUid`,
            type: `string`,
        },
        work_date: {
            sql: `${CUBE}.addTimestamp`,
            type: `time`,
        },
    },
})

cube(`poc_daily_metrics`,
    {
        public: false,
        sql: `
                SELECT 
                    DATE(startTimestamp) as workDate,
                    branchUid,
                    COUNT(*) as finishedCount
                FROM cubes.activities
                WHERE ( 
                    ${FILTER_PARAMS.poc_orders.work_date.filter("startTimestamp")}
                )
                GROUP BY 1, 2            
        `,
        measures: {
            finished_orders_count: {
                sql: `${CUBE}.finishedCount`,
                type: 'sum',
            },
        },
        dimensions: {
            id: {
                sql: `${work_date} || ${branch_uid}`,
                type: `string`,
                primaryKey: true,
            },
            branch_uid: {
                sql: `${CUBE}.branchUid`,
                type: `string`,
            },
            work_date: {
                sql: `${CUBE}.workDate`,
                type: `time`,
            },
        }
    }
)

Part of the console log:

2024-05-13T12:19:30+02:00 Error while querying: a60d1d55-c7f8-430a-8c17-d167fcb83d3b-span-1 (267ms)
2024-05-13T12:19:30+02:00 {
2024-05-13T12:19:30+02:00   "queueId": 9,
2024-05-13T12:19:30+02:00   "processingId": 9,
2024-05-13T12:19:30+02:00   "queueSize": 0,
2024-05-13T12:19:30+02:00   "queryKey": [
2024-05-13T12:19:30+02:00     "SELECT q_0.`poc_orders__work_date_day`, `poc_orders__count` * `poc_daily_metrics__finished_orders_count`/count(`poc_orders`.uid) `poc_orders__measure_causing_error` FROM (SELECT DATETIME_TRUNC(DATETIME(`main__poc_orders`.addTimestamp, 'UTC'), DAY) `poc_orders__work_date_day`, count(`main__poc_orders`.uid) `poc_orders__count` FROM (\n    SELECT co.*\n    FROM cubes.orders co\n    WHERE ((1 = 1))\n    ) AS `main__poc_orders`\nLEFT JOIN (\n                SELECT \n                    DATE(startTimestamp) as workDate,\n                    branchUid,\n                    COUNT(*) as finishedCount\n                FROM cubes.activities\n                WHERE ( \n                    ((1 = 1))\n                )\n                GROUP BY 1, 2            \n        ) AS `main__poc_daily_metrics` ON \n            `main__poc_orders`.branchUid = `main__poc_daily_metrics`.branchUid \n            AND DATE(`main__poc_orders`.addTimestamp) = `main__poc_daily_metrics`.workDate\n              GROUP BY 1) as q_0 INNER JOIN (SELECT `keys`.`poc_orders__work_date_day`, sum(`poc_daily_metrics_key__poc_daily_metrics`.finishedCount) `poc_daily_metrics__finished_orders_count` FROM (SELECT DISTINCT DATETIME_TRUNC(DATETIME(`poc_daily_metrics_key__poc_orders`.addTimestamp, 'UTC'), DAY) `poc_orders__work_date_day`, `poc_daily_metrics_key__poc_daily_metrics`.workDate || `poc_daily_metrics_key__poc_daily_metrics`.branchUid `poc_daily_metrics__id` FROM (\n    SELECT co.*\n    FROM cubes.orders co\n    WHERE ((1 = 1))\n    ) AS `poc_daily_metrics_key__poc_orders`\nLEFT JOIN (\n                SELECT \n                    DATE(startTimestamp) as workDate,\n                    branchUid,\n                    COUNT(*) as finishedCount\n                FROM cubes.activities\n                WHERE ( \n                    ((1 = 1))\n                )\n                GROUP BY 1, 2            \n        ) AS `poc_daily_metrics_key__poc_daily_metrics` ON \n            `poc_daily_metrics_key__poc_orders`.branchUid = `poc_daily_metrics_key__poc_daily_metrics`.branchUid \n            AND DATE(`poc_daily_metrics_key__poc_orders`.addTimestamp) = `poc_daily_metrics_key__poc_daily_metrics`.workDate\n             ) AS `keys`\nLEFT JOIN (\n                SELECT \n                    DATE(startTimestamp) as workDate,\n                    branchUid,\n                    COUNT(*) as finishedCount\n                FROM cubes.activities\n                WHERE ( \n                    ((1 = 1))\n                )\n                GROUP BY 1, 2            \n        ) AS `poc_daily_metrics_key__poc_daily_metrics` ON `keys`.`poc_daily_metrics__id` = `poc_daily_metrics_key__poc_daily_metrics`.workDate || `poc_daily_metrics_key__poc_daily_metrics`.branchUid\n              GROUP BY 1) as q_1 ON (q_0.`poc_orders__work_date_day` = q_1.`poc_orders__work_date_day` OR (q_0.`poc_orders__work_date_day` IS NULL AND q_1.`poc_orders__work_date_day` IS NULL)) ORDER BY 1 ASC LIMIT 10000",
2024-05-13T12:19:30+02:00     [],
2024-05-13T12:19:30+02:00     []
2024-05-13T12:19:30+02:00   ],
2024-05-13T12:19:30+02:00   "queuePrefix": "SQL_QUERY_STANDALONE_default",
2024-05-13T12:19:30+02:00   "timeInQueue": 7,
2024-05-13T12:19:30+02:00   "addedToQueueTime": 1715595570426
2024-05-13T12:19:30+02:00 } 
2024-05-13T12:19:30+02:00 Error: Unrecognized name: poc_orders at [1:112]
2024-05-13T12:19:30+02:00     at new ApiError (/cube/node_modules/@google-cloud/common/build/src/util.js:73:15)
2024-05-13T12:19:30+02:00     at /cube/node_modules/@google-cloud/bigquery/build/src/bigquery.js:1071:23
2024-05-13T12:19:30+02:00     at /cube/node_modules/@google-cloud/common/build/src/util.js:380:25
2024-05-13T12:19:30+02:00     at Util.handleResp (/cube/node_modules/@google-cloud/common/build/src/util.js:159:9)
2024-05-13T12:19:30+02:00     at /cube/node_modules/@google-cloud/common/build/src/util.js:479:22
2024-05-13T12:19:30+02:00     at onResponse (/cube/node_modules/retry-request/index.js:228:7)
2024-05-13T12:19:30+02:00     at /cube/node_modules/teeny-request/src/index.ts:333:11
2024-05-13T12:19:30+02:00     at processTicksAndRejections (node:internal/process/task_queues:95:5)
...
2024-05-13T12:19:39+02:00 --
2024-05-13T12:19:39+02:00 {
2024-05-13T12:19:39+02:00   "measures": [
2024-05-13T12:19:39+02:00     "poc_orders.measure_causing_error"
2024-05-13T12:19:39+02:00   ],
2024-05-13T12:19:39+02:00   "timeDimensions": [
2024-05-13T12:19:39+02:00     {
2024-05-13T12:19:39+02:00       "dimension": "poc_orders.work_date",
2024-05-13T12:19:39+02:00       "granularity": "day"
2024-05-13T12:19:39+02:00     }
2024-05-13T12:19:39+02:00   ],
2024-05-13T12:19:39+02:00   "order": {
2024-05-13T12:19:39+02:00     "poc_orders.work_date": "asc"
2024-05-13T12:19:39+02:00   }
2024-05-13T12:19:39+02:00 }
2024-05-13T12:19:39+02:00 --
2024-05-13T12:19:39+02:00 {
2024-05-13T12:19:39+02:00   "securityContext": {
2024-05-13T12:19:39+02:00     "iat": 1715591181,
2024-05-13T12:19:39+02:00     "exp": 1715677581
2024-05-13T12:19:39+02:00   }
2024-05-13T12:19:39+02:00 } 
2024-05-13T12:19:39+02:00 Error: Unrecognized name: poc_orders at [1:112]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question The issue is a question. Please use Stack Overflow for questions.
Projects
None yet
Development

No branches or pull requests

2 participants