Skip to content

bug: Table aliasing broken with BQ CTEs #11440

Open
@DiegoAlbertoTorres

Description

@DiegoAlbertoTorres

What happened?

The combination of aliasing with CTEs in the .sql method in the BigQuery backend is broken. Aliases seem to be out of scope, and therefore an attempt is made to use the alias as a fully qualified table name. This looks similar to a bug related to using CTEs from .sql, fixed in other SQL backends in #8933. This bug seems to be specific to BQ, as I could not reproduce this with DuckDB. I created a PR with a test to reproduce the bug: #11439. Note that the test (included below) works for DuckDB, but not Bigquery:

def test_embedded_cte_with_alias(con):
    expr = con.sql('SELECT * FROM (SELECT \'abc\' "ts") "x"', dialect="duckdb").limit(1)
    alias = "alias"
    expr = expr.alias(alias).sql(
        f'WITH "x" AS (SELECT * FROM "{alias}") SELECT * FROM "x"', dialect="duckdb"
    )
    result = expr.head(1).execute()
    assert len(result) == 1

What version of ibis are you using?

master

What backend(s) are you using, if any?

BigQuery

Relevant log output

Full error message produced by the test is below. However, the punchline is:

Table "alias" must be qualified with a dataset (e.g. dataset.table).

Full log:

ibis/backends/tests/test_dot_sql.py F                                     [100%]

=================================== FAILURES ====================================
____________________ test_embedded_cte_with_alias[bigquery] _____________________

con = <ibis.backends.bigquery.Backend object at 0x7fff8cbdb9d0>

    def test_embedded_cte_with_alias(con):
        expr = con.sql('SELECT * FROM (SELECT \'abc\' "ts") "x"', dialect="duckdb").limit(1)
        alias = 'alias'
>       expr = expr.alias(alias).sql(
            f'WITH "x" AS (SELECT * FROM "{alias}") SELECT * FROM "x"',
            dialect="duckdb"
        )

ibis/backends/tests/test_dot_sql.py:328: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
ibis/expr/types/relations.py:3611: in sql
    schema = backend._get_sql_string_view_schema(name=name, table=expr, query=query)
ibis/backends/sql/__init__.py:186: in _get_sql_string_view_schema
    return self._get_schema_using_query(sql)
ibis/backends/bigquery/__init__.py:772: in _get_schema_using_query
    job = self.client.query(
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/cloud/bigquery/client.py:3519: in query
    return _job_helpers.query_jobs_insert(
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/cloud/bigquery/_job_helpers.py:182: in query_jobs_insert
    future = do_query()
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/api_core/retry/retry_unary.py:294: in retry_wrapped_func
    return retry_target(
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/api_core/retry/retry_unary.py:156: in retry_target
    next_sleep = _retry_error_helper(
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/api_core/retry/retry_base.py:214: in _retry_error_helper
    raise final_exc from source_exc
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/api_core/retry/retry_unary.py:147: in retry_target
    result = target()
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/cloud/bigquery/_job_helpers.py:138: in do_query
    query_job._begin(retry=retry, timeout=timeout)
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/cloud/bigquery/job/query.py:1398: in _begin
    super(QueryJob, self)._begin(client=client, retry=retry, timeout=timeout)
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/cloud/bigquery/job/base.py:780: in _begin
    api_response = client._call_api(
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/cloud/bigquery/client.py:858: in _call_api
    return call()
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/api_core/retry/retry_unary.py:294: in retry_wrapped_func
    return retry_target(
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/api_core/retry/retry_unary.py:156: in retry_target
    next_sleep = _retry_error_helper(
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/api_core/retry/retry_base.py:214: in _retry_error_helper
    raise final_exc from source_exc
/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/api_core/retry/retry_unary.py:147: in retry_target
    result = target()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <google.cloud.bigquery._http.Connection object at 0x7fff8174bb60>
method = 'POST', path = '/projects/strange-radius-465019-q9/jobs'
query_params = None
data = '{"jobReference": {"jobId": "529b4dbe-30b5-48ad-a8d0-09d139a5bd11", "projectId": "strange-radius-465019-q9"}, "configu...ROM `alias`), `alias` AS (SELECT * FROM (SELECT \'abc\' AS `ts`) AS `x` LIMIT 1) SELECT * FROM `x`"}, "dryRun": true}}'
content_type = 'application/json', headers = None, api_base_url = None
api_version = None, expect_json = True, _target_object = None, timeout = None
extra_api_info = None

    def api_request(
        self,
        method,
        path,
        query_params=None,
        data=None,
        content_type=None,
        headers=None,
        api_base_url=None,
        api_version=None,
        expect_json=True,
        _target_object=None,
        timeout=_DEFAULT_TIMEOUT,
        extra_api_info=None,
    ):
<TRUNCATED>
        response = self._make_request(
            method=method,
            url=url,
            data=data,
            content_type=content_type,
            headers=headers,
            target_object=_target_object,
            timeout=timeout,
            extra_api_info=extra_api_info,
        )
    
        if not 200 <= response.status_code < 300:
>           raise exceptions.from_http_response(response)
E           google.api_core.exceptions.BadRequest: 400 POST https://bigquery.googleapis.com/bigquery/v2/projects/strange-radius-465019-q9/jobs?prettyPrint=false: Table "alias" must be qualified with a dataset (e.g. dataset.table).
E           
E           Location: None
E           Job ID: 529b4dbe-30b5-48ad-a8d0-09d139a5bd11

/nix/store/06vf4i6gj5vm18mj85jz03c4s2dm1184-ibis-3.13/lib/python3.13/site-packages/google/cloud/_http/__init__.py:494: BadRequest
------------------------------ Captured log setup -------------------------------
WARNING  google.auth._default:_default.py:677 No project ID could be determined. Consider running `gcloud config set project` or setting the GOOGLE_CLOUD_PROJECT environment variable
============================ short test summary info ============================
FAILED ibis/backends/tests/test_dot_sql.py::test_embedded_cte_with_alias[bigquery] - google.api_core.exceptions.BadRequest: 400 POST https://bigquery.googleapis....
====================== 1 failed, 45250 deselected in 9.33s ======================

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIncorrect behavior inside of ibis

    Type

    No type

    Projects

    Status

    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions