Skip to content

Session.list_tables(None) fails to return all tables in attached catalog #4400

@everettVT

Description

@everettVT

Describe the bug

Session.list_tables() returns an empty list, while catalog.list_tables() returns desired table identifier.

Given a catalog and session that is attached to that session, session.list_tables is only listing attached/temp but not the ones within catalogs. https://github.com/Eventual-Inc/Daft/blob/main/src/daft-session/src/session.rs#L237-L240

Expected behavior is session.list_tables() should list all tables, including across catalogs.

To Reproduce

import tempfile
from typing import Tuple

from daft import Schema
from daft.catalog import Catalog, Table
from daft.session import Session
import pyarrow as pa
from pyiceberg.catalog.sql import SqlCatalog


namespace = "archetypes"

uri = tempfile.mkdtemp()


my_archetype_schema = pa.schema([
    pa.field("simulation", pa.string(), nullable=False),
    pa.field("run", pa.string(), nullable=False),
    pa.field("entity_id", pa.uint64(), nullable=False),
    pa.field("step", pa.uint64(), nullable=False),
    pa.field("is_active", pa.bool_(), nullable=False),
])


catalog = Catalog.from_iceberg(
    SqlCatalog(
        "default",
        **{
            "uri": f"sqlite:///{uri}/catalog.db",
            "warehouse": f"file://{uri}",
        },
    )
)

# Initialize the session
session = Session()
session.attach(object=catalog) 
session.create_namespace_if_not_exists(namespace)
session.set_namespace(namespace)

table = session.create_table_if_not_exists(
    "my_archetype_table",
    Schema.from_pyarrow_schema(my_archetype_schema)
)

session_tables = session.list_tables(namespace)
catalog_tables = catalog.list_tables(namespace)

assert session_tables == catalog_tables, "Session and catalog should have the same tables"

session_table = session.get_table(session_tables[0])
assert session_table == table, "Session and catalog should have the same table"

catalog_table = catalog.get_table(catalog_tables[0])
assert catalog_table == table, "Catalog and session should have the same table"

Expected behavior

I expect session.list_tables() to return all tables in the attached catalog

Component(s)

Native Runner, Other

Additional context

slack thread
#4035 (comment)
#4048

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingp2 (backlog)Nice to have features

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions