Skip to content

DataFrame.__str__ truncates metadata keys that contain a : character #1312

@abey79

Description

@abey79

Describe the bug

Keys of a dataframe's schema's metadata containing a : character are displayed in a truncated way. For example sorbet:version is displayed as version

To Reproduce

This minimal test reproduces the issue:

def test_metadata_display() -> None:
    import datafusion
    import pyarrow as pa

    schema = pa.schema([("value", pa.int64())], metadata={"sorbet:version": "1.0"})
    table = pa.table({"value": [1, 2, 3]}, schema=schema)

    ctx = datafusion.SessionContext()
    ctx.register_record_batches("test_table", [table.to_batches()])

    df = ctx.table("test_table")

    # This passes, but it should not since the metadata key is "sorbet:version", not "version"
    assert (
        str(df)
        == """\
┌────────────────────────┐
│ METADATA:              │
│ * version: 1.0         │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ ┌────────────────────┐ │
│ │ value              │ │
│ │ ---                │ │
│ │ type: nullable i64 │ │
│ ╞════════════════════╡ │
│ │ 1                  │ │
│ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │
│ │ 2                  │ │
│ ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤ │
│ │ 3                  │ │
│ └────────────────────┘ │
└────────────────────────┘\
"""
    )

Expected behavior
The metadata key should be displayed in full.

Additional context

datafusion==50.1.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions