Skip to content

Support reading Avro files in zstd codec #349

@siumingdev

Description

@siumingdev

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
I would like to read Avro files in ztsd codec, like this

from datafusion import SessionContext

ctx = SessionContext()
ctx.read_avro("/path/to/my/avro/in/zstd/codec")

But currently it gives the following error:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
Cell In[4], line 4
      1 from datafusion import SessionContext
      3 ctx = SessionContext()
----> 4 ctx.read_avro("/path/to/my/avro/in/zstd/codec")

Exception: DataFusion error: AvroError(CodecNotSupported("zstandard"))

I am running the code using official Python 3.9 docker image (python:3.9-slim) and install using pip install datafusion.

Describe the solution you'd like
No idea, is it even not supported in the original Rust implementation?

Describe alternatives you've considered
Read the file into using other Avro libraries and convert into datafusion dataframes.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions