Skip to content

[Bug]: Row is not adaptable from Scala/JVM -> Python via PythonExternalTransform #35493

Open
@mydimension

Description

@mydimension

What happened?

Beam Version: 2.62
Scala Version: 2.12
Java Version: 11
Python Version: 3.12

In a Scala pipeline, i produce Pcollection[Row] and pass to PythonExternalTransform via something like:

pcollTuple
  .apply(SqlTransform.query("..."))
  .apply(PythonExternalTransform.from("my_package.MyTransform", "localhost:65001"): PythonExternalTransform[PCollection[Row], PCollection[Row]])

and the subsequent python code:

class MyTransform(beam.PTransform):
  def expand(self, pcoll: beam.PCollection[Row]) -> beam.PCollection[Row]:
    return pcoll | ...

I end up with the following error:

apache_beam.typehints.decorators.TypeCheckError: Input type hint violation at MyTransform: expected <class 'apache_beam.pvalue.Row'>, got <class 'apache_beam.typehints.schemas.BeamSchema_6a7b1e90_8aac_43f6_8caa_aadca7856423'>

Alternatives I've tried to workaround the issue are:

  1. Creating a typing.NamedTuple class to match the expected inbound Schema, and changing the pcoll input type hint to match
  2. Using beam.row_type.RowTypeConstraint.from_fields from #25749

Both methods gave similar errors about not being able to validate the type hint.

Issue Priority

Priority: 2 (default / most bugs should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions