[Feature Request]: Data Stream metadata and plugin type

### What would you like to happen?

### Proposed Overall Structure

- **New Metadata Plugin Type**: `DataStreamPluginType`  
  A new top-level plugin type (similar to other metadata plugin types in Hop). It should live in the core/engine or as its own module under `plugins/metadata` (or a dedicated `plugins/datastreams` category).

- **Metadata Element**: `Data Stream` (user-facing name)  
  - Annotated with `@HopMetadata(key = "data-stream", name = "Data Stream", ...)`
  - Stored as JSON in the project’s `metadata/data-stream/` folder.
  - Fully manageable and editable in the **Metadata perspective**.

- **Two New Transforms** (to be placed under `plugins/transforms`):
  - **Data Stream Output** – writes rows to a selected Data Stream (producer side)
  - **Data Stream Input** – reads rows from a selected Data Stream (consumer side)

- **Pluggable Implementations** (via the new `DataStreamPluginType`):
  1. **Arrow Socket** – for same-machine or simple multi-process streaming using Arrow IPC over TCP or Unix domain sockets.
  2. **Arrow Flight** – ideal for distributed scenarios (Spark / Flink) and network use cases; supports parallel writes from multiple executors via `DoPut` to a central Flight server.
  3. **Arrow File** – simplest decoupled option; writes Arrow IPC streams or files (including partitioned writes to local disk, HDFS, or S3). Downstream processes can read and optionally merge the data.

Additional implementations (ZeroMQ + Arrow, Kafka with Arrow serialization, shared memory, etc.) can be added later without changing the core metadata or the two transforms.

### How It Would Work in Practice

1. **Define a Data Stream** in the Metadata perspective:
   - Name: e.g. `PythonArrowStream`, `SparkCollector`, `ExternalPythonConsumer`
   - Implementation: choose Arrow Socket, Arrow Flight, or Arrow File
   - Implementation-specific settings (port range, Flight endpoint, base path, etc.)
   - Common settings: batch size, direction, optional schema reference, buffer configuration

2. **Use in Pipelines**:
   - Drop a **Data Stream Output** transform and select the desired Data Stream by name (dropdown, just like a database connection).
   - Drop a **Data Stream Input** transform and select the same (or another) Data Stream by name.
   - Connections are initialized lazily when the transform first needs them.

3. **Distributed Support (Hop on Spark / Hop on Flink)**:
   - Metadata is automatically serialized and distributed to all executors (standard Hop/Beam behavior).
   - **Arrow Flight** handles parallel writes naturally.
   - **Arrow Socket** can use port ranges or a thin merge layer.
   - **Arrow File** lets each executor write partitioned files that can be merged downstream.

This design keeps the user experience simple and metadata-driven while providing powerful, high-performance back-ends under the hood. It builds directly on existing Hop patterns such as **Pipeline Probe** metadata and the metadata injection / selection mechanisms.

### Issue Priority

Priority: 3

### Issue Component

Component: Transforms

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request]: Data Stream metadata and plugin type #6963

What would you like to happen?

Proposed Overall Structure

How It Would Work in Practice

Issue Priority

Issue Component

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature Request]: Data Stream metadata and plugin type #6963

Description

What would you like to happen?

Proposed Overall Structure

How It Would Work in Practice

Issue Priority

Issue Component

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions