Skip to content

PgBulkInsert 9.0.0: missing Stream-based saveAll overload breaks streaming ingestion use-cases #144

@TheNullablePrototype

Description

@TheNullablePrototype

Description

After upgrading to pgbulkinsert:9.0.0, the previous ability to use saveAll with a Stream has been removed.

This makes it impossible to use the library in true streaming scenarios (e.g. reading large files row-by-row from Excel/CSV without collecting everything into memory).

In previous versions, it was possible to pipe a lazy stream directly into saveAll. In 9.0.0 only Iterable is accepted, which forces full materialization of the dataset.

This effectively breaks streaming ingestion pipelines and increases memory usage for large datasets.

Problem/Example

implementation("org.dhatim:fastexcel-reader:0.20.0")
implementation("de.bytefish:pgbulkinsert:9.0.0")
    fun example() {

        val mapper = PgMapper.forClass(Entity::class.java) // skipped map fields to columns for example
        val writer = PgBulkInsert.PgBulkWriter(mapper)

        ReadableWorkbook(Files.newInputStream(Path.of("entities.xlsx"))).use { wb ->
            wb.firstSheet.openStream().use { rows -> // <- Stream<Row>
                dataSource.connection.use {
                    // Argument type mismatch: actual type is 'Stream<T?>!', but '(Mutable)Iterable<Entity!>!' was expected.
                    writer.saveAll(it, "schema.table", rows.map { mapRowToEntity(it) }) 
                }
            }
        }
    }

Impact

  • Breaks streaming ingestion workflows
  • Forces collect() / materialization into memory
  • Makes it unsafe for large file imports (Excel/CSV, ETL pipelines)
  • Regression compared to previous API flexibility

Expected behavior
saveAll should support streaming input, e.g.:

  • Stream
  • or at least provide an overload:
saveAll(Connection conn, String table, Stream<T> stream)

Workaround
Currently the only option is:

writer.saveAll(conn, "schema.table", Iterable { stream.iterator() })

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions