Skip to content
Permalink
Browse files
refactor Arrow.write to support incremental writes (#277)
* refactor Arrow.write to support incremental writes

* bump julia compat due to dependency on interpolation in Base.Threads.@Spawn

* PR feedback

* add Arrow.Writer-specific tests and in-code/manual documentation

Co-authored-by: Ben Baumgold <ben.baumgold@mavensecurities.com>
  • Loading branch information
baumgold and Ben Baumgold committed Apr 9, 2022
1 parent 9b4de2b commit a3f6da7c1d59f8321315f4955a3b45c48d38aab4
Showing 6 changed files with 229 additions and 104 deletions.
@@ -73,7 +73,7 @@ jobs:
dir: './src/ArrowTypes'
version:
- '1.0'
- '1.3'
- '1.4'
- '1' # automatically expands to the latest stable 1.x release of Julia
- 'nightly'
os:
@@ -87,7 +87,7 @@ jobs:
version: '1.0'
- pkg:
name: ArrowTypes.jl
version: '1.3'
version: '1.4'
steps:
- uses: actions/checkout@v2
- uses: julia-actions/setup-julia@v1
@@ -45,7 +45,7 @@ PooledArrays = "0.5, 1.0"
SentinelArrays = "1"
Tables = "1.1"
TimeZones = "1"
julia = "1.3"
julia = "1.4"

[extras]
CategoricalArrays = "324d7699-5711-5eae-9e2f-1d82baa6b597"
@@ -229,6 +229,10 @@ csv_parts = Tables.partitioner(CSV.File, csv_files)
Arrow.write(io, csv_parts)
```

### `Arrow.Writer`

With `Arrow.Writer`, you instantiate an `Arrow.Writer` object, write sources using it, and then close it. This allows for incrmental writes to the same sink. It is similar to `Arrow.append` without having to close and re-open the sink in between writes and without the limitation of only supporting the IPC stream format.

### Multithreaded writing

By default, `Arrow.write` will use multiple threads to write multiple
@@ -70,6 +70,9 @@ end
_normalizemeta(::Nothing) = nothing
_normalizemeta(meta) = toidict(String(k) => String(v) for (k, v) in meta)

_normalizecolmeta(::Nothing) = nothing
_normalizecolmeta(colmeta) = toidict(Symbol(k) => toidict(String(v1) => String(v2) for (v1, v2) in v) for (k, v) in colmeta)

function _arrowtypemeta(::Nothing, n, m)
return toidict(("ARROW:extension:name" => n, "ARROW:extension:metadata" => m))
end

0 comments on commit a3f6da7

Please sign in to comment.