Skip to content

v1.0.0 - Support for arbitrary partitioning schemas

Latest
Compare
Choose a tag to compare
@sleepdeprecation sleepdeprecation released this 29 Oct 18:32
· 4 commits to master since this release
565ea32

Where the initial releases of glutil only supported tables with a [year, month, day, hour] schema, now arbitrary partition schemas are supported. So long as they conform to either:

  • Partition keys are path segments

    partition keys: [year, month, day]
    s3://bucket/table/prefix/2019/08/12/ => [2019, 08, 12]
    
  • Partition keys are path segments, with hive-format paths

    partition keys: [dt]
    s3://bucket/table/prefix/dt=2019-08-12/ => [2019-08-12]
    
  • Single-key partitions, where the partition value is the path with slashes replaced with hyphens

    partition keys: [dt]
    s3://bucket/table/prefix/2019/08/12/ => [2019-08-12]
    

Everything appears to be working as expected, the only change is that Partitioner.partitions_on_disk only accepts a limit_days key if the table's first three partition keys are [year, month, day].