Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add range partitioning support #174

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

kitagry
Copy link

@kitagry kitagry commented Feb 15, 2025

This pull request introduces support for range partitioning in BigQuery.

I checked this feature with example/config_replace_field_range_partitioned_table.yml

image

@hiroyuki-sato hiroyuki-sato self-requested a review February 15, 2025 15:47
Copy link
Member

@hiroyuki-sato hiroyuki-sato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for creating this PR. I'll review this later. Before review this PR, I have a question. Could you check my comment?

README.md Outdated
range_partitioning:
field: customer_id
range:
start: '1'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this part use string instead of integer? As far as I know, range partition uses a number. And we can check the range is valid (start < end) if we use integer.

What do you think of this configuration layout?
(Do we need a range block?).

range_partitioning:
  field:  customer_id # document uses `column` but this plugin uses `field`. [1] 
  start: 1
  end: 1000
  interval: 10
  # [1] https://cloud.google.com/bigquery/docs/creating-partitioned-tables#create_an_integer-range_partitioned_table

(This is just my opinion, I want to ask co-maintainers this comment.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the comment! I followed the api documentation. If you prefer integer, I'll change this!

https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#RangePartitioning

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a range block?

I fixed it with 49d2c36.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, @kitagry. Thank you for waiting.

Could you use this layout? (Sorry, we decided to use the original design (except using integer instead of string in range values.)

range_partitioning:
  field: customer_id
  range:
    start: 1    # integer not string.
    end: 99999  # integer not string.
    interval: 1 # integer not string.

and Could you check the range start + interval < end?
If you have any concern, please let me know.

I referenced the time_partition configurations.

BigQuery API use

{
  "type": string,
  "expirationMs": string,
  "field": string,
  "requirePartitionFilter": boolean
}

embulk configuration

  type: bigquery
  table: table_name$20160929
  time_partitioning:
    type: DAY
    expiration_ms: 259200000 # integer not strong, use sake case

We discussed this using this design document. (Written in Japanese)

After modification, I'll check the partition feature.

@hiroyuki-sato
Copy link
Member

@kitagry Thank you for more work on this PR. I'll review this PR please wait. I'm not the original plugin developer. So, I need to investigate the configuration rule. If this plugin is based on the API settings, It would be better to use the original field.range.start. I'll talk co-maintainer about this.

@kitagry kitagry force-pushed the add-range-paritioned branch from 49d2c36 to f8d039f Compare March 6, 2025 12:54
@kitagry
Copy link
Author

kitagry commented Mar 6, 2025

Hi @hiroyuki-sato , I changed range-partitioning fileds to be integer in f8d039f

@kitagry kitagry force-pushed the add-range-paritioned branch from f8d039f to 5398f69 Compare March 9, 2025 06:26
@hiroyuki-sato
Copy link
Member

@kitagry Thanks. I will test this PR later. Please wait for a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants