Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add config param for batch router for name customize #3461

Merged
merged 3 commits into from Jun 12, 2023
Merged

feat: add config param for batch router for name customize #3461

merged 3 commits into from Jun 12, 2023

Conversation

peakle
Copy link
Contributor

@peakle peakle commented Jun 7, 2023

Description

Add feature for batch router via user can add custom prefix for batch.

Notion Ticket

__

Security

  • The code changed/added as part of this pull request won't create any security issues with how the software is being used.

@maratbakiev2
Copy link

Please, review it and merge. This is very important feature. For example, adding date= as a prefix will allow us to declare all folders of rudderstack outputs as a Hive partitioned table

@gitcommitshow
Copy link
Collaborator

Thank you @peakle for the contribution.

Can you sign the CLA?
This is required before we can merge code from a new contributor. Learn more.

@peakle
Copy link
Contributor Author

peakle commented Jun 8, 2023

Thank you @peakle for the contribution.

Can you sign the CLA? This is required before we can merge code from a new contributor. Learn more.

@gitcommitshow, done

@maratbakiev2
Copy link

@gitcommitshow how we move forward with this? We should wait for review from RS or something else? If we wait for review could you provide approximate deadlines for that?

@gitcommitshow
Copy link
Collaborator

@maratbakiev2 internal disucssion has been started already. Can't say about the deadline to merge but I'll try to close the discussion before next weekend.

One thing the team pointed out was that the partitioning is already supported for datalake destinations. I don't have complete knowledge about that, let me gather some more info to move the discussion ahead.

@lvrach lvrach changed the title add config param for batch router for name customize feat: add config param for batch router for name customize Jun 9, 2023
@maratbakiev2
Copy link

@maratbakiev2 internal disucssion has been started already. Can't say about the deadline to merge but I'll try to close the discussion before next weekend.

One thing the team pointed out was that the partitioning is already supported for datalake destinations. I don't have complete knowledge about that, let me gather some more info to move the discussion ahead.

@gitcommitshow

partitioning in terms of writing to separate folders for each day is indeed supported. But! One of the most popular technologies to process that data is Hive and Hive (Hive Metastore) needs partitions in the format of column=value, for example date=2023-06-08. While currently rudderstack uses just values (2023-06-08) as folder names

@lvrach lvrach merged commit c16e692 into rudderlabs:master Jun 12, 2023
12 of 31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants