-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-6947] Refactored HoodieSchemaUtils.deduceWriterSchema with many flags #10810
[HUDI-6947] Refactored HoodieSchemaUtils.deduceWriterSchema with many flags #10810
Conversation
d0f3399
to
137b32a
Compare
Hmm, good code quality is what we always in persuit with. |
137b32a
to
6e22785
Compare
Please, look at diffs in a split regime. The main refactoring is in the |
@geserdugarov thanks for the contribution! I will review this today |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. @yihua is going to review as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Skimmed the changes and LGTM
Change Logs
Current implementation of
![before_refactoring](https://private-user-images.githubusercontent.com/67073364/309740949-15507db6-92f4-4dd3-86e9-c4fa4b3c05ab.jpg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjExNDI3MDQsIm5iZiI6MTcyMTE0MjQwNCwicGF0aCI6Ii82NzA3MzM2NC8zMDk3NDA5NDktMTU1MDdkYjYtOTJmNC00ZGQzLTg2ZTktYzRmYTRiM2MwNWFiLmpwZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzE2VDE1MDY0NFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTYyNDNhMDE1NTQ5MDE5ZDJhYTUxMTc0M2U4YmU4MWM4M2IzYjY2YjllNWQzMmMyZjBmYzZkNDIxYjNmMzhhMmQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.AXIZT1wEYr1eZm6eTmq7bkgeVuVahh0GvowU85DHbbs)
HoodieSchemaUtils.deduceWriterSchema
is really complicated with many flags used in one place.To figure out how this method is working, we need to draw directed acyclic graph. Here the current implementation:
Yellow is used for marking of flags, and purple is used for marking of returned schemas or exceptions.
We couldn't remove any of the used flags without changing processing logic. The only possibility is to refactor
deduceWriterSchema()
for simplification, and preparing for future removing of deprecatedhoodie.datasource.write.reconcile.schema
.This MR proposes to consider the following changed processing schema:
![after_refactoring](https://private-user-images.githubusercontent.com/67073364/309742000-97683b54-edfb-4941-a9e9-e92a0fe34066.jpg?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjExNDI3MDQsIm5iZiI6MTcyMTE0MjQwNCwicGF0aCI6Ii82NzA3MzM2NC8zMDk3NDIwMDAtOTc2ODNiNTQtZWRmYi00OTQxLWE5ZTktZTkyYTBmZTM0MDY2LmpwZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzE2VDE1MDY0NFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWEwOThkZDhlYzRlOGQyZTFhNjc0NzBiNzAzOWM1MjRiYTljNjVmMTJmZTI0ZTQ4ZDIzY2I1YzI4MDgxMTk3MjEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.0PTJt0E4BJC0yh5jEjkgOYJN9AuTJ8OTY-DkKl3AEDA)
Impact
Refactored
HoodieSchemaUtils.deduceWriterSchema
.Risk level (write none, low medium or high below)
Low.
Documentation Update
No need.
Contributor's checklist