-
Notifications
You must be signed in to change notification settings - Fork 3
Prod 325 databricks rollovers #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f409cdf to
2710733
Compare
gorskysd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. The compression handling here is really clear, I like the design pattern you put in. Also the robust use of Path was nice, there are a lot of pieces I saw here that I wasn't familiar with.
I made some minor suggestions for change, but nothing critical IMO, so submitting my approval.
gorskysd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last minute change, sorry!
Corner case to consider where there's only one rollover log and it's not the 0th -- in this case I believe the rollover ID check is bypassed. This is a common case when client's don't know to expect more than one eventlog. See example log in https://synccomputing.atlassian.net/browse/PROD-399?focusedCommentId=11208
…and add support for s3 directories
|
Phew!
|
|
Kudos, SonarCloud Quality Gate passed!
|
gorskysd
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this looks great, Scott.
I'm learning a lot about Path and url handling here that's super useful.








https://synccomputing.atlassian.net/browse/PROD-325
Start with spark_log_parser/__main__.py to see how the new event log module is used.