Skip to content

Conversation

@eakmanrq
Copy link
Contributor

@eakmanrq eakmanrq commented Feb 6, 2023

  • The skeleton project doesn't include an __init__.py when generating the macro directory and I got an error when running. Now we generate this by default.
  • There was a bug in part of our code where we assumed all tables existed which isn't the case for external models
  • I was having errors when updating the snapshots table. Databricks Delta does not allow concurrent updates to a single underlying file. Therefore I have to partition the table by the condition to force a file per update which also means this table now has a file per partition which isn't good from a performance perspective. I believe a reason for why I didn't hit this before was because I ran in single node mode and that limited the concurrency.
    • Note: I decided to implement this at the engine level. Eventually we will be adding support for setting and understanding the actual format of the tables we are creating and it would make sense to couple this to that once it is created. It would have significantly increased the scope of this PR if I tried to add this now.

Screenshot 2023-02-06 at 9 25 50 AM

@eakmanrq eakmanrq force-pushed the eakmanrq/improve_skeleton branch from 4d6d88c to ef7f08c Compare February 6, 2023 17:30
@eakmanrq eakmanrq requested a review from a team February 6, 2023 17:30
self.engine_adapter.create_schema(self.snapshots_table)

self.engine_adapter.create_table(
self.engine_adapter.create_snapshots_table(
Copy link
Member

@izeigerman izeigerman Feb 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the same problem apply to the environments table? Shouldn't we partition it as well?

table_name, query_or_columns_to_types, exists, **kwargs
)

def create_snapshots_table(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the fact that the state sync logic leaks into something as low lever as engine adapter

@eakmanrq eakmanrq merged commit d19dbb7 into main Feb 6, 2023
@eakmanrq eakmanrq deleted the eakmanrq/improve_skeleton branch February 6, 2023 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants