-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull the owner from the DESCRIBE EXTENDED #39
Conversation
hey @Fokko - can you merge master into this branch? After you do that, I'll kick off the unit + integration tests. I'll give this a deeper look in the coming week :) |
Sure, I've pulled master into this. Since we have integration testing now as well, I'll check if I can come up with a better test :-) |
I'm having difficulty testing this branch:
describe extended `my_db`.`my_table`
It seems like the statement is being parsed wrong, somehow. I wonder if there's some session parameter or cluster configuration I'm missing. I'm running:
@Fokko I'm curious if you've encountered a similar issue before? |
Thanks @jtcohen6 for giving this a try. Currently, I'm on holiday so I don't have access to any production environment. But I think you've got an edge case here. With
If you didn't do this, you'll probably get the error that you're seeing. Allow me to push a patch. |
@jtcohen6 Can you share the full stack trace? |
@Fokko Ah! Sorry to bother you on holiday. I figured out what's going wrong on my end: Here's the trace of what's going wrong:
It looks like this is the offending line. I'm not sure if it's because:
|
@drewbanin Would you be able to take a look at the error in |
I think the empty line is being transformed into a None, I can do additional checks today or tomorrow. |
I think the issue here is indeed that In
For reference the dbt Core 0.14.3 implementation of Ultimately, this logic is going to look pretty similar to the current contents of |
48a8bbf
to
9b6a134
Compare
@Fokko Thank you for making these changes! This PR worked for me when I checked it out last night. I'm kicking off tests now. |
My pleasure @jtcohen6. Let me know what comes out of it. |
@jtcohen6 The error looks unrelated. After this one, I'll continue on two other PR's:
|
@Fokko I figured out why the integration test is failing!
The implementation of Other adapters' implementations of In the long run, I think the better answer here is to reimplement the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Fokko Thank you for making these changes! This is looking really close. I think you'll just need to revise the new unit test test_parse_relation
, since you've changed the method names.
@drewbanin I'm unable to locally replicate the latest integration test failure. Could you take a look when you have a second? |
@jtcohen6 I was already looking into it. Not sure what caused it, could it be a race condition? This could also be wishful thinking. |
@jtcohen6 Can we trigger the tests once more? :) |
@Fokko :( I'll try to get to the bottom of this |
I’ll dig into it as well |
@Fokko @jtcohen6 I gave this a spin locally and I found that the two tables (once created via a seed, one created via an incremental model) were created with the columns in a weird order. This manifests as an error when the incremental model is run twice. It writes columns out of order, which looks something like this: Have you two seen anything like this before? |
I got the integration tests working locally as well. I've noticed that the inserted data is shifted by one, the It looks like the SEED is not properly being created:
the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I've hit the bottom of this. It turns out that this integration test has been a false negative all along. The change in this PR to correctly define Spark's quote character `
caused the test to finally work, and caused me to realize that we need an update to our incremental materialization (#59). Together with the proposed changes in #60, we should finally have a passing integration test.
See my one comment about updating the get_columns_in_relation
method to avoid including partition columns twice.
dbt/adapters/spark/impl.py
Outdated
def find_table_information_separator(rows: List[dict]) -> int: | ||
pos = 0 | ||
for row in rows: | ||
if not row['col_name']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid including partition columns twice, could this be something like
if not row['col_name']: | |
if not row['col_name'] or row['col_name'] == '# Partition Information': |
I'm not sure if that's the most elegant way of writing, and it would need at least a line break to satisfy pep8
@Fokko Could you take a look at this when you get a chance? There's one small change I think we should make here to avoid duplicate columns in the |
Thanks for pining me @jtcohen6. I was away last week, and the notification got lost. I've updated the PR |
We can test against Spark when #60 has been merged :) |
Thanks @Fokko! As it is, #39 depends on #60 and #60 depends on #39 for passing integration tests. I've confirmed locally that the two in combination can pass this integration test. I think I'm going to take what I see as the simplest approach here:
Does that sound ok by you? |
Sounds good, I'm also happy to cherry-pick #60 onto this branch. |
I would like to add the owner to the docs, therefore I've written a PR that will do a DESCRIBE EXTENDED to get the relevant fields. Added an unit test, and tested it against Databricks/Azure.