Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feat] SQFluff Interface Modification #19

Merged
merged 14 commits into from Oct 13, 2022
Merged

[feat] SQFluff Interface Modification #19

merged 14 commits into from Oct 13, 2022

Conversation

z3z1ma
Copy link
Owner

@z3z1ma z3z1ma commented Oct 11, 2022

TODO: Extend description

I think our scope does not require us to limit ourselves to "models" and we should attach ourselves to the SQL itself. This simplifies the whole process as can be seen in this PR.

source_dbt_sql = str(in_str)
fpath = Path(fname)
if fpath.exists():
source_dbt_sql = fpath.read_text()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this code (which was basically already there, i.e. my code, not yours), I realize now we may want to use in_str rather than reading from disk, because SQLFluff knows and uses the encoding specified in .sqlfluff.

  1. Reads the file here
  2. Passes it to render_string() here
  3. Passes the string to the templater here

Some of the existing code above (that you removed) also has issues -- there are a few open issues against sqlfluff-templater-dbt at this moment related to incorrect handling of file encodings. 😅

These are the open issues. I just now added a similar comment to those -- looks like an easy fix now that I'm staring at the code!

@pytest.mark.parametrize(
"path", ["models/my_new_project/disabled_model.sql", "macros/echo.sql"]
@pytest.mark.parametrize("path", ["models/my_new_project/disabled_model.sql", "macros/echo.sql"])
@pytest.mark.skip(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to just remove this test. I initially copied all the tests from sqlfluff-templater-dbt, then removed several that weren't relevant.

BTW, it's nice that the new templater handles some of this better than sqlfluff-templater-dbt. It would be cool to apply some of this back to that templater.

@@ -172,6 +172,9 @@ def test__templater_dbt_templating_test_lex(project_dir, dbt_templater, fname):
),
],
)
@pytest.mark.skip(
reason="We don't need to exclude disabled models since models aren't relevant in our implementation, we lint SQL strings"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to remove this as well.

@barrywhart
Copy link
Contributor

Nice cleanup! I had a few suggestions, one of which may address existing issues copied over from the original sqlfluff-templater-dbt.

retry = 3
while retry > 0:
# In massive parallel requests, referring to higher loads than we will ever see in all likelihood,
# this cheap retry increases reliability to ~100% in test cases of up to 50 clients across 1K reqs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When it fails and needs to retry, what errors were you seeing?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a duplicate node name failure. Even using a uuidv4 I found it can happen. This is accounting for a real edge case but I feel the code is safer by virtue of having it than it is by not.

On a deeper level for clarities sake, the get server node method actually adds a node to the manifest, the same manifest which we can access from multiple threads. So we make this operation atomic. I could probably hide it inside DbProject as a compile_node_safe as I think there is one other place I do this same thing.

These failures only occur when you reach a certain level of concurrency tested on a pretty fast M1 chip.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants