Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-343] partial parse conflicts with generate_schema_name changes #4850

Closed
1 task done
danielefrigo opened this issue Mar 10, 2022 · 11 comments · Fixed by #6494
Closed
1 task done

[CT-343] partial parse conflicts with generate_schema_name changes #4850

danielefrigo opened this issue Mar 10, 2022 · 11 comments · Fixed by #6494
Assignees

Comments

@danielefrigo
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

We have a project with a custom generate_schema_name macro.
If I change the macro generate_schema_name, project compilation fails with error a string or stream input is required.
If I run a dbt clean before compiling, compilation completes without errors.

Expected Behavior

Compilation should not depend on dbt clean

Steps To Reproduce

dbt clean
dbt deps
dbt compile --> no errors

modify generate_schema_name macro (even adding a jinja comment):

dbt compile --> error

10:01:21 Running with dbt=1.0.3
10:01:22 Change detected to override macro used during parsing. Starting full parse.
10:01:34 Encountered an error:
a string or stream input is required

Relevant log output

No response

Environment

- OS: Windows 10
- Python: 3.7.9
- dbt-core: 1.0.3
- dbt-bigquery: 1.0.0

What database are you using dbt with?

bigquery

Additional Context

No response

@danielefrigo danielefrigo added bug Something isn't working triage labels Mar 10, 2022
@github-actions github-actions bot changed the title partial parse conflicts with generate_schema_name changes [CT-343] partial parse conflicts with generate_schema_name changes Mar 10, 2022
@gshank
Copy link
Contributor

gshank commented Mar 16, 2022

I have not been able to recreate this problem (though I don't have access to a Windows machine to test). Is there anything in the logs? Can you share your generate_schema_name macro? Could you try removing all files except the partial_parse.msgpack file from the target directory? Since the msgpack file has been loaded (or the error would be earlier), I'm wondering if it's one of the other files in the target directory that's the issue.

@gshank gshank removed the triage label Mar 16, 2022
@danielefrigo
Copy link
Contributor Author

I tried reproducing the issue on a fresh project and the error didn't arise.
It must be due to some interference with models / macro of the project.
I'll try to do some testing in order to isolate what's the root cause.

@jpcassil
Copy link

jpcassil commented Jun 7, 2022

Thank you for opening this issue.

This also happened to me. I was running DBT inside of a docker container:

FROM python:3.7-slim-buster

running poetry with this pyproject.toml configuration:

[tool.poetry.dependencies]
python = "^3.7"
dbt-core = "^1.0.0"
dbt-bigquery = "^1.0.0"
dbt-coverage = "^0.1.8"
mkdocs = "^1.3.0"
mkdocs-material = "^8.2.11"

[tool.poetry.dev-dependencies]
ipython = "^7.31.1"
ipdb = "^0.13.7"
sqlfluff = "0.5.2"
black = "^21.4b2"
pylint = "^2.12.2"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

The CI/CD pipeline broke because of this issue and I when I added the dbt clean and dbt deps commands to my workflow, it worked.

@thijsnijhuis
Copy link

thijsnijhuis commented Jul 22, 2022

Hi, I got the same error when creating my own custom "source" macro. The macro itself just calls the builtins.source and does nothing else. It works fine when first do a "dbt clean" and a "dbt deps" like described above. But when I change something in this file (might just be a comment) I get the output below and first need to do a clean and dpes again to get it working.

PS C:\GIT\project\dbt> dbt compile
14:01:01  Running with dbt=1.1.0
14:01:01  Change detected to override macro used during parsing. Starting full parse.
14:01:02  No external sources selected
14:01:03  Encountered an error:
a string or stream input is required

I have the dbt_utils, spark_utils and dbt_external_tables packages installed. I use the dbt-databricks adapter and I get the error while running in Visual Studio Code on a WIN11 machine.

Hope this helps with finding the issue.

@kcd83
Copy link

kcd83 commented Aug 2, 2022

I hit this running DBT in a docker container whenever I changed the \macros\get_custom_schema.sql macro. Resolved the issue by running dbt clean after each change.

@moseleyi
Copy link

moseleyi commented Aug 2, 2022

Getting the same error in a fresh project that uses macros for generating schema and aliases names. dbt clean solves the problem

@Aenger
Copy link

Aenger commented Aug 16, 2022

I have this issue now, updated the generate_schema_name macro, with a minor change. then it now fails with the above mentioned error msgs. But I am not able to run a dbt deps (for Cloud) as suggested here as the UI instantly fails and gives me this message:
image

@jeremyyeo
Copy link
Contributor

jeremyyeo commented Sep 20, 2022

I've got a repro of this one (ensure partial parsing hasn't been accidentally disabled).

# dbt_project.yml

name: "my_dbt_project"
version: "1.0.0"
config-version: 2
profile: "snowflake"

models:
  my_dbt_project:
    +materialized: table
    +database: development
-- models/foo.sql
select 1 as user_id
-- macros/generate_database_name.sql
{% macro generate_database_name(custom_database_name, node) -%}
    {%- set default_database = target.database -%}
    {%- if target.name == 'dev' -%}
        {{ default_database }}
    {%- elif target.name == 'prod' -%}
        {{ custom_database_name | trim }}
    {%- else -%}
        {{ default_database }}
    {%- endif -%}
{%- endmacro %}

Key to this is you want to add a file models/schema.yml that is completely empty (has no content in it - not even yaml comments or anything):

  1. Clean out old partial_parse.msgpack files and then compile:
$ dbt clean && dbt compile

01:33:53  Running with dbt=1.2.1
01:33:53  Checking target/*
01:33:53  Cleaned target/*
01:33:53  Finished cleaning all paths.
01:33:58  Running with dbt=1.2.1
01:33:58  Partial parse save file not found. Starting full parse.
01:33:59  Found 1 model, 0 tests, 0 snapshots, 0 analyses, 268 macros, 0 operations, 0 seed files, 0 sources, 0 exposures, 0 metrics
01:33:59  
01:34:02  Concurrency: 1 threads (target='dev')
01:34:02  
01:34:02  Done.
  1. Modify the generate_database_name macro slightly:
-- macros/generate_database_name.sql
{% macro generate_database_name(custom_database_name, node) -%}
    {%- set default_database = target.database -%}
    {%- if target.name == 'dev' -%}
        {{ default_database }}
    {%- elif target.name == 'ci' -%}
        {{ custom_database_name | trim }}
    {%- else -%}
        {{ default_database }}
    {%- endif -%}
{%- endmacro %}
  1. Recompile but do not clean out the partial_parse.msgpack file like you'd do in step (1):
$ dbt compile

01:38:10  Running with dbt=1.2.1
01:38:10  Change detected to override macro used during parsing. Starting full parse.
01:38:11  Encountered an error:
a string or stream input is required
01:38:11  Traceback (most recent call last):
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/main.py", line 129, in main
    results, succeeded = handle_and_check(args)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/main.py", line 191, in handle_and_check
    task, res = run_from_args(parsed)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/main.py", line 238, in run_from_args
    results = task.run()
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/task/runnable.py", line 451, in run
    self._runtime_initialize()
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/task/runnable.py", line 159, in _runtime_initialize
    super()._runtime_initialize()
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/task/runnable.py", line 92, in _runtime_initialize
    self.load_manifest()
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/task/runnable.py", line 81, in load_manifest
    self.manifest = ManifestLoader.get_full_manifest(self.config)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/manifest.py", line 219, in get_full_manifest
    manifest = loader.load()
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/manifest.py", line 366, in load
    self.parse_project(
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/manifest.py", line 469, in parse_project
    parser.parse_file(block, dct=dct)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/schemas.py", line 488, in parse_file
    dct = yaml_from_file(block.file)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/parser/schemas.py", line 118, in yaml_from_file
    return load_yaml_text(source_file.contents, source_file.path)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/clients/yaml_helper.py", line 56, in load_yaml_text
    return safe_load(contents)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/dbt/clients/yaml_helper.py", line 51, in safe_load
    return yaml.load(contents, Loader=SafeLoader)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/yaml/__init__.py", line 79, in load
    loader = Loader(stream)
  File "/Users/jeremy/src/dbt-sandcastles/venv_dbt_latest/lib/python3.8/site-packages/yaml/cyaml.py", line 26, in __init__
    CParser.__init__(self, stream)
  File "yaml/_yaml.pyx", line 288, in yaml._yaml.CParser.__init__
TypeError: a string or stream input is required

Good ol yaml trying to parse an empty on the second go around - but the first time is okay though (full parse) because we do not try and read partial_parse.msgpack on the first go round.

@github-actions
Copy link
Contributor

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

@github-actions github-actions bot added the stale Issues that have gone stale label Dec 20, 2022
@github-actions
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

1 similar comment
@github-actions
Copy link
Contributor

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 28, 2022
@gshank gshank reopened this Dec 31, 2022
@github-actions github-actions bot removed the stale Issues that have gone stale label Jan 1, 2023
@gshank gshank self-assigned this Jan 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants