Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-1343] [Regression] 1.3.0 breaks projects that colocate non-dbt-related Python code with sql model ("dbt only allow one model defined per python file") #6061

Closed
2 tasks done
tin-homa opened this issue Oct 13, 2022 · 8 comments
Labels
bug Something isn't working python_models regression

Comments

@tin-homa
Copy link

tin-homa commented Oct 13, 2022

Is this a regression in a recent version of dbt-core?

  • I believe this is a regression in dbt-core functionality
  • I have searched the existing issues, and I could not find an existing issue for this regression

Current Behavior

We have been structuring our projects to have the models_path pointing to folders that contains both dbt code (in .sql files) and other python code used for extract-load the data, or run some light modelling. This simplifies our repo structure a lot and has been very convenience to work with (compared to the default approach of having a separate dbt folder).

However, since v1.3.0 we're failing to run any dbt model (sql) due to this error

Parsing Error in model the_unrelated_python_file
dbt only allow one model defined per python file

We realize this is due to the new version starting looking into .py files to find models, and assuming all .py files are dbt-related.

Expected/Previous Behavior

While I look forward to using dbt Python model, I don't think this feature should inhibit the ability of having other type of Python files in the same folder.
I'd suggest to either

  • ignore .py files that don't have a specific dbt marker
  • allow a project config to turn off the use of dbt Python altogether (we use Redshift so it's not even supported yet I believe)

Steps To Reproduce

  1. Have a dbt project. Inside the model-path folder, put:
    • a sql dbt model
    • a non-dbt Python file which contains more than 1 class definition.
  2. Run dbt run -s the_sql_model
  3. See the error

Relevant log output

No response

Environment

- OS: Ubuntu 20.04
- Python: 3.8
- dbt (working version): <=1.2.2
- dbt (regression version): 1.3.0

Which database adapter are you using with dbt?

redshift

Additional Context

No response

@tin-homa tin-homa added bug Something isn't working regression triage labels Oct 13, 2022
@github-actions github-actions bot changed the title [Regression] 1.3.0 breaks projects that colocate non-dbt-related Python code with sql model ("dbt only allow one model defined per python file") [CT-1343] [Regression] 1.3.0 breaks projects that colocate non-dbt-related Python code with sql model ("dbt only allow one model defined per python file") Oct 13, 2022
@ChenyuLInx
Copy link
Contributor

@tin-homa Sorry to hear that you run into this issue! We added .dbtignore feature to help with this situation.

Can you try to add the .dbtignore file at the project root dir that excludes those files? Let us know if there's still a issue afterwareds!

@tin-homa
Copy link
Author

tin-homa commented Oct 13, 2022

Hey @ChenyuLInx. Thanks for the fast response. I couldn't get it to work though. Am I doing something wrong?

  • I tried putting .dbtignore in either project root (same level with dbt_project.yml) and inside project/dbt/.
  • I have either *.py or **.py inside.

None of them worked (same error)

@lostmygithubaccount
Copy link
Contributor

hi @tin-homa, I just tested this and placing **.py works for me to ignore all .py files in the models directory. can you confirm the version of dbt you're on?

the .dbtignore should be in the project root, on the same level as dbt_project.yml. one example here: https://github.com/dbt-labs/dbt-demo-data/blob/snowflake-sql-v-py/.dbtignore

I tested another with **.py in it and a few Python files

@tin-homa
Copy link
Author

Here's some screenshots

  • 1.3.0 error
    image

  • 1.2.2 success (same repo, same code, I just reinstall dbt through pip)
    image

  • Here's where my .dbtignore is placed, same level with dbt_project.yml
    image

  • Here's the content of the file
    image

  • FYI my model-paths have multiple folders inside. Maybe it's an edge case that's not been considered during dbtignore implementation?
    image

@ChenyuLInx
Copy link
Contributor

@tin-homa I still couldn't reproduce the issue, with multi layer of folder and everything. I wonder could it be library mismatch? Can you try pip freeze | grep pathspec and see which version of pathspec you have? I had 0.9.0 and also just tested 0.10.1 works fine.

Also happy to hop on a huddle on dbt slack to take a look together, you can find me at @Chenyu Li(dbt Labs)

@tin-homa
Copy link
Author

I have pathspec==0.9.0 as well.
Let me try creating a minimal example that can produce the error.

@tin-homa
Copy link
Author

I tested again and the dbtignore works for me now. Maybe I was doing stupid things yesterday or I just needed to restart my environment somehow. Sorry for all the fuzz and thanks a lot for the quick response.

@jtcohen6 jtcohen6 removed the triage label Oct 16, 2022
@jtcohen6
Copy link
Contributor

Very glad you were able to get this working!

@lostmygithubaccount We still need to actually document .dbtignore: dbt-labs/docs.getdbt.com#2043

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working python_models regression
Projects
None yet
Development

No branches or pull requests

4 participants