Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.15.0+ versions not showing errors with duplicate doc names #2054

Closed
1 of 5 tasks
braydenconnole opened this issue Jan 16, 2020 · 6 comments
Closed
1 of 5 tasks

0.15.0+ versions not showing errors with duplicate doc names #2054

braydenconnole opened this issue Jan 16, 2020 · 6 comments
Labels
bug Something isn't working

Comments

@braydenconnole
Copy link

Describe the bug

It looks like version 0.15.0 removed a feature that throws an error when there are duplicate doc names.

Steps To Reproduce

In a doc block define two docs variables with the same name like so:

{% docs person_id %}
A person ID
{% enddocs %}

{% docs person_id %}
A second person ID
{% enddocs %}

Make a .yml file that references '{{doc("person_id")}}'

version: 2
sources:
  - name: schema
    tables:
      - name: person
        description: "a person table"
        columns:
          - name: person_id
            description: '{{doc("person_id")}}'
          

run dbt docs generate or dbt run using version 0.14.4 of dbt. (also works on some lower versions I've only checked 0.14.3 and 0.14.4)
and you'll get this error:

Running with dbt=0.14.4
Encountered an error:
Compilation Error
  dbt found two resources with the name "person_id". Since these resources have the same name,
  dbt will be unable to find the correct resource when ref("person_id") is used. To fix this,
  change the name of one of these resources:
  - etl.person_id(models/persons/persons.md)
  - etl.person_id (models/persons/persons.md)

If you upgrade dbt to 0.15.0 or 0.15.1, This error never occurs and dbt compiles the docs.

Expected behavior

I would expect the newer version to also throw the error when there are duplicates in the doc block names

System information

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

0.14.4 and 0.15.0

The operating system you're using: Mac OS Catalina
The output of python --version: Python 3.7.3

I did a little digging to see if I could find where the change occurred between the two versions and I did find that the raise_duplicate_resource_name function in the core/dbt/exceptions.py file was changed in this commit: e54661b5df3b3126eb94c16f9784b6ad438b0232

To include this if block:

    if node_1.resource_type in NodeType.refable():
        getter = 'ref("{}")'
    elif node_1.resource_type == NodeType.Source:
        getter = 'source("{}")'
    elif node_1.resource_type == NodeType.Test and 'schema' in node_1.tags:
        return
    get_func = getter.format(duped_name)

Maybe the code is hitting the return in this case so the error is never thrown? I haven't been able to dig much further than this so far.

I didn't see a ton of changes to other functions that called raise_duplicate_resource_name between the 0.14.4 and 0.15.0 versions, so I'm guessing that it probably starts here.

@braydenconnole braydenconnole added bug Something isn't working triage labels Jan 16, 2020
@braydenconnole braydenconnole changed the title 0.15.0 version not showing errors with duplicate doc names 0.15.0+ versions not showing errors with duplicate doc names Jan 16, 2020
@drewbanin drewbanin removed the triage label Jan 21, 2020
@drewbanin drewbanin added this to the Barbara Gittings milestone Jan 21, 2020
@drewbanin
Copy link
Contributor

Thanks for the report @braydenconnole! This is definitely a regression.

I took a quick look at the commit you sent over, but I'm not certain that that's the relevant part of the codebase here. Instead, check out how dbt parsed docs in 0.14.x:
https://github.com/fishtown-analytics/dbt/blob/b5aff36253f5c6563e942265bbaa4cb366722b14/core/dbt/parser/docs.py#L95-L99

And how dbt parses docs in 0.15.x:
https://github.com/fishtown-analytics/dbt/blob/77dceab7beaeefd833bb96928fe1a546339cfb0d/core/dbt/parser/results.py#L76-L97

Note that the add_doc method does not check for duplicates, whereas the add_node method does!

I think the fix is pretty straightforward here:

  1. Add a call to _check_duplicates which validates that the doc is not in self.docs
  2. Update the raise_duplicate_resource_name method to account for Documentation nodes
  3. Add a test so this doesn't happen again :)

Let me know if this is something you'd be interested in contributing a fix for -- we're super happy to help out however we can!

Last, I'm cc'ing @beckjake here in case I'm missing something important - I think I recall discussing this change with him recently, but I'm hazy on the details.

@braydenconnole
Copy link
Author

@drewbanin good find! I could contribute a fix for it.

I'll take a look at your contributor documentation and let you know when I have any questions on it.

@bubbomb
Copy link
Contributor

bubbomb commented Jan 27, 2020

@drewbanin After reading your CLA. I'm switching to my personal account instead. You should see more things coming from this account instead.

@drewbanin
Copy link
Contributor

Thanks for the update @bubbomb - that works for me!

@bubbomb
Copy link
Contributor

bubbomb commented Jan 30, 2020

@drewbanin I put up a PR that fixes the issue, but I'm having trouble figuring out where to put the test, as well as how to set up the docstrings appropriately so the test can work.

Any help would be awesome on this front

Thanks!

beckjake added a commit that referenced this issue Feb 11, 2020
@drewbanin
Copy link
Contributor

closed by #2080

Thanks for opening the issue and coming through with the fix @bubbomb!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants