Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OdbcHook returns None. Related to #15016 issue. #15510

Merged
merged 7 commits into from Jun 13, 2021
Merged

OdbcHook returns None. Related to #15016 issue. #15510

merged 7 commits into from Jun 13, 2021

Conversation

Goodkat
Copy link
Contributor

@Goodkat Goodkat commented Apr 23, 2021

This PR is related to #15016 issue.
OdbcHook returns None for non-boolean-like string values in connect_kwargs dict arg, however connect_kwarg values should remain as is in this case.


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

This PR is related to #15016 issue.
OdbcHook returns None for non-boolean-like string values in connect_kwargs dict arg, however connect_kwarg values should remain as is in this case.
@marcosmarxm
Copy link
Contributor

Nice @Goodkat! Can you add a simple test for this? https://github.com/apache/airflow/blob/master/tests/providers/odbc/hooks/test_odbc.py already has tests for other connections functionalities and add this will helpfully

@marcosmarxm
Copy link
Contributor

@Goodkat can you check discussion on #15016? Maybe is possible to remove this function and change documentation to use JSON object with true/false values. If is possible removing the function will turn the code cleaner =)

According to the discussion on #15016, we agreed to remove the clean_bool function, as it is not needed actually for processing boolean values in JSON .
Removed quotas for the boolean values accroding to the discussion on #15016 issue.
@eladkal
Copy link
Contributor

eladkal commented May 10, 2021

@Goodkat can you fix the failing test?

Fixed the failing pydocstyle test: No blank lines allowed after function docstring (found 1)
Fixed failing pylint check. R1721: Unnecessary use of a comprehension
Integration Test has been changed as it should all work correctly with unquoted raw JSON booleans, which should make clean_bool unnecessary.
Fixed input parameter {'ansi': true} -> {'ansi': True}
@Goodkat
Copy link
Contributor Author

Goodkat commented May 10, 2021

@eladkal should be OK now.

@eladkal eladkal requested a review from dstandish May 10, 2021 17:24
Copy link
Contributor

@dstandish dstandish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's best remove this conversion.

I've commented with reasoning in the issue and notes on how I think the logic came to be this way.

But I'm a little unsure about how we should release, given that it is a breaking change, howevery narrowly. I think this requeires that we create a new major version of the odbc provider.

And it think we could optionally first fix the if statement by adding an else return val in the current provider major version, before removing clean_bool in next. @turbaszek or @eladkal do you have advice here? I'm not sure of the intricacies of bumping provider versions but would be happy to learn so that I can help out with this kind of thing.


Some detail on the way in which this is "breaking":

From pyodbc.connect docstring:

      autocommit
        If False or zero, the default, transactions are created automatically as
        defined in the DB API 2.  If True or non-zero, the connection is put into
        ODBC autocommit mode and statements are committed automatically.

If someone is using connect_kwargs to set autocommit using the string value 'false' then, before this change, autocommit would be False. But after this change, it would be left as string 'False', and therefore it would be non-zero and non-False, and therefore the connection would be created with autocommit on.

On the other hand, users would not be affected if instead they managed autocommit by either (1) using DBApi methods, or (2) explicitly modifying the connection object, or (3) they used actual boolean values instead of strings. And FWIW this hook has not been released very long.

In any case, I think this PR needs a line in updating.md, stating that this hook will no longer convert strings to boolean in connect_kwargs.

After that it's good to merge from my perspective, with the caveat that we need a little guidance from others on whether this needs to wait for next version.

@eladkal
Copy link
Contributor

eladkal commented May 11, 2021

And it think we could optionally first fix the if statement by adding an else return val in the current provider major >version, before removing clean_bool in next

I agree. If we can preserve backward compatibility with deprecation notice we should. we can remove clean_bool in the in a followup major release (whenever it may be)

In any case the steps to bump the major version of provider are:

  1. bump provider yaml
  2. add detailed explanation about the change in the provider changelog
    @potiuk did I miss something?

@eladkal eladkal linked an issue May 11, 2021 that may be closed by this pull request
@eladkal
Copy link
Contributor

eladkal commented Jun 13, 2021

@Goodkat are you still working on it?

@Goodkat
Copy link
Contributor Author

Goodkat commented Jun 13, 2021

@eladkal the implementetion itself is ready to be merged. But the open question is how we should release it.
According to @dstandish recommendation this PR needs a line in updating.md and you also mentioned provider.yaml/provider changelog.
But I am not sure I should change them myself. Maybe we shall ask someone from the release-preparation team?

@potiuk
Copy link
Member

potiuk commented Jun 13, 2021

But I am not sure I should change them myself. Maybe we shall ask someone from the release-preparation team?

That would be me :)

I am about to clarify the rules there (I am just finalizing automation around that in #16405 and #16419 ). They were not fully hashed-out before because we were learning about all the different scenarios there, and I think we have observed pretty much all of them so we are ready to codify that for providers.

For a provider, as an author you only need to add description in provider' CHANGELOG.rst when you have breaking change and want to provide a longer description on how people should migrate. There is no need to bump provider.yaml (this should be done by release manager based on what kind of changes are being released). It's actually harmful to do it before - we will (if we switch to semantic/conventional commits) be able to bump those automatically during release preparation for all providers. For now it's release manager job (and we have tools to help with that).

Coming back to the original question - what do we need now. If I understand correctly, we want to add breaking change where those params have to be 'real' bools not strings. We are releasing breaking changes anyway now for all providers (due to removal of apply_defaults) so adding this now is the best time. I will merge it now, and if you can confirm @Goodkat that this is what the users should do, I will add appropriate description in CHANGELOG.

@github-actions
Copy link

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

@github-actions github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Jun 13, 2021
@potiuk potiuk merged commit 8a4cfd7 into apache:main Jun 13, 2021
@potiuk
Copy link
Member

potiuk commented Jun 13, 2021

BTW. The main reason why there were strings before, I thing was because this is generally how most other extras are passed anyway. And in case we had it defined in the UI (separate UI field for autocommit and ansi), we would have no way to specifying type of those fields (they are always strings). But it would not be possible for ODBC Hook anyway because it uses dict structure for "connect_kwargs" and this is not supported at all. There is quite a bit of inconsistency between different ways extras are handled by different connections and I think in 3.0 we will have to redefine it a bit.

@Goodkat
Copy link
Contributor Author

Goodkat commented Jun 13, 2021

@potiuk thank you for the detailed clarification. I confirm that the basic idea was to amend the parameters' types by means of using 'real' bools not strings.

@potiuk
Copy link
Member

potiuk commented Jun 13, 2021

My changelog proposal:

When you pass kwargs to the connection (for example autocommit and ansi) in connect_kwargs
extra you should bass those as booleans. Previously strings were also supported.

   "connect_kwargs": {
      "autocommit": "false",
      "ansi": "true"
   }

should become

   "connect_kwargs": {
      "autocommit": False,
      "ansi": True
   }

Does it look good @Goodkat ?

@Goodkat
Copy link
Contributor Author

Goodkat commented Jun 13, 2021

@potiuk I would better use true/false without capital letters:

   "connect_kwargs": {
      "autocommit": false,
      "ansi": true
   }

@potiuk
Copy link
Member

potiuk commented Jun 13, 2021

aaaaaaaa Python, python everywhere :).

@potiuk
Copy link
Member

potiuk commented Jun 13, 2021

Corrected. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers okay to merge It's ok to merge this PR as it does not require more tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OdbcHook string values in connect_kwargs dict converts to None
6 participants