Skip to content
This repository has been archived by the owner on Mar 29, 2023. It is now read-only.

Check for negative values before doing substr #32

Merged
merged 3 commits into from
Apr 27, 2021

Conversation

renato2099
Copy link
Contributor

@renato2099 renato2099 commented Apr 20, 2021

hey @tswast ,

I took a swing at https://github.com/ibis-project/ibis/issues/1629

What I'd have wanted was to use the ibis.backends.base.sql.string.registry.substring method but that requires changes to the ibis.backends.base.sql.registry.__init__.py and so on to make it visible here.
For the time being I copied the substring implementation here, but I could open a PR on the ibis repo to make the substring method visible, and then we could use it here directly.
Anyway, let me know what you think

@tswast tswast added the run-ci Code has been reviewed and is deemed safe to run label Apr 21, 2021
@tswast
Copy link
Collaborator

tswast commented Apr 21, 2021

Looking at the GitHub Actions, there is a little clean-up required still. The test you added is failing for some reason.

________________________________ test_substring ________________________________

    def test_substring():
        t = ibis.table([('value', 'string')], name='t')
        expr = t["value"].substr(3, -1)
        with pytest.raises(Exception) as exception_info:
>           ibis.bigquery.compile(expr)
E           Failed: DID NOT RAISE <class 'Exception'>

tests/unit/test_compiler.py:265: Failed

CC @datapythonista to comment on whether ibis.backends.base.sql.string.registry.substring can be made visible.

@tswast
Copy link
Collaborator

tswast commented Apr 21, 2021

Thanks for the contribution, by the way! And for adding tests :-)

@datapythonista
Copy link
Collaborator

To me, for the long term, it'd make sense that the operation registry is a class, and that all these translation functions are their methods. And that you can subclass the registry that makes sense (one that translates to SQL strings, or SQLAlchemy operations mainly), and overwrite what changes from a default.

So, making that public doesn't seem great, since I'd like to move it in the future.

Didn't check for bigquery, but do you create your operation registry (now a dict, not a class) from scratch, or do you use a copy of the base registry, and update it? I've seen both things depending on the backend.

If you create it from scratch, but want to reuse that function, you can start by the base, or also import the base, and use operation_registry[ops.Substring] which is public, and should be the function you want.

Does it make sense?

@tswast
Copy link
Collaborator

tswast commented Apr 21, 2021

Didn't check for bigquery, but do you create your operation registry (now a dict, not a class) from scratch, or do you use a copy of the base registry, and update it? I've seen both things depending on the backend.

Looks like ibis-bigquery starts with a copy of the base registry and updates it.

_operation_registry = {
**operation_registry,
}

@datapythonista
Copy link
Collaborator

Looks like ibis-bigquery starts with a copy of the base registry and updates it.

Shouldn't we then simply remove that operation from the update, and leave the orginal one? Not sure if I fully understood what was the problem here.

@renato2099
Copy link
Contributor Author

thanks for the discussion @datapythonista @tswast !
What I did was to update the registry to point to a specific bigquery function I introduced in the ibis_bigquery/compiler.py, but this substring function I introduced was basically a copy of ibis.backends.base.sql.string.registry.substring and I wanted to use that method instead of copying it over.

I wanted to do something like:

from ibis.backends.base.sql import (fixed_arity, literal, operation_registry,
                                    reduction, substring, unary).          <---- using substring here
....
def _string_substring(translator, expr):
    op = expr.op()
    arg, start, length = op.args
    if length.op().value < 0:
        raise ValueError('Length parameter should not be a negative value.')
        
    substring(translator, expr)

is there a better way to access the ibis.backends.base.sql.string.registry.substring?

@renato2099
Copy link
Contributor Author

also about the unit tests, I see a bunch of

>       result = ibis.bigquery.compile(expr)
E       AttributeError: module 'ibis' has no attribute 'bigquery'

I guess this is due to the move to the new repo, right? is there something else that I need to change?

@tswast
Copy link
Collaborator

tswast commented Apr 21, 2021

I guess this is due to the move to the new repo, right? is there something else that I need to change?

Hmm... The other tests have the same thing and were running okay. Maybe that's because we hadn't deleted the backend from the root repo yet when I last ran these? Filed #33 to clean these up.

@datapythonista
Copy link
Collaborator

is there a better way to access the ibis.backends.base.sql.string.registry.substring?

Does operation_registry[ops.Substring] work?

@datapythonista
Copy link
Collaborator

also about the unit tests, I see a bunch of

>       result = ibis.bigquery.compile(expr)
E       AttributeError: module 'ibis' has no attribute 'bigquery'

I guess this is due to the move to the new repo, right? is there something else that I need to change?

This should still work. Are you install this same package with pip? Like pip install -e <path>. This is required now, since the entrypoint needs to be installed to work.

@tswast
Copy link
Collaborator

tswast commented Apr 22, 2021

Now that I'm merged #37 if you pull the latest, the unit tests should actually execute the logic from this repo.

@renato2099
Copy link
Contributor Author

renato2099 commented Apr 22, 2021

done, let's wait for the ci workflow to run then 👍
thanks @tswast !

@renato2099
Copy link
Contributor Author

hey @tswast , do you mind approving the workflow to see if all tests/checks pass on this PR please?

@tswast tswast added run-ci Code has been reviewed and is deemed safe to run and removed run-ci Code has been reviewed and is deemed safe to run labels Apr 27, 2021
@tswast tswast merged commit d515184 into ibis-project:master Apr 27, 2021
@tswast
Copy link
Collaborator

tswast commented Apr 27, 2021

Looks like all the tests passed. Thanks for the contribution!

@renato2099 renato2099 deleted the 1629_bigquery_substr_length branch April 27, 2021 19:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
run-ci Code has been reviewed and is deemed safe to run
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants