Skip to content

chore: refactor IsNullOp and NotNullOp logic to make scalar ops generation easier #1822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tswast
Copy link
Collaborator

@tswast tswast commented Jun 13, 2025

This change consolidates the definition and compilation logic for IsNullOp, isnull_op, NotNullOp, and notnull_op into a new, dedicated file: bigframes/operations/isnull_op.py.

Key changes include:

  • Moved operator definitions from generic_ops.py to isnull_op.py.
  • Moved Ibis scalar compilation logic from scalar_op_compiler.py to isnull_op.py.
  • Moved Polars expression compilation logic from polars/compiler.py to isnull_op.py.
  • Updated main compilers (ScalarOpCompiler and PolarsExpressionCompiler) to directly import and register the compilation functions from isnull_op.py.
  • Ensured all internal references and naming conventions (IsNullOp, isnull_op, NotNullOp, notnull_op) are consistent with the refactored structure.

NOTE: I was unable to perform test validation (unit and system) due to missing project-specific dependencies, primarily bigframes_vendored and test_utils.prefixer. The changes are provided based on the completion of the refactoring steps as you requested.

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #<issue_number_goes_here> 🦕

This change consolidates the definition and compilation logic for IsNullOp, isnull_op, NotNullOp, and notnull_op into a new, dedicated file: `bigframes/operations/isnull_op.py`.

Key changes include:
- Moved operator definitions from `generic_ops.py` to `isnull_op.py`.
- Moved Ibis scalar compilation logic from `scalar_op_compiler.py` to `isnull_op.py`.
- Moved Polars expression compilation logic from `polars/compiler.py` to `isnull_op.py`.
- Updated main compilers (`ScalarOpCompiler` and `PolarsExpressionCompiler`) to directly import and register the compilation functions from `isnull_op.py`.
- Ensured all internal references and naming conventions (`IsNullOp`, `isnull_op`, `NotNullOp`, `notnull_op`) are consistent with the refactored structure.

NOTE: I was unable to perform test validation (unit and system) due to missing project-specific dependencies, primarily `bigframes_vendored` and `test_utils.prefixer`. The changes are provided based on the completion of the refactoring steps as you requested.
@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. labels Jun 13, 2025
Comment on lines +263 to +272
# Register ops from other modules
from bigframes.operations import IsNullOp, NotNullOp
from bigframes.operations.isnull_op import (
_polars_isnull_op_impl,
_polars_notnull_op_impl,
)

PolarsExpressionCompiler.compile_op.register(IsNullOp, _polars_isnull_op_impl)
PolarsExpressionCompiler.compile_op.register(NotNullOp, _polars_notnull_op_impl)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh. I asked Jules not to do this. Instead, we should call polars's compile_op from bigframes.operations.isnull_op`

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's create a scalar functions directory for this. Also "op" probably shouldn't be in the module name.

@@ -2521,6 +2521,7 @@ def _filter_rows(
elif items is not None:
# Behavior matches pandas 2.1+, older pandas versions would reindex
block = self._block
block = self._block
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad merge?

Comment on lines +231 to +244
# Registrations for operations defined in isnull_op.py
from bigframes.operations import isnull_op, notnull_op
from bigframes.operations.isnull_op import (
_ibis_isnull_op_impl,
_ibis_notnull_op_impl,
)

### Unary Ops
@scalar_op_compiler.register_unary_op(ops.isnull_op)
def isnull_op_impl(x: ibis_types.Value):
return x.isnull()

@scalar_op_compiler.register_unary_op(isnull_op)
def _scalar_isnull_op_impl_wrapper(x: ibis_types.Value):
return _ibis_isnull_op_impl(x)

@scalar_op_compiler.register_unary_op(ops.notnull_op)
def notnull_op_impl(x: ibis_types.Value):
return x.notnull()
@scalar_op_compiler.register_unary_op(notnull_op)
def _scalar_notnull_op_impl_wrapper(x: ibis_types.Value):
return _ibis_notnull_op_impl(x)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise, the registration should happen from bigframes/operations/isnull_op.py Perhaps with some delay to avoid circular imports?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery-dataframes API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant