Skip to content

Conversation

jaimeferj
Copy link

Closes #2523

Rationale for this change

Are these changes tested?

yes

Are there any user-facing changes?

@jaimeferj
Copy link
Author

In the issue #2523 it is said to derive the class from IcebergBaseModel which I have not done, but could try on if my solution is not accepted.

@jaimeferj jaimeferj marked this pull request as draft October 3, 2025 16:56
@jaimeferj
Copy link
Author

I have now marked it as a Draft since I am not sure now that is the kind of implementation you want. Tests are still passing now using LiteralPredicate as subclass of IcebergBaseModel, but had to make term a Term[L] instead of what I thought it was (UnboundTerm[L]) because test_not_equal_to_invert and other ones of the same kind would fail otherwise, since they are using a BoundedTerm instead.

@jaimeferj jaimeferj marked this pull request as ready for review October 3, 2025 20:19
@jaimeferj
Copy link
Author

jaimeferj commented Oct 5, 2025

Something fishy that I had to pull in order for tests to pass was putting the attribute term as Term[L] instead of UnboundTerm[Any] as it was in UnboundPredicate, father of LiteralPredicate. However, that also is triggering mypy since we are changing types from father to child.

The problem is that the earlier implementation was calling _to_unbound_term when initializing the instance, however, it does not always return UnboundTerm as you can easily see from the implementation:

def _to_unbound_term(term: Union[str, UnboundTerm[Any]]) -> UnboundTerm[Any]:
    return Reference(term) if isinstance(term, str) else term

If term is not UnboundTerm nor str the output is whatever the input was. For example, as done in test test_not_equal_to_invert because it is being initialized with a BoundRefence! If you use in the Pydantic model term as UnboundTerm[L]:

_______________________________________________ test_not_equal_to_invert _______________________________________________

    def test_not_equal_to_invert() -> None:
>       bound = NotEqualTo(
            term=BoundReference(  # type: ignore
                field=NestedField(field_id=1, name="foo", field_type=StringType(), required=False),
                accessor=Accessor(position=0, inner=None),
            ),
            literal="hello",
        )

Should we address this in this PR or delegate it to another issue?

@Fokko
Copy link
Contributor

Fokko commented Oct 8, 2025

Should we address this in this PR or delegate it to another issue?

That's a bit of an edge case, since you deliberately ignore the type annotation. We could add a check in the function itself:

def _to_unbound_term(term: Union[str, UnboundTerm[Any]]) -> UnboundTerm[Any]:
    if isinstance(term, str),
        return Reference(term)
    elif isinstance(term, UnboundTerm):
        return term
    else:
        raise ValueError(f"Expected UnboundTerm or str, but got: {term}")
    return Reference(term) if isinstance(term, str) else term

@jaimeferj
Copy link
Author

jaimeferj commented Oct 8, 2025

def _to_unbound_term(term: Union[str, UnboundTerm[Any]]) -> UnboundTerm[Any]:
    if isinstance(term, str),
        return Reference(term)
    elif isinstance(term, UnboundTerm):
        return term
    else:
        raise ValueError(f"Expected UnboundTerm or str, but got: {term}")
    return Reference(term) if isinstance(term, str) else term

I do not ignore the type annotation, it is the current implementation and a current test that is ignoring the annotation. I am trying to just implement the issue requirement and since now the types are checked in runtime by pydantic the (old) test is not passing.

@jaimeferj jaimeferj force-pushed the feat/json-literal-predicate branch from 00bc5db to 5118748 Compare October 10, 2025 22:35
Comment on lines +759 to +764
def __init__(self, *args: Any, **kwargs: Any) -> None:
if args:
if len(args) != 2:
raise TypeError("Expected (term, literal)")
kwargs = {"term": args[0], "literal": args[1], **kwargs}
super().__init__(**kwargs)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After having many issues with an init such as:

def __init__(self, term: Union[str, UnboundTerm[Any]], literals: Union[Iterable[L], Iterable[Literal[L]]]):
        super().__init__(term=_to_unbound_term(term), items=_to_literal_set(literals))

Because there are some typing errors with _transform_literal in pyiceberg/transforms.py for example:

  pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[str | None], str | None]"; expected "Callable[[str], str]"  [arg-type]
  pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[bool | None], bool | None]"; expected "Callable[[str], str]"  [arg-type]
  pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[int | None], int | None]"; expected "Callable[[str], str]"  [arg-type]
  pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[float | None], float | None]"; expected "Callable[[str], str]"  [arg-type]
  pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[bytes | None], bytes | None]"; expected "Callable[[str], str]"  [arg-type]
  pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[UUID | None], UUID | None]"; expected "Callable[[str], str]"  [arg-type]

I decided to just go for this implementation of init. The problem now is that:

assert_type(EqualTo("a", "b"), EqualTo[str])  # <-- Fails
------
  tests/expressions/test_expressions.py:1238: error: Expression is of type "LiteralPredicate[L]", not "EqualTo[str]"  [assert-type]

So I am really stuck, would you mind lending a hand here? @Fokko

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MakeLiteralPredicate predicate JSON serializable

2 participants