Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-35112][python] Fix membership for Row class PyFlink #24756

Merged
merged 1 commit into from
May 11, 2024

Conversation

wzorgdrager
Copy link
Contributor

What is the purpose of the change

This pull request adds support for a membership check for field names in a Row in PyFlink. If field names are not defined, it will check membership in the values. For example:

user_row = Row(name="Alice", age=22)
print("name" in user_row)
# Evaluates to True

user_row = Row("Alice", 22)
print("Alice" in user_row)
# Evaluates to True

This change is more in line with how dictionaries behave in Python.

Brief change log

  • Change in behaviour for Row class.
    • If field names are defined, KEY in row will check if KEY exists in field names.
    • If field names do not exist, KEY in row will check if KEY exists in field values (current behaviour).

Verifying this change

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@wzorgdrager wzorgdrager changed the title [FLINK-35112][Python] Fix membership for Row class PyFlink [FLINK-35112][python] Fix membership for Row class PyFlink May 5, 2024
@flinkbot
Copy link
Collaborator

flinkbot commented May 5, 2024

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@@ -177,7 +177,10 @@ def of_kind(row_kind: RowKind, *args, **kwargs):
return row

def __contains__(self, item):
return item in self._values
if hasattr(self, "_fields"):
return item in self._fields
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enhancement is great. I have one comment, the _fields could be a None by invoking set_field_names. So it should return False in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so then the question is, if _fields is None do we want to fallback and do a membership check for the values or always just return False?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wzorgdrager Thanks a lot for the PR. Personally I prefer to fallback to check for the values if _fields is None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dianfu I agree. I added the change now.

@dianfu dianfu merged commit b4d7114 into apache:master May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants