Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate eventTime field in python client #1355

Merged
merged 1 commit into from Dec 1, 2022

Conversation

pawel-big-lebowski
Copy link
Contributor

Signed-off-by: Pawel Leszczynski leszczynski.pawel@gmail.com

Problem

eventTime field of a RunEvent is a non-validated string field in Python client, which is error-prone.

Closes: #2270

Solution

Validate date time on a python side.

Note: All schema changes require discussion. Please link the issue for context.

  • Your change modifies the core OpenLineage model
  • Your change modifies one or more OpenLineage facets

If you're contributing a new integration, please specify the scope of the integration and how/where it has been tested (e.g., Apache Spark integration supports S3 and GCS filesystem operations, tested with AWS EMR).

Checklist

  • You've signed-off your work
  • Your pull request title follows our guidelines
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • You've updated the CHANGELOG.md with details about your change under the "Unreleased" section (if relevant, depending on the change, this may not be necessary)
  • You've versioned the core OpenLineage model or facets according to SchemaVer (if relevant)
  • You've added a header to source files (if relevant)

SPDX-License-Identifier: Apache-2.0
Copyright 2018-2022 contributors to the OpenLineage project

@pawel-big-lebowski pawel-big-lebowski added client/python python Pull requests that update Python code labels Nov 30, 2022
@pawel-big-lebowski pawel-big-lebowski force-pushed the client/fix-python-datetime-validation branch from c79c1a0 to 68d6891 Compare November 30, 2022 08:24
@boring-cyborg boring-cyborg bot added the documentation Improvements or additions to documentation label Nov 30, 2022
@pawel-big-lebowski pawel-big-lebowski marked this pull request as ready for review November 30, 2022 08:25
@pawel-big-lebowski pawel-big-lebowski force-pushed the client/fix-python-datetime-validation branch from 68d6891 to e12519d Compare November 30, 2022 08:34
@JDarDagran
Copy link
Contributor

Let's use dateutil isoparse method instead of this complex regex. It adds one dependency to Python client.

@pawel-big-lebowski
Copy link
Contributor Author

pawel-big-lebowski commented Nov 30, 2022

Let's use dateutil isoparse method instead of this complex regex. It adds one dependency to Python client.

The Java definition on the server side of ZonedDateTime says:

A date-time with a time-zone in the ISO-8601 calendar system, such as 2007-12-03T10:15:30+01:00 Europe/Paris.

And this is failing when only a date in ISO-8601 provided. Dates like 2007-12-03 were the initial cause of the issue. The Python isoparse documentation claims:

An ISO-8601 datetime string consists of a date portion, followed optionally by a time portion

The term optionally is a problem I believe.

@JDarDagran
Copy link
Contributor

The Java definition on the server side of ZonedDateTime

I think that's the thing. It's server side, not included in spec. So either we change spec ("format": "date-time" in jsonschema applies to RFC 3339 does not have zones as in ZonedDateTime) or stick to what we have and use dateutil which is then enough for sure.

@pawel-big-lebowski pawel-big-lebowski force-pushed the client/fix-python-datetime-validation branch 3 times, most recently from 15cc614 to dd380c1 Compare December 1, 2022 14:22
@boring-cyborg boring-cyborg bot added the ci label Dec 1, 2022
Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
@pawel-big-lebowski pawel-big-lebowski force-pushed the client/fix-python-datetime-validation branch from dd380c1 to 1d978b5 Compare December 1, 2022 14:38
@JDarDagran JDarDagran merged commit a75165e into main Dec 1, 2022
@JDarDagran JDarDagran deleted the client/fix-python-datetime-validation branch December 1, 2022 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci client/python documentation Improvements or additions to documentation python Pull requests that update Python code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants