Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK] add debug facet to help resolving Spark integration issues #2147

Merged
merged 1 commit into from Oct 6, 2023

Conversation

pawel-big-lebowski
Copy link
Contributor

@pawel-big-lebowski pawel-big-lebowski commented Oct 2, 2023

Problem

Debugging openlineage-spark problems is tedious job. We would like to have debug facet that will collect automatically meaningful information when enabled.

Closes: #2135

Solution

Note: All schema changes require discussion. Please link the issue for context.

  • Your change modifies the core OpenLineage model
  • Your change modifies one or more OpenLineage facets

If you're contributing a new integration, please specify the scope of the integration and how/where it has been tested (e.g., Apache Spark integration supports S3 and GCS filesystem operations, tested with AWS EMR).

One-line summary:

Checklist

  • You've signed-off your work
  • Your pull request title follows our guidelines
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • Your comment includes a one-liner for the changelog about the specific purpose of the change (if necessary)
  • You've versioned the core OpenLineage model or facets according to SchemaVer (if relevant)
  • You've added a header to source files (if relevant)

SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the OpenLineage project

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
private SparkConfigDebugFacet buildSparkConfigDebugFacet() {
return SparkConfigDebugFacet.builder()
.extraListeners(getSparkConfOrNull("spark.extraListeners"))
.openLineageConfig(getOpenLineageConfig())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shou.d we remove the API key or any secret?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


private boolean isOnClassPath(String aClass) {
try {
this.getClass().getClassLoader().loadClass(aClass);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't Class.forName() more appropriate here? Does it need to be in the same ClassLoader?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not consistent across the codebase and have both approaches. (https://stackoverflow.com/questions/8100376/class-forname-vs-classloader-loadclass-which-to-use-for-dynamic-loading gives more details on differences)

"getClassLoader" uses system classloader (which is overridable) while Class.forName initialises the class (runs static methods).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've merged this PR but I would be happy to continue discussion on that which can possibly lead to other PR or issue created.

@pawel-big-lebowski pawel-big-lebowski merged commit a93c45b into main Oct 6, 2023
20 checks passed
@pawel-big-lebowski pawel-big-lebowski deleted the spark/debug-facet branch October 6, 2023 11:33
Sheeri pushed a commit to Sheeri/OpenLineage that referenced this pull request Nov 22, 2023
…penLineage#2147)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>
harels added a commit that referenced this pull request Nov 22, 2023
* add proposal for OpenLineage registry

Signed-off-by: Julien Le Dem <julien@apache.org>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* fix headers

Signed-off-by: Julien Le Dem <julien@apache.org>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* improve clarity

Signed-off-by: Julien Le Dem <julien@apache.org>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Added a requirement
Trying out a commit in a new branch...Hopefully I did it correctly :D

Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Added an entry for "core" to propose changes to the spec with a "known" starting point.

This entry:
---
root_doc_URL: "https://openlineage.io/spec/facets/"
produced_facets: [
  "ol:core:1-0-0/ColumnLineageDatasetFacet.json",
  "ol:core:1-0-1/ColumnLineageDatasetFacet.json",
  "ol:core:1-0-0/DataQualityAssertionsDatasetFacet.json"
]
---

indicates that the documentation for the produced facets are at:
https://openlineage.io/spec/facets/1-0-0/ColumnLineageDatasetFacet.json
https://openlineage.io/spec/facets/1-0-1/ColumnLineageDatasetFacet.json
https://openlineage.io/spec/facets/1-0-0/DataQualityAssertionsDatasetFacet.json

Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Proposing a new formatting of producer/consumer objects instead of an array of strings.

I have modified the "core" example, to show the translation between old and new format.
Added an "egeria" example which is both a producer and consumer.

- Producer and consumer doc root URLs may differ, so they are set inside the producer/consumer object
- It is assumed that the documentation link only applies to the facets owned by the same entity
  - e.g. egeria owns NewCustomFacet so the docs are at the Egeria doc URL
  - egeria does not own ColumnLineageDatasetFacet so there are no docs at the Egeria doc URL (unless it's extended by egeria?)
- "sample_URL" is where the examples/tests can be found (recommended but not required)
- "owner" is added for clarification. egeria produces their own custom facet plus one facet from core in this example
- spec_versions array for compatibility
- use_cases to better create the documentation page
- since there are "consumer" and "producer" objects, "facets" replaces "produced_facets" and "consumed_facets"

Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* adding another proposed requirement

Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* added note about accuracy of registry entries

Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* added Acceptance guidelines

Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* added a note about reserving names for the future.

Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Update registry.md

Removing open questions that we don't have an answer to

Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* bump 0.30.0 release date (#2002)

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 0.30.0

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.0.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* ci: remove java macos arm parser build (#2003)

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 0.30.1

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.0.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* fix changelog (#2005)

* fix changelog

Signed-off-by: Michael Robinson <merobi@gmail.com>

* add missing change

Signed-off-by: Michael Robinson <merobi@gmail.com>

---------

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Remove $ref facets from core spec. (#1997)

Move from boon to jv.

Add test facets.

Add pre-commit usage guide.

Change facets versions bump to REVISION level.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* airflow: convert lineage from legacy file definition (#2006)

* airflow: convert lineage from legacy file definition

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>

* Update integration/airflow/openlineage/airflow/extractors/converters.py

Co-authored-by: JDarDagran <kuba0221@gmail.com>

---------

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Co-authored-by: JDarDagran <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Fix error message to avoid confusion (#2001)

Signed-off-by: Mars Lan <mars@metaphor.io>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* docs: add file transport documentation (#2008)

Signed-off-by: Alexandre Bergere <alexandre.bergere@datagalaxy.com>
Co-authored-by: Alexandre Bergere <alexandre.bergere@datagalaxy.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* airflow: make sure we cannot fail in thread despite direct execution (#2010)

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Change log level to DEBUG when extractor isn't found (#2012)

There isn't much a user can do when an extractor is not even available for them to use. So changing this to DEBUG makes more sense IM

Signed-off-by: Kaxil Naik <kaxilnaik@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.gradle.test-retry from 1.5.3 to 1.5.4 in /integration/spark (#2021)

Bumps org.gradle.test-retry from 1.5.3 to 1.5.4.

---
updated-dependencies:
- dependency-name: org.gradle.test-retry
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.junit.jupiter:junit-jupiter in /client/java (#2022)

Bumps [org.junit.jupiter:junit-jupiter](https://github.com/junit-team/junit5) from 5.9.3 to 5.10.0.
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.9.3...r5.10.0)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump junit5Version from 5.9.3 to 5.10.0 in /integration/flink (#2015)

Bumps `junit5Version` from 5.9.3 to 5.10.0.

Updates `org.junit.jupiter:junit-jupiter` from 5.9.3 to 5.10.0
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.9.3...r5.10.0)

Updates `org.junit.jupiter:junit-jupiter-params` from 5.9.3 to 5.10.0
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.9.3...r5.10.0)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.junit.jupiter:junit-jupiter-params
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.junit:junit-bom from 5.9.3 to 5.10.0 in /integration/flink (#2016)

Bumps [org.junit:junit-bom](https://github.com/junit-team/junit5) from 5.9.3 to 5.10.0.
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.9.3...r5.10.0)

---
updated-dependencies:
- dependency-name: org.junit:junit-bom
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* http, snowflake: stop using reusable session by default, do not send full event on snowflake complete (#2025)

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* add missing changes to changelog for 1.0.0 release (#2027)

* add missing changes to changelog for 1.0.0 release

Signed-off-by: Michael Robinson <merobi@gmail.com>

* add missing changes to changelog for 1.0.0 release continued

Signed-off-by: Michael Robinson <merobi@gmail.com>

---------

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.0.0

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.1.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] filter unwanted events (#1987)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.junit:junit-bom from 5.9.3 to 5.10.0 in /integration/spark (#2032)

Bumps [org.junit:junit-bom](https://github.com/junit-team/junit5) from 5.9.3 to 5.10.0.
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.9.3...r5.10.0)

---
updated-dependencies:
- dependency-name: org.junit:junit-bom
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] merge into delta integration test (#2026)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [FLINK] read configuration from flink conf (#2033)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump psycopg2-binary from 2.9.6 to 2.9.7 in /integration/airflow (#2031)

Bumps [psycopg2-binary](https://github.com/psycopg/psycopg2) from 2.9.6 to 2.9.7.
- [Changelog](https://github.com/psycopg/psycopg2/blob/master/NEWS)
- [Commits](https://github.com/psycopg/psycopg2/compare/2.9.6...2.9.7)

---
updated-dependencies:
- dependency-name: psycopg2-binary
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Don't use database as fallback when no schema parsed. (#2023)

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* add javadoc to the java client (#2004)

* add javadoc to the java client

Signed-off-by: Julien Le Dem <julien@apache.org>

* change for compiler version compat

Signed-off-by: Julien Le Dem <julien@apache.org>

* fix javadoc

Signed-off-by: Julien Le Dem <julien@apache.org>

---------

Signed-off-by: Julien Le Dem <julien@apache.org>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* spark: fix wrong naming of JDBC datasets (#2035)

* spark: jdbc namespaces should not have database in them

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>

* tests tests

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>

---------

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* ci: bump Go image version used in root CI job (#2047)

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] append output dataset name to job name (#2036)

* [SPARK] append output dataset name to job name

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>

* [SPARK] use dot separator within job name parts

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>

---------

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Add codespell pre-commit hook. (#2011)

* Add codespell pre-commit hook.

Fix mispells.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>

* Shorten allow list.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>

---------

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] upgrade latest supported version to 3.4.1 (#2057)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump junit5Version from 5.9.3 to 5.10.0 in /integration/spark (#2018)

Bumps `junit5Version` from 5.9.3 to 5.10.0.

Updates `org.junit.jupiter:junit-jupiter` from 5.9.3 to 5.10.0
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.9.3...r5.10.0)

Updates `org.junit.jupiter:junit-jupiter-params` from 5.9.3 to 5.10.0
- [Release notes](https://github.com/junit-team/junit5/releases)
- [Commits](https://github.com/junit-team/junit5/compare/r5.9.3...r5.10.0)

---
updated-dependencies:
- dependency-name: org.junit.jupiter:junit-jupiter
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.junit.jupiter:junit-jupiter-params
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] replace dbfs init scripts (#2055)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [FLINK] fix a bug when getting schema for KafkaSink (#2042)

* fix a bug when getting schema for KafkaSink

Signed-off-by: pentium3 <celerond@msn.com>

* fix a bug when getting schema for KafkaSink

Signed-off-by: pentium3 <celerond@msn.com>

---------

Signed-off-by: pentium3 <celerond@msn.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bug/fix ignored event adaptive spark plan databricks (#2061)

* removed/adaptive_spark_plan from excludedNodes of DatabricksEventFilter. see :https://github.com/OpenLineage/OpenLineage/issues/2058

Signed-off-by: Abdallah Terrab <abdallah@terrab.me>

* added Databricks integration tests testNarrowTransformation and testWideTransformation related to
https://github.com/OpenLineage/OpenLineage/issues/2058
Signed-off-by: Abdallah Terrab <abdallah.terrab.partner@decathlon.com>

Signed-off-by: Abdallah Terrab <abdallah@terrab.me>

* gradlew :app:spotlessApply

Signed-off-by: Abdallah Terrab <abdallah@terrab.me>

* gradlew spotlessApply

Signed-off-by: Abdallah Terrab <abdallah@terrab.me>

* gradlew spotlessApply
w/ java version "1.8.0_381"

Signed-off-by: Abdallah Terrab <abdallah@terrab.me>

---------

Signed-off-by: Abdallah Terrab <abdallah@terrab.me>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* update changelog for 1.1.0 (#2062)

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.1.0

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.2.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump io.confluent:kafka-schema-registry-client in /integration/flink (#2068)

Bumps [io.confluent:kafka-schema-registry-client](https://github.com/confluentinc/schema-registry) from 7.4.1 to 7.5.0.
- [Commits](https://github.com/confluentinc/schema-registry/compare/v7.4.1...v7.5.0)

---
updated-dependencies:
- dependency-name: io.confluent:kafka-schema-registry-client
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump testcontainersVersion from 1.18.1 to 1.19.0 in /integration/spark (#2064)

Bumps `testcontainersVersion` from 1.18.1 to 1.19.0.

Updates `org.testcontainers:junit-jupiter` from 1.18.1 to 1.19.0
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/1.18.1...1.19.0)

Updates `org.testcontainers:postgresql` from 1.18.1 to 1.19.0
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/1.18.1...1.19.0)

Updates `org.testcontainers:mockserver` from 1.18.1 to 1.19.0
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/1.18.1...1.19.0)

Updates `org.testcontainers:kafka` from 1.18.1 to 1.19.0
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/1.18.1...1.19.0)

---
updated-dependencies:
- dependency-name: org.testcontainers:junit-jupiter
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.testcontainers:postgresql
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.testcontainers:mockserver
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: org.testcontainers:kafka
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.xerial:sqlite-jdbc in /integration/spark (#2065)

Bumps [org.xerial:sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from 3.42.0.0 to 3.42.0.1.
- [Release notes](https://github.com/xerial/sqlite-jdbc/releases)
- [Changelog](https://github.com/xerial/sqlite-jdbc/blob/master/CHANGELOG)
- [Commits](https://github.com/xerial/sqlite-jdbc/compare/3.42.0.0...3.42.0.1)

---
updated-dependencies:
- dependency-name: org.xerial:sqlite-jdbc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.codehaus.groovy:groovy-all in /integration/flink (#2070)

Bumps [org.codehaus.groovy:groovy-all](https://github.com/apache/groovy) from 3.0.18 to 3.0.19.
- [Commits](https://github.com/apache/groovy/commits)

---
updated-dependencies:
- dependency-name: org.codehaus.groovy:groovy-all
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump io.confluent:kafka-avro-serializer in /integration/flink (#2069)

Bumps [io.confluent:kafka-avro-serializer](https://github.com/confluentinc/schema-registry) from 7.4.1 to 7.5.0.
- [Commits](https://github.com/confluentinc/schema-registry/compare/v7.4.1...v7.5.0)

---
updated-dependencies:
- dependency-name: io.confluent:kafka-avro-serializer
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* update jackson (#2071)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.openapi.generator from 6.6.0 to 7.0.0 in /client/java (#2066)

Bumps org.openapi.generator from 6.6.0 to 7.0.0.

---
updated-dependencies:
- dependency-name: org.openapi.generator
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Ci/fix pre commit step (#2072)

* Change python to python3.8.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>

* Bump cache version due to change in cimg image.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>

---------

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] fix RDD missing inputs (#2039)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump com.diffplug.spotless from 6.20.0 to 6.21.0 in /integration/flink (#2080)

Bumps com.diffplug.spotless from 6.20.0 to 6.21.0.

---
updated-dependencies:
- dependency-name: com.diffplug.spotless
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.xerial:sqlite-jdbc in /integration/spark (#2077)

Bumps [org.xerial:sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from 3.42.0.1 to 3.43.0.0.
- [Release notes](https://github.com/xerial/sqlite-jdbc/releases)
- [Changelog](https://github.com/xerial/sqlite-jdbc/blob/master/CHANGELOG)
- [Commits](https://github.com/xerial/sqlite-jdbc/compare/3.42.0.1...3.43.0.0)

---
updated-dependencies:
- dependency-name: org.xerial:sqlite-jdbc
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump com.github.tomakehurst:wiremock in /integration/flink (#2079)

Bumps [com.github.tomakehurst:wiremock](https://github.com/wiremock/wiremock) from 2.27.2 to 3.0.1.
- [Release notes](https://github.com/wiremock/wiremock/releases)
- [Commits](https://github.com/wiremock/wiremock/compare/2.27.2...3.0.1)

---
updated-dependencies:
- dependency-name: com.github.tomakehurst:wiremock
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [FLINK] don't send RUNNING events after COMPLETE (#2075)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* http: use non-deprecated apiKey if loading it from env variables (#2029)

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump pytest from 7.4.0 to 7.4.1 in /integration/airflow (#2081)

Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.4.0 to 7.4.1.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/7.4.0...7.4.1)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* fix: serde filtering (#2044)

* fix: serde filtering

Signed-off-by: Xiang Li <stevenlix1026@gmail.com>

* Rewrite lambda function.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>

---------

Signed-off-by: Xiang Li <stevenlix1026@gmail.com>
Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Co-authored-by: Xiang Li <stevenlix1026@gmail.com>
Co-authored-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Remove `sqlparser` main dependency in ifaces. (#2090)

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* spark: publish ProcessingEngineRunFacet (#2089)

* spark: publish ProcessingEngineRunFacet

Previously, the Spark integration published a custom facet named
'SparkVersion'. As it is custom, it isn't defined in the OpenLineage
spec. However, the OpenLineage spec defines a ProcessingEngineRunFacet
that is meant to capture details about the things that runs a job.

This change introduces this in the form of the
'SparkProcessingEngineRunFacetBuilder' and
'SparkProcessingEngineRunFacetBuilderDelegate'.

As the names suggest, these classes are meant to create and populate
ProcessingEngineRunFacet.

The reason for the existence of the delegate is because there are two
code paths that interact with the run facets, namely the code path
within the RddExecutionContext and the other in the
SparkSqlExecutionContext, albeit via a very roundabout way.

The delegate is the object that actually constructs the facet, whilst
the builder provides an adapter that uses the CustomFacetBuilder
interface.

Yes, it's a hacky way of doing it and may need to be changed in the
future. For now though, its good enough.

Closes: https://github.com/OpenLineage/OpenLineage/issues/2086
Signed-off-by: Damien Hawes <d-m-h@users.noreply.github.com>

* spark: Deprecated the SparkVersionFacet, alongside the version-facet.json

Signed-off-by: Damien Hawes <d-m-h@users.noreply.github.com>

* spark: Used a mocked SparkContext instead inside SparkProcessingEngineFacetBuilderTest

Signed-off-by: Damien Hawes <d-m-h@users.noreply.github.com>

* changelog: Updated the changelog indicating that the SparkVersionFacet will be removed in 1.4.0

Signed-off-by: Damien Hawes <d-m-h@users.noreply.github.com>

---------

Signed-off-by: Damien Hawes <d-m-h@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK][FLINK] Unify dataset naming from URI objects. (#2083)

* [SPARK][FLINK] Unify dataset naming from URI objects.

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>

* [SPARK] move DatasetIdentifier to openlineage-java

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>

---------

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* fix ol proxy chart (#2091)

Signed-off-by: Harel Shein <harel.shein@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.slf4j:slf4j-simple from 2.0.7 to 2.0.9 in /client/java (#2096)

Bumps org.slf4j:slf4j-simple from 2.0.7 to 2.0.9.

---
updated-dependencies:
- dependency-name: org.slf4j:slf4j-simple
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] verify dataset naming on databricks and limit amount of events sent (#2076)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [CI] fix circle CI caches (#2101)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Capture clusterAllTags variable from databricks (#2099)

* capture clusterAllTags var

Signed-off-by: anirudh.shrinivason <anirudh.shrinivason@grabtaxi.com>

* Update changelod

Signed-off-by: anirudh.shrinivason <anirudh.shrinivason@grabtaxi.com>

* Changelog nit

Signed-off-by: anirudh.shrinivason <anirudh.shrinivason@grabtaxi.com>

---------

Signed-off-by: anirudh.shrinivason <anirudh.shrinivason@grabtaxi.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* python: fix custom http transport TokenProvider (#2100)

* (init)

Signed-off-by: John Lukenoff <johnlukenoff@asana.com>

* unlint

Signed-off-by: John Lukenoff <johnlukenoff@asana.com>

---------

Signed-off-by: John Lukenoff <johnlukenoff@asana.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] fix failing S3 test on main (#2102)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* fix: Support parsing dbt dbt_project.yml without target-path (#2106)

As of dbt v1.5, usage of target-path in the dbt_project.yml file has been deprecated, now preferring a CLI flag or env var. It will be removed in a future version. See dbt-labs/dbt-core#6882

Docs: https://docs.getdbt.com/reference/project-configs/target-path

This change allows users to run DbtLocalArtifactProcessor in dbt projects that don't declare target-path

Fix: #2093

Signed-off-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* docs: Add openlineage-integration-common PyPI links (#2108)

Signed-off-by: Tatiana Al-Chueyr <tatiana.alchueyr@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* update changelog for 1.2.0 (#2111)

* update changelog for 1.2.0

---------

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* update changelog (#2112)

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.2.0

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.3.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [CI] fix checksum for release sql java (#2114)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* update changelog for 1.2.1 (#2119)

* update changelog for 1.2.1

Signed-off-by: Michael Robinson <merobi@gmail.com>

* revert changelog change

Signed-off-by: Michael Robinson <merobi@gmail.com>

---------

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.2.1

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.3.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* revert bump: org.openapi.generator required JDK 11 (#2113)

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* update changelog for 1.2.2 (#2120)

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.2.2

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.3.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Update the openlineage-java client's documentation (#2123)

- Added missing transports documentation to the README.md
- Streamlined descriptions for clarity and consistency across all transport types.
- Organized configuration details and examples for better readability.
- Highlighted key notes and behaviors for each transport.

Signed-off-by: Damien Hawes <d-m-h@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.projectlombok:lombok in /integration/spark (#2125)

Bumps [org.projectlombok:lombok](https://github.com/projectlombok/lombok) from 1.18.28 to 1.18.30.
- [Changelog](https://github.com/projectlombok/lombok/blob/master/doc/changelog.markdown)
- [Commits](https://github.com/projectlombok/lombok/compare/v1.18.28...v1.18.30)

---
updated-dependencies:
- dependency-name: org.projectlombok:lombok
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.projectlombok:lombok in /integration/flink (#2128)

Bumps [org.projectlombok:lombok](https://github.com/projectlombok/lombok) from 1.18.28 to 1.18.30.
- [Changelog](https://github.com/projectlombok/lombok/blob/master/doc/changelog.markdown)
- [Commits](https://github.com/projectlombok/lombok/compare/v1.18.28...v1.18.30)

---
updated-dependencies:
- dependency-name: org.projectlombok:lombok
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.openapi.generator from 6.6.0 to 7.0.1 in /client/java (#2126)

Bumps org.openapi.generator from 6.6.0 to 7.0.1.

---
updated-dependencies:
- dependency-name: org.openapi.generator
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.projectlombok:lombok from 1.18.28 to 1.18.30 in /client/java (#2127)

Bumps [org.projectlombok:lombok](https://github.com/projectlombok/lombok) from 1.18.28 to 1.18.30.
- [Changelog](https://github.com/projectlombok/lombok/blob/master/doc/changelog.markdown)
- [Commits](https://github.com/projectlombok/lombok/compare/v1.18.28...v1.18.30)

---
updated-dependencies:
- dependency-name: org.projectlombok:lombok
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [CI] modify CircleCi resource class for macos (#2133)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.gradle.test-retry from 1.5.4 to 1.5.5 in /integration/spark (#2116)

Bumps org.gradle.test-retry from 1.5.4 to 1.5.5.

---
updated-dependencies:
- dependency-name: org.gradle.test-retry
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* add timers to ol emit calls (#1845)

Add tests for stats.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump pytest from 7.4.1 to 7.4.2 in /integration/airflow (#2094)

Bumps [pytest](https://github.com/pytest-dev/pytest) from 7.4.1 to 7.4.2.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest/compare/7.4.1...7.4.2)

---
updated-dependencies:
- dependency-name: pytest
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* dbt: Add SQLSERVER to supported dbt profile types (#2136)

* Add SQLSERVER to supported dbt profile types

Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>

* Signed commit

Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>

* Update integration/common/openlineage/common/provider/dbt/processor.py

Co-authored-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>

---------

Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>
Co-authored-by: Erik Alfthan <erik.alfthan@kommuninvest.se>
Co-authored-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* #2130 Add columns as schema facet for airflow.lineage.Table (if defined) (#2138)

* #2130 Add columns as schema facet for airflow.lineage.Table (if defined)

Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>

* #2130 Format import - revert black formatting on untouched test

Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>

* #2130 Format import - revert black formatting on untouched function

Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>

* Apply ruff sort

Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>

---------

Signed-off-by: Erik Alfthan <erik.alfthan@kommuninvest.se>
Co-authored-by: Erik Alfthan <erik.alfthan@kommuninvest.se>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Updated the README.md of the sql parser (#2140)

This just surfaces the list of supported dialects to make it more accessible to readers.

Signed-off-by: Damien Hawes <d-m-h@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Add more graceful logging when no OL provider installed. (#2141)

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Fix find-links path in tox. (#2139)

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump psycopg2-binary from 2.9.7 to 2.9.8 in /integration/airflow (#2146)

Bumps [psycopg2-binary](https://github.com/psycopg/psycopg2) from 2.9.7 to 2.9.8.
- [Changelog](https://github.com/psycopg/psycopg2/blob/master/NEWS)
- [Commits](https://github.com/psycopg/psycopg2/compare/2.9.7...2.9.8)

---
updated-dependencies:
- dependency-name: psycopg2-binary
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Spark: Fixed scheme preservation bug in PathUtils (#2142)

Summary:

Fixed a bug in the PathUtils' prepareDatasetIdentifierFromDefaultTablePath(CatalogTable) method, ensuring the correct scheme preservation from the CatalogTable's location.

Details:

Previously, when generating a DatasetIdentifier from a CatalogTable's default path, the scheme (like "hdfs") could be incorrectly set to "file". This fix addresses the issue, ensuring that the proper scheme from the CatalogTable's location is always preserved.

Impact:

This fix ensures the accuracy and correctness of the DatasetIdentifier's namespace.

Testing:

A unit test was added in PathUtilsTest#testFromCatalogTableShouldReturnADatasetIdentifierWithTheActualScheme

Issue: https://github.com/OpenLineage/OpenLineage/issues/2132

Signed-off-by: Damien Hawes <d-m-h@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump com.diffplug.spotless from 6.21.0 to 6.22.0 in /integration/flink (#2145)

Bumps com.diffplug.spotless from 6.21.0 to 6.22.0.

---
updated-dependencies:
- dependency-name: com.diffplug.spotless
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.apache.avro:avro from 1.11.2 to 1.11.3 in /integration/flink (#2144)

Bumps org.apache.avro:avro from 1.11.2 to 1.11.3.

---
updated-dependencies:
- dependency-name: org.apache.avro:avro
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] support for Spark 3.5 (#2118)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Update the changelog (#2148)

* Updates the changelog.

---------

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.3.0

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.4.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Reverts org.openapi.generator version to enable Java client release. (#2152)

* Reverts org.openapi.generator version to enable java client release.

Signed-off-by: Michael Robinson <merobi@gmail.com>

* Configures dependabot to ignore org.openapi.generator.

Signed-off-by: Michael Robinson <merobi@gmail.com>

* Updates changelog for 1.3.1.

Signed-off-by: Michael Robinson <merobi@gmail.com>

---------

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.3.1

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.4.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [Flink] expand iceberg source types (#2149)

Signed-off-by: Zhenqiu Huang <huangzhenqiu0825@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump testcontainersVersion from 1.19.0 to 1.19.1 in /integration/spark (#2154)

Bumps `testcontainersVersion` from 1.19.0 to 1.19.1.

Updates `org.testcontainers:junit-jupiter` from 1.19.0 to 1.19.1
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/1.19.0...1.19.1)

Updates `org.testcontainers:postgresql` from 1.19.0 to 1.19.1
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/1.19.0...1.19.1)

Updates `org.testcontainers:mockserver` from 1.19.0 to 1.19.1
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/1.19.0...1.19.1)

Updates `org.testcontainers:kafka` from 1.19.0 to 1.19.1
- [Release notes](https://github.com/testcontainers/testcontainers-java/releases)
- [Changelog](https://github.com/testcontainers/testcontainers-java/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testcontainers/testcontainers-java/compare/1.19.0...1.19.1)

---
updated-dependencies:
- dependency-name: org.testcontainers:junit-jupiter
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.testcontainers:postgresql
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.testcontainers:mockserver
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: org.testcontainers:kafka
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.scala-lang.modules:scala-collection-compat_2.12 (#2157)

Bumps [org.scala-lang.modules:scala-collection-compat_2.12](https://github.com/scala/scala-collection-compat) from 2.1.2 to 2.11.0.
- [Release notes](https://github.com/scala/scala-collection-compat/releases)
- [Commits](https://github.com/scala/scala-collection-compat/compare/v2.1.2...v2.11.0)

---
updated-dependencies:
- dependency-name: org.scala-lang.modules:scala-collection-compat_2.12
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Allow setting client's endpoint via environment variable (#2151)

* Allow setting client's endpoint via environment variable

Signed-off-by: Mars Lan <mars.th.lan@gmail.com>

* Fix lint errors

Signed-off-by: Mars Lan <mars.th.lan@gmail.com>

---------

Signed-off-by: Mars Lan <mars.th.lan@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.gradle.test-retry from 1.5.5 to 1.5.6 in /integration/spark (#2156)

Bumps org.gradle.test-retry from 1.5.5 to 1.5.6.

---
updated-dependencies:
- dependency-name: org.gradle.test-retry
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump psycopg2-binary from 2.9.8 to 2.9.9 in /integration/airflow (#2155)

Bumps [psycopg2-binary](https://github.com/psycopg/psycopg2) from 2.9.8 to 2.9.9.
- [Changelog](https://github.com/psycopg/psycopg2/blob/master/NEWS)
- [Commits](https://github.com/psycopg/psycopg2/compare/2.9.8...2.9.9)

---
updated-dependencies:
- dependency-name: psycopg2-binary
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] add debug facet to help resolving Spark integration issues (#2147)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.apache.kafka:kafka-clients from 3.5.1 to 3.6.0 in /client/java (#2172)

Bumps org.apache.kafka:kafka-clients from 3.5.1 to 3.6.0.

---
updated-dependencies:
- dependency-name: org.apache.kafka:kafka-clients
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump io.confluent:kafka-avro-serializer in /integration/flink (#2174)

Bumps [io.confluent:kafka-avro-serializer](https://github.com/confluentinc/schema-registry) from 7.5.0 to 7.5.1.
- [Commits](https://github.com/confluentinc/schema-registry/compare/v7.5.0...v7.5.1)

---
updated-dependencies:
- dependency-name: io.confluent:kafka-avro-serializer
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.apache.kafka:kafka-clients in /integration/spark (#2171)

Bumps org.apache.kafka:kafka-clients from 3.5.1 to 3.6.0.

---
updated-dependencies:
- dependency-name: org.apache.kafka:kafka-clients
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Enable nessie rest catalog (#2165)

Add a new case to the if-else statement in the getDatasetIdentifier() method to handle Nessie catalogs.
#2084

Signed-off-by: WINKJUL <julius.winkelmann@mercedes-benz.com>
Co-authored-by: WINKJUL <julius.winkelmann@mercedes-benz.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump io.confluent:kafka-schema-registry-client in /integration/flink (#2173)

Bumps [io.confluent:kafka-schema-registry-client](https://github.com/confluentinc/schema-registry) from 7.5.0 to 7.5.1.
- [Commits](https://github.com/confluentinc/schema-registry/compare/v7.5.0...v7.5.1)

---
updated-dependencies:
- dependency-name: io.confluent:kafka-schema-registry-client
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Updates changelog for 1.4.0. (#2178)

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.4.0

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.5.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Remove `deref()` from openlineage-sql impl. (#2179)

Fix mispell in changelog.

Fix `unwrap_or_else`.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Updates changelog for 1.4.1. (#2180)

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.4.1

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.5.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] migrate RddExecutionContext to PlanUtils (#2181)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Skip redaction on `ColumnLineageDatasetFacetFieldsAdditionalInputFields`. (#2177)

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump com.amazonaws:amazon-kinesis-producer in /client/java (#2196)

Bumps [com.amazonaws:amazon-kinesis-producer](https://github.com/awslabs/amazon-kinesis-producer) from 0.15.7 to 0.15.8.
- [Release notes](https://github.com/awslabs/amazon-kinesis-producer/releases)
- [Changelog](https://github.com/awslabs/amazon-kinesis-producer/blob/master/CHANGELOG.md)
- [Commits](https://github.com/awslabs/amazon-kinesis-producer/compare/v0.15.7...v0.15.8)

---
updated-dependencies:
- dependency-name: com.amazonaws:amazon-kinesis-producer
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump com.github.davidmc24.gradle.plugin:gradle-avro-plugin (#2193)

Bumps [com.github.davidmc24.gradle.plugin:gradle-avro-plugin](https://github.com/davidmc24/gradle-avro-plugin) from 1.8.0 to 1.9.1.
- [Release notes](https://github.com/davidmc24/gradle-avro-plugin/releases)
- [Changelog](https://github.com/davidmc24/gradle-avro-plugin/blob/master/CHANGES.md)
- [Commits](https://github.com/davidmc24/gradle-avro-plugin/compare/1.8.0...1.9.1)

---
updated-dependencies:
- dependency-name: com.github.davidmc24.gradle.plugin:gradle-avro-plugin
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump com.github.davidmc24.gradle.plugin.avro in /integration/flink (#2195)

Bumps [com.github.davidmc24.gradle.plugin.avro](https://github.com/davidmc24/gradle-avro-plugin) from 1.8.0 to 1.9.1.
- [Release notes](https://github.com/davidmc24/gradle-avro-plugin/releases)
- [Changelog](https://github.com/davidmc24/gradle-avro-plugin/blob/master/CHANGES.md)
- [Commits](https://github.com/davidmc24/gradle-avro-plugin/compare/1.8.0...1.9.1)

---
updated-dependencies:
- dependency-name: com.github.davidmc24.gradle.plugin.avro
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] fix duplicate COMPLETE events (#2103)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [Flink] support Flink cassandra lineage (#2175)

Signed-off-by: Zhenqiu Huang <huangzhenqiu0825@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* athena: change dataset name to its location (#2167)

* fix ci

athena: change dataset name to its location

Signed-off-by: dkt-sophie-ly <sophie.ly@decathlon.com>

* add s3 location in symlink facets

Signed-off-by: dkt-sophie-ly <sophie.ly@decathlon.com>

* add test for athena extractor

Signed-off-by: dkt-sophie-ly <sophie.ly@decathlon.com>

* fix ci

Signed-off-by: dkt-sophie-ly <sophie.ly@decathlon.com>

* constraint airflow version

Signed-off-by: dkt-sophie-ly <sophie.ly@decathlon.com>

* small fix

Signed-off-by: dkt-sophie-ly <sophie.ly@decathlon.com>

* fix typo in version

Signed-off-by: dkt-sophie-ly <sophie.ly@decathlon.com>

---------

Signed-off-by: dkt-sophie-ly <sophie.ly@decathlon.com>
Co-authored-by: dkt-sophie-ly <sophie.ly@decathlon.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* minor changes to spark/README.md file to avoid (#2202)

hick ups in first time setup of spark integration

Signed-off-by: savan navalgi <savan.navalgi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.apache.logging.log4j:log4j-slf4j-impl in /integration/flink (#2205)

Bumps org.apache.logging.log4j:log4j-slf4j-impl from 2.20.0 to 2.21.0.

---
updated-dependencies:
- dependency-name: org.apache.logging.log4j:log4j-slf4j-impl
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump com.typesafe:config from 1.4.2 to 1.4.3 in /integration/flink (#2206)

Bumps [com.typesafe:config](https://github.com/lightbend/config) from 1.4.2 to 1.4.3.
- [Release notes](https://github.com/lightbend/config/releases)
- [Changelog](https://github.com/lightbend/config/blob/main/NEWS.md)
- [Commits](https://github.com/lightbend/config/compare/v1.4.2...v1.4.3)

---
updated-dependencies:
- dependency-name: com.typesafe:config
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.xerial:sqlite-jdbc in /integration/spark (#2207)

Bumps [org.xerial:sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from 3.43.0.0 to 3.43.2.1.
- [Release notes](https://github.com/xerial/sqlite-jdbc/releases)
- [Changelog](https://github.com/xerial/sqlite-jdbc/blob/master/CHANGELOG)
- [Commits](https://github.com/xerial/sqlite-jdbc/compare/3.43.0.0...3.43.2.1)

---
updated-dependencies:
- dependency-name: org.xerial:sqlite-jdbc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] write scala integration test (#2188)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] support databricks 13.3. (#2185)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Update fluentd proxy to validate against 2.0 spec. (#2213)

Add unit tests to CI.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Add always step. (#2182)

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Loosen attrs and requests versions. (#2107)

Remove unnecessary dependency in openlineage-airflow.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [SPARK] fix bitnami image hash (#2216)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.apache.logging.log4j:log4j-slf4j-impl in /integration/flink (#2217)

Bumps org.apache.logging.log4j:log4j-slf4j-impl from 2.21.0 to 2.21.1.

---
updated-dependencies:
- dependency-name: org.apache.logging.log4j:log4j-slf4j-impl
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Add script to dev for generating release docs for the website (#2219)

* Adds script for generating release doc.

Signed-off-by: Michael Robinson <merobi@gmail.com>

* Adds docstring about purpose of script.

Signed-off-by: Michael Robinson <merobi@gmail.com>

---------

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Render yaml configs lazily. (#2221)

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare for release 1.5.0

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Prepare next development version 1.6.0-SNAPSHOT

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Updates the changelog for 1.5.0. (#2224)

Signed-off-by: Michael Robinson <merobi@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Bump org.xerial:sqlite-jdbc in /integration/spark (#2231)

Bumps [org.xerial:sqlite-jdbc](https://github.com/xerial/sqlite-jdbc) from 3.43.2.1 to 3.43.2.2.
- [Release notes](https://github.com/xerial/sqlite-jdbc/releases)
- [Changelog](https://github.com/xerial/sqlite-jdbc/blob/master/CHANGELOG)
- [Commits](https://github.com/xerial/sqlite-jdbc/compare/3.43.2.1...3.43.2.2)

---
updated-dependencies:
- dependency-name: org.xerial:sqlite-jdbc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* upgrade gradle and jackson (#2233)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Pin yq version. (#2235)

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* revert #2216 as is no longer required (#2234)

Signed-off-by: Pawel Leszczynski <leszczynski.pawel@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* [FLINK] add option for flink job listener to read from flink conf (#2229)

* allow flink job listener to read config values from stream execution environment

Signed-off-by: ensctom <tom_ou_yang@hotmail.com>

* add changelog and update flink readme

Signed-off-by: ensctom <tom_ou_yang@hotmail.com>

---------

Signed-off-by: ensctom <tom_ou_yang@hotmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* spec: add clarity to snowflake naming docs (#2223)

Signed-off-by: David Goss <david.goss@matillion.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* add Kafka to naming schema (#2226)

Add missing Kafka to Naming.md

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Update README.md (#2236)

Add latest spark version supported
---------

Signed-off-by: Julien Le Dem <julien@apache.org>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* Run always workflow really always. (#2238)

Fix spelling.

Signed-off-by: Jakub Dardzinski <kuba0221@gmail.com>
Signed-off-by: Sheeri K. Cabral <me@sheeri.com>

* dagster: support dagster 1.5.x (#2220)

* fix(dagster): fix missing imports in conftest.py

fix missing imports in conftest.py
---
Signed-off-by: George T. C., Lai <tsungchih.hd@gmail.com>
Closes #2043
Signed-off-by: George T. C. Lai <tsungchih.hd@gmail.com>

* fix(dagster): fix missing argument and AttributeError

EventRecordsFilter accepts an event_type as mandatory argument so that
we have to get event records for each event_type.

---

Signed-off-by:
George T. C., Lai <tsungchih.hd@gmail.com>

Closes #2043

Signed-off-by: George T. C. Lai <tsungchih.hd@gmail.com>

* test(dagster): correct tests for utils.py

correct tests for utils.py

---

Signed-off-by: George T. C., Lai
<tsungchih.hd@gmail.com>

Closes #2043

Signed-off-by: George T. C. Lai <tsungchih.hd@gmail.com>

* test(dagster): correct tests for sensor evaluation

correct tests for sensor evaluation

---

Signed-off-by: George T. C., Lai
<tsungchih.hd@gmail.com>

Closes #2043

Signed-off-by: George T. C. Lai <tsungchih.hd@gmail.com>

* fix(dagster): fix missing arguments for sensor

sensor factory method now accepts additional event_type with default set
PIPELINE_EVENTS and STEP_EVENTS for filtering event
records

---

Signed-off-by: George T. C., Lai <tsungchih.hd@gmail.com>

Closes #2043

Signed-off-by: George T. C. Lai <tsungchih.hd@gmail.com>

* docs(dagster): correct requirements for dagster version

correct requirements for Dagster version to 0.15.0+

---

Signed-off-by:
George T. C., Lai <tsungchih.hd@gmail.com>

Closes #2043

Signed-off-by: George T. C. Lai <tsungchih.hd@gmail.com>

* fix(dagster): support for Dagster version >=1.0.0

support for Dagster version >=1.0.0

---

Signed-off-by: George T. C. Lai
<tsungchih.hd@gmail.com>

Closes #2043

Signed-off-by: George T. C. Lai <tsungchih.hd@gmail.com>

* test(dagster): create DagsterRun for Dagster version>=1.0.0

handle DagsterRun instance creation for Dagster
version>=1.0.0

---

Signed-off-by: George T. C. Lai
<tsungchih.hd@gmail.com>

Closes #2043

Signed-off-by: George T. C. Lai <tsungchih.hd@gmail.com>

* test(dagster): need not to handle change to EventLogEntry

we don't need to handle removed message field from EventLogEntry since
0.14.3 (minimum version is 1.0.0 now)

---

Signed-off-by: George T. C.
Lai <tsungchih.hd@gma…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation integration/spark
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DebugFacet for Spark integration
2 participants