New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Data Synchronization/Matching] Delegate to Spark for checking existence of columns in the given dataframes #515
Merged
rdsharma26
merged 1 commit into
awslabs:master
from
rdsharma26:dataset-match-column-name-case-issue
Oct 27, 2023
Merged
[Data Synchronization/Matching] Delegate to Spark for checking existence of columns in the given dataframes #515
rdsharma26
merged 1 commit into
awslabs:master
from
rdsharma26:dataset-match-column-name-case-issue
Oct 27, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…nce of columns in the given dataframes - Prior to this change, we were doing case sensitive equality checks of non-key columns. - This makes the utility more restrictive, as Spark does not care about the casing of column names. - With this change, we rely on Spark to check if a column exists in the given dataframe. If Spark can find the column, we can proceed with the rest of the check.
eycho-am
approved these changes
Oct 26, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
rdsharma26
added a commit
that referenced
this pull request
Oct 27, 2023
…nce of columns in the given dataframes (#515) - Prior to this change, we were doing case sensitive equality checks of non-key columns. - This makes the utility more restrictive, as Spark does not care about the casing of column names. - With this change, we rely on Spark to check if a column exists in the given dataframe. If Spark can find the column, we can proceed with the rest of the check.
javierdlrm
pushed a commit
to javierdlrm/deequ
that referenced
this pull request
Oct 31, 2023
…nce of columns in the given dataframes (awslabs#515) - Prior to this change, we were doing case sensitive equality checks of non-key columns. - This makes the utility more restrictive, as Spark does not care about the casing of column names. - With this change, we rely on Spark to check if a column exists in the given dataframe. If Spark can find the column, we can proceed with the rest of the check.
rdsharma26
added a commit
that referenced
this pull request
Nov 1, 2023
…nce of columns in the given dataframes (#515) - Prior to this change, we were doing case sensitive equality checks of non-key columns. - This makes the utility more restrictive, as Spark does not care about the casing of column names. - With this change, we rely on Spark to check if a column exists in the given dataframe. If Spark can find the column, we can proceed with the rest of the check.
rdsharma26
added a commit
that referenced
this pull request
Apr 16, 2024
…nce of columns in the given dataframes (#515) - Prior to this change, we were doing case sensitive equality checks of non-key columns. - This makes the utility more restrictive, as Spark does not care about the casing of column names. - With this change, we rely on Spark to check if a column exists in the given dataframe. If Spark can find the column, we can proceed with the rest of the check.
rdsharma26
added a commit
that referenced
this pull request
Apr 16, 2024
…nce of columns in the given dataframes (#515) - Prior to this change, we were doing case sensitive equality checks of non-key columns. - This makes the utility more restrictive, as Spark does not care about the casing of column names. - With this change, we rely on Spark to check if a column exists in the given dataframe. If Spark can find the column, we can proceed with the rest of the check.
rdsharma26
added a commit
that referenced
this pull request
Apr 16, 2024
…nce of columns in the given dataframes (#515) - Prior to this change, we were doing case sensitive equality checks of non-key columns. - This makes the utility more restrictive, as Spark does not care about the casing of column names. - With this change, we rely on Spark to check if a column exists in the given dataframe. If Spark can find the column, we can proceed with the rest of the check.
rdsharma26
added a commit
that referenced
this pull request
Apr 17, 2024
…nce of columns in the given dataframes (#515) - Prior to this change, we were doing case sensitive equality checks of non-key columns. - This makes the utility more restrictive, as Spark does not care about the casing of column names. - With this change, we rely on Spark to check if a column exists in the given dataframe. If Spark can find the column, we can proceed with the rest of the check.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes:
Issue #, if available:
N/A
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.