Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Iceberg support case insensitivity #83

Closed
xabriel opened this issue Jan 17, 2019 · 4 comments
Closed

Make Iceberg support case insensitivity #83

xabriel opened this issue Jan 17, 2019 · 4 comments

Comments

@xabriel
Copy link
Contributor

xabriel commented Jan 17, 2019

Iceberg's current implementation has column case sensitivity, which hinders usability, as most sql users expect case insensitivity by default. While a query like the following will succeed in other Spark Readers, it will fail on Iceberg:

SELECT COUNT(*)
FROM iceTable
WHERE year = 2017
  AND MONTH = 11 -- Notice how MONTH has different casing than other predicates
  AND day = 01

This will fail with a stack trace similar to:

com.google.common.util.concurrent.UncheckedExecutionException: com.netflix.iceberg.exceptions.ValidationException: Cannot find field 'MONTH' in struct: struct<...>
...

PR to solve this issue at iceberg-api level: #82

More PRs to use this new flag to follow.

xabriel added a commit to xabriel/incubator-iceberg that referenced this issue Jan 19, 2019
@xabriel xabriel changed the title Make expression binding case insensitive Make expression binding support a case sensitivity flag Jan 19, 2019
xabriel added a commit to xabriel/incubator-iceberg that referenced this issue Jan 23, 2019
@xabriel
Copy link
Contributor Author

xabriel commented Jan 23, 2019

PR #82 solves this issue at iceberg-api level.

We still need follow up PRs to:

  1. Expose this new caseSensitive flag as a configuration, perhaps by introducing the use of org.apache.hadoop.conf.Configuration.
  2. Also need to address a comment from @rdblue on Make expression binding support a case sensitivity flag #82:

I think some of these Evaluators will also need case sensitivity options. It doesn't do much good to support it in expression binding if it isn't also exposed when working with expressions in other ways. Can you also open a follow-up issue?

@xabriel xabriel changed the title Make expression binding support a case sensitivity flag Make Iceberg support case insensitivity Mar 19, 2019
@xabriel
Copy link
Contributor Author

xabriel commented Mar 19, 2019

PR #89, just merged, solves this problem all the way to the Spark Reader.

Jotting here some minor follow up items so that we don't forget:

Need to address comments:
#89 (comment)
and
#89 (comment)

@xabriel
Copy link
Contributor Author

xabriel commented Mar 25, 2019

There's another issue with Filterables described in #145 .

@rdblue
Copy link
Contributor

rdblue commented Jul 6, 2019

I'm going to close this because #89 and #141 fixed the original problem. #145 would be nice to fix, but the code works as it is right now.

@rdblue rdblue closed this as completed Jul 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants