Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Support describe index alias #775

Conversation

dai-chen
Copy link
Member

@dai-chen dai-chen commented Oct 9, 2020

Issue #, if available: #725

Description of changes: To understand the root cause and its fix, first let's see how we implemented SHOW/DESCRIBE:

  1. Given SQL LIKE pattern (probably with wildcard % or _)
  2. Convert to ES search pattern: replace % with ES wildcard *.
  3. Get index metadata by ES pattern above: fetch all if including _ because ES only supports *
  4. Convert to regex pattern: replace % with .* and replace _ with .
  5. Filter the ES response by regex pattern above: because we may fetch all if pattern includes _, it's required to filter out the actual matched index name by regex again.

So the root cause is that alias doesn't necessarily match its index name so it's ruled out during regex matching. Ideally, we should fetch alias info and determine if LIKE pattern is an alias. However, this incurs extra ES metadata reading which may need additional permission and testing.

To quick fix the issue, changes in this PR is to skip the regex pattern matching if LIKE pattern doesn't include any wildcard. In this way, the matching process between an index name and its alias (regex pattern converted from pattern in LIKE) won't happen. Thus the index name will be present in the final result.

Documentation: Update existing doc with support for alias: https://github.com/dai-chen/sql/blob/support-describe-index-alias/docs/user/dql/metadata.rst#example-2-show-specific-index-information

Testing: Add UT and IT. Here is the 4 typical cases described in UT.

  /**
   * Case #1:
   * LIKE 'test%' is converted to:
   *  1. Regex pattern: test.*
   *  2. ES search pattern: test*
   * In this case, what ES returns is the final result.
   */
  ...

  /**
   * Case #2:
   * LIKE 'test_123' is converted to:
   *  1. Regex pattern: test.123
   *  2. ES search pattern: (all)
   * Because ES doesn't support single wildcard character, in this case, none is passed
   * as ES search pattern. So all index names are returned and need to be filtered by
   * regex pattern again.
   */
  ...

  /**
   * Case #3:
   * LIKE 'acc' has same regex and ES pattern.
   * In this case, only index name(s) aliased by 'acc' is returned.
   * So regex match is skipped to avoid wrong empty result.
   * The assumption here is ES won't return unrelated index names if
   * LIKE pattern doesn't include any wildcard.
   */
  ...

  /**
   * Case #4:
   * LIKE 'test.2020.10' has same regex pattern. Because it includes dot (wildcard),
   * ES search pattern is all.
   * In this case, all index names are returned. Because the pattern includes dot,
   * it's treated as regex and regex match won't be skipped.
   */

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@dai-chen dai-chen added enhancement New feature or request SQL labels Oct 9, 2020
@dai-chen dai-chen self-assigned this Oct 9, 2020
@dai-chen dai-chen marked this pull request as ready for review October 14, 2020 20:00
Copy link
Contributor

@penghuo penghuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@penghuo penghuo merged commit 6522f58 into opendistro-for-elasticsearch:develop Oct 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request SQL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants