Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for handling complex attribute expression. #107

Merged
merged 79 commits into from
Dec 14, 2021

Conversation

sarthak77
Copy link
Member

@sarthak77 sarthak77 commented Nov 25, 2021

Ref issue : #105

@codecov
Copy link

codecov bot commented Nov 25, 2021

Codecov Report

Merging #107 (e973e48) into main (42b216f) will increase coverage by 1.03%.
The diff coverage is 83.70%.

Impacted file tree graph

@@             Coverage Diff              @@
##               main     #107      +/-   ##
============================================
+ Coverage     80.17%   81.21%   +1.03%     
- Complexity      549      573      +24     
============================================
  Files            53       53              
  Lines          2134     2204      +70     
  Branches        225      234       +9     
============================================
+ Hits           1711     1790      +79     
+ Misses          321      319       -2     
+ Partials        102       95       -7     
Flag Coverage Δ
integration 81.21% <83.70%> (+1.03%) ⬆️
unit 79.23% <80.74%> (+0.94%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...e/prometheus/QueryRequestEligibilityValidator.java 13.43% <0.00%> (ø)
...ry/service/prometheus/FilterToPromqlConverter.java 50.00% <50.00%> (-0.99%) ⬇️
.../query/service/pinot/PinotBasedRequestHandler.java 70.72% <66.66%> (-0.46%) ⬇️
...service/pinot/QueryRequestToPinotSQLConverter.java 91.09% <81.25%> (+4.29%) ⬆️
...ypertrace/core/query/service/QueryRequestUtil.java 71.87% <92.59%> (+19.79%) ⬆️
...y/service/projection/ProjectionTransformation.java 90.11% <100.00%> (+2.80%) ⬆️
...vice/prometheus/PrometheusBasedRequestHandler.java 85.45% <100.00%> (-0.26%) ⬇️
...core/query/service/prometheus/PrometheusUtils.java 80.00% <100.00%> (-3.34%) ⬇️
...vice/prometheus/QueryRequestToPromqlConverter.java 100.00% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 42b216f...e973e48. Read the comment docs.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@sarthak77 sarthak77 marked this pull request as ready for review November 26, 2021 06:55
@sarthak77 sarthak77 requested review from a team and findingrish November 26, 2021 06:55
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@sarthak77 sarthak77 marked this pull request as draft November 29, 2021 03:12
@sarthak77 sarthak77 removed the request for review from aaron-steinfeld November 29, 2021 03:38
@github-actions

This comment has been minimized.


@Test
public void testQueryWithGroupByWithMapAttribute() {
Builder builder = QueryRequest.newBuilder(buildGroupByMapAttributeQuery());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am assuming that AttributeExpression without subpath for supporting EQ and NEQ will be added as followup, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this case the normal one where such an attribute expression without a subpath represents a normal column identifier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you meant CONTAINS_KEY and CONTAINS_KEYVALUE support, then yes.

@@ -37,6 +37,9 @@
private static final int MAP_KEY_INDEX = 0;
private static final int MAP_VALUE_INDEX = 1;

private static final List<Operator> SUPPORTED_OPERATORS_FOR_MAP_ATTRIBUTES =
List.of(Operator.EQ, Operator.NEQ, Operator.GT, Operator.GE, Operator.LT, Operator.LE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Map attribute with sub_path, we will also support LIKE operator.
Example query for Regex query in Pinot

SELECT mapValue(tags__KEYS, 'span.kind', tags__VALUES) FROM spanEventView WHERE mapValue(tags__KEYS, 'span.kind', tags__VALUES) != '' and REGEXP_LIKE(mapValue(tags__KEYS, 'span.kind', tags__VALUES), '^*.cli*')  and start_time_millis > 1636453201000 and start_time_millis < 1636456801000 limit 10

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do in a follow up PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LIKE is actually the primary ask that kicked off the whole effort, but I'm curious why we need this - are any operators not supported? the whole idea here is that there should be no difference. That was intended for the caller to not care, but I'd imagine the same idea would apply in the code. will read on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the boolean operators like and/or/not are not appropriate for map attributes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the case for any attribute though, it's not unique to maps. The new commit looks right, but almost all of the other new changes that are are custom to maps should be able to be reverted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will not need this set for attribute expression having subpath (so for the key of the map). There is support for IN, NOT IN as well. So, it will support all the operator ColumnIdentifier was supporting. Can we remove this set? If, any exception that encounter, we can just put that as an exception
e.g

select mapValue(tags__KEYS, 'response_flags', tags__VALUES) from spanEventView 
where start_time_millis > 1638497360000 and start_time_millis < 1638508160000
and tags__KEYS = 'response_flags' and mapValue(tags__KEYS, 'response_flags', tags__VALUES) IN ('NR', 'UR')

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

String keyCol = convertExpressionToMapKeyColumn(filter.getLhs());
String valCol = convertExpressionToMapValueColumn(filter.getLhs());

if (!SUPPORTED_OPERATORS_FOR_MAP_ATTRIBUTES.contains(filter.getOperator())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, we also have to check if the mapped attributedId (e.g Span.tags) maps to a complex column in Pinot. Only supported transformation should pass else we need to throw an exception. In our current case, the mapped column should be of type mapFields only - https://github.com/hypertrace/query-service/blob/main/query-service/src/main/resources/configs/common/application.conf#L58

Span.tags -> is of map type, if not we need to throw an exception.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handleFilterForComplexAttribute

  • check if the complex attribute is of map type (mapFields)
  • if, yes, handleFilterForMapColumn

handleFilterForMapColumn

  • check here that it has supported operators
  • if not, throw an exception
  • if, yes, do a transformation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, since the only complex attribute is the map attribute, keeping only handleFilterForMapColumn function

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

@sarthak77
Copy link
Member Author

Should the test in migrationTest.java be moved to QueryRequestToPinotSQLConverterTest.java?

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@@ -269,7 +270,7 @@ private String convertExpressionToString(
viewDefinition.getPhysicalColumnNames(getLogicalColumnName(expression));
return joiner.join(columnNames);
case ATTRIBUTE_EXPRESSION:
if (isMapAttributeExpression(expression)) {
if (isAttributeExpressionMapAttribute(expression)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this condition is correct, just incorrectly named (It's a map selection, not any map attribute).

if ((expression.getValueCase() == COLUMNIDENTIFIER)
|| (expression.getValueCase() == ATTRIBUTE_EXPRESSION && !isComplexAttribute(expression))) {
if (expression.getValueCase() == COLUMNIDENTIFIER
|| isAttributeExpressionMapAttribute(expression)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're making our own utils here, so ideally should be encapsulating this in a single predicate. More importantly however, this would fail on an attribute expression without a subpath. If we want to remove the type check here, there's no reason to even include an if condition - the thrown exception is saying it's a type mismatch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words we can proceed one of two ways:

  • Fix isAttributeExpressionMapAttribute to take in a view def and ensure it's a map, independent of subpath
  • Remove this check and the thrown exception and run this code unconditionally

Copy link
Member Author

@sarthak77 sarthak77 Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier didn't remove as it was already present in the main before all these changes. Removed now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

public static boolean isComplexAttribute(Expression expression) {
return expression.getValueCase().equals(ATTRIBUTE_EXPRESSION)
&& expression.getAttributeExpression().hasSubpath();
public static Filter createContainsKeyFilter(String column, String value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can be inline above function or can be private as it (createContainsKeyFilter(String column, String value) is only used in Tests now. But, can be taken as follow up.

kotharironak
kotharironak previously approved these changes Dec 14, 2021
Copy link
Contributor

@kotharironak kotharironak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, now lgtm. But, see @aaron-steinfeld any open comments before merging this.

}
throw new IllegalArgumentException(
"operator CONTAINS_KEY/KEYVALUE supports multi value column only");
throw new IllegalArgumentException("operator supports multi value column only");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up is fine but we should fixup this error message since the type check is gone. Now, this error is reached if no key column exists for the provided attribute - that could be an attribute that's mistyped, but it also may just not be a column in that view.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what should be the error here now ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unable to lookup key column for attribute: {}, or something like that

Builder builder = QueryRequest.newBuilder();
Expression spanTag = createSimpleAttributeExpression("Span.tags").build();
Expression spanTag = createComplexAttributeExpression("Span.tags", "FLAGS").build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does our code lowercase this? I would imagine we should respect the key's casing, unless we're lowercasing the data as we persist it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is interesting, need to check. But, it seems that as prior existing tests (for CONTAINS_KEYVALUE) were also changed to lower case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should follow up on this. May be working correctly, but it caught my eye. If we're pushing everything to lowercase on ingestion side, we have to do this; if we're not, I'd consider it a bug.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we are not doing anything specific for lowercase. It is the assertion utility in the test were written as lowercase for comparing two string for expected and actual string.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, we should remove that. Case matters in (certain positions of) a query, and this is testing the query itself.

}

private QueryRequest buildGroupByMapAttributeQuery() {
Builder builder = QueryRequest.newBuilder();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For test inputs that are this complex, it's usually more readable and a better test (since it better mimics an upstream client) to provide them as text.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't get you

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than building the query in code, this would mean deserializing it from text - either an inline string or a file.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@sarthak77 sarthak77 merged commit c4c1d55 into main Dec 14, 2021
@sarthak77 sarthak77 deleted the feat_support_for_complex_attr branch December 14, 2021 17:45
@github-actions
Copy link

Unit Test Results

  30 files  ±  0    30 suites  ±0   8s ⏱️ -1s
171 tests +13  171 ✔️ +13  0 💤 ±0  0 ❌ ±0 

Results for commit c4c1d55. ± Comparison against base commit 42b216f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants