Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TraceQL Search against Serverless Returns 400 #2114

Closed
disfluxly opened this issue Feb 17, 2023 · 5 comments · Fixed by #2120
Closed

TraceQL Search against Serverless Returns 400 #2114

disfluxly opened this issue Feb 17, 2023 · 5 comments · Fixed by #2120

Comments

@disfluxly
Copy link

Describe the bug
When deploying Tempo Serverless for Search on AWS Lambda, and using the TraceQL section of Grafana Explore, I receive the following error:

Query error
upstream: (500) external endpoint returned 400, serverless [parsing search request]: invalid TraceQL query: parse error at line 1, col 1: syntax error: unexpected %

image

To Reproduce
Steps to reproduce the behavior:

  1. Start Tempo 2.0
  2. Deploy Tempo Serverless 2.0 to AWS Lambda
  3. Configure Tempo 2.0 to use Serverless External Endpoint, set prefer_self: 0 so all Search Hits go to the External Endpoint
  4. Use Grafana Explore on TraceQL to execute a query

Expected behavior
I expect TraceQL to return the same results as it does when the Queriers return the result (prefer-self: >0)

Environment:

  • Infrastructure: AWS Lambda. go1.x runtime.
  • Deployment tool: Terraform

Additional Context
This is coming from Grafana Explore. I tried both 9.4.0-beta & 9.3.6.
Here's an example URL taken from Grafana's Inspector:

"api/datasources/proxy/219/api/search?q=%7Bspan.http.status_code%20%3E%3D%20200%20%26%26%20span.http.status_code%20%3C%20300%7D%20%7C%20count()%20%3E%202&limit=20&start=1676658079&end=1676661679"

This is what the querier reports as the error:

level=warn ts=2023-02-17T20:21:19.697288228Z caller=logging.go:86 traceID=1061e2766aa23b2d msg="GET /querier/api/search?blockID=fad9c7f4-f497-49b2-af21-2473f6ebf991&dataEncoding=&encoding=none&end=1676661679&footerSize=6240&indexPageSize=0&limit=20&pagesToSearch=19&q=%7Bspan.http.status_code+%3E%3D+200+%26%26+span.http.status_code+%3C+300%7D+%7C+count%28%29+%3E+2&size=2608809&start=1676658079&startPage=0&totalRecords=1&version=vParquet (500) 23.398073ms Response: \"external endpoint returned 400, serverless [parsing search request]: invalid TraceQL query: parse error at line 1, col 1: syntax error: unexpected %\\n\"

Doing a curl against the Querier directly does the same:

/ $ curl "http://localhost:3100/querier/api/search?blockID=fad9c7f4-f497-49b2-af21-2473f6ebf991&dataEncoding=&encoding=none&end=1676661679&footerSize=6240&index
PageSize=0&limit=20&pagesToSearch=19&q=%7Bspan.http.status_code+%3E%3D+200+%26%26+span.http.status_code+%3C+300%7D+%7C+count%28%29+%3E+2&size=2608809&start=1676
658079&startPage=0&totalRecords=1&version=vParquet"
external endpoint returned 400, serverless [parsing search request]: invalid TraceQL query: parse error at line 1, col 1: syntax error: unexpected %

In addition, normal Search works completely fine. It seems to be only an error when using TraceQL. Could the Serverless Function not be url parsing correctly?

@joe-elliott
Copy link
Member

joe-elliott commented Feb 21, 2023

In addition, normal Search works completely fine.

I was going to suggest reviewing your Lambda settings to make sure things are setup correctly, but if you're confident you've run normal search queries through Lambdas then everything is likely correct.

I'm curious as to what the Lambda code is receiving. Can we debug by dumping the values in this loop and see if anything looks off: https://github.com/grafana/tempo/blob/main/cmd/tempo-serverless/lambda/main.go#L59? Or maybe the Lambda itself does request/response logging? Perhaps it has a clue.

We have not seen this issue internally, but I can't say we've run traceql through AWS lambda functions except through our integration tests.

@disfluxly
Copy link
Author

disfluxly commented Feb 21, 2023

@joe-elliott - Yep, standard search is working. I added in the print statements, here's the result:

Standard Search

Params: map[]
pagesToSearch: 16
start: 1677016145
tags: http.status_code%3D200
dataEncoding:
limit: 20
maxBytes: 5000000
startPage: 0
blockID: e459698a-d75d-402c-b4c0-a1771bb0ff9b
encoding: none
size: 2946347
version: vParquet
footerSize: 6242
indexPageSize: 0
end: 1677019745
totalRecords: 1 

TraceQL Search

Params: map[]
encoding: none
indexPageSize: 0
q: %7Bspan.http.status_code+%3D+200+%7D
end: 1677019762
limit: 20
start: 1677016162
pagesToSearch: 16
totalRecords: 1
maxBytes: 5000000
size: 2946347
startPage: 0
version: vParquet
blockID: e459698a-d75d-402c-b4c0-a1771bb0ff9b
dataEncoding:
footerSize: 6242 

Edit: Made sure the "query" was the same between both.

@disfluxly
Copy link
Author

Adding a decode on the value seems to fix the issue:
https://github.com/grafana/tempo/blob/main/cmd/tempo-serverless/lambda/main.go#L59-L61

for k, v := range event.QueryStringParameters {
		decodedValue, _ := url.QueryUnescape(v)
		params.Set(k, decodedValue)
	}

Logged output:

Params: map[]
startPage: 0
version: vParquet
encoding: none
end: 1677021964
footerSize: 6242
indexPageSize: 0
pagesToSearch: 16
q: {span.http.status_code = 200 }
start: 1677018364
totalRecords: 1
blockID: e459698a-d75d-402c-b4c0-a1771bb0ff9b
dataEncoding:
maxBytes: 5000000
size: 2946347
limit: 20 

Standard search also continues to work with this fix. I would submit a MR...buuuuuut I did this at work soooo I can only "suggest" this as a fix :)

@joe-elliott
Copy link
Member

haha :)

well, thank you for the suggestion!

@joe-elliott
Copy link
Member

i appreciate your help here. great find! the PR is up and will be backported to 2.0.1!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants