Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build variables in SQL commands with Parameters (Don't interpolate string) #21856

Open
fabionaspolini opened this issue Jan 30, 2020 · 5 comments
Labels

Comments

@fabionaspolini
Copy link

What would you like to be added:

I would like to change the query build engine. Currently is interpolated command text string query and submit to DB Server.

Why is this needed:

This is very bad because on refresh data, always is changed the command text query. That invalidates previously processed execute plan at the DB Server cache and cause Memory and CPU overhead.

In a low concurrence environment, this detail does not affect much, but on medium/high concurrency environment makes it impossible to use the tool.
The database server has memory area do store processed plans, when exists an application building a lot of command text different, this cache area purge others plans already pre-processed and increases CPU consumption.

These settings greatly improve performance, especially in SQL Server.

Simulate
I created git repo with this case, execute instructions from markdown file to up an environment to test.
https://github.com/fabionaspolini/Grafana-Issues-Samples

Proposed implementation

This part from current query builder engine is very bad.
current-builded-sql

Here need use database params feature:

select *
from city
where stateid in @states
  and foundation_date BETWEEN @date_from AND @date_to
order by name;
@phemmer
Copy link
Contributor

phemmer commented May 25, 2020

I also have interest in this, but for another reason. I'm hoping to be able to expose Grafana to the public internet. The massive problem with this is that the client sends raw SQL to Grafana for it to execute, so the client can abuse it and write whatever SQL they want. And because the SQL is generated client side, with variables injected into it, there's no practical way to validate it's not abuse.

However if the SQL queries were static, then validation becomes possible. And if all the variables were pulled out and sent along side the query string as parameters, the query would become static.

I don't think any sort of validation should necessary be part of a V1 implementation of this feature request (the one the ticket was opened for, not mine), but it might be a good idea to keep in mind so a future iteration could add the validation functionality without a complete rewrite.

As for how to validate, I can think of a few ways.
I think the best would be to not have the client send SQL at all. If the statements are stored on the server, the client can just invoke them with identifiers. E.G. 'execute statement 123 with parameters foo="bar"', and the client has no idea what statement 123 is. As a further bonus, by using IDs, grafana could use prepared statements for performance benefits.
Option 2 would be to take the SQL the client sent and see if it matches any known SQL statement configured on a panel. Little crude, but probably simple implementation.
Option 3 would be to have the server sign the query string before sending to the client, and then have the client send the query+signature back. With this the server wouldn't have to do any sort of lookups on the query, it can just validate and pass through.

@gabor
Copy link
Contributor

gabor commented Oct 23, 2023

(removing assignment to postgres and mysql, because the original post is about mssql, and query planning details depend on the specific database)

@septatrix
Copy link

(removing assignment to postgres and mysql, because the original post is about mssql, and query planning details depend on the specific database)

This should be implemented for all database sources, not only for a single one.

@gabor
Copy link
Contributor

gabor commented Oct 24, 2023

@septatrix thanks for the comment. if you think the same applies to postgres and mysql too, i recommend creating separate issues for each datasource plugin (mysql, postgres), describing how their query plan caching works, and why they would benefit from sending variables separately. that way we can discuss the merits for each db separately.

if we do it in a single issue, discussions get mixed up.

we can always merge them later.

i do understand that it's generally preferred to send variables "separately", unfortunately that is not the way this architecturally works in grafana, and it's not something that is easy to change. (consider template-variables etc.).

@septatrix
Copy link

i do understand that it's generally preferred to send variables "separately", unfortunately that is not the way this architecturally works in grafana, and it's not something that is easy to change. (consider template-variables etc.).

While it is certainly not as simple, such a more extensive change would be the way to go forward. Everything else would just duplicate the effort in several places. A general concept has to be developed for parameterized queries for all data sources as that would solve several issues. While this issue is about performance, the more general argument is about security which can be trivially circumvented on public dashboards. Please also see my comment about this in #30552 (comment) where I also mention a third issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Feature Requests
Development

No branches or pull requests

7 participants