-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement extra info (tag, field) #46
Conversation
9f5f609
to
b5b3035
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, LGTM.
Let Phan do the final review. Also, I am curious why kubernetes_version
is able to collect now. I did not see the code here before this implementation. Or we use a different branch for Longhorn?
@PhanLe1010 Please do the review before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are creating continuous queries automatically for app developers. Suggesting to modify the documentation
Lines 128 to 153 in b5b3035
### Add kubernetesVersion extra field | |
For example, if you want to keep track of the number of your application instances by each Kubernetes version, you may want to include Kubernetes version into `extraInfo` in the request's body sent to Upgrade Responder server. | |
The request's body may look like this: | |
```json | |
{ | |
"appVersion": "v0.8.1", | |
"extraInfo": { | |
"kubernetesVersion": "v1.19.1" | |
} | |
} | |
``` | |
In order to display statical information about Kubernetes version on Grafana dashboard, you need to do: | |
1. Create a continuous query in the InfluxDB to periodically group data by Kubernetes version | |
``` | |
CREATE CONTINUOUS QUERY "cq_by_kubernetes_version_down_sampling" ON "<application-name>_upgrade_responder" BEGIN SELECT count("value") as total INTO "by_kubernetes_version_down_sampling" FROM "upgrade_request" GROUP BY time(1h),"kubernetes_version" END | |
``` | |
where: | |
* `cq_by_kubernetes_version_down_sampling` is the name of the new continuous query (chosen by you) | |
* `<application-name>_upgrade_responder` is the name of the database prefixed by your application's name | |
* `by_kubernetes_version_down_sampling` is the name of the new measurement (chosen by you) | |
* `upgrade_request` is the original measurement where Upgrade Responder server records data to | |
* `1h` is the query period. Make sure that this value is the same as `--query-period` | |
* `kubernetes_version` is the tag to grouping data. This is derived from the camelcase form of `kubernetesVersion` to its snakecase form. | |
1. Create a Grafana panel that pull data from the new measurement `by_kubernetes_version_down_sampling` similar to this: | |
![Alt text](./assets/images/grafana_query_by_kubernetes_version.png?raw=true) |
Also, we need to update the Grafanar dashboard afterward.
After this PR, the upgrade-responder server will create a continuous query for each user-submitted tag value. This can potentially be abused by a malicious user sending a request with a lot of tags and crashing our database. I am thinking to verify the user-submitted data based on a config file similar to https://github.com/longhorn/upgrade-responder/blob/b5b303538e1e2920a97f0afb7e6fb1ce5e2ea143/config/response.json#L1-L26 This will be tracked at the ticket https://github.com/longhorn/upgrade-responder/issues/47 |
That's possible, especially right now this service is quite public and anonymous. It's better to use white list to achieve that. |
ref: 5235 Signed-off-by: Chin-Ya Huang <chin-ya.huang@suse.com>
092cc6b
to
f327113
Compare
28ca1e9
to
3b0c787
Compare
b80e29a
to
64c48fa
Compare
d79c0cf
to
ed6684e
Compare
ed6684e
to
e325414
Compare
ref: 5235 Signed-off-by: Chin-Ya Huang <chin-ya.huang@suse.com>
96a5396
to
6145b60
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have created the PR #48 which contains the logic of this PR as well as the logic for user-data verification. Is it ok to use that PR in favor of this PR @c3y1huang @innobead? Thank you very much
return | ||
} | ||
|
||
s.dbCache.AddPoint(pt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Iwith this AddPoint() and the one below it, we are adding 2 points for 1 request which will make the number of node count double because the number of nodes == number of requests
return nil | ||
} | ||
|
||
func (s *Server) addContinuousQuerieStringsFromTags(databaseName string, queryStrings map[string]string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am thinking that the main focus for the extra tag info and extra field info is to understand the distribution of different values/settings currently. Therefore, I think we don't need to create continuous queries because we can use the data in the last hour to visualize all kind of distribution. What do you think?
I am OK with that. @c3y1huang WDYT? |
I am fine with this. |
longhorn/longhorn#5235