-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DevStats to track velocity for Kubeflow #34
Comments
Here's the code behind that |
@jlewi That's great! |
That is awesome! |
@lukaszgryglicki I created a postgres database and I ran gha2db to load the data from my repo into it. Is it possible to run a simple query e.g dump a time series of number of PRs per day? I tried using runq to run some of the sql files in util_sql/top_unknowns but I'm not sure what arguments to provide and I keep getting segfaults. |
First you need to run query in
util_sql/top_unknowns.sql contains "macros":
So to run it, you must provide replacements.
|
BTW: computed time series are in the If you have a server then you can implement something very similar for Kubeflow. I can help with any problems on the way. And maybe (checking this) I just can set up everything for you on your own server - not sure yet. |
So I have a green light to setup DevStats instance for Kubeflow (and to create good documentation during doing so). |
That's fantastic. If you send me the email associated with your google account; I can give you access to the GKE cluster and GCP project where we'd like to deploy this. |
lgryglicki@cncf.io |
We used to run postgres on K8s for Airflow. This is how we did
|
Which slack channel? (We use kubeflow.slack.com) |
Kubernetes (kubernetes.slack.com) then #devstats - you can pm me. DevStats wasn't ported to K8s yet, I don't even know how to use I can also create account on kubeflow.slack.com but seems like I need some invitation? |
So here's my guess of what we need to do
|
I can't probably help with that :-( We already discussed this on DevStats repo and it was decided that I shouldn't work on porting DevStats to K8s yet. If you are interested in this, there are tiny Packet servers that will be able to handle that. Even the smallest one $0.40/hour should be enough for such a small deployment. Unless @dankohn wants me to dive into K8s porting. I think that porting to K8s will be a bit more complex than you described it. |
No problem. Thanks for all the great work on devstats; looks fantastic. |
@lukaszgryglicki does the outline provided above make sense though? Was there anything obvious I missed. Thanks for the offer. I think I'd prefer to deploy it on K8s even if that means setting it up ourselves. I think longer term that will be easier to manage and I don't think it will be that hard to convert to K8s. |
My knowledge about K8s deployments is quite limited by I can see the following componentes:
Obviously for single project (Kubeflow) N=1. BTW: I like the idea of deploying to k8s. Would be great if you upstream your code. I can help with any questions. |
Do we need to run some binary regularly to execute the queries and load the data into influxdb? Is this handled by the devstats binary |
|
Binaries that are called by
|
Will devstats create the database's if they don't already exist? If not is there an option to automatically run the setup scripts that will create the DBs but only if they exist? Can we just start devstats and leave it running in order to periodically get the latest data? Or do we need to invoke it from a cron job? |
|
Actually after my last changes the |
@lukaszgryglicki @jberkus so I have devstats running and I'm definitely getting data into the postgress database. I'm not sure though if data is making into influx though. Any suggestion how to test? I'm a little confused about how to setup grafana. I looked at the instructions here but its not clear to me how we configure GRAFANA to access the postgres and influx databases. |
So it looks like one problem is that gha_repos is empty. I ran the following query to update that table
I'll file a bug to run it periodically. |
I reran devstats and now influx db shows the following time series
|
Looks like a bunch of data is still missing
Lets try rerunning shared/reinit.sh Note: Running reinit.sh complains about missing ` 'open ./metrics/shared/idb_vars.yaml: no such file or directory' I just created an empty file and that seemed to work. |
So the timeseries now looks to have all the data.
|
Grafana is complaing
|
I manually edited the influxdb query in Grafana to be
That worked. So the problem appears to be that Grafana isn't properly updating |
* The dashboards aren't properly set up yet. We're having some problems getting Grafana to handle repository groups correctly. Related to kubeflow#34
So it looks like the problem was that the dashboards are using Grafana variables whose value is supposed to be populated from influxdb. But when I imported the dashboards I must set the data source incorrectly because it was trying to read the data from the postgre data which didn't work as a result I was getting the error "template variables could not be initialized". I edited the dashboard via the UI and that fixed things. Now I just need to figure out how to load all the json files defining the dashboards into Grafana. |
@jberkus @lukaszgryglicki Is there a script I can use to create dashboards from all the json files in a directory? If the dashboards already exist I'd like them to be overwriten. |
Looks like we can configure Grafana to load dashboards from a directory. |
Found some helpful instructions about how to configure Grafana to load dashboards from files in conjuction with a config map. |
In my case, I was copying grafana.db file from the test server to another server. Postgres variables are created by |
* Dashboards are provided via a configmap. * Grafana is configured via config maps to load dashboards from the directory. * Some of the dashboards don't appear to be working correctly and/or showing data. * The dashboards are based on the K8s dashboards and were copied from here: https://github.com/cncf/devstats/tree/master/grafana/dashboards/kubernetes * We excluded the sig tables since we don't have any sigs. * We added some shell scripts to convert them to use for Kubeflow. Related to kubeflow#34
Thanks. I ended up configuring Grafana to load dashaboards from files. I store the dashboards in a config map so its super easy to update them (just ks apply). /cc @jberkus |
OK great. |
* Deploy grafana to show the dashboards * Dashboards are provided via a configmap. * Grafana is configured via config maps to load dashboards from the directory. * Some of the dashboards don't appear to be working correctly and/or showing data. * The dashboards are based on the K8s dashboards and were copied from here: https://github.com/cncf/devstats/tree/master/grafana/dashboards/kubernetes * We excluded the sig tables since we don't have any sigs. * We added some shell scripts to convert them to use for Kubeflow. Related to #34 * * Expose devstats dashboards publicly * Enable anonymous access * Make the admin password secure * Setup an ingress to allow http access * Update command to print out password.
@jlewi have deployed devstat on k8s? is there any resources/source code to re-use of your deployment ? |
@ant31 I think you can find the artifacts here: https://github.com/kubeflow/community/tree/master/devstats |
@gaocegege great, thanks ! |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Issue-Label Bot is automatically applying the labels:
Please mark this comment with 👍 or 👎 to give our bot feedback! |
Kubernetes has a great set of dashboards for velocity related metrics
https://k8s.devstats.cncf.io/d/44/time-metrics?orgId=1&var-period=w&var-repogroup_name=Kubernetes&var-repogroup=kubernetes&var-apichange=All&var-size_name=All&var-size=all&var-full_name=Kubernetes
There are dashboards for things like
PR time to LGTM
approvers/# reviewers
https://k8s.devstats.cncf.io/d/38/reviewers?orgId=1
It would be great to get the same metrics for Kubeflow.
The text was updated successfully, but these errors were encountered: