-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a K8s app to deploy cncf/devstats for Kubeflow. #91
Conversation
@jlewi: GitHub didn't allow me to request PR reviews from the following users: lukaszgryglicki. Note that only kubeflow members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, added some suggestions/comments.
Please note that I have no ability to run/test this.
So I've just reviewed the code carefully - nothing more.
devstats/Dockerfile.devstats
Outdated
git clone https://github.com/cncf/devstats.git devstats | ||
|
||
RUN cd ${GOPATH}/src/devstats && \ | ||
git checkout v0.4.0 && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This release is old. I would go with master or let me create another v0.5.0 release tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
devstats/Dockerfile.devstats
Outdated
|
||
# Add /go/src/devstats to the path so that all the binaries will be on the path. | ||
# This is needed by the devstats binary. | ||
ENV PATH $PATH:/go/src/devstats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All other lgtm for me, but I cannot actually test this and confirm that it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem; wasn't expecting you to. Thanks for the comments.
devstats/Makefile
Outdated
# this a ksonnet app and run locally on minikube. | ||
run_postgre: | ||
-docker rm -f postgres | ||
docker run -d --name=postgres -p 5432:5432 -e POSTGRES_USER=gha_admin -e POSTGRES_PASSWORD=password -e POSTGRES_DB=gha postgres:10.1-alpine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't test this, so cannot actually say if this os OK or not. Basically it looks OK for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
devstats/README.md
Outdated
### Create the config map with the projects file | ||
|
||
kubectl create -n devstats configmap projects --from-file=./projects.yaml | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
projects.yaml
taken from cncf/devstats repo (it contains 22 or 23 CNCF projects) or just a new projects.yaml containing only kubeflow
config?
Basically this file must be customized per your needs.
EDIT: I see that you have created custom projects.yaml
: good, lgtm.
devstats/README.md
Outdated
|
||
``` | ||
Error(time=2018-04-18 18:39:01.274140856 +0000 UTC): | ||
Error: 'pq: database "devstats" does not exist' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
devstats
database is a separate postgres database that only holds logs (gha_logs
table).
Its setup is described in INSTALL_*.md files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done; thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I'm going to update auto deploy script - to check if that DB exists. If not (it is shared for all projects) assume that this is a "from zero" setup - create database, admin, ro_user, team role etc.
devstats/devstats_cli.yaml
Outdated
value: gha | ||
- name: PG_PASS | ||
value: password | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably need to refer logs database here somehow.
You need two Postgres databases minmally:
- Your project's database - can have any name
- Logs database named
devstats
containing single tablegha_logs
, see INSTALL_*.md files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have two databases both stored in a single instance of postgre.
devstats/ghadb_postgres.yaml
Outdated
- name: POSTGRES_PASSWORD | ||
value: password | ||
- name: POSTGRES_DB | ||
value: gha |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably logs database also needed here (or maybe logs database will be in a different pod/yaml/deployment)?
devstats/projects.yaml
Outdated
# TODO(jlewi): Move the start date back in time once we are done doing some initial testing | ||
start_date: 2018-04-16T00:00:00Z | ||
# Can we get rid of join date since we aren't part of CNCF yet? | ||
join_date: 2018-04-16T00:00:00Z |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can set it to null
just like opencontainers project in CNCF repo:
See: https://github.com/cncf/devstats/blob/master/projects.yaml#L241
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM although I am not sure if it works.
Thanks :-)
go get gopkg.in/yaml.v2 && \ | ||
go get github.com/google/go-github/github && \ | ||
go get golang.org/x/oauth2 && \ | ||
go get github.com/mattn/go-sqlite3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, @lukaszgryglicki Could we add vendor in devstats side in the future thus we could avoid go get
in the docker image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure we can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlewi https://github.com/cncf/devstats/pull/99 devstas has vendor now and maybe we could not install so many deps in the docker image.
* This deploys postgres in a pod to store the data from github archive and influx db. * There is a cron job to run devstats to regularly sync the data. * There is also a K8s job that can be used to load data from a specific time range (useful for backfilling) * I've verified that data can be loaded into postgres and queried. * I don't know if any data is being dumped to influxdb. * We create a Docker image with the cncf/devstats binaries following the cncf/devstats instructions for installing the binaries in ubuntu 17. * Add stateful sets for influxdb and postgre * Add a stateful set to deploy devstats container so that we can do kubectl exec to run commands.
Thanks PTAL I made significant changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM although I am not sure if it works.
# /etc/projects-volume will be volume mounted into the pod. | ||
ln -sf /etc/projects-volume/projects.yaml /etc/gha2db/projects.yaml | ||
|
||
# TODO(jlewi): Per the instructions for devstats we should increase the number of default connections for postgres |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should set the postgre-docker-entrypoint.sh as entrypoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker image is used to run other things; like the CLI stateful set. Which is why I didn't want to set the entrypoint here and instead do it in the pod spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
|
||
RUN mkdir -p /home/ | ||
COPY postgre-docker-entrypoint.sh /usr/local/bin/ | ||
RUN chmod a+x /usr/local/bin/postgre-docker-entrypoint.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we set the permission in docker image? Could we set it outside and copy it into the image or just set bash /usr/local/bin/postgre-docker-entrypoint.sh
as entrypoint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think permissions are preserved when you copy the file. Usually I have to do chmod a+x to make scripts executable inside the container.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This depends on the number of CPU/cores reported by the go runtime. Our servers report 96 for example. And this is the number of threads that devstats spawn. Sometimes it is too much for postgres default setup. But when the system reports 8 or even 24 cores - you can skip this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
@gaocegege LGTM? |
/lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gaocegege, jlewi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
See kubeflow/kubeflow#1207 Update according to new sections
kubectl exec to run commands.
/assign @gaocegege
/cc @lukaszgryglicki
Related to #34
This change is