Skip to content
This repository has been archived by the owner on Apr 17, 2019. It is now read-only.

Submit Queue / munger: keep state over restarts #1042

Closed
lavalamp opened this issue May 23, 2016 · 10 comments
Closed

Submit Queue / munger: keep state over restarts #1042

lavalamp opened this issue May 23, 2016 · 10 comments

Comments

@lavalamp
Copy link
Contributor

Could be as simple as storing a static file in GCS or as complex as adding a database to the things we run in the utility cluster.

P0 Requirements:

  • remember merge stats across restarts.
@ghost ghost self-assigned this May 23, 2016
@lavalamp
Copy link
Contributor Author

@eparis @mikedanese @fejta maybe we can come up with a list of things we want to save.

@mhrgoog It'd be great if we could collect very explicitly in one place the set of things that are getting persisted.

@ghost
Copy link

ghost commented May 24, 2016

Right now I am brainstorming and throwing ideas against the wall. I am not sure if this justifies a design doc or not. Here are three potential possibilities:

  1. A protobuffer that is stored in a file.
    Pros: Provides backwards compatibility and parsing
    Cons: It's just a protobuffer. One must read in the whole structure before it can be used, concurrent operations may not be easy.

  2. Key value store: We can use some key value store

Pros: Concurrent use should be easier.
Not sure which one to pick and how robust they are. Maybe this is obvious to veterans of the team

  1. Full on relational DB

Pros: fun with queries and ability to mine information flexibly
Cons: Schema management can be a hassle.

I think even if we knew what we wanted to save now the list would change. But knowing the size of data we want would help a lot.

@fejta you have any ideas of what you are looking for?

@apelisse
Copy link
Contributor

What problem are we trying to fix here?

@fejta
Copy link
Contributor

fejta commented May 24, 2016

@mhrgoog The data in http://submit-queue.k8s.io/#/e2e is great... except it disappears whenever we restart the merge queue. I want it to serialized to a GCS object so we can avoid that.

Specifically I need to be able to measure the following each week:

  • What percentage of the time was the merge bot healthy last week?
  • Which job was the most unhealthy last week?
  • How many things did it merge in the past week?

Right now I cannot because whenever we restart the mergebot everything is lost. So I wind up tracking these values since 4 hours ago instead.

At this point I am not concerned about fun with queries or concurrency. I want to be able to answer those three questions.

@lavalamp
Copy link
Contributor Author

I recommend a json object over a protobuf.
On May 23, 2016 10:44 PM, "Erick Fejta" notifications@github.com wrote:

@mhrgoog https://github.com/mhrgoog The data in
http://submit-queue.k8s.io/#/e2e is great... except it disappears
whenever we restart the merge queue. I want it to serialized to a GCS
object so we can avoid that.

Specifically I need to be able to measure the following each week:

  • What percentage of the time was the merge bot healthy last week?
  • Which job was the most unhealthy last week?
  • How many things did it merge in the past day?

Right now I cannot because whenever we restart the mergebot everything is
lost. So I wind up tracking these values since 4 hours ago instead.

At this point I am not concerned about fun with queries or concurrency. I
want to be able to answer those three questions.


You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub
#1042 (comment)

@eparis
Copy link
Contributor

eparis commented May 24, 2016

I'm hearing about saving and analyzing lists of time series data. Neither are things I want the mungegithub tools to get much better at. I'd rather mungegithub remained focused on github and automating stuff around github and we build 'something else' to handle the data/visualization aspects we have been adding.

Should we be dumping (or having something poll) the e2e tests data and the history data into something like influxdb which is actually designed to hold the data? And I think has easy stuff to do visualization and understanding on that data?

I'm the one who started the process of showing stats in the submit-queue, but as we want more we're probably best to ask what the right solution is, not what continues to be 'easy' to bolt onto the side...

Especially since in my mind the thing that would be GREAT to save across reboot is the proxied cache of github object state. So we don't have such a slow re-start and we don't run out of API tokens on restart...

@lavalamp
Copy link
Contributor Author

I think @eparis is probably right, the best thing would be to publish metrics (I think @apelisse or @rmmh already started publishing via promethieus?) and scrape them regularly so we can get time-series data for this stuff.

...however I'm super interested in getting something working yesterday. I'm OK with different short- and long- term solutions.

@eparis
Copy link
Contributor

eparis commented May 24, 2016

@lavalamp no fights from me.

@lavalamp
Copy link
Contributor Author

Remembering the queue order across restarts would be super handy, too. Right now it's chugging away on an unimportant PR instead of 1.3 PRs.

@ghost ghost removed their assignment Jun 2, 2016
@lavalamp
Copy link
Contributor Author

I don't think we have an immediate need here now. We keep the github cache and the stat-scraping script handles queue restarts. Closing.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants