New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event Manager #3329
Comments
How do we decide what information to put into PodStatus/NodeStatus and what information to put into discrete events? I believe there is some kind of homomorphism here because in the limit we could
Also, on the archiving question -- could/should we just archive the whole etcd transaction log? (I assume etcd is structured as some kind of log that contains every mutating operation, perhaps with some kind of periodic compaction) This could be useful for post-hoc debugging and would give us archiving of events for free (if you were only interested in events when reading back, you'd skip all the non-event mutations). |
XStatus should have the entire current state of X; an event about X tells you only one detail. I think there is room for both. I added b.6. above to capture the idea of updating status based on incoming events. It's not clear if that's actually the best thing to do, but it's worth talking about.
This is a fair point-- perhaps we should solve archival globally. But it would be good to store it in a searchable/queryable format. |
I prefer that we solve archival globally, and if we use events as the first use case, that is fine for me. ----- Original Message -----
XStatus should have the entire current state of X; an event about X tells you only one detail. I think there is room for both. I added b.6. above to capture the idea of updating status based on incoming events. It's not clear if that's actually the best thing to do, but it's worth talking about.
This is a fair point-- perhaps we should solve archival globally. But it would be good to store it in a searchable/queryable format. Reply to this email directly or view it on GitHub: |
#2298 too We plan to moving XStatus (at least PodStatus) computation to Kubelet level, which means keeping XStatus as API type is necessary. |
One possible model for Status vs. Events is the following (based on a similar system I worked on recently). I think this is similar to what you were saying. A component that starts with no information can construct the entire state of the cluster by observing the current values of the Status objects in etcd. If it later becomes backlogged or goes down for a long time, it can come back up and reconstruct the state this way. On the other hand, Event lifetimes are bounded (either by an explicit TTL, or the compaction interval of the transaction log, or the retention policy of the system that archives the transaction log, or whatever). So Events should only be used to convey information that is not mission-critical (for example, you specifically would not want to delegate conversion of Events into Status to an event manager that runs on top of Kubernetes, since losing some Events would corrupt the representation of the cluster's state; having Kubelet and API server compute Status is safest). This also provides a design guideline for components, which in the previous project we described as "edged-based" vs. "level-based"; the former rely on seeing every object transition, while the latter could figure out what to do just based on its observation of the current Statuses and whatever private state it had stashed away. Unfortunately by the time we understood this distinction we were already building edge-based components so we just pretended the store transaction log would be archived long enough that even if the edge-based component was down (or fell behind) for a long time enough time it would always be able to catch up. This isn't a good/safe assumption to make. |
We extracted work items required for v1 release, and filed them separately. Lower the priority of this to P3 to unblock V1. |
This is highly theoretical and in practice isn't an urgent issue. Please re-open if you disagree. |
This message was created automatically by mail delivery software.
A message that you sent could not be delivered to one or more of its
recipients. This is a temporary error. The following address(es) deferred:
curtis.l.bates@gmail.com
Domain imwiz.com has exceeded the max emails per hour (158/150 (105%)) allowed. Message will be reattempted later
…------- This is a copy of the message, including all the headers. ------
Received: from o9.sgmail.github.com ([167.89.101.2]:20471)
by box969.bluehost.com with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128)
(Exim 4.87)
(envelope-from <bounces+848413-5c7e-dev=imwiz.com@sgmail.github.com>)
id 1dGTL9-000Tsb-Jb
for dev@imwiz.com; Thu, 01 Jun 2017 10:52:52 -0600
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=github.com;
h=from:reply-to:to:cc:in-reply-to:references:subject:mime-version:content-type:content-transfer-encoding:list-id:list-archive:list-post:list-unsubscribe;
s=s20150108; bh=sWjMG0zxujO+bliemWP/guo18C8=; b=fZfNG3rHkyZkI1jM
bCn2xBtv+PcgvpRtglDDoJPDY1TL68jca5Rd0GmELtqMbNtsD4tjbbmigXc4BWPl
usDVJd2Fa+lG/QqPn8cV3OIv5fpOiHajo9VWPKdRTgV1B1Ud3XuKKLn9sCIovPY6
u90hpuymsiIx7FVnCtPaU5U/I0U=
Received: by filter0501p1mdw1.sendgrid.net with SMTP id filter0501p1mdw1-2636-59304654-6A
2017-06-01 16:52:36.983492999 +0000 UTC
Received: from github-smtp2a-ext-cp1-prd.iad.github.net (github-smtp2a-ext-cp1-prd.iad.github.net [192.30.253.16])
by ismtpd0006p1iad1.sendgrid.net (SG) with ESMTP id yNz9ELt9QHK39MwvLPFGGw
for <dev@imwiz.com>; Thu, 01 Jun 2017 16:52:36.952 +0000 (UTC)
Date: Thu, 01 Jun 2017 09:52:36 -0700
From: Clayton Coleman <notifications@github.com>
Reply-To: kubernetes/kubernetes <reply@reply.github.com>
To: kubernetes/kubernetes <kubernetes@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <kubernetes/kubernetes/issue/3329/issue_event/1106426314@github.com>
In-Reply-To: <kubernetes/kubernetes/issues/3329@github.com>
References: <kubernetes/kubernetes/issues/3329@github.com>
Subject: Re: [kubernetes/kubernetes] Event Manager (#3329)
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="--==_mimepart_59304654d426c_499e3f93881f9c34638e6";
charset=UTF-8
Content-Transfer-Encoding: 7bit
Precedence: list
X-GitHub-Sender: smarterclayton
X-GitHub-Recipient: falenn
X-GitHub-Reason: subscribed
List-ID: kubernetes/kubernetes <kubernetes.kubernetes.github.com>
List-Archive: https://github.com/kubernetes/kubernetes
List-Post: <mailto:reply@reply.github.com>
List-Unsubscribe: <mailto:unsub+000ab60a3fb5b0fa12e7e6df6b61bbcf3c69ee2c639c864192cf000000011548085492a169ce0334e0d0@reply.github.com>,
<https://github.com/notifications/unsubscribe/AAq2CuTTJbGS4fXq1H1_nk0xgp1vYoSHks5r_uxUgaJpZM4DQDoH>
X-Auto-Response-Suppress: All
X-GitHub-Recipient-Address: dev@imwiz.com
X-SG-EID: APO41b8ovafPb3SK9rw3vGS2Kq45kgpYx4y17m0ryg3JH/JwE7co13m4iTYP9W+Ap+1iX/uexRsNs2
PL3kFzIhK3UtblvtVdEdL2jYUg5VR3eNf3cioyb7qKTIIAkzloSaNGr/LCx6TNTHOTKx7AlvhHAG6+
nPGbzqPOV4st4M7UOshun47MFku09BiO2vb8knsv+0s6tBv7MY0wi8ITwkhw9s/UKLfHs9V0yI/84g
A=
X-Spam-Status: No, score=-2.5
X-Spam-Score: -24
X-Spam-Bar: --
X-Ham-Report: Spam detection software, running on the system "box969.bluehost.com",
has NOT identified this incoming email as spam. The original
message has been attached to this so you can view it or label
similar future email. If you have any questions, see
root\@localhost for details.
Content preview: Closed #3329. -- You are receiving this because you are subscribed
to this thread. Reply to this email directly or view it on GitHub: #3329 (comment)
[...]
Content analysis details: (-2.5 points, 4.0 required)
pts rule name description
---- ---------------------- --------------------------------------------------
-0.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain
-0.0 SPF_PASS SPF: sender matches SPF record
1.3 HTML_IMAGE_ONLY_24 BODY: HTML: images with 2000-2400 bytes of words
0.0 HTML_MESSAGE BODY: HTML included in message
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-2.8 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2)
[167.89.101.2 listed in wl.mailspike.net]
-0.8 AWL AWL: Adjusted score from AWL reputation of From: address
X-Spam-Flag: NO
----==_mimepart_59304654d426c_499e3f93881f9c34638e6
Content-Type: text/plain;
charset=UTF-8
Content-Transfer-Encoding: 7bit
Closed #3329.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#3329 (comment)
----==_mimepart_59304654d426c_499e3f93881f9c34638e6
Content-Type: text/html;
charset=UTF-8
Content-Transfer-Encoding: 7bit
<p>Closed <a href="#3329" class="issue-link js-issue-link" data-url="#3329" data-id="53797072" data-error-text="Failed to load issue title" data-permission-text="Issue title is private">#3329</a>.</p>
<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="#3329 (comment)">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AAq2Ch1hBlfXdHTnBCnbdHcL0XptxxiHks5r_uxUgaJpZM4DQDoH">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AAq2Ct24TNi4Ruold8n4HrQNM75PaHVUks5r_uxUgaJpZM4DQDoH.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
<link itemprop="url" href="#3329 (comment)"></link>
<meta itemprop="name" content="View Issue"></meta>
</div>
<meta itemprop="description" content="View this Issue on GitHub"></meta>
</div>
<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/kubernetes/kubernetes","title":"kubernetes/kubernetes","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/kubernetes/kubernetes"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"Closed #3329."}],"action":{"name":"View Issue","url":"#3329 (comment)"}}}</script>
----==_mimepart_59304654d426c_499e3f93881f9c34638e6--
|
Now that we have various components producing events, we need to start processing them. Our current policy of a 2 day TTL isn't going to scale.
So, I want us to build an event manager.
The basic sketch is that it reads events from the cluster, processes them, and archives them.
a. Reading events:
b. Process events
c. Archive events
d. Misc open questions
We can also explore the idea of kubernetes writing events to a non-etcd (or separate etcd) DB in the first place. Events are the primary source of write load on etcd right now. It's good for the moment in that it's exposing some bugs in our use of etcd, but in the long term it's probably more efficient for us to use a different storage mechanism for events.
The text was updated successfully, but these errors were encountered: