Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for #2842, instead of returning the first Pod, return the one whi… #2846

Merged
merged 5 commits into from May 29, 2019

Conversation

Projects
None yet
7 participants
@zendai
Copy link
Contributor

commented May 27, 2019

…ch is Running

This fix is about the Kubenetes autopath module which occasionally identifies the source Pod incorrectly. It returns the first Pod entry in the list based on an IP search, which occasionally will be incorrect as we can have historical Pods with that IP in that list as well. If the first Pod in the list is a historical one, the autpath module will work with incorrect information.

This fix is returning a Pod object only if it's Running (which there should only be one, per IP). In order to do that I had to add a new field which I called Phase (after the original field in the Kubernetes spec, to remain consistent), and which represent the status of the Pod.

As a result now we look up the source Pod not just by IP, but also by status to make sure we get the correct one.

1. Why is this pull request needed and what does it do?

This is a bugfix for #2842.

2. Which issues (if any) are related?

Issue #2842.

3. Which documentation changes (if any) need to be made?

None.

4. Does this introduce a backward incompatible change or deprecation?

No.

@corbot corbot bot requested a review from bradbeam May 27, 2019

@corbot

This comment has been minimized.

Copy link

commented May 27, 2019

Thank you for your contribution. I've just checked the OWNERS files to find a suitable reviewer. This search was successful and I've asked bradbeam (via plugin/kubernetes/OWNERS) for a review.

If you have questions or suggestions for this bot, please file an issue against the miekg/dreck repository.

The bot understands the commands that are listed here.

@yongtang

This comment has been minimized.

Copy link
Member

commented May 27, 2019

@zendai Looks like the email address you used in git history (A...@im...ve.com) is not registered with GitHub. For that reason, the commit you contributed is not showing your account picture.

It is certainly up to you to decide one way or another. Though if you want the commit you contributed to be associated to you, you may consider link the email in GitHub (or use a already linked email when you commit the code).

@codecov-io

This comment has been minimized.

Copy link

commented May 27, 2019

Codecov Report

Merging #2846 into master will decrease coverage by 0.05%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #2846      +/-   ##
=========================================
- Coverage   55.35%   55.3%   -0.06%     
=========================================
  Files         204     204              
  Lines       10241   10249       +8     
=========================================
- Hits         5669    5668       -1     
- Misses       4155    4165      +10     
+ Partials      417     416       -1
Impacted Files Coverage Δ
plugin/kubernetes/watch.go 0% <0%> (ø) ⬆️
plugin/kubernetes/controller.go 10.48% <0%> (-0.12%) ⬇️
plugin/forward/connect.go 81.69% <0%> (-4.23%) ⬇️
core/dnsserver/server_grpc.go 8% <0%> (-0.11%) ⬇️
core/dnsserver/server_https.go 0% <0%> (ø) ⬆️
plugin/file/reload.go 76.31% <0%> (+5.26%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 14b0792...7fb9ab5. Read the comment docs.

@zendai

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

@yongtang thank you for the note, I wasn't even thinking about this. I added the email address to my account, I think that's the simpler solution for now. Next time I'll be smarter! (maybe)

@@ -12,6 +12,7 @@ type Pod struct {
Name string
Namespace string
Deleting bool
Phase string

This comment has been minimized.

Copy link
@miekg

miekg May 27, 2019

Member

ah, this is very expensive to add because it increases memory. Can't we use Deleting or figure out something there?

This comment has been minimized.

Copy link
@zendai

zendai May 27, 2019

Author Contributor

@miekg Based on my limited understanding of the codebase (I just reviewed it today for the first time), this is the easiest way to fix this. If memory is a concern, instead of storing the status itself, we could store a bool to indicate whether it's running or not. What do you think?

The field will be less useful but will use less memory.

This comment has been minimized.

Copy link
@zendai

zendai May 27, 2019

Author Contributor

Btw, the status is about 5-6 characters long (not sure how much underlying memory this actually means though), is that really too much? 🙄

This comment has been minimized.

Copy link
@zendai

zendai May 27, 2019

Author Contributor

Let me check if we can use Deleting.

This comment has been minimized.

Copy link
@zendai

zendai May 27, 2019

Author Contributor

Can confirm, Deleting will be false for all the Pods, including the historical ones. We can't use Deleting here.

@miekg
Copy link
Member

left a comment

adding Phase is too expensive, esp considering autopath is sort of a hack

@zendai

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

We can't use the Deleting field, will submit a new version where we only use a bool instead of string to represent if the Pod is running. This is more CPU cycles but less memory.

@zendai

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

@miekg I rewrote it using bool instead of string. That's my best bet, if you want to optimise this further we have to make structural changes at many points.

@zendai

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

@miekg about "esp considering autopath is sort of a hack". Could you please elaborate what that means and what the consequences are?

Please don't take this the wrong way but I couldn't find any sign in the documentation that suggests that autopath is a hack, according to the docs it's as much of a plugin as any other. Add to this that CoreDNS is the default DNS for Kubernetes and it was recommended in the official Kubernetes blog (https://kubernetes.io/blog/2018/07/10/coredns-ga-for-kubernetes-cluster-dns/) just increases my confusion what hack means.

Should we not use it in production? Is it not supported as much as the other plugins?

@johnbelamaric

This comment has been minimized.

Copy link
Member

commented May 27, 2019

@zendai

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

As I see there are two ways not to keep Pods unless they are running.

  • Use the current Kubernetes API query mechanism and post process the results, which would still require a field in the struct (pointless).
  • Filter the Pods at the Kubernetes API query level, in which case we don't need the field in the struct, because Kubernetes will handle the filtering.

This second version is somewhere here. Potentially we could use a FieldSelector as part of the Kubernetes client options.

Also, this is theoretical as I have no in-depth knowledge of the codebase and I'm not sure about the consequences. Can we confirm that adding such a filter at the Kubernetes query layer won't have any negative side effect at any other component?

The current version only have one extra bool per Pod. Memory-wise that must be barely visible, even if we talk about tens of thousands of Pods. Does it make sense to go with a higher risk and deeper structural change to save a few kbytes, maybe tens of kbytes memory?

Just asking, I honestly don't have internal knowledge of the code base, you know better than I do.

@miekg

This comment has been minimized.

Copy link
Member

commented May 27, 2019

@bradbeam

This comment has been minimized.

Copy link
Collaborator

commented May 28, 2019

How about just not keeping pods unless they are running?

Would that enable us to drop p.Deleting? It looks like we're already withholding returning a response if p.Deleting == true.

@miekg

This comment has been minimized.

Copy link
Member

commented May 28, 2019

@zendai

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

@miekg @johnbelamaric Please review this proposal. This won't have the extra field in struct, it'll return nil if the Pod is not Running at the conversion function ToPod. We already had a condition which returns nil from that function if there is an error fetching the Pod, so I assume it's safe to return nil.

Tested a few times, it works.

A more elegant solution would be what I mentioned before, ask Kubernetes to filter Pods using FieldSelector, but that's a bigger change. That would go to two places, the Pod listing function and the Pod watcher function.

Please let me know if you're fine with this version, or you want the more elegant version with the FieldSelector.

@zendai

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

@miekg @johnbelamaric @bradbeam My latest commit is a proposal where we configure FieldSelector at the Kubernetes API query so the filtering is completely done at the Kubernetes level. Also, if there is a FieldSelector already we just append our filter. This is the cleanest version so far.

If you're sure that shortlisting Pods at that level is safe, I think this should be fine.

Please review and let me know what you think.

@chrisohaver

This comment has been minimized.

Copy link
Member

commented May 28, 2019

The current proposed solution LGTM. (filtering at API list/watch)

@johnbelamaric

This comment has been minimized.

Copy link
Member

commented May 28, 2019

Can we keep Running and Pending, as suggested by @miekg? That way we get the Pod info earlier which will reduce the likelihood of race conditions.

@zendai

This comment has been minimized.

Copy link
Contributor Author

commented May 28, 2019

@miekg @johnbelamaric @bradbeam Changed the FieldSelector to include Running and Pending.

Please note that the FieldSelector allows only for logical AND, and not OR. This means we can't filter for "include Running OR Pending" (explicit), but we can achieve the same result by excluding all the others (implicit) which are Succeeded, Failed and Unknown.

The result is the same, we're filtering for Pods that are Running or Pending. Tested, it works.

The available Pod states today are listed here, for the reference. The potential risk is if Kubernetes introduces new Pod states, which won't be covered by our exclusion list, still this is the best we can do now if we want to include multiple states.

Please let me know what you think.

@chrisohaver

This comment has been minimized.

Copy link
Member

commented May 28, 2019

Agree - It makes sense to include every status that eventually leads to Running, which IIUC is Pending and Running. @zendai, Thanks for linking to the pod status definitions. LGTM

@johnbelamaric

This comment has been minimized.

Copy link
Member

commented May 28, 2019

/lgtm

@corbot

corbot bot approved these changes May 28, 2019

Copy link

left a comment

LGTM by johnbelamaric

@miekg

miekg approved these changes May 29, 2019

@miekg miekg merged commit 7dde3f3 into coredns:master May 29, 2019

3 checks passed

codecov/project 55.3% (target 50%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
stickler-ci No lint errors found.

@agdolla agdolla referenced this pull request Jun 21, 2019

Merged

Add 1.5.1 release notes #2912

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.