Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nginx Ingress controller started seeing internal IPs; bad azure-ip-masq-agent update? #2076

Closed
roy-work opened this issue Jan 25, 2021 · 15 comments

Comments

@roy-work
Copy link

What happened: at 1:48a on Saturday, our cluster appears to have seen some sort of update to azure-ip-masq-agent: both the CM for the agent and the DS were listed as being (re-)created at that time.

Also at that time, nginx-ingress-controller started reporting that the client's IP of incoming HTTP requests as being within the 10.0.0.0/8 subnet / within the pod subnet. (It reported IPs that were node IPs on requests that clearly did not originate on nodes. All of the node IPs we noted in the logs also were from nodes on which nginx-ingress-controller was not running on, which I think might be relevant?)

We looked at some other (unfortunately, different version of k8s) clusters, and they all had this for a azure-ip-masq-agent-config:

nonMasqueradeCIDRs:
  - <stuff>

However, on the problematic cluster, the config was,

data:
  ip-masq-agent: |-
    nonMasqueradeCIDRs:
    masqLinkLocal: true
    resyncInterval: 60s

Lacking any other ideas, we added the pod subnet CIDR to nonMasqueradeCIDRs, & restarted the pods in the azure-ip-masq-agent DS. This appears to have corrected the issue.

Why? AIUI, ip-masq-agent is to re-write the IP addresses of Internet-bound traffic to the node IPs. But traffic from LB to the ingress controller isn't outbound / Internet bound, so I'm at a loss as to why editing that CM had any effect at all. However, much understanding of both kube-proxy & ip-masq-agent is pretty rudimentary.

What you expected to happen: nginx to get the right (external) IP for incoming requests

How to reproduce it (as minimally and precisely as possible): We're not sure.

Anything else we need to know?: We use nginx-ingress in what we think is a bog standard config: there is an Azure LoadBalancer, behind that, nginx-ingress, and behind that, our services.

Environment:

  • Kubernetes version (use kubectl version): 1.13.x
  • Size of cluster (how many worker nodes are in the cluster?) 7
  • General description of workloads in the cluster (e.g. HTTP microservices, Java app, Ruby on Rails, machine learning, etc.)
  • Others:
@ghost ghost added the triage label Jan 25, 2021
@ghost
Copy link

ghost commented Jan 25, 2021

Hi roy-work, AKS bot here 👋
Thank you for posting on the AKS Repo, I'll do my best to get a kind human from the AKS team to assist you.

I might be just a bot, but I'm told my suggestions are normally quite good, as such:

  1. If this case is urgent, please open a Support Request so that our 24/7 support team may help you faster.
  2. Please abide by the AKS repo Guidelines and Code of Conduct.
  3. If you're having an issue, could it be described on the AKS Troubleshooting guides or AKS Diagnostics?
  4. Make sure your subscribed to the AKS Release Notes to keep up to date with all that's new on AKS.
  5. Make sure there isn't a duplicate of this issue already reported. If there is, feel free to close this one and '+1' the existing issue.
  6. If you have a question, do take a look at our AKS FAQ. We place the most common ones there!

@ghost ghost added the action-required label Jan 28, 2021
@ghost
Copy link

ghost commented Jan 28, 2021

Triage required from @Azure/aks-pm

@ghost
Copy link

ghost commented Feb 2, 2021

Action required from @Azure/aks-pm

@ghost ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label Feb 2, 2021
@ghost
Copy link

ghost commented Feb 17, 2021

Issue needing attention of @Azure/aks-leads

3 similar comments
@ghost
Copy link

ghost commented Mar 4, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Mar 19, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Apr 4, 2021

Issue needing attention of @Azure/aks-leads

@ghost ghost removed triage action-required Needs Attention 👋 Issues needs attention/assignee/owner labels Apr 5, 2021
@ghost ghost added the action-required label Apr 30, 2021
@ghost
Copy link

ghost commented May 5, 2021

Action required from @Azure/aks-pm

@ghost ghost added the Needs Attention 👋 Issues needs attention/assignee/owner label May 5, 2021
@ghost
Copy link

ghost commented May 20, 2021

Issue needing attention of @Azure/aks-leads

2 similar comments
@ghost
Copy link

ghost commented Jun 5, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Jun 20, 2021

Issue needing attention of @Azure/aks-leads

@ghost
Copy link

ghost commented Jul 5, 2021

Issue needing attention of @Azure/aks-leads

@miwithro
Copy link
Contributor

@roy-work what AKS Version are you currently running, and is the issue still there?

@ghost ghost removed action-required Needs Attention 👋 Issues needs attention/assignee/owner labels Jul 20, 2021
@ghost ghost added the stale Stale issue label Sep 18, 2021
@ghost
Copy link

ghost commented Sep 18, 2021

This issue has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs within 15 days of this comment.

@ghost ghost closed this as completed Oct 4, 2021
@ghost
Copy link

ghost commented Oct 4, 2021

This issue will now be closed because it hasn't had any activity for 15 days after stale. roy-work feel free to comment again on the next 7 days to reopen or open a new issue after that time if you still have a question/issue or suggestion.

@ghost ghost locked as resolved and limited conversation to collaborators Nov 3, 2021
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants