Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sflow] testDelAgent failed #10291

Open
nhe-NV opened this issue Oct 11, 2023 · 7 comments
Open

[sflow] testDelAgent failed #10291

nhe-NV opened this issue Oct 11, 2023 · 7 comments

Comments

@nhe-NV
Copy link
Contributor

nhe-NV commented Oct 11, 2023

Description
The test failed even after the #9766 is merged

Steps to reproduce the issue:

  1. Run the sflow test testDelAgent

Describe the results you received:
{"changed": true, "cmd": "//env-python3/bin/ptf --test-dir ptftests/py3 sflow_test --platform-dir ptftests --platform remote -t 'testbed_type='"'"'t0'"'"';router_mac='"'"'9c:05:91:9b:56:00'"'"';dst_port=3;agent_id='"'"'10.245.20.41'"'"';sflow_ports_file='"'"'/tmp/sflow_ports.json'"'"';polling_int=20;active_collectors="['"'"'collector0'"'"','"'"'collector1'"'"']"' --relax --debug info --log-file /tmp/TestAgentId.testDelAgent.log --socket-recv-size 16384", "delta": "0:00:30.819610", "end": "2023-10-10 23:56:54.898380", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2023-10-10 23:56:24.078770", "stderr": "//env-python3/bin/ptf:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses\n import imp\nsflow_test.SflowTest ... FAIL\n\n======================================================================\nFAIL: sflow_test.SflowTest\n----------------------------------------------------------------------\nTraceback (most recent call last):\n File "ptftests/py3/sflow_test.py", line 310, in runTest\n 'collector0', self.poll_tests)\n File "ptftests/py3/sflow_test.py", line 189, in packet_analyzer\n data, collector, self.polling_int, port_sample)\n File "ptftests/py3/sflow_test.py", line 211, in analyze_counter_sample\n % (self.agent_id, rcvd_agent_id))\nAssertionError: False is not true : Agent id in Sampled packet is not expected . Expected : 10.245.20.41 , received : 20.1.1.1\n\n----------------------------------------------------------------------\nRan 1 test in 29.349s\n\nFAILED (failures=1)", "stderr_lines": ["//env-python3/bin/ptf:19: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses", " import imp", "sflow_test.SflowTest ... FAIL", "", "======================================================================", "FAIL: sflow_test.SflowTest", "----------------------------------------------------------------------", "Traceback (most recent call last):", " File "ptftests/py3/sflow_test.py", line 310, in runTest", " 'collector0', self.poll_tests)", " File "ptftests/py3/sflow_test.py", line 189, in packet_analyzer", " data, collector, self.polling_int, port_sample)", " File "ptftests/py3/sflow_test.py", line 211, in analyze_counter_sample", " % (self.agent_id, rcvd_agent_id))", "AssertionError: False is not true : Agent id in Sampled packet is not expected . Expected : 10.245.20.41 , received : 20.1.1.1", "", "----------------------------------------------------------------------", "Ran 1 test in 29.349s", "", "FAILED (failures=1)"], "stdout": "Using packet manipulation module: ptf.packet_scapy\n\n*************************************\nATTENTION: SOME TESTS DID NOT PASS!!!\n\nThe following tests failed:\nSflowTest\n\n******************************************", "stdout_lines": ["Using packet manipulation module: ptf.packet_scapy", "", "", "ATTENTION: SOME TESTS DID NOT PASS!!!", "", "The following tests failed:", "SflowTest", "", ""]}

Describe the results you expected:

Additional information you deem important:

**Output of `show version`:**

```

202305_RC.7-c8447efe1_Internal
```

**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
@Gokulnath-Raja
Copy link
Contributor

@nhe-NV kindly share the hsflowd version for this test ?

@Gokulnath-Raja
Copy link
Contributor

After upgrading hsflowd to 2.0.51, post delete of agent id the hsflowd selection of agent id is not deterministic. In our setup we could see mgmt_ip (100.104.62.4) as preferred over loopback (10.1.0.32). Looks like in your setup also agent id is getting selected as different. Kindly confirm in your setup loopback is configured with 20.1.1.1 or some other interface??
@dgsudharsan kindly help here... Looks like agent id is selected randomly based IP addresses configured in the test topology.

@dgsudharsan
Copy link
Contributor

@Gokulnath-Raja Please talk with hsflowd team and understand the algorithm and based on it please update the test logic.

nhe-NV added a commit to nhe-NV/sonic-mgmt that referenced this issue Oct 13, 2023
Skip the test by github issue: sonic-net#10291

Change-Id: I06ee5185fc98c2ec824ab1c51aa7fe84837df258
@sflow
Copy link

sflow commented Oct 16, 2023

Hello all, the logic for auto-selecting the sflow-agent-address was tweaked in this commit back in November 2022:
sflow/host-sflow@458295b

I'm not sure if this is the exact commit that explains what you are seeing, but the address priority defined here:
https://github.com/sflow/host-sflow/blob/v2.0.51-26/src/Linux/hsflowd.h#L259-L273
is certainly intending to choose 100.104.62.4 (IPSP_IP4) in preference to 10.1.0.32 (IPSP_IP4_RFC1918).

The reasoning behind this is that the global IP 100.104.62.4 is more likely to be unique and reachable from anywhere. It would be easy to imagine a multi-site network where two switches (perhaps on different LANs) both had 10.1.0.32.

Does this answer the question?

@dgsudharsan
Copy link
Contributor

@Gokulnath-Raja the test fix that was done here #9766 needs to be done on a more deterministic approach.
@sflow Thanks for your explanation. We need the test to be aligned to chose the next possible IP when we delete the agent. For that the IP chosen needs to be deterministic. Can you please share the entire approach so that the sonic-mgmt test suite can integrate and verify?

@mohanapriya-meganathan
Copy link

Hi all,
I have done for the analysis to understand the logic behind the selection of agent-id in hsflowd once after the deletion of user configured agent-id.
Here is the default agent-id selection logic in hsflowd

  1. If the agentip or agentname is configured in the settings or hardcoded in the configuration file, agent-ip will be choosen from that settings or configuration file.

  2. If it is not falls under 1st condition, we will try to get all(ipv4 and ipv6) the ip interfaces configured and match ipPriority(EnumIPSelectionPriority) appropriate for the interface.

https://github.com/sflow/host-sflow/blob/6296a172c2c3879126298dc66994d38e68956185/src/Linux/hsflowconfig.c#L1019

typedef enum { IPSP_NONE=0,
	 IPSP_CLASS_E,
	 IPSP_MULTICAST,
	 IPSP_LOOPBACK6,
	 IPSP_LOOPBACK4,
	 IPSP_SELFASSIGNED4,
	 IPSP_IP6_SCOPE_LINK,
	 IPSP_VLAN6,
	 IPSP_VLAN4,
	 IPSP_IP6_SCOPE_UNIQUE,
	 IPSP_IP6_SCOPE_GLOBAL,
	 IPSP_IP4_RFC1918,
	 IPSP_IP4,
	 IPSP_NUM_PRIORITIES,

} EnumIPSelectionPriority;

  1. Based on the Selection Priority, ip interface having higher priority will be selected as agent-id.
    https://github.com/sflow/host-sflow/blob/6296a172c2c3879126298dc66994d38e68956185/src/Linux/hsflowconfig.c#L1112,
  2. If two interfaces having same selectionpriority, then we need to get the interface index and adaptor selection priority for the particular interface.
  3. If interface index is same, then first discovered ip is choosen to be the agent-id.
  4. If interface index is different, then the adaptor having lower selection priority is choosen to be the agent-id.
  5. If adaptor priority is same, then interface having lower interface index is choosen to be agent-id.

Its a complex implementation to be done in the test script. Instead we can have a command or any other way to display the selected agent-id from the hsflowd when it chooses for the default agent-id.

@sflow
Copy link

sflow commented Nov 9, 2023

Good summary. The chosen agent-address is written as one of the lines in the file /etc/hsflowd.auto (intended for other programs to read if they are going to export application-sFlow samples to the same collector). So I think that might be the easiest place for the test script to find it. Here is an example:

# WARNING: Do not edit this file. It is generated automatically by hsflowd.
rev_start=1
hostname=inmon
sampling=400
header=128
datagram=1400
polling=30
sampling.http=1
collector=127.0.0.1
agentIP=2001:468:1f07:ff1a::106
agent=ens3
ds_index=1
rev_end=1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants