Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sidecar detection mode #282

Merged
merged 4 commits into from
Mar 17, 2023
Merged

sidecar detection mode #282

merged 4 commits into from
Mar 17, 2023

Conversation

cybwan
Copy link
Contributor

@cybwan cybwan commented Mar 16, 2023

问题:
业务容器和边车容器启动的先后顺序是不定的;
如果业务容器先启动,但边车容器还未启动,这时探测边车容器的监听端口,探测不到,所以会标识_default.is_in_mesh = 0;
同时探测的结果会被 cache,不会再被重复探测,即is_in_mesh永远是 0,导致mb_connect.c中的几处分支逻辑永远不会执行.

解决方法:
因为 cni plugin 中已经校验当前的 POD 是否被 mess 纳管,进而决定是否启动SOCK_IP_MARK_PORT的监听;
即 POD 中SOCK_IP_MARK_PORT总是最先启动的, 所以只需探测SOCK_IP_MARK_PORT即可.

问题:
业务容器和边车启动容器的先后顺序是不定的;
如果业务容器先启动,但边车容器还未启动,这是探测边车容器的监听端口,探测不到,所以会标识_default.is_in_mesh = 0;
同时探测的结果会被 cache,不会再被重复探测,即is_in_mesh永远是 0,导致mb_connect.c中的几处分支逻辑永远不会执行.

解决方法:
因为 cni plugin 中已经校验当前的 POD 是否被 mess 纳管,进而决定是否启动SOCK_IP_MARK_PORT的监听;
所以这时只需探测SOCK_IP_MARK_PORT即可.

signed-off-by: cybwan <baili@flomesh.io>
@cybwan cybwan requested a review from a team as a code owner March 16, 2023 05:53
@mergify
Copy link

mergify bot commented Mar 16, 2023

Welcome to the Merbridge OpenSource Community!👏

We're delighted to have you onboard 💘

问题:
业务容器和边车容器启动的先后顺序是不定的;
如果业务容器先启动,但边车容器还未启动,这时探测边车容器的监听端口,探测不到,所以会标识_default.is_in_mesh = 0;
同时探测的结果会被 cache,不会再被重复探测,即is_in_mesh永远是 0,导致mb_connect.c中的几处分支逻辑永远不会执行.

解决方法:
因为 cni plugin 中已经校验当前的 POD 是否被 mess 纳管,进而决定是否启动SOCK_IP_MARK_PORT的监听;
即 POD 中SOCK_IP_MARK_PORT总是最先启动的, 所以只需探测SOCK_IP_MARK_PORT即可.

signed-off-by: cybwan <baili@flomesh.io>
@codecov-commenter
Copy link

codecov-commenter commented Mar 16, 2023

Codecov Report

Merging #282 (7234372) into main (73165f3) will not change coverage.
The diff coverage is n/a.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@           Coverage Diff           @@
##             main     #282   +/-   ##
=======================================
  Coverage   32.87%   32.87%           
=======================================
  Files           7        7           
  Lines         514      514           
=======================================
  Hits          169      169           
  Misses        331      331           
  Partials       14       14           
Flag Coverage Δ
unittests 32.87% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Member

@dddddai dddddai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! lgtm
I think this addresses #244

BTW I'm thinking about a new workaround to get rid of all the hacks(dummy socket, cni server), that would require vmlinux.h though. But I've been a bit busy...will file a design later

@kebe7jun
Copy link
Member

Last time my fix kept giving me errors at the e2e stage ......

@kebe7jun
Copy link
Member

e2e still failed.

@cybwan
Copy link
Contributor Author

cybwan commented Mar 17, 2023

e2e still failed.

test passed in multi nodes k8s cluster.
later i would test in kind cluster.

@dddddai
Copy link
Member

dddddai commented Mar 17, 2023

This pr will break non-cni mode(that's why the e2e test failed I think) since there's no dummy socket

@cybwan
Copy link
Contributor Author

cybwan commented Mar 17, 2023

right

signed-off-by: cybwan <baili@flomesh.io>
@cybwan cybwan requested a review from a team as a code owner March 17, 2023 07:13
signed-off-by: cybwan <baili@flomesh.io>
@kebe7jun
Copy link
Member

I can do without cni mode in ambient mode, but that's a different topic.

@kebe7jun kebe7jun merged commit f88730f into merbridge:main Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants