Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to create a new network on Neutron #77

Closed
oomichi opened this issue Mar 6, 2019 · 31 comments
Closed

Failed to create a new network on Neutron #77

oomichi opened this issue Mar 6, 2019 · 31 comments

Comments

@oomichi
Copy link
Owner

oomichi commented Mar 6, 2019

まとめ

Tenant network: lb-mgmt-net が作れない問題

$ openstack network create lb-mgmt-net
Error while executing command: HttpException: Unknown error, {"NeutronError": {"message": "Unable to create the network. No tenant network is available for allocation.", "type": "NoNetworkAvailable", "detail": ""}}

下記のようにTenant network 用として VXLAN を設定することで解決

$ diff -u ml2_conf.ini.orig ml2_conf.ini
--- ml2_conf.ini.orig   2019-03-06 10:54:51.062392771 -0800
+++ ml2_conf.ini        2019-03-06 12:08:06.510577001 -0800
@@ -1,9 +1,11 @@
 [ml2]
-type_drivers = flat,vlan
-tenant_network_types =
+type_drivers = flat,vxlan
+tenant_network_types = vxlan
 mechanism_drivers = linuxbridge
 extension_drivers = port_security

 [ml2_type_flat]
-flat_networks = provider,company
+flat_networks = provider

+[ml2_type_vxlan]
+vni_ranges = 1:1000

Controllerノードを含む全ノードのlinuxbridge設定変更
192.168.1.59は VXLAN のカプセル化を行うインターフェースを指定する。

$ diff -u /etc/neutron/plugins/ml2/linuxbridge_agent.ini.orig /etc/neutron/plugins/ml2/linuxbridge_agent.ini
--- /etc/neutron/plugins/ml2/linuxbridge_agent.ini.orig 2019-03-06 12:29:49.326323707 -0800
+++ /etc/neutron/plugins/ml2/linuxbridge_agent.ini      2019-03-06 12:31:43.249367381 -0800
@@ -2,7 +2,8 @@
 physical_interface_mappings = provider:eno1

 [vxlan]
-enable_vxlan = false
+enable_vxlan = true
+local_ip = 192.168.1.59

 [securitygroup]
 firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

Floating ipがつけられない問題

$ openstack floating ip set --port ac400c96-c53e-4ef2-ba3b-c5ba1381c34e 192.168.1.110
NotFoundException: Unknown erro

Tenant network が中から外へのトラフィックを送るための経路が無かったため。
Router で Provider ネットワークにつなぐことで解決

$ openstack router create lb-mgmt-router
$ openstack router add subnet lb-mgmt-router lb-mgmt-subnet
$ openstack router set lb-mgmt-router --external-gateway provider

VXLANで構成されたテナントネットワーク上でDHCPが取れない問題

VXLAN設定の問題だった。
下記のように修正することで通るようになった。

--- /etc/neutron/plugins/ml2/ml2_conf.ini.orig  2019-03-06 10:54:51.062392771 -0800
+++ /etc/neutron/plugins/ml2/ml2_conf.ini       2019-03-08 10:31:28.388334795 -0800
@@ -1,9 +1,11 @@
 [ml2]
-type_drivers = flat,vlan
-tenant_network_types =
-mechanism_drivers = linuxbridge
+type_drivers = flat,vxlan
+tenant_network_types = vxlan
+mechanism_drivers = linuxbridge,l2population
 extension_drivers = port_security

 [ml2_type_flat]
-flat_networks = provider,company
+flat_networks = provider

+[ml2_type_vxlan]
+vni_ranges = 1:1000
--- /etc/neutron/plugins/ml2/linuxbridge_agent.ini.orig 2019-03-06 16:54:07.934162103 -0800
+++ /etc/neutron/plugins/ml2/linuxbridge_agent.ini      2019-03-08 10:35:50.800145841 -0800
@@ -2,7 +2,13 @@
 physical_interface_mappings = provider:enp2s0,company:enp0s31f6

 [vxlan]
-enable_vxlan = false
+enable_vxlan = true
+local_ip = 192.168.1.1
+l2_population = true
+vxlan_group =
+
+[agent]
+prevent_arp_spoofing = true

 [securitygroup]
 firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver
@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

/etc/neutron/plugins/ml2/ml2_conf.ini

[ml2]
type_drivers = flat,vlan
tenant_network_types =
mechanism_drivers = linuxbridge
extension_drivers = port_security

[ml2_type_flat]
flat_networks = provider,company

flat_networks として provider と company のみ許可していたためと思われる。
デフォルト値は * なので、それに変更して動くか試してみる。
また、vlan は個別項目を設定しておらず使っていないので削除してみる。

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

$ diff -u ml2_conf.ini.orig  ml2_conf.ini
--- ml2_conf.ini.orig   2019-03-06 10:54:51.062392771 -0800
+++ ml2_conf.ini        2019-03-06 10:55:06.574561000 -0800
@@ -1,9 +1,9 @@
 [ml2]
-type_drivers = flat,vlan
+type_drivers = flat
 tenant_network_types =
 mechanism_drivers = linuxbridge
 extension_drivers = port_security

 [ml2_type_flat]
-flat_networks = provider,company
+flat_networks = *

駄目だ、引き続き問題が起きている。

$ openstack network create lb-mgmt-net
Error while executing command: HttpException: Unknown error, {"NeutronError": {"message": "Unable to create the network. No tenant network is available for allocation.", "type": "NoNetworkAvailable", "detail": ""}}

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

Neutronのエラーログ

2019-03-06 11:04:20.890 2478 INFO neutron.wsgi [-] 127.0.0.1 "GET / HTTP/1.1" status: 200  len: 251 time: 0.0014389
2019-03-06 11:04:21.553 2478 INFO neutron.quota [req-74f39fe3-291f-4365-97ca-1f7310636376 e5e99065fd524f328c2f81e28a6fbc42 682e74f275fe427abd9eb6759f3b68c5 - default default] Loaded quota_driver: <neutron.db.quota.driver.DbQuotaDriver object at 0x7f91a0a78e50>.
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation [req-74f39fe3-291f-4365-97ca-1f7310636376 e5e99065fd524f328c2f81e28a6fbc42 682e74f275fe427abd9eb6759f3b68c5 - default default] POST failed.: NoNetworkAvailable: Unable to create the network. No tenant network is available for allocation.
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation Traceback (most recent call last):
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/pecan/core.py", line 683, in __call__
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     self.invoke_controller(controller, args, kwargs, state)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/pecan/core.py", line 574, in invoke_controller
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     result = controller(*args, **kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 91, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     setattr(e, '_RETRY_EXCEEDED', True)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     self.force_reraise()
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     six.reraise(self.type_, self.value, self.tb)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 87, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return f(*args, **kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 147, in wrapper
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     ectxt.value = e.inner_exc
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     self.force_reraise()
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     six.reraise(self.type_, self.value, self.tb)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 135, in wrapper
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return f(*args, **kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 126, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     LOG.debug("Retry wrapper got retriable exception: %s", e)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     self.force_reraise()
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     six.reraise(self.type_, self.value, self.tb)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 122, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return f(*dup_args, **dup_kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/pecan_wsgi/controllers/utils.py", line 76, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return f(*args, **kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/pecan_wsgi/controllers/resource.py", line 159, in post
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return self.create(resources)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/pecan_wsgi/controllers/resource.py", line 177, in create
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return {key: creator(*creator_args, **creator_kwargs)}
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/common/utils.py", line 627, in inner
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return f(self, context, *args, **kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 161, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return method(*args, **kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 91, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     setattr(e, '_RETRY_EXCEEDED', True)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     self.force_reraise()
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     six.reraise(self.type_, self.value, self.tb)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 87, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return f(*args, **kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 147, in wrapper
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     ectxt.value = e.inner_exc
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     self.force_reraise()
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     six.reraise(self.type_, self.value, self.tb)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_db/api.py", line 135, in wrapper
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return f(*args, **kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 126, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     LOG.debug("Retry wrapper got retriable exception: %s", e)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     self.force_reraise()
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     six.reraise(self.type_, self.value, self.tb)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/db/api.py", line 122, in wrapped
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     return f(*dup_args, **dup_kwargs)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/plugin.py", line 837, in create_network
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     result, mech_context = self._create_network_db(context, network)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/plugin.py", line 796, in _create_network_db
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     tenant_id)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/managers.py", line 209, in create_network_segments
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     segment = self._allocate_tenant_net_segment(context)
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation   File "/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/managers.py", line 272, in _allocate_tenant_net_segment
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation     raise exc.NoNetworkAvailable()
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation NoNetworkAvailable: Unable to create the network. No tenant network is available for allocation.
2019-03-06 11:04:21.832 2478 ERROR neutron.pecan_wsgi.hooks.translation
2019-03-06 11:04:22.132 2478 INFO neutron.wsgi [req-74f39fe3-291f-4365-97ca-1f7310636376 e5e99065fd524f328c2f81e28a6fbc42 682e74f275fe427abd9eb6759f3b68c5 - default default] 127.0.0.1 "POST /v2.0/networks HTTP/1.1" status: 503  len: 369 time: 1.2399452

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

https://docs.openstack.org/newton/install-guide-ubuntu/launch-instance-networks-provider.html によると /etc/neutron/plugins/ml2/linuxbridge_agent.ini で lb-mgmt-net に対応する ethernet を指定していないからか?
network に対応する ethernet が必要だが、flat network の場合 VLAN などで分けずにノード間の通信を行うため、たぶん1つの ethernet に対して1つの network しか作れないと思う。そうしないと DHCP 通信が分離できないはずだし。IaaS として VLAN を選ぶのが正しかったのかもしれない。
まずは、上記のエラーが発生している原因を特定する。

エラー箇所
/usr/lib/python2.7/dist-packages/neutron/plugins/ml2/managers.py

    def _allocate_tenant_net_segment(self, context):
        for network_type in self.tenant_network_types:
            segment = self._allocate_segment(context, network_type)
            if segment:
                return segment
        raise exc.NoNetworkAvailable()

Flat network 用ドライバのコメントを見ると、tenant network をサポートしていないと明記
neutron/plugins/ml2/drivers/type_flat.py#n103

    def allocate_tenant_segment(self, context):
        # Tenant flat networks are not supported.
        return

一方、VLAN用ドライバはちゃんと tenant network が実装されている。
http://git.openstack.org/cgit/openstack/neutron/tree/neutron/plugins/ml2/drivers/type_vlan.py#n203

    def allocate_tenant_segment(self, context):
        for physnet in self.network_vlan_ranges:
            alloc = self.allocate_partially_specified_segment(
                context, physical_network=physnet)
            if alloc:
                break
        else:
            return
        return {api.NETWORK_TYPE: p_const.TYPE_VLAN,
                api.PHYSICAL_NETWORK: alloc.physical_network,
                api.SEGMENTATION_ID: alloc.vlan_id,
                api.MTU: self.get_mtu(alloc.physical_network)}

Flat network はあくまで外部(Internet)接続用?

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

ゲートで使われている Devstack の設定情報

[securitygroup]
firewall_driver = openvswitch

[ml2]
tenant_network_types = vxlan
extension_drivers = port_security,qos
mechanism_drivers = openvswitch,linuxbridge

[ml2_type_gre]
tunnel_id_ranges = 1:1000

[ml2_type_vxlan]
vni_ranges = 1:1000

[ml2_type_flat]
flat_networks = public,

[ml2_type_vlan]
network_vlan_ranges = public

[ml2_type_geneve]
vni_ranges = 1:1000

[agent]
extensions = qos
tunnel_types = vxlan
root_helper_daemon = sudo /usr/local/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf
root_helper = sudo /usr/local/bin/neutron-rootwrap /etc/neutron/rootwrap.conf

[ovs]
datapath_type = system
bridge_mappings = public:br-ex
tunnel_bridge = br-tun
local_ip = 198.72.124.7

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

そもそも tenant_network_types を指定していなかったのが問題。
逆に type_drivers を指定していないけど大丈夫なのか?
→ デフォルトが type_drivers = local,flat,vlan,gre,vxlan,geneve で全指定になっていたので問題なし。
ゲートでは VXLAN がテナントネットワークとして使われているので、それに合わせる。

以下のように変更してみる。

$ diff -u ml2_conf.ini.orig ml2_conf.ini
--- ml2_conf.ini.orig   2019-03-06 10:54:51.062392771 -0800
+++ ml2_conf.ini        2019-03-06 12:08:06.510577001 -0800
@@ -1,9 +1,11 @@
 [ml2]
-type_drivers = flat,vlan
-tenant_network_types =
+type_drivers = flat,vxlan
+tenant_network_types = vxlan
 mechanism_drivers = linuxbridge
 extension_drivers = port_security

 [ml2_type_flat]
-flat_networks = provider,company
+flat_networks = provider

+[ml2_type_vxlan]
+vni_ranges = 1:1000

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

成功した

$ openstack network create lb-mgmt-net
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
| admin_state_up            | UP                                   |
| availability_zone_hints   |                                      |
| availability_zones        |                                      |
| created_at                | 2019-03-06T20:27:31Z                 |
| description               |                                      |
| dns_domain                | None                                 |
| id                        | e2971ef3-e5ac-4642-b8a0-9c9007069716 |
| ipv4_address_scope        | None                                 |
| ipv6_address_scope        | None                                 |
| is_default                | False                                |
| is_vlan_transparent       | None                                 |
| mtu                       | 1450                                 |
| name                      | lb-mgmt-net                          |
| port_security_enabled     | True                                 |
| project_id                | 682e74f275fe427abd9eb6759f3b68c5     |
| provider:network_type     | vxlan                                |
| provider:physical_network | None                                 |
| provider:segmentation_id  | 83                                   |
| qos_policy_id             | None                                 |
| revision_number           | 2                                    |
| router:external           | Internal                             |
| segments                  | None                                 |
| shared                    | False                                |
| status                    | ACTIVE                               |
| subnets                   |                                      |
| tags                      |                                      |
| updated_at                | 2019-03-06T20:27:31Z                 |
+---------------------------+--------------------------------------+

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

compute ノード側の設定変更も必要。
下記 192.168.1.59 は compute ノードのIPアドレス。
このアドレス上でオーバーレイネットワークを構築する。

$ diff -u /etc/neutron/plugins/ml2/linuxbridge_agent.ini.orig /etc/neutron/plugins/ml2/linuxbridge_agent.ini
--- /etc/neutron/plugins/ml2/linuxbridge_agent.ini.orig 2019-03-06 12:29:49.326323707 -0800
+++ /etc/neutron/plugins/ml2/linuxbridge_agent.ini      2019-03-06 12:31:43.249367381 -0800
@@ -2,7 +2,8 @@
 physical_interface_mappings = provider:eno1

 [vxlan]
-enable_vxlan = false
+enable_vxlan = true
+local_ip = 192.168.1.59

 [securitygroup]
 firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

  1. provider network only exists
    Pass
  2. Test nova boot -> Expect succeeding with provider network
    Pass
  3. Create network and subnetwork with VXLAN
Pass
$ openstack network create lb-mgmt-net
+---------------------------+--------------------------------------+
| Field                     | Value                                |
+---------------------------+--------------------------------------+
...
| provider:network_type     | vxlan                                |
+---------------------------+--------------------------------------+
$ openstack subnet create --network lb-mgmt-net --allocation-pool start=192.168.10.100,end=192.168.10.200 --dns-nameserver 8.8.4.4 --gateway 192.168.10.1 --subnet-range 192.168.10.0/24 lb-mgmt-subnet
  1. nova boot without specifying network -> Expect failing
Pass
$ nova boot --key-name mykey --flavor m1.medium --image 73f70800-1d0c-4569-a3c5-29c70775c334 test
ERROR (Conflict): Multiple possible networks found, use a Network ID to be more specific. (HTTP 409) (Request-ID: req-2401b84f-3e12-417c-8acb-9d4e8c1861e1)
  1. nova boot with specifying network -> Expect succeeding
$ nova boot --key-name mykey --flavor m1.medium --image 73f70800-1d0c-4569-a3c5-29c70775c334 --nic net-name=lb-mgmt-net test
$ nova list
+--------------------------------------+------+--------+------------+-------------+----------------------------+
| ID                                   | Name | Status | Task State | Power State | Networks                   |
+--------------------------------------+------+--------+------------+-------------+----------------------------+
| 9103f6e7-73d2-4e95-a89f-342d57fc1307 | test | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.106 |
+--------------------------------------+------+--------+------------+-------------+----------------------------+

起動したが NIC が有効になっていない模様

$ nova console-log test
...
[[0;32m  OK  [0m] Reached target Network (Pre).
         Starting Raise network interfaces...
[[0m[0;31m*     [0m] A start job is running for Raise network interfaces (9s / 5min 3s)[K[[0;1;31m*[0m[0;31m*    [0m] A start job is running for Raise network interfaces (9s / 5min 3s)[K[[0;31m*[0;1;31m*[0m[0;31m*   [0m] A start job is running for Raise network interfaces (10s / 5min 3s)[K[ [0;31m*[0;1;31m*[0m[0;31m*  [0m] A start job is running for Raise network interfaces (10s / 5min 3s)[
...
3s / 5min 3s)[K[[0;1;31mFAILED[0m] Failed to start Raise network interfaces.
See 'systemctl status networking.service' for details.
         Starting Initial cloud-init job (metadata service crawler)...
[[0;32m  OK  [0m] Reached target Network.
[  309.176412] cloud-init[840]: Cloud-init v. 18.2 running 'init' at Wed, 06 Mar 2019 21:00:01 +0000. Up 309.02 seconds.
[  309.178438] cloud-init[840]: ci-info: ++++++++++++++++++++++++++++++++++++Net device info+++++++++++++++++++++++++++++++++++++
[  309.180268] cloud-init[840]: ci-info: +--------+------+------------------------------+-----------+-------+-------------------+
[  309.182063] cloud-init[840]: ci-info: | Device |  Up  |           Address            |    Mask   | Scope |     Hw-Address    |
[  309.183878] cloud-init[840]: ci-info: +--------+------+------------------------------+-----------+-------+-------------------+
[  309.185706] cloud-init[840]: ci-info: |  ens3  | True |              .               |     .     |   .   | fa:16:3e:60:96:30 |
[  309.187473] cloud-init[840]: ci-info: |  ens3  | True | fe80::f816:3eff:fe60:9630/64 |     .     |  link | fa:16:3e:60:96:30 |
[  309.189276] cloud-init[840]: ci-info: |   lo   | True |          127.0.0.1           | 255.0.0.0 |   .   |         .         |
[  309.191042] cloud-init[840]: ci-info: |   lo   | True |           ::1/128            |     .     |  host |         .         |
[  309.192862] cloud-init[840]: ci-info: +--------+------+------------------------------+-----------+-------+-------------------+
[  309.194626] cloud-init[840]: 2019-03-06 21:00:01,823 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [0/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7fd6df673470>: Failed to establish a new connection: [Errno 101] Network is unreachable',))]
...
Ubuntu 16.04.4 LTS ubuntu ttyS0

ubuntu login:

cloud-init による外への通信も失敗している。

@oomichi
Copy link
Owner Author

oomichi commented Mar 6, 2019

2つのNicを持ったVMを起動し、lb-mgmt-net 上の通信ができるかチェックする。

$ nova boot --key-name mykey --flavor m1.medium --image 73f70800-1d0c-4569-a3c5-29c70775c334 --nic net-name=lb-mgmt-net --nic net-name=provider test2
$ nova list
+--------------------------------------+-------+--------+------------+-------------+----------------------------------------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks                                           |
+--------------------------------------+-------+--------+------------+-------------+----------------------------------------------------+
| 9103f6e7-73d2-4e95-a89f-342d57fc1307 | test  | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.106                         |
| f967637f-bbd5-45a1-95eb-fd0b9b9d1e1c | test2 | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.108; provider=192.168.1.107 |
+--------------------------------------+-------+--------+------------+-------------+----------------------------------------------------+

console-log をみると外への通信で失敗している。
lb-mgmt-net にデフォルトゲートウェイを設定しているのが駄目かも。
はずしてみる。

$ openstack subnet delete 1b88db03-e254-4415-ac53-0a548a1f16f0
$ openstack subnet create --network lb-mgmt-net --allocation-pool start=192.168.10.100,end=192.168.10.200 --subnet-range 192.168.10.0/24 lb-mgmt-subnet
$ nova boot --key-name mykey --flavor m1.medium --image 73f70800-1d0c-4569-a3c5-29c70775c334 --nic net-name=lb-mgmt-net test1
$ nova boot --key-name mykey --flavor m1.medium --image 73f70800-1d0c-4569-a3c5-29c70775c334 --nic net-name=lb-mgmt-net --nic net-name=provider test2
$ nova boot --key-name mykey --flavor m1.medium --image 73f70800-1d0c-4569-a3c5-29c70775c334 --nic net-name=provider test3
$ nova list
+--------------------------------------+-------+--------+------------+-------------+----------------------------------------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks                                           |
+--------------------------------------+-------+--------+------------+-------------+----------------------------------------------------+
| 84ed1896-dfc7-4134-95eb-a3effeeeb22e | test1 | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.105                         |
| b4f9d49e-908f-4b68-a5da-eb2403e77f29 | test2 | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.100; provider=192.168.1.110 |
| 81e649c1-caa6-487a-a1b1-a1e6b65073cd | test3 | ACTIVE | -          | Running     | provider=192.168.1.106                             |
+--------------------------------------+-------+--------+------------+-------------+----------------------------------------------------+

上記 provider ネットワークだけを持つ test3 には Ping が通るが、test2 には通らない。
console-log を見ると NIC のアップで失敗している。

[[0;32m  OK  [0m] Reached target Network (Pre).
         Starting Raise network interfaces...
[[0m[0;31m*     [0m] A start job is running for Raise network interfaces (8s / 5min 3s)[K[[0;1;31m*[0m[0;31m*    [0m] A start job is running for Raise network interfaces (9s / 5min 3s)[K[[0;31m*[0;1;31m*[0m[0;31m*   [0m] A start job is running for Raise network interfaces (10s / 5min 3s)[K[ [0;31m*[0;1;31m*[0m[0;31m*  [0m] A start job is running for Raise network interfaces (10s / 5min 3s)[K[  [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Raise network interfaces (11s / 5min 3s)[K[   [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Raise network interfaces (11s / 5min 3s)[K[    [0;31m*[0;1;31m*[0m] A start job is running for Raise network interfaces (12s / 5min 3s)[K[     [0;31m*[0m] A start job is running for Raise network interfaces (13s / 5min 3s)[K[    [0;31m*[0;1;31m*[0m] A start job is running for Raise network interfaces (13s / 5min 3s)[K[   [0;31m*[0;1;31m*[0m[0;31m*[0m] A start job is running for Raise network interfaces (14s / 5min 3s)[K[  [0;31m*[0;1;31m*[0m[0;31m* [0m] A start job is running for Raise network interfaces (14s / 5min 3s)[K[ [0;31m*[0;1;31m*[0m[0;31m*  [0m] A start job is running for Raise network interfaces (15s / 5min 3s)[K[[0;31m*[0;1;31m*[0m[0;31m*   [0m] A start job is running for Ra

成功する場合(test3)のログ

[[0;32m  OK  [0m] Reached target Network (Pre).
         Starting Raise network interfaces...
[[0;32m  OK  [0m] Started Raise network interfaces.
         Starting Initial cloud-init job (metadata service crawler)...
[[0;32m  OK  [0m] Reached target Network.

@oomichi
Copy link
Owner Author

oomichi commented Mar 7, 2019

controller の linuxbridge_agent.ini に VXLAN設定するの忘れていた。

$ diff -u linuxbridge_agent.ini.orig linuxbridge_agent.ini
--- linuxbridge_agent.ini.orig  2019-03-06 16:54:07.934162103 -0800
+++ linuxbridge_agent.ini       2019-03-06 16:54:55.670651200 -0800
@@ -2,7 +2,8 @@
 physical_interface_mappings = provider:enp2s0,company:enp0s31f6

 [vxlan]
-enable_vxlan = false
+enable_vxlan = true
+local_ip = 192.168.1.1

 [securitygroup]
 firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

@oomichi
Copy link
Owner Author

oomichi commented Mar 7, 2019

vm, subnet, network を削除して、再度試す。

$ openstack network create lb-mgmt-net
$ openstack subnet create --network lb-mgmt-net --allocation-pool start=192.168.10.100,end=192.168.10.200 --subnet-range 192.168.10.0/24 lb-mgmt-subnet
$ nova boot --key-name mykey --flavor m1.medium --image 73f70800-1d0c-4569-a3c5-29c70775c334 --nic net-name=lb-mgmt-net test1

事象は変わらず

@oomichi
Copy link
Owner Author

oomichi commented Mar 7, 2019

トラブルシューティング

$ openstack network agent list
+--------------------------------------+----------------------+------------+-------------------+-------+-------+---------------------------+
| ID                                   | Agent Type           | Host       | Availability Zone | Alive | State | Binary                    |
+--------------------------------------+----------------------+------------+-------------------+-------+-------+---------------------------+
| 1a7faecf-bd6b-44a7-b456-9d56506dcbf8 | Metadata agent       | iaas-ctrl  | None              | :-)   | UP    | neutron-metadata-agent    |
| 2cb40e67-c41c-4172-b742-699dc85451fb | Linux bridge agent   | iaas-cpu02 | None              | XXX   | UP    | neutron-linuxbridge-agent |
| 2ff0a087-636f-413d-9394-d015a5a4f032 | Linux bridge agent   | iaas-cpu03 | None              | XXX   | UP    | neutron-linuxbridge-agent |
| 3c658599-86f3-4fc1-bc2e-0f06cc14d29e | DHCP agent           | iaas-ctrl  | nova              | :-)   | UP    | neutron-dhcp-agent        |
| 3c66d18c-5670-42ab-9fa7-4c4582469b0b | Linux bridge agent   | iaas-ctrl  | None              | :-)   | UP    | neutron-linuxbridge-agent |
| 4c3e58ff-5d9a-4a63-bf1d-30694882b11c | Loadbalancerv2 agent | iaas-ctrl  | None              | :-)   | UP    | neutron-lbaasv2-agent     |
| 73af79f5-9358-4564-9d48-a54e790c83dc | Linux bridge agent   | iaas-cpu01 | None              | :-)   | UP    | neutron-linuxbridge-agent |
+--------------------------------------+----------------------+------------+-------------------+-------+-------+---------------------------+

cpu02, 03 の linuxbridge-agent が死んでいる・・・
cpu02 のログ

2019-03-06 12:36:21.457 1002 INFO neutron.common.config [-] /usr/bin/neutron-linuxbridge-agent version 12.0.2
2019-03-06 12:36:21.457 1002 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Interface mappings: {'provider': 'eno1'}
2019-03-06 12:36:21.458 1002 INFO neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Bridge mappings: {}
2019-03-06 12:36:21.507 1002 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Tunneling cannot be enabled without the local_ip bound to an interface on the host. Please configure local_ip 192.168.1.61 on the host interface to be used for tunneling and restart the agent.

cpu02, 03でエージェント再起動したところ

# systemctl restart neutron-linuxbridge-agent.service

直った。

$ openstack network agent list
+--------------------------------------+----------------------+------------+-------------------+-------+-------+---------------------------+
| ID                                   | Agent Type           | Host       | Availability Zone | Alive | State | Binary                    |
+--------------------------------------+----------------------+------------+-------------------+-------+-------+---------------------------+
| 1a7faecf-bd6b-44a7-b456-9d56506dcbf8 | Metadata agent       | iaas-ctrl  | None              | :-)   | UP    | neutron-metadata-agent    |
| 2cb40e67-c41c-4172-b742-699dc85451fb | Linux bridge agent   | iaas-cpu02 | None              | :-)   | UP    | neutron-linuxbridge-agent |
| 2ff0a087-636f-413d-9394-d015a5a4f032 | Linux bridge agent   | iaas-cpu03 | None              | :-)   | UP    | neutron-linuxbridge-agent |
| 3c658599-86f3-4fc1-bc2e-0f06cc14d29e | DHCP agent           | iaas-ctrl  | nova              | :-)   | UP    | neutron-dhcp-agent        |
| 3c66d18c-5670-42ab-9fa7-4c4582469b0b | Linux bridge agent   | iaas-ctrl  | None              | :-)   | UP    | neutron-linuxbridge-agent |
| 4c3e58ff-5d9a-4a63-bf1d-30694882b11c | Loadbalancerv2 agent | iaas-ctrl  | None              | :-)   | UP    | neutron-lbaasv2-agent     |
| 73af79f5-9358-4564-9d48-a54e790c83dc | Linux bridge agent   | iaas-cpu01 | None              | :-)   | UP    | neutron-linuxbridge-agent |
+--------------------------------------+----------------------+------------+-------------------+-------+-------+---------------------------+

結果、console-log 上は NIC がちゃんと立上がった模様。
→ 一部のノードの neutron-linuxbridge-agent が立上がっていないだけで、本事象が発生するのは良くわからない。
しかし、2NICのVMは引き続き NIC が立上がらない。

@oomichi
Copy link
Owner Author

oomichi commented Mar 7, 2019

lb-mgmt-net ネットワーク上の2VMの片方に floating-ip を割当てSSHログインし、lb-mgmt-net 上のPingが通ることを確認する。
2つ目のVMでもNICが立上がらない問題発生。両方とも lb-mgmt-net ネットワークだけに接続しているのに、最初のVMはNIC成功なのに後のが失敗するのは?

2回目も同様の事象発生。
1 VM: test1 は iaas-cpu02 で起動。成功
2 VM: test2 は iaas-cpu03 で起動。失敗
3 VM: test3 は iaas-cpu01 で起動。失敗
4 VM: test4 は iaas-cpu02 で起動。成功
5 VM: test5 は iaas-cpu03 で起動。失敗
6 VM: test6 は iaas-cpu01 で起動。失敗
7 VM: test7 は iaas-cpu02 で起動。成功

つまり iaas-cpu02 で立上がったVMのみNICが取れている。
ひとまず test1 と test4の間で通信が出来ることを確認する。

$ openstack floating ip create provider
$ openstack floating ip list
+--------------------------------------+---------------------+------------------+------+--------------------------------------+----------------------------------+
| ID                                   | Floating IP Address | Fixed IP Address | Port | Floating Network                     | Project                          |
+--------------------------------------+---------------------+------------------+------+--------------------------------------+----------------------------------+
| f2bf1831-efa8-4b26-a14b-59457b8ff180 | 192.168.1.110       | None             | None | bfd9fd43-c9b4-43ad-bb67-930c674f2605 | 682e74f275fe427abd9eb6759f3b68c5 |
+--------------------------------------+---------------------+------------------+------+--------------------------------------+----------------------------------+
$ openstack floating --debug ip set --port ac400c96-c53e-4ef2-ba3b-c5ba1381c34e 192.168.1.110
...
REQ: curl -g -i -X PUT http://iaas-ctrl:9696/v2.0/floatingips/f2bf1831-efa8-4b26-a14b-59457b8ff180 -H "User-Agent: osc-lib/1.9.0 keystoneauth1/3.4.0 python-requests/2.18.4 CPython/2.7.12" -H "Content-Type: application/json" -H "X-Auth-Token: {SHA1}9f05f838fbe06d8a0b150aa231b8c8eaa4d289a1" -d '{"floatingip": {"port_id": "ac400c96-c53e-4ef2-ba3b-c5ba1381c34e"}}'
http://iaas-ctrl:9696 "PUT /v2.0/floatingips/f2bf1831-efa8-4b26-a14b-59457b8ff180 HTTP/1.1" 404 306
RESP: [404] Content-Type: application/json Content-Length: 306 X-Openstack-Request-Id: req-37bb0232-ff1c-4180-b7d6-92c522936cc1 Date: Thu, 07 Mar 2019 01:46:03 GMT Connection: keep-alive
RESP BODY: {"NeutronError": {"message": "External network bfd9fd43-c9b4-43ad-bb67-930c674f2605 is not reachable from subnet 8e6e3ead-b4b6-44d5-bd72-31662fb16183.  Therefore, cannot associate Port ac400c96-c53e-4ef2-ba3b-c5ba1381c34e with a Floating IP.", "type": "ExternalGatewayForFloatingIPNotFound", "detail": ""}}

失敗。External network: provider が subnet lb-mgmt-subnet から到達可能でないため。
そもそも floating-ip ってローカルネットワーク上のVMを外に見せるためのものじゃなかったっけ?
→ たぶん、中から外への通信はできていないとならないはず。Routerでつなぐ必要がありそう。

@oomichi
Copy link
Owner Author

oomichi commented Mar 7, 2019

Help message がイマイチな件は https://storyboard.openstack.org/#!/story/2005163 として登録した。

@oomichi
Copy link
Owner Author

oomichi commented Mar 7, 2019

下記でネットワーク lb-mgmt-router と外部ネットワーク provider の間を Router でつなぎ

$ openstack router create lb-mgmt-router
$ openstack router add subnet lb-mgmt-router lb-mgmt-subnet
$ openstack router set lb-mgmt-router --external-gateway provider

再度 Floating IP をつけてみたところ成功

$ nova list
+--------------------------------------+-------+--------+------------+-------------+----------------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks                   |
+--------------------------------------+-------+--------+------------+-------------+----------------------------+
| 95f05cd5-2c55-4957-bc9e-18f63074c0f7 | test1 | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.105 |
+--------------------------------------+-------+--------+------------+-------------+----------------------------+
$ openstack floating ip set --port a8357da0-7f64-4c25-ae52-124a2dbbfb03 192.168.1.110
$ nova list
+--------------------------------------+-------+--------+------------+-------------+-------------------------------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks                                  |
+--------------------------------------+-------+--------+------------+-------------+-------------------------------------------+
| 95f05cd5-2c55-4957-bc9e-18f63074c0f7 | test1 | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.105, 192.168.1.110 |
+--------------------------------------+-------+--------+------------+-------------+-------------------------------------------+

しかし、引き続き NIC が立上がらない問題あり。
floating ipに対して Ping も通らない。cpu02を含む全てのノードで立ち上げても動かない。

@oomichi
Copy link
Owner Author

oomichi commented Mar 7, 2019

NICが立上がらない問題を地道に調べる必要あり。

$ nova list
+--------------------------------------+-------+--------+------------+-------------+-------------------------------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks                                  |
+--------------------------------------+-------+--------+------------+-------------+-------------------------------------------+
| 95f05cd5-2c55-4957-bc9e-18f63074c0f7 | test1 | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.105, 192.168.1.110 |
+--------------------------------------+-------+--------+------------+-------------+-------------------------------------------+
$ openstack port list | grep 192.168.10.105
| a8357da0-7f64-4c25-ae52-124a2dbbfb03 |      | fa:16:3e:f6:41:e2 | ip_address='192.168.10.105', subnet_id='8e6e3ead-b4b6-44d5-bd72-31662fb16183' | ACTIVE |
$ nova show test1
+--------------------------------------+------------------------------------------------------------+
| Property                             | Value                                                      |
+--------------------------------------+------------------------------------------------------------+
| OS-DCF:diskConfig                    | MANUAL                                                     |
| OS-EXT-AZ:availability_zone          | nova                                                       |
| OS-EXT-SRV-ATTR:host                 | iaas-cpu02                                                 |
...

Port ID は a8357da0-7f64-4c25-ae52-124a2dbbfb03 であることを確認
問題のVMはiaas-cpu02 に存在、iaas-cpu02 にログイン。
ブリッジの状態を確認
-> tapa8357da0-7f がVMのデバイス("tap<Port IDの一部>")

$ brctl show
bridge name     bridge id               STP enabled     interfaces
brqebe0b402-ae          8000.0601094dc904       no              tapa8357da0-7f
                                                        vxlan-13
virbr0          8000.52540006e678       yes             virbr0-nic

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

気づいたら provider ネットワークに作った VM も外へ通信できなくなってしまった。
ひとまず、問題のセルフマネージメントネットワークを削除する。

$ openstack network delete lb-mgmt-net
Failed to delete network with name or ID 'lb-mgmt-net': Unable to delete Network for openstack.network.v2.network.Network(provider:physical_network=None, ipv6_address_scope=None, revision_number=3, port_security_enabled=True, provider:network_type=vxlan, id=ebe0b402-aeaa-44dd-9eff-993a08b57bee, router:external=False, availability_zone_hints=[], availability_zones=[u'nova'], ipv4_address_scope=None, shared=False, project_id=682e74f275fe427abd9eb6759f3b68c5, status=ACTIVE, subnets=[u'8e6e3ead-b4b6-44d5-bd72-31662fb16183'], description=, tags=[], updated_at=2019-03-07T01:21:34Z, provider:segmentation_id=13, name=lb-mgmt-net, admin_state_up=True, created_at=2019-03-07T01:21:25Z, mtu=1450)
1 of 1 networks failed to delete.

失敗
やっぱり上記のエラーメッセージは役に立たない。
Neutron 自体は以下のように役立つ情報を送っている。

RESP BODY: {"NeutronError": {"message": "Unable to complete operation on network ebe0b402-aeaa-44dd-9eff-993a08b57bee. There are one or more ports still in use on the network.", "type": "NetworkInUse", "detail": ""}}

neutron コマンドはちゃんと上記メッセージを出している。

$ neutron net-delete lb-mgmt-net
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
Unable to complete operation on network ebe0b402-aeaa-44dd-9eff-993a08b57bee. There are one or more ports still in use on the network.
Neutron server returns request_ids: ['req-12ca03a6-ba3b-400c-8bb4-ecfb14a570fe']

下記で解消。

$ openstack router remove port lb-mgmt-router ec10e55e-0229-43a0-8f14-c14f54dd5829
$ neutron router-delete lb-mgmt-router
$ openstack network delete lb-mgmt-net

しかし、VM内から外部への通信はできていない。

$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 received, 100% packet loss, time 3023ms

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

routing 情報
-> default gwとして 192.168.1.1 が設定されている
-> そこへは ping がとおる

$ route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    0      0        0 ens3
169.254.169.254 192.168.1.100   255.255.255.255 UGH   0      0        0 ens3
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 ens3
$ ping 192.168.1.1
PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.505 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.962 ms

IPマスカレード?以前も直した気がする。
/etc/ufw/before.rules での指定NICを外部IPアドレスが振られている enp0s31f6 に変更する。
→ 正解、直った。そもそも brq5bff0834-bd がなくなっている。Neutron での操作で変化してしまう?

#-A POSTROUTING -s 192.168.1.0/24 -o brq5bff0834-bd -j MASQUERADE
-A POSTROUTING -s 192.168.1.0/24 -o enp0s31f6 -j MASQUERADE

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

再トライ
→ 再現

$ openstack network create lb-mgmt-net
$ openstack subnet create --network lb-mgmt-net --allocation-pool start=192.168.10.100,end=192.168.10.200 --subnet-range 192.168.10.0/24 lb-mgmt-subnet
$ nova boot --key-name mykey --flavor m1.medium --image 73f70800-1d0c-4569-a3c5-29c70775c334 --nic net-name=lb-mgmt-net test
$ nova list
+--------------------------------------+------+--------+------------+-------------+----------------------------+
| ID                                   | Name | Status | Task State | Power State | Networks                   |
+--------------------------------------+------+--------+------------+-------------+----------------------------+
| 2e193bb9-2ca0-4ae7-9e05-1400fa007558 | test | ACTIVE | -          | Running     | lb-mgmt-net=192.168.10.109 |
+--------------------------------------+------+--------+------------+-------------+----------------------------+
$ openstack port list
+--------------------------------------+------+-------------------+-------------------------------------------------------------------------------+--------+
| ID                                   | Name | MAC Address       | Fixed IP Addresses                                                            | Status |
+--------------------------------------+------+-------------------+-------------------------------------------------------------------------------+--------+
| 1428aa20-fde9-4e31-9fa5-b16313c74e92 |      | fa:16:3e:b8:91:03 | ip_address='192.168.1.109', subnet_id='43ed897b-3c10-4d5c-8f6d-263edcd817c7'  | ACTIVE |
| 93fdfb70-63d2-4804-a28e-9f0f70890b8c |      | fa:16:3e:52:57:f1 | ip_address='192.168.10.109', subnet_id='dcb0ca7e-edea-4ba1-bb07-e50e51fde57e' | ACTIVE |
...

iaas-cpu03 にログイン

$ brctl show
bridge name     bridge id               STP enabled     interfaces
brq6a303139-3b          8000.e6f569ccdcb3       no              tap93fdfb70-63
                                                        vxlan-10
brqbfd9fd43-c9          8000.f44d306e9cc0       no              eno1
virbr0          8000.525400253304       yes             virbr0-nic

tap93fdfb70-63 が VM のNICデバイス
brq6a303139-3b が tap93fdfb70-63 と vxlan-10 をブリッジしていることがわかる。
さらに vxlan-10 の状態を確認、物理 NIC のeno1 上にできていることがわかる。

$ ip -d link show vxlan-10
18: vxlan-10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master brq6a303139-3b state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether e6:f5:69:cc:dc:b3 brd ff:ff:ff:ff:ff:ff promiscuity 1
    vxlan id 10 group 224.0.0.1 dev eno1 srcport 0 0 dstport 8472 ageing 300
    bridge_slave state forwarding priority 32 cost 100 hairpin off guard off root_block off fastleave off learning on flood on addrgenmode eui64

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

DHCPの情報が渡らなかったのはController側のため?
controllerのネットワーク状態を確認する。
同一 VXLANインターフェースが存在、物理NIC enp2s0 上にあることを確認

$ ip -d link show vxlan-10
8: vxlan-10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master brq6a303139-3b state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether 26:4c:a1:c1:a9:75 brd ff:ff:ff:ff:ff:ff promiscuity 1
    vxlan id 10 group 224.0.0.1 dev enp2s0 srcport 0 0 dstport 8472 ageing 300
    bridge_slave state forwarding priority 32 cost 100 hairpin off guard off root_block off fastleave off learning on flood on addrgenmode eui64
$ brctl show
bridge name     bridge id               STP enabled     interfaces
brq6a303139-3b          8000.264ca1c1a975       no              tapd7459afe-90
                                                        vxlan-10
brqbfd9fd43-c9          8000.001b2139e5fa       no              enp2s0
                                                        tapf233ccef-a5
docker0         8000.024210971ae8       no

tapd7459afe-90 と vxlan-10 がブリッジbrq6a303139-3b で繋がっていることがわかる
tapd7459afe-90 は 192.168.10.100 のポートであることがわかる。
192.168.10.100 は subnet 作成時に指定したIPアドレス範囲の最初のもの

$ openstack port list
+--------------------------------------+------+-------------------+-------------------------------------------------------------------------------+--------+
| ID                                   | Name | MAC Address       | Fixed IP Addresses                                                            | Status |
+--------------------------------------+------+-------------------+-------------------------------------------------------------------------------+--------+
| d7459afe-9004-478f-884b-8e0a8e9991bc |      | fa:16:3e:4b:f2:e5 | ip_address='192.168.10.100', subnet_id='dcb0ca7e-edea-4ba1-bb07-e50e51fde57e' | ACTIVE |

この先は dnsmasq の net nsに繋がっている模様。
192.168.10.0/24 を対応する dnsmasq プロセスを確認

$ sudo ps -ef | grep dnsmasq
 nobody    4304     1  0 17:38 ?        00:00:00 dnsmasq --no-hosts --no-resolv --strict-order --except-interface=lo --pid-file=/var/lib/neutron/dhcp/6a303139-3bc7-4621-a27b-415f409cb743/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/6a303139-3bc7-4621-a27b-415f409cb743/host --addn-hosts=/var/lib/neutron/dhcp/6a303139-3bc7-4621-a27b-415f409cb743/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/6a303139-3bc7-4621-a27b-415f409cb743/opts --dhcp-leasefile=/var/lib/neutron/dhcp/6a303139-3bc7-4621-a27b-415f409cb743/leases --dhcp-match=set:ipxe,175 --bind-interfaces
 --interface=ns-d7459afe-90
 --dhcp-range=set:tag0,192.168.10.0,static,255.255.255.0,86400s
 --dhcp-option-force=option:mtu,1450 --dhcp-lease-max=256 --conf-file= --domain=openstacklocal

ns-d7459afe-90 でListenしていることがわかる。

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

ns-d7459afe-90 インターフェースを探す。
tapd7459afe-90 は netns 上にある ns-d7459afe-90 と繋がっているはず
netns を確認。

$ ip netns
qdhcp-6a303139-3bc7-4621-a27b-415f409cb743 (id: 1)
qdhcp-bfd9fd43-c9b4-43ad-bb67-930c674f2605 (id: 0)

id:1 (新しいほう) でインターフェース一覧を確認

$ sudo ip netns exec qdhcp-6a303139-3bc7-4621-a27b-415f409cb743 ifconfig
...
ns-d7459afe-90 Link encap:Ethernet  HWaddr fa:16:3e:4b:f2:e5
          inet addr:169.254.169.254  Bcast:169.254.255.255  Mask:255.255.0.0
          inet6 addr: fe80::f816:3eff:fe4b:f2e5/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:6 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:468 (468.0 B)  TX bytes:438 (438.0 B)
$
$ sudo ip netns exec qdhcp-6a303139-3bc7-4621-a27b-415f409cb743 ip link show
...
2: ns-d7459afe-90@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether fa:16:3e:4b:f2:e5 brd ff:ff:ff:ff:ff:ff link-netnsid 0
$
$ sudo ip netns exec qdhcp-6a303139-3bc7-4621-a27b-415f409cb743 ethtool -S ns-d7459afe-90
NIC statistics:
     peer_ifindex: 7

ns-d7459afe-90 の index が2、Pair先のindexが7であることがわかる
net ns 外での index 7を確認する。
-> tapd7459afe-90 と Pairになっていることがわかる。

$ ip link show
...
7: tapd7459afe-90@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master brq6a303139-3b state UP mode DEFAULT group default qlen 1000
    link/ether 4a:90:fb:ec:74:08 brd ff:ff:ff:ff:ff:ff link-netnsid 1

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

ネットワークの全体像が見えたので、tcpdump で各NICの状態を確認していく。

dnsmasqプロセス
-> ns-d7459afe-90 in netns
-> tapd7459afe-90 (dhcpパケットが届いていないことを確認)
-> brq6a303139-3b
-> vxlan-10 (dhcpパケットが届いていないことを確認)
-> enp2s0
--- これより上が Controller ノード内、下がCPU ノード ---
-> eno1
-> vxlan-10 (dhcpパケットが届いていることを確認)
-> brq6a303139-3b
-> tap93fdfb70-63
-> VM

つまり Controller ノードの vxlan-10 インターフェースに dhcp パケットが届いていないことになる。

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

Controller ノードの vxlan 設定を見直す。
たぶん、スイッチがマルチキャストを通さない?
linuxbridge の初期値が 224.0.0.1 のようだ。
http://git.openstack.org/cgit/openstack/neutron/tree/neutron/conf/plugins/ml2/drivers/linuxbridge.py#n38
下記のように linuxbridge の設定で無効にしてみる。

# diff -u linuxbridge_agent.ini.orig  linuxbridge_agent.ini
--- linuxbridge_agent.ini.orig  2019-03-06 16:54:07.934162103 -0800
+++ linuxbridge_agent.ini       2019-03-07 19:47:47.658474992 -0800
@@ -2,7 +2,9 @@
 physical_interface_mappings = provider:enp2s0,company:enp0s31f6

 [vxlan]
-enable_vxlan = false
+enable_vxlan = true
+local_ip = 192.168.1.1
+vxlan_group = none

 [securitygroup]
 firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

上記 none だと駄目っぽい

2019-03-07 19:52:06.879 893 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [-] Invalid VXLAN Group: none, must be an address or network (in CIDR notation) in a multicast range of the same address family as local_ip: 192.168.1.1: AddrFormatError: invalid IPNetwork none

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

# diff -u linuxbridge_agent.ini.orig  linuxbridge_agent.ini
--- linuxbridge_agent.ini.orig  2019-03-06 16:54:07.934162103 -0800
+++ linuxbridge_agent.ini       2019-03-07 19:47:47.658474992 -0800
@@ -2,7 +2,9 @@
 physical_interface_mappings = provider:enp2s0,company:enp0s31f6

 [vxlan]
-enable_vxlan = false
+enable_vxlan = true
+local_ip = 192.168.1.1
+vxlan_group =

 [securitygroup]
 firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

上記のように設定変更すると、下記の 735 行目でエラーになる。
vxlan_ucast_supported を指定するようにしないと駄目っぽい。

 727     def check_vxlan_support(self):
 728         self.vxlan_mode = lconst.VXLAN_NONE
 729
 730         if self.vxlan_ucast_supported():
 731             self.vxlan_mode = lconst.VXLAN_UCAST
 732         elif self.vxlan_mcast_supported():
 733             self.vxlan_mode = lconst.VXLAN_MCAST
 734         else:
 735             raise exceptions.VxlanNetworkUnsupported()
 736         LOG.debug('Using %s VXLAN mode', self.vxlan_mode)

チェックロジック

 677     def vxlan_ucast_supported(self):
 678         if not cfg.CONF.VXLAN.l2_population:
 679             return False
 680         if not ip_lib.iproute_arg_supported(
 681                 ['bridge', 'fdb'], 'append'):
 682             LOG.warning('Option "%(option)s" must be supported by command '
 683                         '"%(command)s" to enable %(mode)s mode',
 684                         {'option': 'append',
 685                          'command': 'bridge fdb',
 686                          'mode': 'VXLAN UCAST'})
 687             return False
 688
 689         test_iface = None
 690         for seg_id in moves.range(1, constants.MAX_VXLAN_VNI + 1):
 691             if (ip_lib.device_exists(self.get_vxlan_device_name(seg_id))
 692                     or ip_lib.vxlan_in_use(seg_id)):
 693                 continue
 694             test_iface = self.ensure_vxlan(seg_id)
 695             break
 696         else:
 697             LOG.error('No valid Segmentation ID to perform UCAST test.')
 698             return False
 699
 700         try:
 701             bridge_lib.FdbInterface.append(constants.FLOODING_ENTRY[0],
 702                                            test_iface, '1.1.1.1',
 703                                            log_fail_as_error=False)
 704             return True
 705         except RuntimeError:
 706             return False
 707         finally:
 708             self.delete_interface(test_iface)

cfg.CONF.VXLAN.l2_population を True にする。

# diff -u /etc/neutron/plugins/ml2/linuxbridge_agent.ini.orig /etc/neutron/plugins/ml2/linuxbridge_agent.ini
--- /etc/neutron/plugins/ml2/linuxbridge_agent.ini.orig 2019-03-06 16:54:07.934162103 -0800
+++ /etc/neutron/plugins/ml2/linuxbridge_agent.ini      2019-03-07 20:12:24.786563612 -0800
@@ -2,7 +2,10 @@
 physical_interface_mappings = provider:enp2s0,company:enp0s31f6

 [vxlan]
-enable_vxlan = false
+enable_vxlan = true
+local_ip = 192.168.1.1
+l2_population = true
+vxlan_group =

 [securitygroup]
 firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

multicast が止まったことを確認した。

$ ip -d link show vxlan-36
9: vxlan-36: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue master brq337443f2-b7 state UNKNOWN mode DEFAULT group default qlen 1000
    link/ether ba:bc:6e:0a:5b:09 brd ff:ff:ff:ff:ff:ff promiscuity 1
    vxlan id 36 dev enp2s0 srcport 0 0 dstport 8472 ageing 300
    bridge_slave state forwarding priority 32 cost 100 hairpin off guard off root_block off fastleave off learning on flood on addrgenmode eui64

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

TODO: 全ノードに上記設定を展開する。

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

neutron-vxlan

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

Neutron-vxlan.pptx

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

まだ駄目っぽい。
Controller側の VXLAN インターフェースに DHCP パケットが届いていない。

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

https://docs.openstack.org/liberty/ja/install-guide-ubuntu/neutron-controller-install-option2.html によると

mechanism_drivers = linuxbridge,l2population

とl2populationを指定しなければなら無そう。
その他、もろもろを再設定。

--- /etc/neutron/plugins/ml2/ml2_conf.ini.orig  2019-03-06 10:54:51.062392771 -0800
+++ /etc/neutron/plugins/ml2/ml2_conf.ini       2019-03-08 10:31:28.388334795 -0800
@@ -1,9 +1,11 @@
 [ml2]
-type_drivers = flat,vlan
-tenant_network_types =
-mechanism_drivers = linuxbridge
+type_drivers = flat,vxlan
+tenant_network_types = vxlan
+mechanism_drivers = linuxbridge,l2population
 extension_drivers = port_security

 [ml2_type_flat]
-flat_networks = provider,company
+flat_networks = provider

+[ml2_type_vxlan]
+vni_ranges = 1:1000
--- /etc/neutron/plugins/ml2/linuxbridge_agent.ini.orig 2019-03-06 16:54:07.934162103 -0800
+++ /etc/neutron/plugins/ml2/linuxbridge_agent.ini      2019-03-08 10:35:50.800145841 -0800
@@ -2,7 +2,13 @@
 physical_interface_mappings = provider:enp2s0,company:enp0s31f6

 [vxlan]
-enable_vxlan = false
+enable_vxlan = true
+local_ip = 192.168.1.1
+l2_population = true
+vxlan_group =
+
+[agent]
+prevent_arp_spoofing = true

 [securitygroup]
 firewall_driver = neutron.agent.linux.iptables_firewall.IptablesFirewallDriver

@oomichi
Copy link
Owner Author

oomichi commented Mar 8, 2019

できるようになった。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant