Question : mesos networking by angry-tony · Pull Request #126 · apache/mesos

angry-tony · 2016-07-04T03:05:13Z

I am wondering if Mesos can offer resources from multiple network interfaces?
I would like to attach multiple Network Interfaces (public eth0, private eth1) on slave nodes and would like to bind specific applications that I run on mesos's slave nodes on specific interfaces?
does not mesos need distinct physical networks like OpenStack has four distinct physical networks??

is there any reference guide or doc?

Review: https://reviews.apache.org/r/44414

Review: https://reviews.apache.org/r/44468/

Review: https://reviews.apache.org/r/44391/

Review: https://reviews.apache.org/r/44531

As long as the module manifest(s) being loaded don't conflict with the already loaded manifests, the module manager will allow multiple calls to `load()`. Review: https://reviews.apache.org/r/44694

Review: https://reviews.apache.org/r/44258/

Based on the modified test Anand posted in the description of MESOS-4831. Review: https://reviews.apache.org/r/44537/

Review: https://reviews.apache.org/r/44435/

This is because if the response is chunked, curl will output the dechunked version. That will confuse the http response decoder because the header shows that it is chunked. Review: https://reviews.apache.org/r/44944/

Review: https://reviews.apache.org/r/44672/

Review: https://reviews.apache.org/r/45092/

Review: https://reviews.apache.org/r/45151

Review: https://reviews.apache.org/r/45267

Review: https://reviews.apache.org/r/45386/

The "overlayfs" was renamed to "overlay" in kernel 4.2, for overlay support check function, it should check both "overlay" and "overlayfs" in "/proc/filesystems". Review: https://reviews.apache.org/r/44421/

JSON::Object::find() interprets the '.' as a path into nested JSON objects. Add the at() helper that just indexes the top level keys of a JSON object with an uninterpreted key. This is helpful for indexing keys that contain periods. Review: https://reviews.apache.org/r/45450/

If you use the unified containerizer and the local docker puller with a Docker image from a private registry, the local puller fails to find any layers in the image's repository manifest. This happens because the top layer repository is qualified by the private registry name. The fix is to also try the fully-qualified repository name if we have a registry name. Review: https://reviews.apache.org/r/45451/

Review: https://reviews.apache.org/r/45520

This is for the port mapping isolator. Review: https://reviews.apache.org/r/45690

Review: https://reviews.apache.org/r/45726/

*** Modified for 0.28.1 release *** Review: https://reviews.apache.org/r/44985

*** Modified for 0.28.1 release **** Review: https://reviews.apache.org/r/45773

Previously the 'perf' parsing logic used the kernel version to determine the token ordering. However, this approach breaks when distributions backport perf parsing changes onto older kernel versions. This updates the parsing logic to understand all existing formats. Co-authored with haosdent. Review: https://reviews.apache.org/r/44379/

This reflects the backport of MESOS-4705 to the 0.28.x branch. Review: https://reviews.apache.org/r/46663

Review: https://reviews.apache.org/r/44888/

Review: https://reviews.apache.org/r/46577/

Review: https://reviews.apache.org/r/46576/

This patch simplifies the way we deal with rootfs for command tasks. Here are the major simplifications: 1) We no longer do a bind mount (<rootfs> -> <sandbox>/.rootfs) for command tasks. All isolators can now assume that for both custom executors or command tasks, if they change root filesystems, the root filesystems will be specified in 'containerConfig.rootfs'. This will simplify isolators' logic. 2) The sandbox bind mount will be done in the container's mount namespace. Currently, it's done in the host mount namespace. This creates a lot of complications, especially during the cleanup and recovery path. 3) All persistent volumes will be consistently mounted under the container's sandbox directory. Currently, depending on whether the container changes the root filesystem or not, the persistent volumes will be mounted at different locations. This simplifies the cleanup and recovery path as well. *** Modified for 0.28.2 *** Review: https://reviews.apache.org/r/46807

Review: https://reviews.apache.org/r/45057/

MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline message instead of calling acceptOffers with an empty list of task infos. The changed version of declineOffer however did not remove the offerId from the savedOffers map, causing a memory leak. Review: https://reviews.apache.org/r/47804/

***Modified for 0.28.2*** The agent now shuts down the executor during registration if it does not have any queued tasks (e.g., framework sent a killTask before registration). Note that if the executor doesn't register at all, it will be cleaned up anyway after the registration timeout value. Also, note that this doesn't handle the case where the agent restarts after processing the killTask() but before cleaning up the executor. Review: https://reviews.apache.org/r/47381

***Modified for 0.28.2*** If the agent restarts after handling killTask but before sending shutdown message to the executor, we ensure the executor terminates. Review: https://reviews.apache.org/r/47402

This is necessary to enable bash subshell redirection within the container. Review: https://reviews.apache.org/r/48240/

Currently the agent will immediately link with the master once the master is discovered. The agent currently uses a registration backoff in order to alleviate the failover load of the master for large clusters. However, since the link occurs immediately, the master still has to process a large influx of connections. This can lead to connections closing and the agents not re-registering due to MESOS-1963. Performing the link after the initial backoff ensures that the --registration_backoff flag alleviates a thundering herd of TCP connections against the master. Review: https://reviews.apache.org/r/47209/

This was a scalability improvement to make agents backoff before establishing connections to the master. Review: https://reviews.apache.org/r/49109

Review: https://reviews.apache.org/r/48994/

When an SSL-enabled Mesos actor attempts to link, it will open an SSL socket first. If this fails, and downgrading to non-SSL is enabled, the Mesos actor will create a "Poll" socket instead. In this code path, we explicitly avoid calling `SocketManager::close` as "closing" a link will also trigger `ExitedEvents`. Instead, the downgrade codepath tries to "swap" the `SocketManager`s state from the old SSL socket, to the new Poll socket via `swap_implementing_socket`. `swap_implementing_socket` leaks a `Socket` object, which is a reference-counted wrapper for an FD. Besides the memory leak, leaking the `Socket` means that the reference count for the old SSL socket will always be above zero. And we only `close` the socket when the reference count reaches zero. Review: https://reviews.apache.org/r/49127/

Incoming sockets are leaked when the agent forks because incoming sockets are not modified with the CLOEXEC option. Review: https://reviews.apache.org/r/49280/

This fixes a rare race (segfault) between `link` and `ignore_recv_data`. If the peer of the socket exits between establishing a connection and libprocess queuing a `MessageEncoder`, `ignore_recv_data` may delete the `Socket` underneath the `link`. This patch is meant to be easily backported. Review: https://reviews.apache.org/r/49416/

The DRFSorter previously kept the total resources at each agent, along with the total quantity of resources in the cluster. The latter figure is what most of the clients of the sorter care about. In the few places where the previous coding was using the per-agent total resources, it is relatively easy to adjust the code to remove this dependency. As part of this change, remove `total()` and `update(const SlaveID& slaveId, const Resources& resources)` from the Sorter interface. The former was unused; for the latter, code that used it can instead be rewritten to adjust the total resources in the cluster by first removing the previous resources at a agent and then adding the new resources. Review: https://reviews.apache.org/r/49375/

Review: https://reviews.apache.org/r/49376/

Review: https://reviews.apache.org/r/45529/

These tests check that dynamic reservations and persistent volumes can be created via the offer API and then removed via the HTTP endpoints, and vice versa. Review: https://reviews.apache.org/r/49323/

Each `DRFSorter` tracks the total resources in the cluster. This means that each sorter must be updated when the resources in the cluster have changed, e.g., due to the creation of a dynamic reservation or a persistent volume. In the previous implementation, the quota role sorter was not updated for non-quota roles when a reservation or persistent volume was created by a framework. This resulted in inconsistency between the total resources in the allocator and the quota role sorter. This could cause several problems. First, removing a slave from the cluster would leak resources in the quota role sorter. Second, certain interleavings of slave removals and `reserve`/`unreserve` operations by frameworks and via HTTP endpoints could lead to `CHECK` failures. Review: https://reviews.apache.org/r/49377/

jieyu · 2016-07-07T15:45:14Z

Can you ask the question on dev/user mailing list, or slack channel? Thanks!

The `RemoteConnection:RECONNECT` option for `ProcessBase::link` will force the `SocketManager` to create a new socket if a persistent link already exists. Review: https://reviews.apache.org/r/49177/

Review: https://reviews.apache.org/r/49346/

jfarrell · 2016-07-13T01:11:08Z

Closing per request at https://s.apache.org/V8r3

jieyu and others added 30 commits March 11, 2016 13:38

Added documentation about container image support.

0558e11

Review: https://reviews.apache.org/r/44414

Fixed the logic for default docker cmd case.

bf3cf92

Review: https://reviews.apache.org/r/44468/

Added document for overlayfs backend.

b709724

Review: https://reviews.apache.org/r/44391/

Fixed parsing network ip address with docker.

95dae31

Review: https://reviews.apache.org/r/44531

Enabled multiple calls to ModuleManager::load().

f28cc0f

As long as the module manifest(s) being loaded don't conflict with the already loaded manifests, the module manager will allow multiple calls to `load()`. Review: https://reviews.apache.org/r/44694

Maintenance: Fixed bug that sent two inverse offers.

2f58264

Review: https://reviews.apache.org/r/44258/

Update PendingUnavailabilityTest to use the new scheduler mock.

59df0e5

Based on the modified test Anand posted in the description of MESOS-4831. Review: https://reviews.apache.org/r/44537/

Updated CHANGELOG for 0.28.0-rc2.

961edbd

Fixed a bug that causes the task stuck in staging state.

5b2aa00

Review: https://reviews.apache.org/r/44435/

Handled chunked responses in docker URI fetcher.

987b5a2

This is because if the response is chunked, curl will output the dechunked version. That will confuse the http response decoder because the header shows that it is chunked. Review: https://reviews.apache.org/r/44944/

Added normalize method to registry puller.

7ef1db7

Review: https://reviews.apache.org/r/44672/

Fixed containerizer potential race destroy while provisioning.

2b1e16c

Review: https://reviews.apache.org/r/45092/

Updated FrameworkInfo::Capability::Type enum for upgradability.

e9f7426

Review: https://reviews.apache.org/r/45151

Fixed a memory leak in process::subprocess.

e6d839b

Review: https://reviews.apache.org/r/45267

Fix container destroy provisioning race.

2576ee5

Review: https://reviews.apache.org/r/45386/

Added support for "overlay" keyword.

92dc954

The "overlayfs" was renamed to "overlay" in kernel 4.2, for overlay support check function, it should check both "overlay" and "overlayfs" in "/proc/filesystems". Review: https://reviews.apache.org/r/44421/

Used realpath for the bind mount root in port mapping isolator.

43668f1

Review: https://reviews.apache.org/r/45520

Ensured the bind mount root is a shared mount in its own peer group.

5c4c70d

This is for the port mapping isolator. Review: https://reviews.apache.org/r/45690

Fixed a bug in the flags::parse function and added a couple of tests.

cf8c704

Review: https://reviews.apache.org/r/45726/

Fix the broken ProvisionerDockerPullerTest on Centos7.

d7eac40

*** Modified for 0.28.1 release *** Review: https://reviews.apache.org/r/44985

Updated the Mesos version to 0.28.1.

6de4526

Updated the CHANGELOG for 0.28.1.

555db23

*** Modified for 0.28.1 release **** Review: https://reviews.apache.org/r/45773

Added 0.28.2 to the CHANGELOG.

ab3acfd

This reflects the backport of MESOS-4705 to the 0.28.x branch. Review: https://reviews.apache.org/r/46663

Fixed typo when displaying Disk in WebUI.

7a411fe

Review: https://reviews.apache.org/r/44888/

Fixed isolator cleaup issue when destroying a provisioning container.

f094cb6

Review: https://reviews.apache.org/r/46577/

Fixed a mesos containerizer race destroy while preparing.

5dc1064

Review: https://reviews.apache.org/r/46576/

janisz and others added 24 commits May 23, 2016 09:59

Made unzip overwrite existing files without prompting.

98d27ee

Review: https://reviews.apache.org/r/45057/

Updated CHANGELOG for 0.28.2.

d8bdb8d

Fixed agent to properly handle killTask during agent restart.

b52a41d

***Modified for 0.28.2*** If the agent restarts after handling killTask but before sending shutdown message to the executor, we ensure the executor terminates. Review: https://reviews.apache.org/r/47402

Updated CHANGELOG for 0.28.2.

c021631

Updated CHANGELOG for 0.28.2.

ceecad6

Updated the Mesos version to 0.28.3.

3ef2edd

Added /dev/fd to the list of symlinks created by filesystem/linux.

1f03f13

This is necessary to enable bash subshell redirection within the container. Review: https://reviews.apache.org/r/48240/

Updated CHANGELOG for tracking 0.28.3.

b7f0d56

Cherry-pick of MESOS-5330 to 0.28.x.

2026903

This was a scalability improvement to make agents backoff before establishing connections to the master. Review: https://reviews.apache.org/r/49109

Fixed portmapping isolator bind mount root non-existed case.

3dc2a85

Review: https://reviews.apache.org/r/48994/

Backported MESOS-5673 to 0.28.x branch.

4f3e4f1

Fixed FD inheritance leak when SSL is enabled.

9d1ebc6

Incoming sockets are leaked when the agent forks because incoming sockets are not modified with the CLOEXEC option. Review: https://reviews.apache.org/r/49280/

Cherry-pick of MESOS-5691,5723,5748 to 0.28.x.

50bb082

Added new assertions to DRFSorter.

a2d8e7b

Review: https://reviews.apache.org/r/49376/

Fixed leaked roleSorter and quotaRoleSorter in allocator.

5b21dca

Review: https://reviews.apache.org/r/45529/

Added tests that combine the two ways of creating volumes.

3b0efb3

These tests check that dynamic reservations and persistent volumes can be created via the offer API and then removed via the HTTP endpoints, and vice versa. Review: https://reviews.apache.org/r/49323/

Cherry-pick of MESOS-5073 and MESOS-5698 to 0.28.x.

9bcdc69

Joseph Wu and others added 3 commits July 11, 2016 14:02

Added "relink" semantics to ProcessBase::link.

1415fc9

The `RemoteConnection:RECONNECT` option for `ProcessBase::link` will force the `SocketManager` to create a new socket if a persistent link already exists. Review: https://reviews.apache.org/r/49177/

Changed the replicated log's network abstraction to use "relink".

cf7afe7

Review: https://reviews.apache.org/r/49346/

Added backport of MESOS-5576 and MESOS-5740 to 0.28.3.

5464a90

jfarrell closed this Jul 13, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question : mesos networking#126

Question : mesos networking#126
angry-tony wants to merge 62 commits intomasterfrom
0.28.x

angry-tony commented Jul 4, 2016

Uh oh!

jieyu commented Jul 7, 2016

Uh oh!

jfarrell commented Jul 13, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

angry-tony commented Jul 4, 2016

Uh oh!

jieyu commented Jul 7, 2016

Uh oh!

jfarrell commented Jul 13, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants