Closed
Conversation
As long as the module manifest(s) being loaded don't conflict with the already loaded manifests, the module manager will allow multiple calls to `load()`. Review: https://reviews.apache.org/r/44694
Based on the modified test Anand posted in the description of MESOS-4831. Review: https://reviews.apache.org/r/44537/
This is because if the response is chunked, curl will output the dechunked version. That will confuse the http response decoder because the header shows that it is chunked. Review: https://reviews.apache.org/r/44944/
The "overlayfs" was renamed to "overlay" in kernel 4.2, for overlay support check function, it should check both "overlay" and "overlayfs" in "/proc/filesystems". Review: https://reviews.apache.org/r/44421/
JSON::Object::find() interprets the '.' as a path into nested JSON objects. Add the at() helper that just indexes the top level keys of a JSON object with an uninterpreted key. This is helpful for indexing keys that contain periods. Review: https://reviews.apache.org/r/45450/
If you use the unified containerizer and the local docker puller with a Docker image from a private registry, the local puller fails to find any layers in the image's repository manifest. This happens because the top layer repository is qualified by the private registry name. The fix is to also try the fully-qualified repository name if we have a registry name. Review: https://reviews.apache.org/r/45451/
This is for the port mapping isolator. Review: https://reviews.apache.org/r/45690
*** Modified for 0.28.1 release *** Review: https://reviews.apache.org/r/44985
*** Modified for 0.28.1 release **** Review: https://reviews.apache.org/r/45773
Previously the 'perf' parsing logic used the kernel version to determine the token ordering. However, this approach breaks when distributions backport perf parsing changes onto older kernel versions. This updates the parsing logic to understand all existing formats. Co-authored with haosdent. Review: https://reviews.apache.org/r/44379/
This reflects the backport of MESOS-4705 to the 0.28.x branch. Review: https://reviews.apache.org/r/46663
This patch simplifies the way we deal with rootfs for command tasks. Here are the major simplifications: 1) We no longer do a bind mount (<rootfs> -> <sandbox>/.rootfs) for command tasks. All isolators can now assume that for both custom executors or command tasks, if they change root filesystems, the root filesystems will be specified in 'containerConfig.rootfs'. This will simplify isolators' logic. 2) The sandbox bind mount will be done in the container's mount namespace. Currently, it's done in the host mount namespace. This creates a lot of complications, especially during the cleanup and recovery path. 3) All persistent volumes will be consistently mounted under the container's sandbox directory. Currently, depending on whether the container changes the root filesystem or not, the persistent volumes will be mounted at different locations. This simplifies the cleanup and recovery path as well. *** Modified for 0.28.2 *** Review: https://reviews.apache.org/r/46807
MesosScheduler.declineOffers has been changed ~6 months ago to send a Decline message instead of calling acceptOffers with an empty list of task infos. The changed version of declineOffer however did not remove the offerId from the savedOffers map, causing a memory leak. Review: https://reviews.apache.org/r/47804/
***Modified for 0.28.2*** The agent now shuts down the executor during registration if it does not have any queued tasks (e.g., framework sent a killTask before registration). Note that if the executor doesn't register at all, it will be cleaned up anyway after the registration timeout value. Also, note that this doesn't handle the case where the agent restarts after processing the killTask() but before cleaning up the executor. Review: https://reviews.apache.org/r/47381
***Modified for 0.28.2*** If the agent restarts after handling killTask but before sending shutdown message to the executor, we ensure the executor terminates. Review: https://reviews.apache.org/r/47402
This is necessary to enable bash subshell redirection within the container. Review: https://reviews.apache.org/r/48240/
Currently the agent will immediately link with the master once the master is discovered. The agent currently uses a registration backoff in order to alleviate the failover load of the master for large clusters. However, since the link occurs immediately, the master still has to process a large influx of connections. This can lead to connections closing and the agents not re-registering due to MESOS-1963. Performing the link after the initial backoff ensures that the --registration_backoff flag alleviates a thundering herd of TCP connections against the master. Review: https://reviews.apache.org/r/47209/
This was a scalability improvement to make agents backoff before establishing connections to the master. Review: https://reviews.apache.org/r/49109
When an SSL-enabled Mesos actor attempts to link, it will open an SSL socket first. If this fails, and downgrading to non-SSL is enabled, the Mesos actor will create a "Poll" socket instead. In this code path, we explicitly avoid calling `SocketManager::close` as "closing" a link will also trigger `ExitedEvents`. Instead, the downgrade codepath tries to "swap" the `SocketManager`s state from the old SSL socket, to the new Poll socket via `swap_implementing_socket`. `swap_implementing_socket` leaks a `Socket` object, which is a reference-counted wrapper for an FD. Besides the memory leak, leaking the `Socket` means that the reference count for the old SSL socket will always be above zero. And we only `close` the socket when the reference count reaches zero. Review: https://reviews.apache.org/r/49127/
Incoming sockets are leaked when the agent forks because incoming sockets are not modified with the CLOEXEC option. Review: https://reviews.apache.org/r/49280/
This fixes a rare race (segfault) between `link` and `ignore_recv_data`. If the peer of the socket exits between establishing a connection and libprocess queuing a `MessageEncoder`, `ignore_recv_data` may delete the `Socket` underneath the `link`. This patch is meant to be easily backported. Review: https://reviews.apache.org/r/49416/
The DRFSorter previously kept the total resources at each agent, along with the total quantity of resources in the cluster. The latter figure is what most of the clients of the sorter care about. In the few places where the previous coding was using the per-agent total resources, it is relatively easy to adjust the code to remove this dependency. As part of this change, remove `total()` and `update(const SlaveID& slaveId, const Resources& resources)` from the Sorter interface. The former was unused; for the latter, code that used it can instead be rewritten to adjust the total resources in the cluster by first removing the previous resources at a agent and then adding the new resources. Review: https://reviews.apache.org/r/49375/
These tests check that dynamic reservations and persistent volumes can be created via the offer API and then removed via the HTTP endpoints, and vice versa. Review: https://reviews.apache.org/r/49323/
Each `DRFSorter` tracks the total resources in the cluster. This means that each sorter must be updated when the resources in the cluster have changed, e.g., due to the creation of a dynamic reservation or a persistent volume. In the previous implementation, the quota role sorter was not updated for non-quota roles when a reservation or persistent volume was created by a framework. This resulted in inconsistency between the total resources in the allocator and the quota role sorter. This could cause several problems. First, removing a slave from the cluster would leak resources in the quota role sorter. Second, certain interleavings of slave removals and `reserve`/`unreserve` operations by frameworks and via HTTP endpoints could lead to `CHECK` failures. Review: https://reviews.apache.org/r/49377/
Member
|
Can you ask the question on dev/user mailing list, or slack channel? Thanks! |
The `RemoteConnection:RECONNECT` option for `ProcessBase::link` will force the `SocketManager` to create a new socket if a persistent link already exists. Review: https://reviews.apache.org/r/49177/
Contributor
|
Closing per request at https://s.apache.org/V8r3 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I am wondering if Mesos can offer resources from multiple network interfaces?
I would like to attach multiple Network Interfaces (public eth0, private eth1) on slave nodes and would like to bind specific applications that I run on mesos's slave nodes on specific interfaces?
does not mesos need distinct physical networks like OpenStack has four distinct physical networks??
is there any reference guide or doc?