Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PodResources interface enhancements #1884

Merged

Commits on Oct 6, 2020

  1. Add numaid and cpus into PodResources interface

    This change necessary for resource with topology exporting daemon,
    which used in topology aware scheduling.
    
    Information about CPU is keeping in cpu_ids, since it's  enough to
    represent both quantity and numaid. NUMAid can be obtained from
    cadvisor MachineInfo, since id in cpus_ids is a thread_id.
    This API doesn't provide cpu fraction, since it could be obtainded from
    Pod's request/limits and in case of non-integer CPU quantity and
    non-guaranteed QoS cpu assigned is not exclusive and NUMA id is not
    interesting.
    
    Device assignment on the NUMA node is representing by Topology
    structure. The structure was chosen because of several reason:
     more extensibility (with keeping backward compatibility)
     more compact representation of case when we have device aligned on
    several NUMA nodes (compared to just numa_id field in ContainerDevices)
    
    Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
    AlexeyPerevalov committed Oct 6, 2020
    Copy the full SHA
    1372b3d View commit details
    Browse the repository at this point in the history
  2. podresources: add endpoint to query all resources

    To enable topology aware scheduling, the reporting agent wants
    to learn about the available resources on the worker node, in
    order to then let the scheduler know and do an informed decision.
    
    As it is now, the kubelet is the only authoritative source for
    the resource accounting, for example because it manages the devices
    plugin. Thus we need to extract this information from the kubelet.
    
    A good fit for this task is the podresources API, and we add here
    a new endpoint to let clients enumerate the resources.
    
    Signed-off-by: Francesco Romani <fromani@redhat.com>
    ffromani authored and AlexeyPerevalov committed Oct 6, 2020
    Copy the full SHA
    af8e194 View commit details
    Browse the repository at this point in the history
  3. node: podresources: address review comments

    Signed-off-by: Francesco Romani <fromani@redhat.com>
    Signed-off-by: Alexey Perevalov <alexey.perevalov@huawei.com>
    ffromani authored and AlexeyPerevalov committed Oct 6, 2020
    Copy the full SHA
    3b9ac4c View commit details
    Browse the repository at this point in the history
  4. podresources: add Watch API

    Extend the protocol with a simple implementation of ListAndWatch
    to enable monitoring agents to be notified of resource allocation
    changes.
    
    A separate Watch endpoint presents the issue of
    enabling client applications to not lose any updates
    when both APIs are used.
    
    The straightforward option is to follow the generic k8s approach (see
    link below) and let kubelet keep a historical window of
    the last recent changes, so client applications have the chance to
    issue `List` and shortly after `Watch`, starting from the
    resourceVersion returned in `List`.
    The underlying assumption is indeed that `Watch` happens "shortly" after `List`,
    otherwise the system cannot guarantee the lack of gaps.
    
    However implementing this support requires to keep the aforementioned
    sliding window of changes, which however requires careful implementation
    to address scalability and safety guarantees.
    
    However, the `podresources` API is a specific API, so, while is good to
    follow as much as possible the generic API concepts, it also allows some
    possible little differences which can help keep the implementation
    simple and safe.
    
    This patch proposes a simplest possible approach to reconcile the
    `List` and `Watch` responses, providing the `resource_version` field and
    suggesting a little change in the client applications programming model.
    
    Inspired by the concepts found on
    https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes
    
    Signed-off-by: Francesco Romani <fromani@redhat.com>
    ffromani authored and AlexeyPerevalov committed Oct 6, 2020
    Copy the full SHA
    ad1294d View commit details
    Browse the repository at this point in the history
  5. podresources: fix markdown

    align section with TOC
    
    Signed-off-by: Francesco Romani <fromani@redhat.com>
    ffromani authored and AlexeyPerevalov committed Oct 6, 2020
    Copy the full SHA
    4b500dd View commit details
    Browse the repository at this point in the history
  6. address review comments

    Signed-off-by: Francesco Romani <fromani@redhat.com>
    ffromani authored and AlexeyPerevalov committed Oct 6, 2020
    Copy the full SHA
    60b1c07 View commit details
    Browse the repository at this point in the history