Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics interfaces at edge #1573

Merged
merged 7 commits into from
Apr 17, 2020
Merged

Conversation

fisherxu
Copy link
Member

@fisherxu fisherxu commented Mar 31, 2020

What type of PR is this?

/kind feature

What this PR does / why we need it:
Previously, edgecore did not support kubelet's metrics interface, and users could not view monitoring data at the edge.

This PR integrates cadvisor-based metrics interface into edged, and users can collect monitoring data.

Notes: the metrics interface is now only accessed on localhost based insecure server now.

Next step, we will support metric-server in the cloud, and support collect monitoring data from the edge through the cloud/edge hub.

And the edgecore binary size change(Uncompressed):
92M -> 105M

The memory change:
(0 pod): 64M -> 72M
(100 pods): 72M ->96M

Disable metrics through setting EnableMetric to false:
(0 pod): 64M
(100 pods): 80M

Xref: #1561

➜  ~ curl 127.0.0.1:10350/metrics/probes   
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.58564475334e+09

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Add metrics interfaces at edge

@kubeedge-bot kubeedge-bot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 31, 2020
@kubeedge-bot kubeedge-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Mar 31, 2020
@kevin-wangzefeng kevin-wangzefeng added this to the v1.3 milestone Mar 31, 2020
@fisherxu
Copy link
Member Author

fisherxu commented Apr 3, 2020

Metrics-server can use stats/summary interface below to collect infos.

➜  ~ kubectl get pods
NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-86b86956b8-vb868   1/1     Running   0          15s
➜  ~ 
➜  ~ curl "127.0.0.1:10350/stats/summary"
{
 "node": {
  "nodeName": "edge-node",
  "systemContainers": [
   {
    "name": "kubelet",
    "startTime": "2020-04-03T10:16:05Z",
    "cpu": {
     "time": "2020-04-03T10:19:14Z",
     "usageNanoCores": 97041104,
     "usageCoreNanoSeconds": 1421531396180
    },
    "memory": {
     "time": "2020-04-03T10:19:14Z",
     "usageBytes": 525459456,
     "workingSetBytes": 468848640,
     "rssBytes": 74661888,
     "pageFaults": 45787751,
     "majorPageFaults": 1418
    }
   },
   {
    "name": "runtime",
    "startTime": "2020-04-03T10:16:05Z",
    "cpu": {
     "time": "2020-04-03T10:19:15Z",
     "usageNanoCores": 159960549,
     "usageCoreNanoSeconds": 163277825011
    },
    "memory": {
     "time": "2020-04-03T10:19:15Z",
     "usageBytes": 1162891264,
     "workingSetBytes": 1005391872,
     "rssBytes": 362520576,
     "pageFaults": 3391977,
     "majorPageFaults": 0
    }
   },
   {
    "name": "pods",
    "startTime": "2020-04-02T08:13:38Z",
    "cpu": {
     "time": "2020-04-03T10:19:06Z",
     "usageNanoCores": 356500246,
     "usageCoreNanoSeconds": 853667101402041
    },
    "memory": {
     "time": "2020-04-03T10:19:06Z",
     "availableBytes": 2506276864,
     "usageBytes": 7189700608,
     "workingSetBytes": 5864239104,
     "rssBytes": 1310015488,
     "pageFaults": 733896478,
     "majorPageFaults": 270
    }
   }
  ],
  "startTime": "2019-12-23T11:54:27Z",
  "cpu": {
   "time": "2020-04-03T10:19:06Z",
   "usageNanoCores": 356500246,
   "usageCoreNanoSeconds": 853667101402041
  },
  "memory": {
   "time": "2020-04-03T10:19:06Z",
   "availableBytes": 2506276864,
   "usageBytes": 7189700608,
   "workingSetBytes": 5864239104,
   "rssBytes": 1310015488,
   "pageFaults": 733896478,
   "majorPageFaults": 270
  },
  "network": {
   "time": "2020-04-03T10:19:06Z",
   "name": "eth0",
   "rxBytes": 22677920837,
   "rxErrors": 0,
   "txBytes": 11514467359,
   "txErrors": 0,
   "interfaces": [
    {
     "name": "edge0",
     "rxBytes": 0,
     "rxErrors": 0,
     "txBytes": 0,
     "txErrors": 0
    },
    {
     "name": "dummy0",
     "rxBytes": 0,
     "rxErrors": 0,
     "txBytes": 0,
     "txErrors": 0
    },
    {
     "name": "cni0",
     "rxBytes": 6397556,
     "rxErrors": 0,
     "txBytes": 14659853,
     "txErrors": 0
    },
    {
     "name": "flannel.1",
     "rxBytes": 0,
     "rxErrors": 0,
     "txBytes": 0,
     "txErrors": 0
    },
    {
     "name": "eth0",
     "rxBytes": 22677920837,
     "rxErrors": 0,
     "txBytes": 11514467359,
     "txErrors": 0
    }
   ]
  },
  "fs": {
   "time": "2020-04-03T10:19:06Z",
   "availableBytes": 47825649664,
   "capacityBytes": 126692048896,
   "usedBytes": 73266397184,
   "inodesFree": 6326802,
   "inodes": 7864320,
   "inodesUsed": 1537518
  },
  "runtime": {
   "imageFs": {
    "time": "2020-04-03T10:19:06Z",
    "availableBytes": 47825649664,
    "capacityBytes": 126692048896,
    "usedBytes": 7693978269,
    "inodesFree": 6326802,
    "inodes": 7864320,
    "inodesUsed": 1537518
   }
  },
  "rlimit": {
   "time": "2020-04-03T10:19:22Z",
   "maxpid": 32768,
   "curproc": 3124
  }
 },
 "pods": [
  {
   "podRef": {
    "name": "nginx-deployment-86b86956b8-vb868",
    "namespace": "default",
    "uid": "5911a297-175e-431b-8c22-3b85586aaa12"
   },
   "startTime": "2020-04-03T10:19:04Z",
   "containers": [
    {
     "name": "nginx",
     "startTime": "2020-04-03T10:19:08Z",
     "cpu": {
      "time": "2020-04-03T10:19:19Z",
      "usageNanoCores": 0,
      "usageCoreNanoSeconds": 29694404
     },
     "memory": {
      "time": "2020-04-03T10:19:19Z",
      "availableBytes": 131338240,
      "usageBytes": 2887680,
      "workingSetBytes": 2879488,
      "rssBytes": 1441792,
      "pageFaults": 1114,
      "majorPageFaults": 0
     },
     "rootfs": {
      "time": "2020-04-03T10:19:19Z",
      "availableBytes": 47825649664,
      "capacityBytes": 126692048896,
      "usedBytes": 57344,
      "inodesFree": 6326802,
      "inodes": 7864320,
      "inodesUsed": 14
     },
     "logs": {
      "time": "2020-04-03T10:19:19Z",
      "availableBytes": 47825649664,
      "capacityBytes": 126692048896,
      "usedBytes": 24576,
      "inodesFree": 6326802,
      "inodes": 7864320,
      "inodesUsed": 1537518
     }
    }
   ],
   "network": {
    "time": "2020-04-03T10:19:15Z",
    "name": "eth0",
    "rxBytes": 648,
    "rxErrors": 0,
    "txBytes": 0,
    "txErrors": 0,
    "interfaces": [
     {
      "name": "eth0",
      "rxBytes": 648,
      "rxErrors": 0,
      "txBytes": 0,
      "txErrors": 0
     }
    ]
   },
   "volume": [
    {
     "time": "2020-04-03T10:19:05Z",
     "availableBytes": 4185243648,
     "capacityBytes": 4185255936,
     "usedBytes": 12288,
     "inodesFree": 1021782,
     "inodes": 1021791,
     "inodesUsed": 9,
     "name": "default-token-qnbp2"
    }
   ],
   "ephemeral-storage": {
    "time": "2020-04-03T10:19:19Z",
    "availableBytes": 47825649664,
    "capacityBytes": 126692048896,
    "usedBytes": 81920,
    "inodesFree": 6326802,
    "inodes": 7864320,
    "inodesUsed": 14
   }
  }
 ]
}#                              

@fisherxu
Copy link
Member Author

fisherxu commented Apr 3, 2020

It's ready to review now, ptal
@kevin-wangzefeng @kadisi @subpathdev @chendave

Copy link
Member

@chendave chendave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job fishxu!

Overall, I am good with the PR, here is the early comments from me, may not valid though,

  • TBH, this PR scared me as it is too large :), can we split it to more small commits (specially, second "add cadvisor") for better review?
  • might need some doc, for example, how to enable/disable the feature, how to collect data etc.
  • size of edgecore is getting bigger and bigger, I am not sure whether we are moving in the right direction with the more and more new features added.
  • we might need a release cycle to make kubeedge more stable, more refined code, doc improvement, feature enhancement etc.

will look into the details later.

edge/pkg/edged/edged.go Outdated Show resolved Hide resolved
edge/pkg/edged/edged.go Outdated Show resolved Hide resolved
edge/pkg/edged/edged_pods.go Show resolved Hide resolved
edge/pkg/edged/edged_pods.go Show resolved Hide resolved
edge/pkg/edged/edged_pods.go Show resolved Hide resolved
edge/pkg/edged/server/server.go Show resolved Hide resolved
edge/pkg/edged/edged.go Outdated Show resolved Hide resolved
edge/pkg/edged/edged_pods.go Outdated Show resolved Hide resolved
@fisherxu fisherxu force-pushed the metrics-edge branch 2 times, most recently from c44eb52 to 6c1fb52 Compare April 17, 2020 10:49
@fisherxu
Copy link
Member Author

fisherxu commented Apr 17, 2020

Pr has been rebased, ptal @kevin-wangzefeng @chendave @kadisi

@kadisi
Copy link
Member

kadisi commented Apr 17, 2020

/lgtm

thanks @fisherxu

@kubeedge-bot kubeedge-bot added the lgtm Indicates that a PR is ready to be merged. label Apr 17, 2020
@kevin-wangzefeng
Copy link
Member

/approve

@kubeedge-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kevin-wangzefeng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubeedge-bot kubeedge-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 17, 2020
@kubeedge-bot kubeedge-bot merged commit 1c7ae3f into kubeedge:master Apr 17, 2020
@fisherxu fisherxu deleted the metrics-edge branch May 19, 2020 12:15
@chendave
Copy link
Member

/ref #1735

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants