Skip to content

Commit

Permalink
Adding a helm chart
Browse files Browse the repository at this point in the history
  • Loading branch information
Luis Davim authored and rchakode committed Apr 16, 2019
1 parent cb9b6f4 commit 097c09d
Show file tree
Hide file tree
Showing 17 changed files with 427 additions and 112 deletions.
7 changes: 4 additions & 3 deletions Dockerfile
Expand Up @@ -18,17 +18,18 @@ COPY requirements.txt \
NOTICE \
$APP_HOME/

RUN mkdir -p $APP_HOME/static/images
RUN mkdir -p $APP_HOME/static/images /data
COPY static/images/kube-opex-analytics.png $APP_HOME/static/images/
COPY static/images/favicon.ico $APP_HOME/static/images/

WORKDIR $APP_HOME

RUN pip3 install -r requirements.txt

RUN useradd $RUNTIME_USER && \
usermod $RUNTIME_USER -d $APP_HOME && \
chown -R $RUNTIME_USER:$RUNTIME_USER $APP_HOME
chown -R $RUNTIME_USER:$RUNTIME_USER $APP_HOME \
chown -R $RUNTIME_USER:$RUNTIME_USER /data

# USER $RUNTIME_USER

Expand Down
4 changes: 2 additions & 2 deletions NOTICE
@@ -1,8 +1,8 @@
Kubernetes Opex Analytics
Copyright 漏 2019 Rodrigue Chakode <rodrigue.chakode at gmail dot com> and contributors.

Kubernetes Opex Analytics is licensed under the Apache License 2.0 (the "License");
you may not use this file except in compliance with the License. You may obtain a
Kubernetes Opex Analytics is licensed under the Apache License 2.0 (the "License");
you may not use this file except in compliance with the License. You may obtain a
copy of the License at:

http://www.apache.org/licenses/LICENSE-2.0
Expand Down
14 changes: 7 additions & 7 deletions README.md
Expand Up @@ -3,12 +3,12 @@ Are you impatient and do first want to see Kubernetes Opex Analytics in action b

* [Take a look at the online demo](http://kube-opex-analytics.realopinsight.com:5483)

The demo is live over an actual small Kubernetes cluster running in GKE.
The demo is live over an actual small Kubernetes cluster running in GKE.

It should display charts as documented later in this document. Each chart enables a tooltip activable with mouse hover action.
It should display charts as documented later in this document. Each chart enables a tooltip activable with mouse hover action.

## What is Kubernetes Opex Analytics
Kubernetes Opex Analytics provides short-, mid- and long-term resource usage dashboards over Kubernetes clusters so to allow organizations to understand how their Kubernetes operating costs are spending by their different projects. The final **goal being to help them make cost allocatoion and capacity planning decisions** with factual analytics.
Kubernetes Opex Analytics provides short-, mid- and long-term resource usage dashboards over Kubernetes clusters so to allow organizations to understand how their Kubernetes operating costs are spending by their different projects. The final **goal being to help them make cost allocatoion and capacity planning decisions** with factual analytics.

To meet this goal, Kubernetes Opex Analytics collects CPU and memory usage metrics from Kubernetes's metrics APIs, processes and consolidates them over time to produce resource usage analytics on the basis of namespaces and with different time aggregation perspectives that cover up to a year. These perspectives also show a special usage item labelled _non-allocatable_ highlighting the **share of non-allocatable capacity** for both CPU and memory.

Expand Down Expand Up @@ -70,15 +70,15 @@ In this command:
* We provide a local path `/var/lib/kube-opex-analytics` as data volume for the container. That's where Kubernetes Opex Analytics will store its internal analytics data. You can change the local path to another location, but you MUST take care to adapt the `KOA_DB_LOCATION` environment variable accordingly.
* The environment variable `KOA_DB_LOCATION` points to the path to use by Kubernetes Opex Analytics to store its internal data. You can remark that this directory belongs to the data volume atached to the container.
* The environment variable `KOA_K8S_API_ENDPOINT` set the address of the Kubernetes API endpoint.

You can then access the web interface at `http://127.0.0.1:5483/`.

> Due to the time needed to have sufficient data to consolidate, you may need to wait almost a hour to have all charts filled. This is a normal operations of Kubernetes Opex Analytics.
## What's Next
Kubernetes Opex Analytics is a currently at a early stage but is already useful and ready to use. We encourage feedback and will make our best to be proactive to handle any troubles you may encounter when using it.

Meanwhile we already have some ideas of improvments for next releases https://github.com/rchakode/kube-opex-analytics/issues.
Meanwhile we already have some ideas of improvments for next releases https://github.com/rchakode/kube-opex-analytics/issues.

Other ideas are welcomed, please open an issue to submit your idea if you have any one.

Expand All @@ -92,4 +92,4 @@ The tool includes and is bound to third-party libraries provided with their owns
## Contributions
Contributions are accepted subject that the code and documentation be released under the terms of Apache License 2.0.

To contribute bug patches or new features, you can use the Github Pull Request model.
To contribute bug patches or new features, you can use the Github Pull Request model.
109 changes: 55 additions & 54 deletions backend.py
@@ -1,4 +1,4 @@
"""
"""
# File: backend.py #
# Author: Rodrigue Chakode <rodrigue.chakode @ gmail dot com> #
# #
Expand All @@ -13,7 +13,7 @@
# Unless required by applicable law or agreed to in writing, software distributed #
# under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR #
# CONDITIONS OF ANY KIND, either express or implied. See the License for the #
# specific language governing permissions and limitations under the License. #
# specific language governing permissions and limitations under the License. #
"""

import argparse
Expand All @@ -39,6 +39,7 @@

# load dynamic configuration settings
KOA_K8S_API_ENDPOINT = os.getenv('KOA_K8S_API_ENDPOINT', 'http://127.0.0.1:8001')
KOA_K8S_API_VERIFY_SLL = os.getenv('KOA_K8S_API_VERIFY_SLL', True)
KOA_DEFAULT_DB_LOCATION = ('%s/.kube-opex-analytics/db') % os.getenv('HOME', '/tmp')
KOA_DB_LOCATION = os.getenv('KOA_DB_LOCATION', KOA_DEFAULT_DB_LOCATION)
KOA_POLLING_INTERVAL_SEC = int(os.getenv('KOA_POLLING_INTERVAL_SEC', '300'))
Expand All @@ -49,7 +50,7 @@
KOA_BILING_HOURLY_RATE = float(os.getenv('KOA_BILING_HOURLY_RATE'))
except:
KOA_BILING_HOURLY_RATE = 0

# fixed configuration settings
STATIC_CONTENT_LOCATION = '/static'
FRONTEND_DATA_LOCATION = '.%s/data' % (STATIC_CONTENT_LOCATION)
Expand Down Expand Up @@ -308,7 +309,7 @@ def extract_pod_metrics(self, data):
# exit if not valid data
if data is None:
return
# process likely valid data
# process likely valid data
data_json = json.loads(data)
for _, item in enumerate(data_json['items']):
podName = '%s.%s' % (item['metadata']['name'], item['metadata']['namespace'])
Expand All @@ -318,36 +319,36 @@ def extract_pod_metrics(self, data):
pod.memUsage = 0.0
for _, container in enumerate(item['containers']):
pod.cpuUsage += self.decode_cpu_capacity(container['usage']['cpu'])
pod.memUsage += self.decode_memory_capacity(container['usage']['memory'])
pod.memUsage += self.decode_memory_capacity(container['usage']['memory'])
self.pods[pod.name] = pod


def consolidate_ns_usage(self):
self.cpuUsedByPods = 0.0
self.memUsedByPods = 0.0
def consolidate_ns_usage(self):
self.cpuUsedByPods = 0.0
self.memUsedByPods = 0.0
for pod in self.pods.values():
if hasattr(pod, 'cpuUsage') and hasattr(pod, 'memUsage'):
self.cpuUsedByPods += pod.cpuUsage
self.nsResUsage[pod.namespace].cpuUsage += pod.cpuUsage
self.nsResUsage[pod.namespace].memUsage += pod.memUsage
self.nsResUsage[pod.namespace].memUsage += pod.memUsage
self.memUsedByPods += pod.memUsage
self.nodes[pod.nodeName].podsRunning.append(pod)
self.cpuCapacity += 0.0
self.memCapacity += 0.0
self.memCapacity += 0.0
for node in self.nodes.values():
if hasattr(node, 'cpuCapacity') and hasattr(node, 'memCapacity'):
self.cpuCapacity += node.cpuCapacity
self.memCapacity += node.memCapacity
self.memCapacity += node.memCapacity
self.cpuAllocatable += 0.0
self.memAllocatable += 0.0
self.memAllocatable += 0.0
for node in self.nodes.values():
if hasattr(node, 'cpuAllocatable') and hasattr(node, 'memAllocatable'):
self.cpuAllocatable += node.cpuAllocatable
self.memAllocatable += node.memAllocatable
self.memAllocatable += node.memAllocatable

def dump_nodes(self):
def dump_nodes(self):
with open(str('%s/nodes.json' % FRONTEND_DATA_LOCATION), 'w') as fd:
fd.write(json.dumps(self.nodes, cls=JSONMarshaller))
fd.write(json.dumps(self.nodes, cls=JSONMarshaller))

def compute_usage_percent_ratio(value, total):
return round((100.0*value) / total, 1)
Expand All @@ -366,10 +367,10 @@ class RrdPeriod(enum.IntEnum):

class ResUsageType(enum.IntEnum):
CPU = 0
MEMORY = 1
CONSOLIDATED = 2
CUMULATED_COST = 3
MEMORY = 1
CONSOLIDATED = 2
CUMULATED_COST = 3

class Rrd:
def __init__(self, db_files_location=None, dbname=None):
create_directory_if_not_exists(db_files_location)
Expand All @@ -395,19 +396,19 @@ def create_rrd_file_if_not_exists(self):
str('DS:estimated_cost:GAUGE:%d:U:U' % xfs),
"RRA:AVERAGE:0.5:1:4032",
"RRA:AVERAGE:0.5:12:8880")

def add_value(self, probe_ts, cpu_usage, mem_usage, consolidated_usage, estimated_cost):
rrdtool.update(self.rrd_location, '%s:%s:%s:%s:%s'%(
probe_ts,
round(cpu_usage, 1),
round(mem_usage, 1),
round(consolidated_usage, 1),
probe_ts,
round(cpu_usage, 1),
round(mem_usage, 1),
round(consolidated_usage, 1),
round(estimated_cost, 1)))

def dump_trends_data(self, period, step_in):
end_ts_in = int(int(calendar.timegm(time.gmtime()) * step_in) / step_in)
start_ts_in = int(end_ts_in - int(period))
result = rrdtool.fetch(self.rrd_location, 'AVERAGE', '-r', str(step_in), '-s', str(start_ts_in), '-e', str(end_ts_in))
result = rrdtool.fetch(self.rrd_location, 'AVERAGE', '-r', str(step_in), '-s', str(start_ts_in), '-e', str(end_ts_in))
res_usage = collections.defaultdict(list)
sum_res_usage = collections.defaultdict(lambda:0.0)
cumulated_cost = 0.0
Expand All @@ -424,8 +425,8 @@ def dump_trends_data(self, period, step_in):
cumulated_cost += round(100*float(cdp[3]))/100
datetime_utc_json = time.strftime('%Y-%m-%dT%H:%M:%SZ', datetime_utc)
res_usage[ResUsageType.CPU].append('{"name":"%s","dateUTC":"%s","usage":%f}' % (self.dbname, datetime_utc_json, current_cpu_usage))
res_usage[ResUsageType.MEMORY].append('{"name":"%s","dateUTC":"%s","usage":%f}' % (self.dbname, datetime_utc_json, current_mem_usage))
res_usage[ResUsageType.CONSOLIDATED].append('{"name":"%s","dateUTC":"%s","usage":%s}' % (self.dbname, datetime_utc_json, current_consolidated_usage))
res_usage[ResUsageType.MEMORY].append('{"name":"%s","dateUTC":"%s","usage":%f}' % (self.dbname, datetime_utc_json, current_mem_usage))
res_usage[ResUsageType.CONSOLIDATED].append('{"name":"%s","dateUTC":"%s","usage":%s}' % (self.dbname, datetime_utc_json, current_consolidated_usage))
res_usage[ResUsageType.CUMULATED_COST].append('{"name":"%s", "dateUTC":"%s","usage":%s}' % (self.dbname, datetime_utc_json, cumulated_cost))
sum_res_usage[ResUsageType.CPU] += current_cpu_usage
sum_res_usage[ResUsageType.MEMORY] += current_mem_usage
Expand All @@ -436,20 +437,20 @@ def dump_trends_data(self, period, step_in):

if sum_res_usage[ResUsageType.CPU] > 0.0 and sum_res_usage[ResUsageType.CPU] > 0.0:
return (','.join(res_usage[ResUsageType.CPU]),
','.join(res_usage[ResUsageType.MEMORY]),
','.join(res_usage[ResUsageType.CONSOLIDATED]),
','.join(res_usage[ResUsageType.CUMULATED_COST]))
','.join(res_usage[ResUsageType.MEMORY]),
','.join(res_usage[ResUsageType.CONSOLIDATED]),
','.join(res_usage[ResUsageType.CUMULATED_COST]))
return '', '', '', ''


def dump_histogram_data(self, period, step_in):
end_ts_in = int(int(calendar.timegm(time.gmtime()) * step_in) / step_in)
start_ts_in = int(end_ts_in - int(period))
result = rrdtool.fetch(self.rrd_location, 'AVERAGE', '-r', str(step_in), '-s', str(start_ts_in), '-e', str(end_ts_in))
periodic_cpu_usage = collections.defaultdict(lambda:0.0)
periodic_mem_usage = collections.defaultdict(lambda:0.0)
periodic_cpu_usage = collections.defaultdict(lambda:0.0)
periodic_mem_usage = collections.defaultdict(lambda:0.0)
periodic_consolidated_usage = collections.defaultdict(lambda:0.0)
valid_rows = collections.defaultdict(lambda:0.0)
valid_rows = collections.defaultdict(lambda:0.0)
start_ts_out, _, step = result[0]
current_ts = start_ts_out
for _, cdp in enumerate( result[2] ):
Expand All @@ -470,48 +471,48 @@ def dump_histogram_data(self, period, step_in):
return periodic_cpu_usage, periodic_mem_usage, periodic_consolidated_usage, valid_rows

@staticmethod
def dump_trend_analytics(dbfiles):
res_usage = collections.defaultdict(list)
def dump_trend_analytics(dbfiles):
res_usage = collections.defaultdict(list)
for _, dbfile in enumerate(dbfiles):
dbfile_splitted=os.path.splitext(dbfile)
if len(dbfile_splitted) == 2 and dbfile_splitted[1] == '.rrd':
rrd = Rrd(db_files_location=KOA_DB_LOCATION, dbname=dbfile_splitted[0])
analytics = rrd.dump_trends_data(period=RrdPeriod.PERIOD_7_DAYS_SEC, step_in=3600)
analytics = rrd.dump_trends_data(period=RrdPeriod.PERIOD_7_DAYS_SEC, step_in=3600)
for usage_type in range(4):
if analytics[usage_type]:
res_usage[usage_type].append(analytics[usage_type])
with open(str('%s/cpu_usage_trends.json' % FRONTEND_DATA_LOCATION), 'w') as fd:
fd.write('['+','.join(res_usage[0])+']')
fd.write('['+','.join(res_usage[0])+']')
with open(str('%s/memory_usage_trends.json' % FRONTEND_DATA_LOCATION), 'w') as fd:
fd.write('['+','.join(res_usage[1])+']')
fd.write('['+','.join(res_usage[1])+']')
with open(str('%s/consolidated_usage_trends.json' % FRONTEND_DATA_LOCATION), 'w') as fd:
fd.write('['+','.join(res_usage[2])+']')
fd.write('['+','.join(res_usage[2])+']')
with open(str('%s/estimated_usage_trends.json' % FRONTEND_DATA_LOCATION), 'w') as fd:
fd.write('['+','.join(res_usage[3])+']')
fd.write('['+','.join(res_usage[3])+']')

@staticmethod
def dump_histogram_analytics(dbfiles, period):
res_usage = collections.defaultdict(list)
def dump_histogram_analytics(dbfiles, period):
res_usage = collections.defaultdict(list)
for _, dbfile in enumerate(dbfiles):
dbfile_splitted=os.path.splitext(dbfile)
if len(dbfile_splitted) == 2 and dbfile_splitted[1] == '.rrd':
dbname = dbfile_splitted[0]
rrd = Rrd(db_files_location=KOA_DB_LOCATION, dbname=dbname)
analytics = rrd.dump_histogram_data(period=period, step_in=3600)
# valid_rows = analytics[3]
for usage_type in range(3):
for usage_type in range(3):
for date_key, value in analytics[usage_type].items():
if value > 0.0:
# res_usage[usage_type].append('{"stack":"%s","usage":%f,"date":"%s"}' % (dbname, value/valid_rows[date_key], date_key))
res_usage[usage_type].append('{"stack":"%s","usage":%f,"date":"%s"}' % (dbname, value, date_key))
# res_usage[usage_type].append('{"stack":"%s","usage":%f,"date":"%s"}' % (dbname, value/valid_rows[date_key], date_key))
res_usage[usage_type].append('{"stack":"%s","usage":%f,"date":"%s"}' % (dbname, value, date_key))

# write exported data to files
with open(str('%s/cpu_usage_period_%d.json' % (FRONTEND_DATA_LOCATION, period)), 'w') as fd:
fd.write('['+','.join(res_usage[0])+']')
fd.write('['+','.join(res_usage[0])+']')
with open(str('%s/memory_usage_period_%d.json' % (FRONTEND_DATA_LOCATION, period)), 'w') as fd:
fd.write('['+','.join(res_usage[1])+']')
fd.write('['+','.join(res_usage[1])+']')
with open(str('%s/consolidated_usage_period_%d.json' % (FRONTEND_DATA_LOCATION, period)), 'w') as fd:
fd.write('['+','.join(res_usage[2])+']')
fd.write('['+','.join(res_usage[2])+']')



Expand All @@ -520,15 +521,15 @@ def pull_k8s(api_context):
api_endpoint = '%s%s' % (KOA_K8S_API_ENDPOINT, api_context)

try:
http_req = requests.get(api_endpoint)
http_req = requests.get(api_endpoint, verify=KOA_K8S_API_VERIFY_SLL)
if http_req.status_code == 200:
data = http_req.text
else:
print_error("%s [ERROR] '%s' returned error (%s)" % (time.strftime("%Y-%M-%d %H:%M:%S"), api_endpoint, http_req.text))
except requests.exceptions.RequestException as ex:
print_error("%s [ERROR] HTTP exception requesting '%s' (%s)" % (time.strftime("%Y-%M-%d %H:%M:%S"), api_endpoint, ex))
except:
print_error("%s [ERROR] exception requesting '%s'" % (time.strftime("%Y-%M-%d %H:%M:%S"), api_endpoint))
print_error("%s [ERROR] HTTP exception requesting '%s' (%s)" % (time.strftime("%Y-%M-%d %H:%M:%S"), api_endpoint, ex))
except:
print_error("%s [ERROR] exception requesting '%s'" % (time.strftime("%Y-%M-%d %H:%M:%S"), api_endpoint))

return data

Expand Down Expand Up @@ -560,7 +561,7 @@ def create_metrics_puller():
estimated_cost = consolidated_usage * (KOA_POLLING_INTERVAL_SEC * KOA_BILING_HOURLY_RATE) / 36
rrd.add_value(probe_ts, cpu_usage=cpu_usage, mem_usage=mem_usage, consolidated_usage=consolidated_usage, estimated_cost=estimated_cost)
time.sleep(int(KOA_POLLING_INTERVAL_SEC))


def dump_analytics():
export_interval = round(1.5 * KOA_POLLING_INTERVAL_SEC)
Expand Down
4 changes: 2 additions & 2 deletions entrypoint.sh
Expand Up @@ -14,7 +14,7 @@
# Unless required by applicable law or agreed to in writing, software distributed #
# under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR #
# CONDITIONS OF ANY KIND, either express or implied. See the License for the #
# specific language governing permissions and limitations under the License. #
# specific language governing permissions and limitations under the License. #

LC_ALL='C.UTF-8' LANG='C.UTF-8' \
python3 -u ./backend.py
python3 -u ./backend.py
22 changes: 22 additions & 0 deletions helm/kube-opex-analytics/.helmignore
@@ -0,0 +1,22 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
5 changes: 5 additions & 0 deletions helm/kube-opex-analytics/Chart.yaml
@@ -0,0 +1,5 @@
apiVersion: v1
appVersion: "1.0"
description: A Helm chart for Kubernetes Opex Analtytics
name: kube-opex-analytics
version: 0.1.0

0 comments on commit 097c09d

Please sign in to comment.