# 5 start monitoring & logging on Azure AKS

change ${PJ_ROOT} to your directory.

In [None]:
export PJ_ROOT="${HOME}/core"
cd ${PJ_ROOT};pwd

example)
```
/Users/user/core
```

## load environment variables

In [None]:
source ${PJ_ROOT}/docs/environments/azure_aks/env

## setup alias

In [None]:
if [ "$(uname)" == 'Darwin' ]; then
  alias openbrowser='open'
elif [ "$(expr substr $(uname -s) 1 5)" == 'Linux' ]; then
  alias openbrowser='xdg-open'
else
  echo "Your platform ($(uname -a)) is not supported."
  exit 1
fi

## start fiware cygnus for elasticsearch

In [None]:
kubectl apply -f cygnus/cygnus-elasticsearch-service.yaml

In [None]:
kubectl apply -f cygnus/cygnus-elasticsearch-deployment.yaml

In [None]:
kubectl get pods -l app=cygnus-elasticsearch

example)
```
NAME                                    READY   STATUS    RESTARTS   AGE
cygnus-elasticsearch-689b7f5fd8-dtptx   1/1     Running   0          36s
cygnus-elasticsearch-689b7f5fd8-wj5vm   1/1     Running   0          36s
cygnus-elasticsearch-689b7f5fd8-xnhhj   1/1     Running   0          36s
```

In [None]:
kubectl get services -l app=cygnus-elasticsearch

example)
```
NAME                   TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)             AGE
cygnus-elasticsearch   ClusterIP   10.0.93.83   <none>        5050/TCP,8081/TCP   1m
```

## start prometheus & grafana

### install prometheus-operator

In [None]:
helm install stable/prometheus-operator --name po --namespace monitoring -f monitoring/prometheus-operator-azure.yaml

In [None]:
kubectl --namespace monitoring get pods -l "app=prometheus-operator-operator,release=po"

example)
```
NAME                                               READY   STATUS    RESTARTS   AGE
po-prometheus-operator-operator-7cf7c5cc97-78h9g   1/1     Running   0          2m28s
```

In [None]:
kubectl get daemonsets --namespace monitoring

example)
```
NAME                          DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
po-prometheus-node-exporter   4         4         4       4            4           <none>          2m48s
```

In [None]:
kubectl get deployments --namespace monitoring

example)
```
NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
po-grafana                        1/1     1            1           3m51s
po-kube-state-metrics             1/1     1            1           3m51s
po-prometheus-operator-operator   1/1     1            1           3m51s
```

In [None]:
kubectl get statefulsets --namespace monitoring

example)
```
NAME                                               READY   AGE
alertmanager-po-prometheus-operator-alertmanager   1/1     3m44s
prometheus-po-prometheus-operator-prometheus       1/1     3m34s
```

In [None]:
kubectl get pods --namespace monitoring

example)
```
NAME                                                 READY   STATUS    RESTARTS   AGE
alertmanager-po-prometheus-operator-alertmanager-0   2/2     Running   0          3m56s
po-grafana-fbc85bc4b-5k2s8                           2/2     Running   0          4m32s
po-kube-state-metrics-64fdf7d84d-v9d8h               1/1     Running   0          4m32s
po-prometheus-node-exporter-4fptr                    1/1     Running   0          4m32s
po-prometheus-node-exporter-92lzp                    1/1     Running   0          4m32s
po-prometheus-node-exporter-h8hff                    1/1     Running   0          4m32s
po-prometheus-node-exporter-76pc4                    1/1     Running   0          4m32s
po-prometheus-operator-operator-7cf7c5cc97-78h9g     1/1     Running   0          4m32s
prometheus-po-prometheus-operator-prometheus-0       3/3     Running   1          3m46s
```

In [None]:
kubectl get services --namespace monitoring

example)
```
NAME                                  TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
alertmanager-operated                 ClusterIP   None           <none>        9093/TCP,6783/TCP   6m4s
po-grafana                            ClusterIP   10.0.67.251    <none>        80/TCP              6m40s
po-kube-state-metrics                 ClusterIP   10.0.104.75    <none>        8080/TCP            6m40s
po-prometheus-node-exporter           ClusterIP   10.0.103.93    <none>        9100/TCP            6m40s
po-prometheus-operator-alertmanager   ClusterIP   10.0.57.200    <none>        9093/TCP            6m40s
po-prometheus-operator-operator       ClusterIP   10.0.37.49     <none>        8080/TCP            6m40s
po-prometheus-operator-prometheus     ClusterIP   10.0.229.252   <none>        9090/TCP            6m40s
prometheus-operated                   ClusterIP   None           <none>        9090/TCP            5m54s
```

In [None]:
kubectl get persistentvolumeclaims --namespace monitoring

example)
```
NAME                                                                                       STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
prometheus-alertmanager-storage-claim-alertmanager-po-prometheus-operator-alertmanager-0   Bound    pvc-82a98fe2-9f94-11e9-8e06-7200ae7cb77a   30Gi       RWO            managed-premium   119s
prometheus-prometheus-storage-claim-prometheus-po-prometheus-operator-prometheus-0         Bound    pvc-88b5edbb-9f94-11e9-8e06-7200ae7cb77a   30Gi       RWO            managed-premium   109s
```

### edit some prometheus rules

In [None]:
echo 'kubectl edit --namespace=monitoring prometheusrules po-prometheus-operator-general.rules'

```diff
       for: 10m
       labels:
         severity: warning
-    - alert: Watchdog
-      annotations:
-        message: |
-          This is an alert meant to ensure that the entire alerting pipeline is functional.
-          This alert is always firing, therefore it should always be firing in Alertmanager
-          and always fire against a receiver. There are integrations with various notification
-          mechanisms that send a notification when this alert is not firing. For example the
-          "DeadMansSnitch" integration in PagerDuty.
-      expr: vector(1)
-      labels:
-        severity: none
```

In [None]:
echo 'kubectl edit --namespace=monitoring prometheusrules po-prometheus-operator-kubernetes-absent'

```diff
       for: 15m
       labels:
         severity: critical
-    - alert: KubeControllerManagerDown
-      annotations:
-        message: KubeControllerManager has disappeared from Prometheus target discovery.
-        runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecontrollermanagerdown
-      expr: absent(up{job="kube-controller-manager"} == 1)
-      for: 15m
-      labels:
-        severity: critical
-    - alert: KubeSchedulerDown
-      annotations:
-        message: KubeScheduler has disappeared from Prometheus target discovery.
-        runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeschedulerdown
-      expr: absent(up{job="kube-scheduler"} == 1)
-      for: 15m
-      labels:
-        severity: critical
     - alert: KubeStateMetricsDown
       annotations:
         message: KubeStateMetrics has disappeared from Prometheus target discovery.
```

### confirm prometheus

In [None]:
echo 'kubectl --namespace monitoring port-forward $(kubectl get pod --namespace monitoring -l prometheus=kube-prometheus -l app=prometheus -o template --template "{{(index .items 0).metadata.name}}") 9090:9090'

In [None]:
openbrowser http://localhost:9090

1. confirm Prometheus
    * no `Target` is down.
    * no `Alert` is fired except CPU or Memory resources.

### patch grafana service
* add the "annotation" of ambassador in order to access from Internet.

In [None]:
kubectl patch service --namespace monitoring po-grafana -p '{"metadata": {"annotations": {"getambassador.io/config": "---\napiVersion: ambassador/v0\nkind:  Mapping\nname:  grafana-mapping\nprefix: /\nhost: \"^grafana\\\\..+$\"\nhost_regex: true\nservice: http://po-grafana.monitoring:80\n"}}}'

### register DNS A Record for grafana

In [None]:
HTTPS_IPADDR=$(kubectl get services -l app=ambassador -o json | jq '.items[0].status.loadBalancer.ingress[0].ip' -r)
az network dns record-set a add-record --resource-group ${DNS_ZONE_RG} --zone-name "${DOMAIN}" --record-set-name "grafana" --ipv4-address "${HTTPS_IPADDR}"

### setup grafana

In [None]:
openbrowser https://grafana.${DOMAIN}/login

1. login grafana
    * At the first, a admin user (`admin`/`prom-operator`) is available.
2. change the admin's password

## start Elasticsearch, fluentd and Kibana

### start Elasticsearch

In [None]:
kubectl apply -f logging/elasticsearch-azure-service.yaml

In [None]:
kubectl apply -f logging/elasticsearch-azure-deployment.yaml

In [None]:
kubectl get statefulsets --namespace monitoring -l k8s-app=elasticsearch-logging

example)
```
NAME                    READY   AGE
elasticsearch-logging   2/2     5m18s
```

In [None]:
kubectl get pods --namespace monitoring -l k8s-app=elasticsearch-logging

example)
```
NAME                      READY     STATUS    RESTARTS   AGE
elasticsearch-logging-0   1/1       Running   0          4m
elasticsearch-logging-1   1/1       Running   0          2m
```

In [None]:
kubectl get services --namespace monitoring -l k8s-app=elasticsearch-logging

example)
```
NAME                    TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
elasticsearch-logging   ClusterIP   10.0.80.88   <none>        9200/TCP   4m
```

In [None]:
kubectl get persistentvolumeclaims -n monitoring -l k8s-app=elasticsearch-logging

example)
```
NAME                                            STATUS    VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
elasticsearch-logging-elasticsearch-logging-0   Bound     pvc-238139db-b014-11e8-b618-066567bdfa8c   64Gi       RWO            managed-premium   4m
elasticsearch-logging-elasticsearch-logging-1   Bound     pvc-70ca5ec3-b014-11e8-b618-066567bdfa8c   64Gi       RWO            managed-premium   2m
```

In [None]:
kubectl exec -it elasticsearch-logging-0 --namespace monitoring -- curl -H "Content-Type: application/json" -X PUT http://elasticsearch-logging:9200/_cluster/settings -d '{"transient": {"cluster.routing.allocation.enable":"all"}}'

### start fluentd

In [None]:
kubectl apply -f logging/fluentd-es-configmap.yaml

In [None]:
kubectl apply -f logging/fluentd-es-ds.yaml

In [None]:
kubectl get daemonsets --namespace monitoring -l k8s-app=fluentd-es

example)
```
NAME                DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
fluentd-es-v2.2.0   4         4         4       4            4           <none>          53s
```

In [None]:
kubectl get pods --namespace monitoring -l k8s-app=fluentd-es

example)
```
NAME                      READY   STATUS    RESTARTS   AGE
fluentd-es-v2.2.0-8sv45   1/1     Running   0          1m
fluentd-es-v2.2.0-96ghs   1/1     Running   0          1m
fluentd-es-v2.2.0-cjhtc   1/1     Running   0          1m
fluentd-es-v2.2.0-djzff   1/1     Running   0          1m
```

### start Kibana

In [None]:
kubectl apply -f logging/kibana-service.yaml

In [None]:
kubectl apply -f logging/kibana-deployment.yaml

In [None]:
kubectl get pods --namespace monitoring -l k8s-app=kibana-logging

example)
```
NAME                              READY     STATUS    RESTARTS   AGE
kibana-logging-7444956bf8-stnfm   1/1       Running   0          1m
```

### register DNS A Record for Kibana

In [None]:
HTTPS_IPADDR=$(kubectl get services -l app=ambassador -o json | jq '.items[0].status.loadBalancer.ingress[0].ip' -r)
az network dns record-set a add-record --resource-group ${DNS_ZONE_RG} --zone-name "${DOMAIN}" --record-set-name "kibana" --ipv4-address "${HTTPS_IPADDR}"

### start curator

In [None]:
kubectl apply -f logging/curator-configmap.yaml

In [None]:
kubectl apply -f logging/curator-cronjob.yaml

In [None]:
kubectl get cronjobs --namespace monitoring -l k8s-app=elasticsearch-curator

example)
```
NAME                    SCHEDULE     SUSPEND   ACTIVE    LAST SCHEDULE   AGE
elasticsearch-curator   0 18 * * *   False     0         <none>          7s
```

### confirm basic auth username & password for Kibana

In [None]:
cat ${PJ_ROOT}/secrets/auth-tokens.json | jq '.[]|select(.host == "kibana\\..+$")|.settings.basic_auths[0].username' -r

In [None]:
cat ${PJ_ROOT}/secrets/auth-tokens.json | jq '.[]|select(.host == "kibana\\..+$")|.settings.basic_auths[0].password' -r

### setup Kibana

In [None]:
openbrowser https://kibana.${DOMAIN}/

1. Login kibana by basic authorization using above username and password
1. show `Management -> Index Patterns`
2. set `logstash-*` as Index Pattern, and push `Next step`
3. set `@timestamp` as Time Filter field name, and push `Create index pattern`

### setup grafana

In [None]:
openbrowser https://grafana.${DOMAIN}/login

### add `elasticsearch` dashboard to grafana

1. add a new Data Source (ElasticSearch)
    * Name: `elasticsearch`
    * URL: `http://elasticsearch-logging:9200`
    * Access: `Server(Default)`
    * Index name: `logstash-*`
    * Time field name: `@timestamp`
    * Version: `6.0+`
2. import `monitoring/dashboard_elasticsearch.json`