Skip to content
This repository has been archived by the owner on Dec 18, 2019. It is now read-only.

Commit

Permalink
fix(SOP) : Improve and fix sop in order to add steps to scale the pod
Browse files Browse the repository at this point in the history
  • Loading branch information
camilamacedo86 committed Jul 12, 2019
1 parent 36fa0a4 commit 201cdd0
Showing 1 changed file with 64 additions and 50 deletions.
114 changes: 64 additions & 50 deletions SOP/SOP-mss.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ endif::[]
:toc:
toc::[]


== Critical

=== MobileSecurityServiceConsoleDown or MobileSecurityServiceDown
Expand All @@ -31,23 +30,11 @@ Troubleshoot using the following steps via either the console or the cli:
+
NOTE: Check that the status are as expected as described in the https://github.com/aerogear/mobile-security-service-operator#status-definition-per-types[Status Definitions per Types in the README]
+
. Check the status of the Mobile Security Service DB CR. If the Database Pod is not available this will cause the Service pod to error
.. Go to `Resources -> Other Resources -> Choose a resource to list -> Mobile Security Service DV -> Actions -> Edit yaml`
+
NOTE: Check that the status are as expected as described in the https://github.com/aerogear/mobile-security-service-operator#status-definition-per-types[Status Definitions per Types in the README]
+
. Check the environment variables of the Database
.. Go to : `Applications -> Pods -> mobile-security-service-db-<xyz123> -> Environment Variables`
.. By default it should use the values defined in the ConfigMap created by the operator `mobile-security-service-config`. For a further information see link:.https://github.com/aerogear/mobile-security-service-operator#changing-the-environment-variables-values[Changing the Environment Variables values in the README]
. Check the logs of the OAuth Proxy Container
.. Go to : `Applications -> Pods -> mobile-security-service-<xyz123> -> Logs -> Container -> oauth-proxy`
. Check the logs of the Application Container
.. Go to : `Applications -> Pods -> mobile-security-service-<xyz123> -> Logs -> Container -> application`
. Check its database. See <<MobileSecurityServiceDatabaseDown>>
. Check the `pod/mobile-security-service-<xyz123>` events, see if was possible to pull the MSS image.
. You can capture further information by following the <<General procedure>>
. Check the operator pod is present as it is responsible for managing the service pod as described in https://github.com/aerogear/mobile-security-service-operator/blob/0.2.0/SOP/SOP-operator.adoc[MobileSecurityServiceOperatorDown]

NOTE: If resolving the MobileSecurityServiceOperatorDown doesn't resolve the issue, please continue with the below steps

==== CLI

. Check that the Mobile Security Service CR is deployed in the same namespace as the operator
Expand All @@ -56,23 +43,9 @@ NOTE: If resolving the MobileSecurityServiceOperatorDown doesn't resolve the iss
+
NOTE: Check that the status are as expected as described in the https://github.com/aerogear/mobile-security-service-operator#status-definition-per-types[Status Definitions per Types in the README]
+
. Check the status of the Mobile Security Service DB Custom Resource. If the Database Pod is not available this will cause the Service pod to error
.. Run `oc describe customresourcedefinition.apiextensions.k8s.io/mobilesecurityservicedbs.mobile-security-service.aerogear.com`
+
NOTE: Check that the status are as expected as described in the https://github.com/aerogear/mobile-security-service-operator#status-definition-per-types[Status Definitions per Types in the README]
+
. Check the environment variables of the Database
.. Run `oc describe pod/mobile-security-service-db-<xyz123>`
.. By default it should use the values defined in the ConfigMap created by the operator `mobile-security-service-config`. For a further information see https://github.com/aerogear/mobile-security-service-operator#changing-the-environment-variables-values[Changing the Environment Variables values in the README]

. Check the logs of the OAuth Proxy Container
.. Get the service pod name -> `oc get pods | grep mobile-security-service`
.. Run `oc logs <service-podname> -c oauth-proxy`
.. Save the logs by running `oc logs <service-podname> -c oauth-proxy > <filename>.log`
. Check the logs of the Application Container
.. Run `oc logs <service-podname> -c application`
.. Save the logs by running `oc logs <service-podname> -c application > <filename>.log`
. See the `pod/mobile-security-service-<xyz123>` logs. See if was possible to pull the MSS image.
. Check its database. See <<MobileSecurityServiceDatabaseDown>>
. Check the `pod/mobile-security-service-<xyz123>` events, see if was possible to pull the MSS image.
. You can capture further information by following the <<General procedure>>
. Check the operator pod is present as it is responsible for managing the service pod as described in https://github.com/aerogear/mobile-security-service-operator/blob/0.2.0/SOP/SOP-operator.adoc[MobileSecurityServiceOperatorDown]

=== MobileSecurityServiceDatabaseDown
Expand Down Expand Up @@ -114,45 +87,86 @@ NOTE: Check that the status are as expected as described in the https://github.c

== Warning

=== MobileSecurityServicePodCPUHigh or MobileSecurityServicePodMemoryHigh
=== MobileSecurityServicePodCPUHigh

. Capture a snapshot of the 'Mobile Security Service Application' Grafana dashboard and track it over time. The metrics can be useful for identifying performance issues over time.
. Increase the log level of the Service pod (`pod/mobile-security-service-<xyz123>`)
.. Go to `Applications -> Deployment`
.. Click on in the `mobile-security-service`
.. Go to the `Environment` tab
.. Remove the `Env Var` `LOG_LEVEL`
.. Add a new `Env Var` `LOG_LEVEL` with the value `debug`
.. After saving the re-deploy of the service will be done automatically then you will be able to get further information
You can check the <<Scaling the pods>> to know how you can scale the pod in order to try to solve it. However, please following the <<General procedure>> procedure before in order to capture the logs and send it to its maintainers.

=== MobileSecurityServiceApiHighRequestDuration, MobileSecurityServiceApiHighRequestFailure , MobileSecurityServiceApiHighConcurrentRequests
=== MobileSecurityServicePodMemoryHigh

Troubleshoot using the following steps via either the console and/or the cli:
You can check the <<Scaling the pods>> to know how you can scale the pod in order to try to solve it. However, please following the <<General procedure>> procedure before in order to capture the logs and send it to its maintainers.

=== MobileSecurityServiceApiHighRequestDuration

You can check the <<Scaling the pods>> to know how you can scale the pod in order to try to solve it. However, please following the <<General procedure>> procedure before in order to capture the logs and send it to its maintainers.

=== MobileSecurityServiceApiHighRequestFailure

- Check <<General procedure>>

=== MobileSecurityServiceApiHighConcurrentRequests

- Check <<General procedure>>

== General procedure

. Capture a snapshot of the 'Mobile Security Service Application' Grafana dashboard and track it over time. The metrics can be useful for identifying performance issues over time.

==== Console:
=== Console:

. Capture application logs for analysis.
.. Go to : `Applications -> Pods -> mobile-security-service-<xyz123> -> Logs -> Container -> application`
.. Go to : `Applications -> Pods -> mobile-security-service-<xyz123> -> Logs -> Container -> oauth-proxy`
. Increase the log level of the Service pod (`pod/mobile-security-service-<xyz123>`)
.. Go to `Applications -> Deployment`
.. Click on in the `mobile-security-service`
.. Go to the `Environment` tab
.. Remove the `Env Var` `LOG_LEVEL`
.. Add new `Env Var` `LOG_LEVEL` with the value `debug`
.. Add a new `Env Var` `LOG_LEVEL` with the value `debug`
.. After saving the re-deploy of the service will be done automatically then you will be able to get further information
. Capture logs for analysis.
.. Capture application logs for analysis.
... Go to : `Applications -> Pods -> mobile-security-service-<xyz123> -> Logs -> Container -> application`
.. Capture the logs of the OAuth Proxy Container
... Go to : `Applications -> Pods -> mobile-security-service-<xyz123> -> Logs -> Container -> oauth-proxy`
.. Capture the logs of the Database Container
... Navigate to `Applications -> Pods -> mobile-security-service-db-<xyz123> -> Logs
. If necessary, recreate the service pod to restore service.
.. Navigate to `Application -> Pods -> mobile-security-service-<xyz123> -> Actions -> Delete -> Delete`

==== CLI:
=== CLI:

. Increase the log level of the Service pod (`pod/mobile-security-service-<xyz123>`) (Check above the steps to do it by the Console)
. Capture application logs for analysis.
.. Get the service pod name -> `oc get pods | grep mobile-security-service`
.. `oc logs <service-podname> -c application`
.. Save the logs by running `oc logs <service-podname> -c application > <filename>.log`
. Increase the log level of the Service pod (`pod/mobile-security-service-<xyz123>`)
. Increase the log level of the Service pod (`pod/mobile-security-service-<xyz123>`)
. Capture logs for analysis.
.. Capture application logs for analysis.
... Run `oc logs <service-podname> -c application`
... Save the logs by running `oc logs <service-podname> -c application > <filename>.log`
.. Capture the logs of the OAuth Proxy Container
... Get the service pod name -> `oc get pods | grep mobile-security-service`
... Run `oc logs <service-podname> -c oauth-proxy`
... Save the logs by running `oc logs <service-podname> -c oauth-proxy > <filename>.log`
.. Capture the logs for the database Container
.. Get the service pod name -> `oc get pods | grep mobile-security-service-db`
.. `oc logs <database-podname>`
.. Save the logs by running `oc logs <database-podname> > <filename>.log`
. See the `pod/mobile-security-service-<xyz123>` logs. See if was possible to pull the MSS image.
. If necessary, recreate the service pod to restore service.
. Get the service pod name -> `oc get pods | grep mobile-security-service`
. oc delete pod <service-podname>

== Scaling the pods

You can scale the MSS pod by changing the spec size in the Mobile Security Service CR. See link:./deploy/crds/mobile-security-service_v1alpha1_mobilesecurityservice_cr.yaml[MobileSecurityService CR].

NOTE: The architecture of Mobile Security Service do not allow scale the database.

==== Console:

. Go to: `Resources -> Other Resources -> Choose a resource to list -> Mobile Security Service DB-> Actions -> Edit yaml`

==== CLI:

. Run `oc edit customresourcedefinition.apiextensions.k8s.io/mobilesecurityservicedbs.mobile-security-service.aerogear.com` and edit it.


0 comments on commit 201cdd0

Please sign in to comment.