In this workshop you'll cover using a Process and various Platform components to create a SQL Server Big Data Clusters (BDC) solution you can deploy on premises, in the cloud, or in a hybrid architecture. In each module you'll get more references, which you should follow up on to learn more. Also watch for links within the text - click on each one to explore that topic.
(Make sure you check out the prerequisites page before you start. You'll need all of the items loaded there before you can proceed with the workshop.)
You'll cover the following topics in this Module:
- 5.0 Managing and Monitoring Your Solution
- 5.1 Using kubectl commands
- 5.2 Using azdata commands
- 5.3 Using Grafana and Kibana
There are two primary areas for monitoring your BDC deployment. The first deals with SQL Server 2019, and the second deals with the set of elements in the Cluster.
For SQL Server, management is much as you would normally perform for any SQL Server system. You have the same type of services, surface points, security areas and other control vectors as in a stand-alone installation of SQL Server. The tools you have available for managing the Master Instance in the BDC are the same as managing a stand-alone installation, including SQL Server Management Studio, command-line interfaces, Azure Data Studio, and third party tools.
For the cluster components, you have three primary interfaces to use, which you will review next.
Since the BDC lives within a Kubernetes cluster, you'll work with the kubectl command to deal with those specific components. The following list is a short version of some of the commands you can use to manage and monitor the BDC implementation of a Kubernetes cluster:
Command | Description |
---|---|
az aks get-credentials --name --resource-group | Download the Kubernetes cluster configuration file and set the cluster context |
kubectl get pods --all-namespaces | Get the status of pods in the cluster for either all namespaces or the big data cluster namespace |
kubectl describe pod -n | Get a detailed description of a specific pod in json format output. It includes details, such as the current Kubernetes node that the pod is placed on, the containers running within the pod, and the image used to bootstrap the containers. It also shows other details, such as labels, status, and persisted volumes claims that are associated with the pod |
kubectl get svc -n | Get details for the big data cluster services. These details include their type and the IPs associated with respective services and ports. Note that BDC services are created in a new namespace created at cluster bootstrap time based on the cluster name specified in the azdata create cluster command |
kubectl describe pod -n | Get a detailed description of a service in json format output. It will include details like labels, selector, IP, external-IP (if the service is of LoadBalancer type), port, etc. |
kubectl exec -it -c -n -- /bin/bash | If existing tools or the infrastructure does not enable you to perform a certain task without actually being in the context of the container, you can log in to the container using kubectl exec command. For example, you might need to check if a specific file exists, or you might need to restart services in the container |
kubectl cp pod_name:source_file_path -c container_name -n namespace_name target_local_file_path | Copy files from the container to your local machine. Reverse the source and destination to copy into the container |
kubectl delete pods -n --grace-period=0 --force | For testing availability, resiliency, or data persistence, you can delete a pod to simulate a pod failure with the kubectl delete pods command. Not recommended for production, only to simulate failure |
kubectl get pods -o yaml -n | grep hostIP | Get the IP of the node a pod is currently running on |
Use this resourceto learn more about these commands for troubleshooting the BDC.
A full list of the kubectl commands is here.
Activity: Discover the IP Address of the BDC Master Installation, and Connect to it with Azure Data Studio
In this activity, you will Get the IP Address of the Master Instance in your Cluster, and connect with Azure Data Studio.
Steps
Open this resource, and follow the steps there for the AKS deployments: section.
The azdata utility enables cluster administrators to bootstrap and manage big data clusters via the REST APIs exposed by the Controller service. The controller is deployed and hosted in the same Kubernetes namespace where the customer wants to build out a big data cluster. The Controller is responsible for core logic for deploying and managing a big data cluster.
The Controller service is installed by a Kubernetes administrator during cluster bootstrap, using the azdata command-line utility.
You can find a list of the switches and commands by typing:
azdata --h
You used the azdata commands to deploy your cluster, and you can use it to get information about your bdc deployment as well. You should review the documentation for this command here.
You learned about Grafana and Kibana systems in Module 01, Microsoft has created various views within each that you can use to interact with both the SQL Server-specific and Kubernetes portions of the BDC. The Azure Data Studio big data clusters management panel shows the TCP/IP addresses for each of these systems.
Activity: Start dashboard when cluster is running in AKS
To launch the Kubernetes dashboard run the following commands:
az aks browse --resource-group --name
Note:
If you get the following error:
Unable to listen on port 8001: All listeners failed to create with the following errors: Unable to create listener: Error listen tcp4 127.0.0.1:8001: >bind: Only one usage of each socket address (protocol/network address/port) is normally permitted. Unable to create listener: Error listen tcp6: address [[::1]]:8001: missing port in >address error: Unable to listen on any of the requested ports: [{8001 9090}]
make sure you did not start the dashboard already from another window.
When you launch the dashboard on your browser, you might get permission warnings due to RBAC being enabled by default in AKS clusters, and the service account used by the dashboard does not have enough permissions to access all resources (for example, pods is forbidden: User "system:serviceaccount:kube-system:kubernetes-dasboard" cannot list pods in the namespace "default"). Run the following command to give the necessary permissions to kubernetes-dashboard, and then restart the dashboard:
kubectl create clusterrolebinding kubernetes-dashboard -n kube-system --clusterrole=cluster-admin --serviceaccount=kube-system:kubernetes-dashboard
- Official Documentation for this section
- Kubectl commands for monitoring and troubleshooting SQL Server big data clusters
- Debug and Diagnose Spark Applications on SQL Server big data clusters in Spark History Server
Next, Continue to Security.