Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,33 +1,48 @@
---
title: Cluster monitoring with Prometheus and Grafana in Amazon EKS
title: Monitor the cluster with Prometheus and Grafana
weight: 5

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## CPU and RAM usage statistics with Prometheus and Grafana
## Monitor CPU and RAM usage with Prometheus and Grafana

Prometheus is a monitoring and alerting tool. It is used for collecting and querying real-time metrics in cloud-native environments like Kubernetes. Prometheus collects essential metrics (e.g., CPU, memory usage, pod counts, request latency) that help in monitoring the health and performance of Kubernetes clusters. Grafana is a visualization and analytics tool that integrates with data sources from Prometheus, to create interactive dashboards to monitor and analyze Kubernetes metrics over time.
Prometheus is a monitoring and alerting tool. It is used for collecting and querying real-time metrics in cloud-native environments like Kubernetes. Prometheus collects essential metrics about CPU usage, memory usage, pod counts, and request latency. This helps you monitor the health and performance of your Kubernetes clusters.

Grafana is a visualization and analytics tool that integrates with data sources from Prometheus to create interactive dashboards to monitor and analyze Kubernetes metrics over time.

## Install Prometheus on Arm-based EKS cluster
## Install Prometheus on your EKS cluster

This learning path uses `helm` to install prometheus on the Kubernetes cluster. Follow the [helm documentation](https://helm.sh/docs/intro/install/) to install it on your laptop.
You can use Helm to install prometheus on the Kubernetes cluster.

Create a namespace in your EKS cluster to host `prometheus` pods
Follow the [Helm documentation](https://helm.sh/docs/intro/install/) to install it on your computer.

Confirm Helm is installed by running the version command:

```console
helm version
```

The output is similar to:

```output
version.BuildInfo{Version:"v3.16.3", GitCommit:"cfd07493f46efc9debd9cc1b02a0961186df7fdf", GitTreeState:"clean", GoVersion:"go1.22.7"}
```

Create a namespace in your EKS cluster to host `prometheus` pods:

```console
kubectl create namespace prometheus
```

Add the following helm repo for prometheus
Add the following Helm repo for prometheus:

```console
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
```

Install `prometheus` on the cluster with the following command
Install Prometheus on the cluster with the following command:

```console
helm install prometheus prometheus-community/prometheus \
Expand All @@ -36,22 +51,21 @@ helm install prometheus prometheus-community/prometheus \
--set server.persistentVolume.storageClass="gp2"
```

Check all pods are up and running
Check all pods are up and running:

```console
kubectl get pods -n prometheus
```

## Install Grafana on your EKS cluster

## Install Grafana on Arm-based EKS cluster

Add the following helm repo for grafana
Add the following Helm repo for Grafana:

```console
helm repo add grafana https://grafana.github.io/helm-charts
```

Create `grafana.yaml` file with the following contents
Use a text editor to create a `grafana.yaml` file with the following contents:

```console
datasources:
Expand All @@ -65,13 +79,13 @@ datasources:
isDefault: true
```

Create another namespace for `grafana` pods
Create another namespace for Grafana pods:

```console
kubectl create namespace grafana
```

Install `grafana` on the cluster with the following command
Install Grafana on the cluster with the following command:

```console
helm install grafana grafana/grafana \
Expand All @@ -82,12 +96,15 @@ helm install grafana grafana/grafana \
--values grafana.yaml \
--set service.type=LoadBalancer
```

Check all pods are up and running

```console
kubectl get pods -n grafana
```

Login to the grafana dashboard using the LoadBalancer IP and click on `Dashboards` in the left navigation page. Locate a `Kubernetes / Compute Resources / Node` dashboard and click on it. You should see a dashboard like below for your Kubernetes cluster
Login to the grafana dashboard using the LoadBalancer IP and click on `Dashboards` in the left navigation page. Locate a `Kubernetes / Compute Resources / Node` dashboard and click on it.

You see a dashboard like below for your Kubernetes cluster:

![grafana #center](_images/grafana.png)
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Monitoring the sentiments with Elasticsearch and Kibana
title: Monitoring sentiment with Elasticsearch and Kibana
weight: 4

### FIXED, DO NOT MODIFY
Expand All @@ -8,11 +8,13 @@ layout: learningpathall

## Deploy Elasticsearch and Kibana on Arm-based EC2 instance

Elasticsearch is a NoSQL database and search & analytics engine. It's designed to store, search and analyze large amounts of data. It has real-time indexing capability which is crucial for handling high-velocity data streams like tweets. Kibana is a dashboard and visualization tool that integrates seamlessly with Elasticsearch. It provides an interface to interact with twitter data, apply filters and receive alerts. There are multiple ways to install Elasticsearch and Kibana, one of the methods is shown below.
Elasticsearch is a NoSQL database, search, and analytics engine. It's designed to store, search and analyze large amounts of data. It has real-time indexing capability which is crucial for handling high-velocity data streams like Tweets.

Before you begin, ensure that docker and docker compose have been installed on your laptop.
Kibana is a dashboard and visualization tool that integrates seamlessly with Elasticsearch. It provides an interface to interact with twitter data, apply filters, and receive alerts. There are multiple ways to install Elasticsearch and Kibana, one method is shown below.

Create the following docker-compose.yml file
Before you begin, ensure that Docker and Docker Compose have been installed on your computer.

Use a text editor to create a `docker-compose.yml` file with the contents below:

```yml
version: '2.18.1'
Expand Down Expand Up @@ -47,15 +49,18 @@ networks:
elk:
driver: bridge
```

Use the following command to deploy Elasticsearch and Kibana Dashboard.

```console
docker-compose up
```

After the dashboard is up, use the the public IP of your server on the port 5601 to access the Kibana dashboard.

![kibana #center](_images/kibana.png)

Now switch to the stack management using the menu on the left side as shown in below image.
Switch to the stack management using the menu on the left side as shown in below image.

![kibana-data #center](_images/Kibana-data.png)

Expand All @@ -71,7 +76,7 @@ One of the sample dashboard structures looks as below, showing the records of di

![kibana-dashboard2 #center](_images/Kibana-dashboard2.png)

Similarly, you can desgin and create dashboards to analyze a particular set of data. The screenshot below shows the dashboard designed for this learning path
Similarly, you can design and create dashboards to analyze a particular set of data. The screenshot below shows the dashboard designed for this learning path

![kibana-dashboard3 #center](_images/Kibana-dashboard3.png)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,20 +8,38 @@ layout: learningpathall

## Before you begin

You will need an [AWS account](https://aws.amazon.com/). Create an account if needed.
You will need an [AWS account](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-creating.html). Create an account if needed.

Three tools are required on your local machine. Follow the links to install the required tools.
Four tools are required on your local machine. Follow the links to install each tool.

* [Kubectl](/install-guides/kubectl/)
* [AWS CLI](/install-guides/aws-cli)
* [Docker](/install-guides/docker)
* [Terraform](/install-guides/terraform)
* [AWS CLI](/install-guides/aws-cli/)
* [Docker](/install-guides/docker/)
* [Terraform](/install-guides/terraform/)

To use the AWS CLI, you will need to generate AWS access keys and configure the CLI. Follow the [AWS Credentials](/install-guides/aws_access_keys/) install guide for instructions.

## Setup sentiment analysis

Clone this github [repository](https://github.com/koleini/spark-sentiment-analysis) on your local workstation. Navigate to `eks` directory and update the `variables.tf` file with your AWS region.
Take a look at the [GitHub repository](https://github.com/koleini/spark-sentiment-analysis) then clone it on your local computer:

```console
git clone https://github.com/koleini/spark-sentiment-analysis.git
cd spark-sentiment-analysis
```

Edit the file `eks/variables.tf` if you want to change the default AWS region.

The default value is at the top of the file and is set to `us-east-1`.

```output
variable "AWS_region" {
default = "us-east-1"
description = "AWS region"
}
```

Execute the following commands to create the Amazon EKS cluster with pre-configured labels.
Execute the following commands to create the Amazon EKS cluster:

```console
terraform init
Expand All @@ -30,8 +48,10 @@ terraform apply --auto-approve

Update the `kubeconfig` file to access the deployed EKS cluster with the following command:

If you want to use an AWS CLI profile not named `default`, change the profile name before running the command.

```console
aws eks --region $(terraform output -raw region) update-kubeconfig --name $(terraform output -raw cluster_name) --profile <AWS_PROFILE_NAME>
aws eks --region $(terraform output -raw region) update-kubeconfig --name $(terraform output -raw cluster_name) --profile default
```

Create a service account for Apache spark
Expand All @@ -43,24 +63,26 @@ kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount

## Build the sentiment analysis JAR file

Navigate to the `sentiment_analysis` folder and create a JAR file for the sentiment analyzer
Navigate to the `sentiment_analysis` folder and create a JAR file for the sentiment analyzer:

```console
cd sentiment_analysis
sbt assembly
```

You should see a JAR file created at the following location
A JAR file is created at the following location:

```console
sentiment_analysis/target/scala-2.13/bigdata-assembly-0.1.jar
```

## Create Spark docker container image
## Create a Spark container image

Create a repository in Amazon ECR to store the docker images. You can also use Docker Hub.

The Spark repository contains a script to build the Docker image needed for running inside the Kubernetes cluster. Execute this script on your Arm-based laptop to build the arm64 image.
The Spark repository contains a script to build the container image you need to run inside the Kubernetes cluster.

Execute this script on your Arm-based computer to build the arm64 image.

In the current working directory, clone the `apache spark` github repository prior to building the image

Expand All @@ -69,26 +91,29 @@ git clone https://github.com/apache/spark.git
cd spark
git checkout v3.4.3
```
Build the docker container using the following commands:

Build the docker container using the following commands. Substitute the name of your container repository before running the commands.

```console
cp ../sentiment_analysis/target/scala-2.13/bigdata-assembly-0.1.jar jars/
bin/docker-image-tool.sh -r <your-docker-repository> -t sentiment-analysis build
bin/docker-image-tool.sh -r <your-docker-repository> -t sentiment-analysis push
```

## Run Spark computation on the cluster

Execute the `spark-submit` command within the Spark folder to deploy the application. The following commands will run the application with two executors, each with 12 cores, and allocate 24GB of memory for both the executors and driver pods.

Set the following variables before executing the `spark-submit` command
Set the following variables before executing the `spark-submit` command:

```console
export MASTER_ADDRESS=<K8S_MASTER_ADDRESS>
export ES_ADDRESS=<IP_ADDRESS_OF_ELASTICS_SEARCH>
export CHECKPOINT_BUCKET=<BUCKET_NAME>
export EKS_ADDRESS=<EKS_REGISTERY_ADDRESS>
```
Execute the following command

Execute the `spark-submit` command:

```console
bin/spark-submit \
Expand Down Expand Up @@ -122,16 +147,20 @@ spark-twitter 1/1 Running 0 12m

## Twitter sentiment analysis

Create a twitter(X) [developer account](https://developer.x.com/en/docs/x-api/getting-started/getting-access-to-the-x-api) and create a `bearer token`. Using the following script to fetch the tweets
Create a twitter(X) [developer account](https://developer.x.com/en/docs/x-api/getting-started/getting-access-to-the-x-api) and create a `bearer token`.

Use the following commands to set the token and fetch the Tweets:

```console
export BEARER_TOKEN=<BEARER_TOKEN_FROM_X>
python3 scripts/xapi_tweets.py
```

You can modify the script `xapi_tweets.py` with your own keywords. Update the following section in the script to do so
You can modify the script `xapi_tweets.py` with your own keywords.

```console
Here is the code which includes the keywords:

```output
query_params = {'query': "(#onArm OR @Arm OR #Arm OR #GenAI) -is:retweet lang:en",
'tweet.fields': 'lang'}
```
Original file line number Diff line number Diff line change
@@ -1,27 +1,30 @@
---
title: What is Twitter Sentiment Analysis
title: Understand sentiment analysis
weight: 2

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## What is Sentiment Analysis
## What is sentiment analysis?

Sentiment analysis is a natural language processing technique used to identify and categorize opinions expressed in a piece of text, such as a tweet or a product review. It can help to gauge public opinion, identify trends and patterns, and improve decision-making. Social media platforms, such as Twitter, provide a wealth of information about public opinion, trends, and events. Sentiment analysis is important because it provides insights into how people feel about a particular topic or issue, and can help to identify emerging trends and patterns.
Sentiment analysis is a natural language processing technique used to identify and categorize opinions expressed in a piece of text, such as a tweet or a product review. It can help gauge public opinion, identify trends and patterns, and improve decision-making. Social media platforms, such as Twitter (X), provide a wealth of information about public opinion, trends, and events. Sentiment analysis is important because it provides insights into how people feel about a particular topic or issue, and can help to identify emerging trends and patterns.

## Can I perform real-time sentiment analysis using an Arm-based Amazon EKS cluster?

## Real-time sentiment analysis with Arm-based Amazon EKS clusters
Yes, you can use EKS for sentiment analysis.

Real-time sentiment analysis is a compute-intensive task and can quickly drive up resources and increase costs if not managed effectively. Tracking real-time changes enables organizations to understand sentiment patterns and make informed decisions promptly, allowing for timely and appropriate actions.
Real-time sentiment analysis is a compute-intensive task and can quickly drive up resources and increase costs if not managed effectively. Tracking real-time changes enables you to understand sentiment patterns and make informed decisions promptly, allowing for timely and appropriate actions.

The architecture used for the solution is shown below:

![sentiment analysis #center](_images/Sentiment-Analysis.png)

The high-level technology stack for the solutions is as follows:
The technology stack for the solution includes the following steps:

- Twitter(X) Developer API to fetch tweets based on certain keywords
- Captured data is processed using Amazon Kinesis
- Sentiment Analyzer model to classify the text and tone of tweets
- Process the sentiment of tweets using Apache Spark streaming API
- Elasticsearch and Kibana to store the processed tweets and showcase on dashboard
- Prometheus and Grafana to monitor the CPU and RAM resources of the Amazon EKS cluster
- Use the Twitter (X) developer API to fetch Tweets based on certain keywords
- Process the captured data using Amazon Kinesis
- Run a sentiment analysis model to classify the text and tone of the text
- Process the sentiment of Tweets using Apache Spark streaming API
- Use Elasticsearch and Kibana to store the processed Tweets and showcase the activity on a dashboard
- Monitor the CPU and RAM resources of the Amazon EKS cluster with Prometheus and Grafana
Original file line number Diff line number Diff line change
@@ -1,19 +1,22 @@
---
title: Learn how to perform Twitter(X) Sentiment Analysis on Arm-based EKS clusters
title: Learn how to perform Twitter (X) sentiment analysis on Arm-based EKS clusters

draft: true
cascade:
draft: true

minutes_to_complete: 60

who_is_this_for: This is an advanced topic for software developers who like to build an end-to-end solution ML solution to analyze the sentiments of live tweets with Arm-based Amazon EKS cluster
who_is_this_for: This is an advanced topic for software developers who want to build an end-to-end ML sentiment analysis solution to analyze live Tweets on an Arm-based Amazon EKS cluster.

learning_objectives:
- Deploy text classification model on Amazon EKS with Apache Spark
- Learn how to deploy Elasticsearch and Kibana dashboard to analyze the tweets
- Deploy Prometheus and Grafana dashboard to keep track of CPU and RAM usage of Kubernetes nodes
- Deploy a text classification model on Amazon EKS with Apache Spark.
- Use Elasticsearch and a Kibana dashboard to analyze the Tweets.
- Deploy Prometheus and Grafana dashboards to keep track of CPU and RAM usage of Kubernetes nodes.

prerequisites:
- An [AWS account](https://aws.amazon.com/). Create an account if needed.
- A computer with [Amazon eksctl CLI](/install-guides/eksctl) and [kubectl](/install-guides/kubectl/)installed.
- Docker installed on local computer [Docker](/install-guides/docker)
- An AWS account.
- A computer with Docker, Terraform, the Amazon eksctl CLI, and kubectl installed.

author_primary: Pranay Bakre, Masoud Koleini, Nobel Chowdary Mandepudi, Na Li

Expand Down