<a href="https://colab.research.google.com/github/ainfanzon/Cockroach_IAM_Workshop/blob/main/GCP_Colab_notebooks/Exercise_02.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


<img src="https://drive.google.com/uc?id=1XYr9Tyrz31a5kZdo601xD1QWz_YM8-H3">

### CockroachDB is a distributed SQL database that is __*highly scalable*__, __*resilient*__, and __*easy to use*__.

# Identity and Access Management Workshop.
---
## Scalability and Resiliency.

In this section we are going to scale the cluster. You will:

1. Add three more nodes to cluster.
1. Create and populate the Northwine database.
1. Simulate two nodes failure.
1. Restart nodes.
<br>

<html>
<head>
<style>
table, th, td {
  border: 1px solid black;
  border-collapse: collapse;
}
</style>
</head>
<body>

<table style="width:100%">
  <tr>
       <td align="right">
          <img src="https://drive.google.com/uc?id=1VS1jCK6UAUeqdNrKot3BbOZrbMVo2m5l" width="850"
          height="350">
      </td>
  </tr>
</table>

</body>
</html>

---

## 1. Add three more nodes to cluster.

CockroachDB is designed to be low touch and highly automated for operators, while remaining easy to reason about for developers.

**Benefits**

CockroachDB's horizontal scalability allows users to start small and scale out as needed. It also maintains ACID guarantees, so users don't have to risk their data to improve performance.

Execute the next two cells below to overview the current status of the cluster (alternatively use the DB Console). In the cell below, replace the host value with your PUBLIC IP address.

In [1]:
import psycopg2
import pandas as pd

from IPython.display import IFrame, display, HTML, Markdown

pd.set_option('display.max_colwidth', None)

conn = psycopg2.connect(
        database = 'defaultdb'
      , user = 'root'
      , host = 'your IP address'                        # Use the GCP Compute Engine external IP address
      , port = '26257'
      , sslmode = 'disable'
)
cursor = conn.cursor()

In [3]:
cursor.execute("""
SELECT gn.node_id AS "Node ID"
     , gn.advertise_sql_address AS "Advertised Address"
     , gn.build_tag AS "Version"
     , current_timestamp() AT TIME ZONE 'UTC' - gn.started_at AS "Up Time"
     , "ranges" AS "Ranges"
     , leases AS "Leaders"
     , CASE WHEN is_live THEN 'LIVE' ELSE 'DEAD' END AS "status"
     , gl.membership
FROM crdb_internal.gossip_nodes AS gn join crdb_internal.gossip_liveness AS gl USING(node_id)
""")
result_set = cursor.fetchall()
df_result_set = pd.DataFrame(result_set, columns=[desc[0] for desc in cursor.description])
df_result_set.set_index('Node ID', inplace=True)
df_result_set

Unnamed: 0_level_0,Advertised Address,Version,Up Time,Ranges,Leaders,status,membership
Node ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,10.14.0.235:26257,v24.2.3,0 days 01:51:29.747260,61,22,LIVE,active
2,10.14.0.235:26259,v24.2.3,0 days 01:51:29.183486,61,19,LIVE,active
3,10.14.0.235:26258,v24.2.3,0 days 01:51:29.013236,61,20,LIVE,active


To scale the cluster follow the steps below:

- Use the __**cockroach start**__ to provision additional nodes to the cluster.

> <p>cockroach start<br>
&emsp;&emsp;--insecure<br>
&emsp;&emsp;--listen-addr=&lt;ip address&gt;:&lt;sql listening port&gt;<br>
&emsp;&emsp;--join=&lt;ip address&gt;:&lt;sql listening port&gt;, ... ,&lt;ip address&gt;:&lt;sql listening port&gt;<br>
&emsp;&emsp;--http-addr=&lt;ip address&gt;:&lt;http listening port&gt;<br>
&emsp;&emsp;--locality=region=us-west,zone=us-west-1a<br> &emsp;&emsp;--store=/home/cockroach/data/cr_data_1<br>
&emsp;&emsp;--background<br>
</p>

Execute the __**add_nodes.sh**__ shell script in the scripts directory

> </code>
$ /home/cockroach/scripts/add_nodes.sh &lt;region&gt;
</code>

Verify there are six instances of the `cockroach` process running on different ports.

- List all active `cockroach` processes.

> `pgrep -a cockroach | awk '{ print $5}'`

&emsp;&emsp;Each process will be listening on the same IP but different port.

> <code>
--listen-addr=10.0.1.2:26257<br>
--listen-addr=10.0.1.2:26258<br>
--listen-addr=10.0.1.2:26259<br>
--listen-addr=10.0.1.2:26260<br>
--listen-addr=10.0.1.2:26261<br>
--listen-addr=10.0.1.2:26262<br>
</code>

---

## 2. Create and populate the Northwind database.
<html>
<head>
<style>
table, th, td {
  border: 1px solid black;
  border-collapse: collapse;
}
</style>
</head>
<body>

<table style="width:100%">
  <tr>
      <td align="right">
          <img src="https://drive.google.com/uc?id=1eM0otn7ieCvBMXVQ0WmXgRKUf41GaS9h" width="850"
          height="650">
      </td>
  </tr>
</table>

</body>
</html>

Execute the following steps to create and load the data to the database:

- Run the sed command below to update the script IP address:

> ```sed -E -i s/HOST_IP/$(hostname -I | awk '{print $1}')/ /home/cockroach/sql/northwind.sql```

- On the second terminal (not running the loading server) execute the **northwind.sql** script.

> ```cockroach sql --host $(hostname -I) -u root -d default -f /home/cockroach/sql/northwind.sql --insecure```

- On the cockroach **DB Console**:

> <code>
http://IP Address:8080/#/overview/list
</code>


In [4]:
conn = psycopg2.connect(
        database = 'northwind'
      , user = 'root'
      , host = 'Your IP Address'                        # Use the GCP Compute Engine external IP address
      , port = '26257'
      , sslmode = 'disable'
)

cursor = conn.cursor()
cursor.execute("""
SELECT gn.node_id AS "Node ID"
     , gn.advertise_sql_address AS "Advertised Address"
     , gn.build_tag AS "Version"
     , current_timestamp() AT TIME ZONE 'UTC' - gn.started_at AS "Up Time"
     , "ranges" AS "Ranges"
     , leases AS "Leaders"
     , CASE WHEN is_live THEN 'LIVE' ELSE 'DEAD' END AS "status"
     , gl.membership
FROM crdb_internal.gossip_nodes AS gn join crdb_internal.gossip_liveness AS gl USING(node_id)
""")
result_set = cursor.fetchall()
df_result_set = pd.DataFrame(result_set, columns=[desc[0] for desc in cursor.description])
df_result_set.set_index('Node ID', inplace=True)
df_result_set

Unnamed: 0_level_0,Advertised Address,Version,Up Time,Ranges,Leaders,status,membership
Node ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
1,10.14.0.235:26257,v24.2.3,0 days 02:22:01.099725,84,11,LIVE,active
2,10.14.0.235:26259,v24.2.3,0 days 02:22:00.535951,82,39,LIVE,active
3,10.14.0.235:26258,v24.2.3,0 days 02:22:00.365701,82,13,LIVE,active
4,10.14.0.235:26260,v24.2.3,0 days 00:04:03.705612,75,12,LIVE,active
5,10.14.0.235:26261,v24.2.3,0 days 00:04:02.788537,79,11,LIVE,active
6,10.14.0.235:26262,v24.2.3,0 days 00:04:01.883042,78,40,LIVE,active


#### Compare the output with the earlier result.

- How many nodes are there in the cluster?
- Is the number of ranges higher or lower than before?
- **In the DB Console what is the number of replicas different than the sql output? why?**

---

## 3. Simulate failure of two nodes on different regions.

<html>
<head>
<style>
table, th, td {
  border: 1px solid black;
  border-collapse: collapse;
}
</style>
</head>
<body>

<table style="width:100%">
  <tr>
      <td align="right">
          <img src="https://drive.google.com/uc?id=1MCqaWMMCNDr2dYaP_JMDZykiQxpu_8Gd" width="850" height="375">
      </td>
  </tr>
</table>

</body>
</html>

- Set the __**time_until_store_dead**__ to reduce the amount of time the cluster waits before considering a node dead. The default is five minutes and the minimum allowed is 1 minute and 15 seconds. Execute the command below on the second terminal.

> <code>
cockroach sql --insecure --host=&lt;Private IP&gt;:26257 -d defaultdb --execute="SET CLUSTER SETTING server.time_until_store_dead = '1m15s';"
</code>

- Simulate an availability zone failure that brings down two nodes in the cluster.

> <code>kill -9 $(pgrep -a cockroach | grep "[west\|east]-1c" | awk '{print $1}')
</code>

Check the DB Console after a couple of minutes :

- How many nodes are reported dead?
- In the replication status now many under replicated ranges are there?

- Check the Northwind database is still operational after a whole availability zone went down.

In [12]:
cursor.execute("""
SELECT p.product_name AS "Product Name"
     , SUM(od.unit_price * CAST(od.quantity AS FLOAT) * (1.0 - od.discount)) AS Sales
FROM products AS p INNER JOIN order_details AS od ON od.product_id = p.product_id
GROUP BY p.product_name
ORDER BY Sales DESC LIMIT 5
""")
result_set = cursor.fetchall()
df_result_set = pd.DataFrame(result_set, columns=[desc[0] for desc in cursor.description])
df_result_set.set_index('Product Name', inplace=True)
df_result_set

Unnamed: 0_level_0,sales
Product Name,Unnamed: 1_level_1
Côte de Blaye,141396.735
Thüringer Rostbratwurst,80368.672
Raclette Courdavault,71155.7
Tarte au sucre,47234.97
Camembert Pierrot,46825.48


---

## 4. Restart nodes.

Execute the commands below to restart the nodes

<code>
cockroach start --insecure --listen-addr=10.0.1.2:26259 --join=10.0.1.2:26257,10.0.1.2:26258,10.0.1.2:26259 --http-addr=10.0.1.2:8082 --locality=region=us-west,zone=us-west-1c --store=/home/cockroach/data/cr_data_3 --background<br>
cockroach start --insecure --listen-addr=10.0.1.2:26260 --join=10.0.1.2:26257,10.0.1.2:26258,10.0.1.2:26259 --http-addr=10.0.1.2:8083 --locality=region=us-east,zone=us-west-1c --store=/home/cockroach/data/cr_data_4 --background
</code>

- What is happening with the under replicated ranges?
- Are all the nodes operational again?

---
## CockroachDB is a distributed SQL database that is __*highly scalable*__, __*resilient*__, and __*easy to use*__.
<img src="https://drive.google.com/uc?id=1XYr9Tyrz31a5kZdo601xD1QWz_YM8-H3">

---

# Appendix

Workshop CRDB user id and passowrd

> <p>uid = roachie<br>
pwd = roachfan
</p>

List CRDB process id and process name.

> <code>pgrep -l cockroach</code>

List the listening address of each `cockroach` process.

> <code>pgrep -a cockroach | awk '{ print $5}'</code>

Kill ALL CRDB processes

> <code>kill -9  $(pgrep cockroach)</code>

Remove all CRDB files

> <code>sudo rm -fR /home/cockroach/data/*</code>

Update the IP in the script

> <code>sed -E -i s/HOST_IP/$(hostname -I | awk '{print $1}')/ northwind.sql</code>