Skip to content

IBM/store-and-query-unstructured-json-cockroachdb-operator-rhm

Repository files navigation

abstract excerpt authors completed_date last_updated components draft keywords primary_tag related_content title subtitle tags translators type
This tutorial shows you how to get unstructured JSON data from an API, store it in CockroachDB hosted on Red Hat Marketplace, and query the unstructured JSON data from the table using Python and a Jupyter notebook.
This tutorial shows you how to get unstructured JSON data from an API, store it in CockroachDB hosted on Red Hat Marketplace, and query the unstructured JSON data from the table using Python and a Jupyter notebook.
name email
Manoj Jahgirdar
manoj.jahgirdar@in.ibm.com
name email
Rahul Reddy Ravipally
raravi86@in.ibm.com
name email
Srikanth Manne
srikanth.manne@in.ibm.com
name email
Manjula G. Hosurmath
mhosurma@in.ibm.com
2020-09-03
2020-09-03
slug name url type
cockroachdb
CockroachDB
component
slug name url type
redhat-marketplace
Red Hat Marketplace
component
true
cockroachdb, red hat marketplace
databases
type slug
tutorials
get-started-using-a-cockroachdb-operator-hosted-on-red-hat-marketplace
Store and query unstructured JSON data from CockroachDB hosted on Red Hat Marketplace
Access JSON data with CockroachDB and OpenShift
databases
containers
name email
tutorial

This tutorial shows you how to get unstructured JSON data from an API, store it in CockroachDB hosted on Red Hat Marketplace, and query the unstructured JSON data from the table using Python and Jupyter notebooks.

About CockroachDB Operator

CockroachDB is an elastic SQL database that easily scales transactions for your apps and services. Its cloud-native architecture simplifies scale and also guarantees consistent transactions across multiple regions and multiple clouds. Learn more.

Learning objectives

When you have completed this tutorial, you will understand how to:

  • Install CockroachDB Operator from Red Hat Marketplace on an OpenShift Cluster
  • Create a CockroachDB cluster instance
  • Create a user and database in CockroachDB
  • Store and query unstructured JSON data from a third-party API in CockroachDB
  • Manage the CockroachDB cluster from the Cluster Overview page

Estimated time

Completing this tutorial should take about 30 minutes.

Pre-requisites

  1. Red Hat Marketplace account.
  2. Red Hat OpenShift Cluster.
  3. OpenShift container & kubectl CLI.
  4. Access to a Jupyter Notebook. You can install a Jupyter Notebook from python-pip or use a tool such as Anaconda to open the Jupyter Notebook.

Steps

Step 1: Install the CockroachDB Operator from Red Hat Marketplace on OpenShift Cluster

Follow the steps in this tutorial to deploy a CockroachDB Operator from Red Hat Marketplace on an OpenShift Cluster:

Once you have successfully set up a CockroachDB Operator on an OpenShift Cluster you can create a database.

Step 2: Create a user and database in CockroachDB

Now, let's create a user and a database.

  1. Run the following command to spin up a CockroachDB client:

    $ kubectl run -it --rm cockroach-client \
    --image=cockroachdb/cockroach \
    --restart=Never \
    --command -- \
    ./cockroach sql --insecure --host=example-cockroachdb-public.cockroachdb-test

    This should run the CockroachDB client and take you to a SQL Command Prompt as shown. If you don't see a command prompt, try pressing Enter.

    root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb>
  2. From the CockroachDB client, run the following commands:

    • Create a user maxroach as follows:
    root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb> CREATE USER IF NOT EXISTS maxroach;
     CREATE USER 1
    
     Time: 9.580878ms
     
    • Create a database employees as follows:
    root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb> CREATE DATABASE employees;
     CREATE DATABASE
    
     Time: 14.449525ms
     
    • Give the user, maxroach, permission to update our database, employees, as follows:
    root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb> GRANT ALL ON DATABASE employees TO maxroach;
     GRANT
    
     Time: 9.308095ms
     

    At this point, you should have a user and a database.

  3. Type \q to quit the client console as shown:

    root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb> \q
     pod "cockroach-client" deleted
     

Step 3: Access the admin console on your localhost

The following steps show you how to view the results of the commands you ran in the earlier steps via the admin console. You can access the console at localhost with port forwarding.

Note: You need to be logged in to your OpenShift Cluster with the OC login that you accessed in Step 2: Connect to the OpenShift cluster in your CLI.

  1. Run the following command to port forward 8080:

    $ kubectl port-forward example-cockroachdb-0 8080
    Forwarding from 127.0.0.1:8080 -> 8080
    Forwarding from [::1]:8080 -> 8080
    
  2. Visit http://localhost:8080 on your browser as shown. The page should load the cluster overview.

    cockroach-clusteroverview

  3. Click on databases to view the employees database that you created earlier.

    cockroach-databases.png

Step 4: Store and query unstructured JSON data using Python and Jupyter Notebooks

Once the CockroachDB is running and you created the user and database, you can now explore the JSON support for CockroachDB in a python runtime using Jupyter Notebook.

  1. In your terminal, run the following command to port forward the 26257 port from the CockroachDB database instance. This port is used in your Jupyter Notebook to establish a connection with the CockroachDB database instance.

    $ kubectl port-forward example-cockroachdb-0 26257
    Forwarding from 127.0.0.1:26257 -> 26257
    Forwarding from [::1]:26257 -> 26257
    
  2. Download and open the notebook Store-and-query-unstructured-Json-CockroachDB.ipynb in your local machine.

    We are using the http://dummy.restapiexample.com/api/v1/employees API for demonstration purpose.

    nb-api

  3. Click on the Cell tab and select Run All.

    nb-run

    You can now follow the notebook instructions for more details on what is happening in each cell.

    After you have executed the notebook, verify the table in the CockroachDB instance through the CockroachDB client from terminal.

  4. In your terminal, run the following command to spin up a CockroachDB client:

    $ kubectl run -it --rm cockroach-client \
    --image=cockroachdb/cockroach \
    --restart=Never \
    --command -- \
    ./cockroach sql --insecure --host=example-cockroachdb-public.cockroachdb-test

    This should run the CockroachDB client and take you to a SQL Command Prompt as shown. If you don't see a command prompt, try pressing Enter.

    root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb>
  5. From the CockroachDB client, run the following commands to view the user, database, and table which the Jupyter Notebook created:

    • View users by running the SHOW users; command:

      root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb> SHOW users;
       user_name
      `-------------`
        cpuser
        maxroach
        root
        (3 rows)
      
      Time: 3.037641ms
      
    • View databases by running the SHOW databases; command:

      root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb> SHOW databases;
      database_name
      `-----------------`
      bank
      defaultdb
      employees
      postgres
      system
      (5 rows)
      
      Time: 2.890031ms
    • To view the tables present in the Employees database, run the USE bank; command to switch to bank database, and run \d command to view the tables:

      root@example-cockroachdb-public.cockroachdb-test:26257/defaultdb> USE employees;
      SET
      
      Time: 11.83841ms
      
      root@example-cockroachdb-public.cockroachdb-test:26257/employees> \d
      table_name
      `----------------------`
      jsontbl
      (1 row)
      
      Time: 3.684617ms
  6. Finally, to view the unstructured JSON from the table, run the SELECT command as follows:

<pre><code>root@example-cockroachdb-public.cockroachdb-test:26257/employees> <b>SELECT * from jsontbl;</b>
</code></pre>

![cmd-output](doc/source/images/cmdoutput.png)

Summary

In this tutorial, we showed you how to get unstructured JSON data from a third-party REST API, created a table in CockroachDB and store the unstructured JSON data into the table. You also learned how to query the unstructured data from CockroachDB table into a pandas dataframe which can be further used for data engineering and data science tasks.

Reference

Refer to the following documentation from Cockroach labs to learn more about the operator and its features.

About

Store and query unstructured JSON data from CockroachDB hosted on Red Hat Marketplace

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published