# pydgraph example notebook

## Self-managed cluster version

This example notebook version uses an an existing Dgraph cluster that you control. If you have Docker, the TLDR; version is:

```sh
docker run --rm -it -p 8080:8080 -p 9080:9080 -p 5080:5080 dgraph/standalone:latest
```

For more information on starting Dgraph with Docker or Docker Compose, see this [document](https://dgraph.io/docs/learn/data-engineer/get-started-with-dgraph/tutorial-1/).

This example notebook uses a schema and data from the [Dgraph ICIJ offshore leaks repository](https://github.com/dgraph-io/vlg). Please refer to that repo for a discussion of the schema and data.

**Please note that this notebook updates the schema in the configured cluster and loads data into it.**

In [None]:
# Set the hostname of the Dgraph alpha service
dgraph_hostname = "localhost"

In [None]:
# This cell checks that the required ports for the Dgraph cluster are accessible from this notebook. It also sets
# important port variables used in later cells

import socket

def check_port(url, port):
    """
    check_port returns true if the port at the url is accepting connections
    """
    try:
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        sock.settimeout(3)  # Set a timeout value for the connection attempt
        result = sock.connect_ex((url, port))
        sock.close()
        if result == 0:
            return True
        else:
            return False
    except socket.error:
        return False

# check ports to ensure access. these are the defaults, change these to match your custom 
dgraph_http_port = 8080
dgraph_grpc_port = 9080
dgraph_zero_port = 5080
if not check_port(dgraph_hostname, dgraph_http_port):
    print(f"Port {dgraph_http_port} at {dgraph_hostname} not responding, is the server running?")
elif not check_port(dgraph_hostname, dgraph_grpc_port):
    print(f"Port {dgraph_grpc_port} at {dgraph_hostname} not responding, is the server running?")
elif not check_port(dgraph_hostname, dgraph_zero_port):
    print(f"Port {dgraph_zero_port} at {dgraph_hostname} not responding, is the server running?")
else:
    print("Required ports accepting connections")

In [None]:
# Apply a GraphQL Schema to the cluster

!curl -Ss https://raw.githubusercontent.com/dgraph-io/vlg/main/schema/schema.graphql --output schema.graphql
admin_endpoint = f"http://{dgraph_hostname}:{dgraph_http_port}/admin/schema"
!curl --data-binary '@./schema.graphql' {admin_endpoint}

In [None]:
# Load data into the cluster

!curl -Ss https://raw.githubusercontent.com/dgraph-io/vlg/main/rdf-subset/data.rdf.gz --output data.rdf.gz

# Find ways to load data into the cluster
import shutil, os, platform

pwd = os.getcwd()
if shutil.which('docker') is not None:
    docker_host = dgraph_hostname
    if dgraph_hostname == 'localhost':
        docker_host = 'host.docker.internal'
    !docker run -it -v {pwd}:/data dgraph/standalone:latest dgraph live -f /data/data.rdf.gz --alpha {docker_host}:{dgraph_grpc_port} --zero {docker_host}:{dgraph_zero_port}
elif shutil.which('dgraph') is not None:
    !dgraph live -f ./data.rdf.gz --alpha {dgraph_hostname}:{dgraph_grpc_port} --zero {dgraph_hostname}:{dgraph_zero_port}
elif platform.system == "Linux":
    !curl https://get.dgraph.io -sSf | bash -s -- -y
    !dgraph live -f ./data.rdf.gz --alpha {dgraph_hostname}:{dgraph_grpc_port} --zero {dgraph_hostname}:{dgraph_zero_port}
else:
    raise Exception("Unable to find a way to load data into your cluster.")

    

In [None]:
# Install pydgraph

!pip install pydgraph

In [None]:
# Initialize a pydgraph client

import pydgraph

client_stub = pydgraph.DgraphClientStub(addr=f"{dgraph_hostname}:{dgraph_grpc_port}", options=[('grpc.max_receive_message_length', 1024*1024*1024)])
pyd_client = pydgraph.DgraphClient(client_stub)
print("Dgraph Version:", pyd_client.check_version())

In [None]:
# Perform a DQL query

import json

query = """
{
  q(func: anyoftext(Record.name, "living"), first: 10) {
    id: Record.nodeID
    name: Record.name
  }
}
"""
res = pyd_client.txn(read_only=True).query(query)
print(json.dumps(json.loads(res.json), indent=2))
