# Setup Dgraph

This tutorial uses [dgraph via `docker-compose`](https://docs.dgraph.io/get-started/#docker-compose). It assumes you have already installed [docker](https://docs.docker.com/v17.09/engine/installation/) and [docker-compose](https://docs.docker.com/compose/install/) on the same machine as jupyter. What follows is a whirlwind demonstration that follows a subset tiny projection dgraph's [Tour of Dgraph: A Bigger Dataset](https://tour.dgraph.io/moredata/1/). I've set `--lru_mb=4096` instead of `2048` on the `alpha` server since the tour suggests it's often needed for a dataset of this size.

## Create the `docker-compose.yml`

The downloaded data and artifacts will go to `tutorial_artifacts` to keep your working directory legible.

In [1]:
!mkdir -p tutorial_artifacts

I've named the containers `dgraph_zero`, `dgraph_server`, and `dgraph_ratel` in the `docker-compose.yml` file. This makes data injestion more straight-forward.

In [2]:
%%writefile tutorial_artifacts/docker-compose.yml
version: "3.2"
services:
  zero:
    image: dgraph/dgraph:latest
    container_name: "dgraph_zero"
    volumes:
      - type: volume
        source: dgraph
        target: /dgraph
        volume:
          nocopy: true
    ports:
      - 5080:5080
      - 6080:6080
    restart: on-failure
    command: dgraph zero --my=zero:5080
  server:
    container_name: "dgraph_server"
    image: dgraph/dgraph:latest
    volumes:
      - type: volume
        source: dgraph
        target: /dgraph
        volume:
          nocopy: true
    ports:
      - 8080:8080
      - 9080:9080
    restart: on-failure
    command: dgraph alpha --my=server:7080 --lru_mb=4096 --zero=zero:5080
  ratel:
    container_name: "dgraph_ratel"
    image: dgraph/dgraph:latest
    volumes:
      - type: volume
        source: dgraph
        target: /dgraph
        volume:
          nocopy: true
    ports:
      - 8000:8000
    command: dgraph-ratel

volumes:
  dgraph:

Writing tutorial_artifacts/docker-compose.yml


# Start Dgraph

With the `docker-compose.yml` file written, it's now possible to start `dgraph`.

(You'll probably have to verify successful start in your terminal with `docker ps`. Terminal output rewriting breaks in jupyter output cells.)

In [3]:
!docker-compose -f tutorial_artifacts/docker-compose.yml up -d 

Creating network "tutorial_artifacts_default" with the default driver
Creating dgraph_ratel ... 
Creating dgraph_server ... 
Creating dgraph_zero   ... 
[2Bting dgraph_server ... [32mdone[0m[1A[2K[2A[2K

# Get the Data

In [4]:
!wget "https://github.com/dgraph-io/tutorial/blob/master/resources/1million.rdf.gz?raw=true" --no-clobber -O tutorial_artifacts/1million.rdf.gz -q

# Load Data

In [5]:
%load_ext idgraph

The following command will delete existing data from dgraph. If this is your first run, you won't have any. And, if this isn't your first run, you may not want to delete your data. The `--skip` argument to the cell magic skips execution.

In [6]:
%%dgraph --alter --json --skip
{
    "drop_all": true
}

Execution skipped. Remove `--skip` flag to execute.

The following alters graph database.

In [7]:
%%dgraph --alter

# Define Directives and index
director.film: [uid] @reverse .
genre: [uid] @reverse .
initial_release_date: dateTime @index(year) .
name: string @index(exact, term) @lang .

```json
{
    "code": "Success",
    "message": "Done"
}
```


For a variety of reasons, people have had trouble with loading the tour data. It's easy to end up with error messages like,

> While trying to setup connection to Dgraph server.  
> error: context deadline exceeded


As [Pawan Rawal's suggestion](https://discuss.dgraph.io/t/cant-execute-tour-part-a-bigger-dataset/2267/9), using `docker cp` makes it easier. The following,

1. Copies the data to the server container;
2. Runs the injestor;
3. Deletes the data on the server container.

In [8]:
!docker cp ./tutorial_artifacts/1million.rdf.gz dgraph_server:/dgraph
!docker exec -it dgraph_server dgraph live -f 1million.rdf.gz --alpha dgraph_server:9080 --zero dgraph_zero:5080 -c 1
!docker exec -it dgraph_server rm /dgraph/1million.rdf.gz

[Decoder]: Using assembly version of decoder
I1020 20:42:40.707381      27 init.go:98] 

Dgraph version   : v1.1.0
Dgraph SHA-256   : 7d4294a80f74692695467e2cf17f74648c18087ed7057d798f40e1d3a31d2095
Commit SHA-1     : ef7cdb28
Commit timestamp : 2019-09-04 00:12:51 -0700
Branch           : HEAD
Go version       : go1.12.7

For Dgraph official documentation, visit https://docs.dgraph.io.
For discussions about Dgraph     , visit https://discuss.dgraph.io.
To say hi to the community       , visit https://dgraph.slack.com.

Licensed variously under the Apache Public License 2.0 and Dgraph Community License.
Copyright 2015-2018 Dgraph Labs, Inc.



Running transaction with dgraph endpoint: dgraph_server:9080
Found 1 data file(s) to process
Processing data file "1million.rdf.gz"
[20:42:45Z] Elapsed: 05s Txns: 34 N-Quads: 34000 N-Quads/s [last 5s]:  6800 Aborts: 0
[20:42:50Z] Elapsed: 10s Txns: 70 N-Quads: 70000 N-Quads/s [last 5s]:  7200 Aborts: 0
[20:42:55Z] Elapsed: 15s Txns: 173 N-Quads: 

The following adds types,

In [9]:
%%dgraph --alter
# Define Types

type Person {
    name: string
    director.film: [Movie]
}

type Movie {
    name: string
    initial_release_date: string
    genre: [Genre]
    starring: [Performance]
}

type Genre {
    name: string
}

type Performance {
    performance.film: [Movie]
    performance.character_note: string
    performance.character: [Person]
    performance.actor: [Person]
    performance.special_performance_type: [Special_performance_type]
    type: [Generic]
}


```json
{
    "code": "Success",
    "message": "Done"
}
```


Now, it's possible to execute queries!

In [10]:
%%dgraph 
{
    hanks(func: eq(name@., "Tom Hanks")) {
        uid
        name@.
        director.film {
            expand(_all_)
        }
    }
}

```json
{
    "hanks": [
        {
            "uid": "0xcbcefccffb9eb400",
            "name@.": "Tom Hanks",
            "director.film": [
                {
                    "initial_release_date": "1996-09-14T00:00:00Z",
                    "name@en": "That Thing You Do!",
                    "name@it": "Music Graffiti",
                    "name@de": "That Thing You Do!"
                },
                {
                    "initial_release_date": "2011-06-27T00:00:00Z",
                    "name@en": "Larry Crowne",
                    "name@it": "L'amore all'improvviso - Larry Crowne",
                    "name@de": "Larry Crowne"
                }
            ]
        }
    ]
}
```


The output shows only the contents of the value associated with the `data` key by default. But, if you use the `--full-resp` flag, you'll get the full envelope.

In [11]:
%%dgraph --full-resp
{
    hanks(func: eq(name@., "Tom Hanks")) {
        uid
        name@.
        director.film {
            expand(_all_)
        }
    }
}

```json
{
    "data": {
        "hanks": [
            {
                "uid": "0xcbcefccffb9eb400",
                "name@.": "Tom Hanks",
                "director.film": [
                    {
                        "name@en": "That Thing You Do!",
                        "name@it": "Music Graffiti",
                        "name@de": "That Thing You Do!",
                        "initial_release_date": "1996-09-14T00:00:00Z"
                    },
                    {
                        "name@en": "Larry Crowne",
                        "name@it": "L'amore all'improvviso - Larry Crowne",
                        "name@de": "Larry Crowne",
                        "initial_release_date": "2011-06-27T00:00:00Z"
                    }
                ]
            }
        ]
    },
    "extensions": {
        "server_latency": {
            "parsing_ns": 37163,
            "processing_ns": 4289874,
            "encoding_ns": 157920,
            "assign_timestamp_ns": 847628
        },
        "txn": {
            "start_ts": 82088
        }
    }
}
```


I really enjoy [`jmespath`](http://jmespath.org/). You can use the `--jmespath` flag to extract a specific part of the response.

In [12]:
%%dgraph --jmespath="data.hanks[0].uid"
{
    hanks(func: eq(name@., "Tom Hanks")) {
        uid
        name@.
        director.film {
            expand(_all_)
        }
    }
}

```json
"0xcbcefccffb9eb400"
```


For all queries, the extracted part is automatically bound to `_dgraph`, sorta following what you'd expect with `_` in IPython,

In [13]:
_dgraph

'0xcbcefccffb9eb400'

And the full query is bound to `_dgraph_full`,

In [14]:
_dgraph_full

{'data': {'hanks': [{'uid': '0xcbcefccffb9eb400',
    'name@.': 'Tom Hanks',
    'director.film': [{'name@en': 'That Thing You Do!',
      'name@it': 'Music Graffiti',
      'name@de': 'That Thing You Do!',
      'initial_release_date': '1996-09-14T00:00:00Z'},
     {'name@en': 'Larry Crowne',
      'name@it': "L'amore all'improvviso - Larry Crowne",
      'name@de': 'Larry Crowne',
      'initial_release_date': '2011-06-27T00:00:00Z'}]}]},
 'extensions': {'server_latency': {'parsing_ns': 21679,
   'processing_ns': 3189200,
   'encoding_ns': 119247,
   'assign_timestamp_ns': 626255},
  'txn': {'start_ts': 82089}}}

You can change the binding variable by specifying the `--into` flag.

In [15]:
%%dgraph --into='jeunet' --jmespath="data.jeunet[0]"
{
    jeunet(func: allofterms(name@en, "Jean-Pierre Jeunet")) {
        uid
        name@en
        director.film {
            names: name@en
        }
    }
}

```json
{
    "uid": "0xc96d2ccebd5f961a",
    "name@en": "Jean-Pierre Jeunet",
    "director.film": [
        {
            "names": "Delicatessen"
        },
        {
            "names": "A Very Long Engagement"
        },
        {
            "names": "Micmacs"
        },
        {
            "names": "The Young and Prodigious Spivet"
        },
        {
            "names": "Am\u00e9lie"
        },
        {
            "names": "The City of Lost Children"
        },
        {
            "names": "Things I Like, Things I Don't Like"
        },
        {
            "names": "Alien: Resurrection"
        }
    ]
}
```


Now, it's in `jeunet`,

In [16]:
jeunet

{'uid': '0xc96d2ccebd5f961a',
 'name@en': 'Jean-Pierre Jeunet',
 'director.film': [{'names': 'Delicatessen'},
  {'names': 'A Very Long Engagement'},
  {'names': 'Micmacs'},
  {'names': 'The Young and Prodigious Spivet'},
  {'names': 'Amélie'},
  {'names': 'The City of Lost Children'},
  {'names': "Things I Like, Things I Don't Like"},
  {'names': 'Alien: Resurrection'}]}

With the full response in `jeunet_full`,

In [17]:
jeunet_full

{'data': {'jeunet': [{'uid': '0xc96d2ccebd5f961a',
    'name@en': 'Jean-Pierre Jeunet',
    'director.film': [{'names': 'Delicatessen'},
     {'names': 'A Very Long Engagement'},
     {'names': 'Micmacs'},
     {'names': 'The Young and Prodigious Spivet'},
     {'names': 'Amélie'},
     {'names': 'The City of Lost Children'},
     {'names': "Things I Like, Things I Don't Like"},
     {'names': 'Alien: Resurrection'}]}]},
 'extensions': {'server_latency': {'parsing_ns': 25980,
   'processing_ns': 27678541,
   'encoding_ns': 28621,
   'assign_timestamp_ns': 568283},
  'txn': {'start_ts': 82090}}}

You can also interpret the command to execute as a [`jinja` templating](https://palletsprojects.com/p/jinja/) with the `--as-jinja` flag. Generally, *you don't want to do this too much*, since you should probably be crafting better graphql+- queries, but it's occasionally useful. Before sending, it's often helpful to use the `--print-jinja` flag to see what's going to get sent. Interpolation escaping works with ``<<variable>>`` bound to the user namespace (instead of ``{{variable}}``). Blocks work with `<% block %>`. (The jinja environment uses the cwd, so it's technically possible to load parent templates, but I *really* don't think you should.)

In [18]:
%%dgraph --as-jinja --print-jinja
{
    templated_query(func: uid("<<jeunet.uid>>")) {
    <%- if True %>
        name@en
    <%- endif %>
    }
}

```text
{
    templated_query(func: uid("0xc96d2ccebd5f961a")) {
        name@en
    }
}
```


Having inspected the interpolated query, I feel secure sending it.

In [19]:
%%dgraph --as-jinja
{
    templated_query(func: uid("<<jeunet.uid>>")) {
        name@en
    }
}

```json
{
    "templated_query": [
        {
            "name@en": "Jean-Pierre Jeunet"
        }
    ]
}
```


Now that I'm done showing off, I'll take the server down.

In [20]:
!docker-compose -f tutorial_artifacts/docker-compose.yml down

Stopping dgraph_zero   ... 
Stopping dgraph_server ... 
Stopping dgraph_ratel  ... 
[3Bping dgraph_zero   ... [32mdone[0m[2A[2KRemoving dgraph_zero   ... 
Removing dgraph_server ... 
Removing dgraph_ratel  ... 
[3BRemoving network tutorial_artifacts_default


That's it! Check out `dgraph`'s [online documentation](https://docs.dgraph.io/) for a lot more information. And, if this was useful, [please give me a star on github](https://github.com/jbn/idgraph) or [a shout out on twitter](https://twitter.com/generativist), or both :)