# Complex Gremlin Queries
In this section, we show you some more advanced Gremlin queries.

## Setup

Before you start, ensure you have run notebook _01-Setup_ to create the dataset with which we'll be working.

In [None]:
%load_ext ipython_unittest
%run '../util/neptune.py'

In [None]:
g = neptune.graphTraversal()

## Graph Model

Here's the application graph data model:

<img src="https://s3.amazonaws.com/aws-neptune-customer-samples/neptune-sagemaker/images/imdb-data-model.jpg"/>


## Gremlin-Python

Throughout these exercises you'll be using [Gremlin-Python](http://tinkerpop.apache.org/docs/current/reference/#gremlin-python), which requires a few modifications to Gremlin:

 - In Python, `as`, `in`, `and`, `or`, `is`, `not`, `from`, and `global` are reserved words. In Gremlin-Python, simply add a `_` postfix to these words. For example, the `as()` Gremlin step is written `as_()`.

### 03.01: Recommendation query

Start with a person with ID 'person378' and suggest 10 movie recommendations.
The movies should be highly rated and belong to the favourite genre of this person.

Consult the following documentation:
 - [`groupCount()`](http://tinkerpop.apache.org/docs/current/reference/#groupcount-step)
 - [`order().by()`](http://tinkerpop.apache.org/docs/3.3.2/reference/#order-step)


In [None]:
%%time
results_03_01 = (g.
    #begin
    V('person378').out('rated').aggregate('watched').
    out('genre').groupCount().order(local).by(values, decr).unfold().limit(1).
    select(keys).in_('genre').has('rating', gt(8.0)).has('numvotes',gt(50000)).
    where(without('watched')).order().by('rating', decr).
    limit(10).values('title').
    #end
    toList())

for result in results_03_01:
    print(result);



### 03.02: Bacon number
Find out how is Jack Nicholson related to Kevin Bacon

This query introduces you to repeat().until() step. This is a useful constuct to 
write multiple-hop queries with some terminating condition. 

Consult the following documentation:
 - [`repeat().until()`](http://tinkerpop.apache.org/docs/3.3.2/reference/#repeat-step)
 - [`path()`](http://tinkerpop.apache.org/docs/3.3.2/reference/#path-step)
 - [`Additional reference`](http://kelvinlawrence.net/book/Gremlin-Graph-Guide.html#sp)


In [None]:
%%time
results_03_02 = (g.
    #begin
    V().has('name','Jack Nicholson').
    repeat(__.in_('actor').out('actor').simplePath()).until(has('name','Kevin Bacon')).
    path().by('name').by('title').limit(10).
    #end
    toList())

for result in results_03_02:
    print(result);



### 03.03: Running average

What is the average rating of movies directed by Steven Spielberg?

Consult the following documentation:
 - [`mean()`](http://tinkerpop.apache.org/docs/3.3.2/reference/#mean-step)


In [None]:
%%time
results_03_03 = (g.
    #begin
    V().hasLabel('Artist').has('name', 'Steven Spielberg').
    in_('director').values('rating').mean().
    #end
    toList())

for result in results_03_03:
    print(result);



### 03.04

Which is the most popular genre of movies that were directed by Martin Scorsese and Leonardo DiCaprio acted in them.


In [None]:
%%time
results_03_04 = (g.
    #begin
    V().hasLabel('Artist').has('name', 'Leonardo DiCaprio').as_('actor').
    in_('actor').as_('movies').out('director').has('name', 'Martin Scorsese').
    select('movies').out('genre').groupCount().order(local).by(values, decr).
    #end
    toList())

for result in results_03_04:
    print(result);



### 03.05

List the top 10 movies which have received highest number of ratings by the people.

Consult the following documentation:
 - [`project()`](http://tinkerpop.apache.org/docs/3.3.2/reference/#project-step)


In [None]:
%%time
results_03_05 = (g.
    #begin
    V().hasLabel('Movie').
    project('movie', 'numratings').by('title').by('numvotes').
    order().by(select('numratings'), decr).limit(10).
    #end
    toList())

for result in results_03_05:
    print(result);



### 03.06

List the top 10 most social people.


In [None]:
%%time
results_03_06 = (g.
    #begin
    V().hasLabel('Person').project('firstName', 'lastName', 'numknows').
    by('firstName').by('lastName').by(__.out('knows').count()).
    order().by(select('numknows'), decr).limit(10).
    #end
    toList())

for result in results_03_06:
    print(result);

