# DevAx: Introduction to Graph Modeling


## Use case 1

**Which companies has X worked for, and in what roles?**
![Use case 1](usecase1.png)

Let us create some sample data first.

### Dataset
![usecase 2 dataset.png](usecase1-dataset.png)


In [None]:
%%gremlin
g.
   addV('Person').property(id,'p-1').property('firstName','Martha').property('lastName','Rivera').
   addV('Person').property(id,'p-2').property('firstName','Richard').property('lastName','Roe').
   addV('Person').property(id,'p-3').property('firstName','Li').property('lastName','Juan').
   addV('Person').property(id,'p-4').property('firstName','John').property('lastName','Stiles').
   addV('Person').property(id,'p-5').property('firstName','Saanvi').property('lastName','Sarkar').
   addV('Company').property(id,'c-1').property('name','Example Corp').
   addV('Company').property(id,'c-2').property('name','AnyCompany').
   V('p-1').addE('WORKED_FOR').to(V('c-1')).property('role','Principal Analyst').                         
   V('p-2').addE('WORKED_FOR').to(V('c-1')).property('role','Senior Analyst').                           
   V('p-3').addE('WORKED_FOR').to(V('c-1')).property('role','Analyst').
   V('p-4').addE('WORKED_FOR').to(V('c-1')).property('role','Analyst').                           
   V('p-5').addE('WORKED_FOR').to(V('c-2')).property('role','Manager').
   V('p-3').addE('WORKED_FOR').to(V('c-2')).property('role','Associate Analyst')

### Querying the data

#### Query 1 – Which companies has Li worked for, and in what roles?

To answer this question, we'll have to perform the following steps:

    1. Start at the Person vertex representing Li
    2. Follow WORKED_FOR edges to find each Company for whom Li has worked
    3. Select the Company details, and the role property of the relationship

In [None]:
%%gremlin -p v,oute,inv
g.V('p-3').
    outE('WORKED_FOR').
    inV().
    path().
    by('firstName').by('role').by('name')

## Use case 2

**Who worked for company X, and at which locations, between Y1-Y2?**
![Use case 2](usecase2.png)

Let us first drop the graph we created for use case 1 and create some revised sample data.

In [None]:
%%gremlin
g.V().hasLabel('Person').drop()

In [None]:
%%gremlin
g.V().hasLabel('Company').drop()

### Dataset
![Use case 2](usecase2-dataset.png)

In [None]:
%%gremlin
g.
   addV('Person').property(id,'p-1').property('firstName','Martha').property('lastName','Rivera').
   addV('Person').property(id,'p-2').property('firstName','Richard').property('lastName','Roe').
   addV('Person').property(id,'p-3').property('firstName','Li').property('lastName','Juan').
   addV('Person').property(id,'p-4').property('firstName','John').property('lastName','Stiles').
   addV('Person').property(id,'p-5').property('firstName','Saanvi').property('lastName','Sarkar').
   addV('Company').property(id,'c-1').property('name','Example Corp').
   addV('Company').property(id,'c-2').property('name','AnyCompany').
   addV('Location').property(id,'l-1').property('name','HQ').property('address','100 Main St, Anytown').
   addV('Location').property(id,'l-2').property('name','Offices').property('address','Downtown, Anytown').
   addV('Location').property(id,'l-3').property('name','Exchange').property('address','50 High St, Anytown').
   addV('Job').property(id,'j-1').property('from',datetime('2010-10-20T00:00:00')).property('to','2017-11-1 00:00:00').
    property('role','Principal Analyst').
   addV('Job').property(id,'j-2').property('from',datetime('2011-02-16T00:00:00')).property('to',datetime('2013-09-17T00:00:00')).
    property('role','Senior Analyst').
   addV('Job').property(id,'j-3').property('from',datetime('2013-11-21T00:00:00')).property('to',datetime('2016-03-23T00:00:00')).
    property('role','Analyst').
   addV('Job').property(id,'j-4').property('from',datetime('2015-02-02T00:00:00')).property('to',datetime('2018-02-08T00:00:00')).
    property('role','Analyst').
   addV('Job').property(id,'j-5').property('from',datetime('2011-07-15T00:00:00')).property('to',datetime('2017-10-14T00:00:00')).
    property('role','Manager').
   addV('Job').property(id,'j-6').property('from',datetime('2012-03-23T00:00:00')).property('to',datetime('2013-11-01T00:00:00')).
    property('role','Associate Analyst').
   V('c-1').addE('LOCATION').to(V('l-1')).
   V('c-1').addE('LOCATION').to(V('l-2')).
   V('c-2').addE('LOCATION').to(V('l-3')). 
   V('p-1').addE('JOB').to(V('j-1')).
   V('j-1').addE('COMPANY').to(V('c-1')).
   V('j-1').addE('LOCATION').to(V('l-1')).                            
   V('p-2').addE('JOB').to(V('j-2')).
   V('j-2').addE('COMPANY').to(V('c-1')).
   V('j-2').addE('LOCATION').to(V('l-2')).                            
   V('p-3').addE('JOB').to(V('j-3')).
   V('j-3').addE('COMPANY').to(V('c-1')).
   V('j-3').addE('LOCATION').to(V('l-1')).
   V('p-4').addE('JOB').to(V('j-4')).
   V('j-4').addE('COMPANY').to(V('c-1')).
   V('j-4').addE('LOCATION').to(V('l-2')).                              
   V('p-5').addE('JOB').to(V('j-5')).
   V('j-5').addE('COMPANY').to(V('c-2')).
   V('j-5').addE('LOCATION').to(V('l-3')).
   V('p-3').addE('JOB').to(V('j-6')).
   V('j-6').addE('COMPANY').to(V('c-2')).
   V('j-6').addE('LOCATION').to(V('l-3'))

### Querying the data

#### Query 2 – Who worked for Example Corp, and at which locations, between 2015-2017?

To answer this question, we'll have to perform the following steps:

    1. Start at Company vertex
    2. Traverse to Job vertices
    3. Filter by date
    4. Traverse to Person and Location vertices

In [None]:
%%gremlin -p v,inv,e
g.V('c-1').
    in('COMPANY').
    or_(
        has('from', between(datetime('2015-01-01'),datetime('2018-01-01'))),
        has('to', between(datetime('2015-01-01'),datetime('2018-01-01')))
    ).
    bothE().
    otherV().
    not(cyclicPath()).
    path()

In [None]:
%%gremlin -p v,inv,e
g.V('c-1').
    in('COMPANY').
    or_(
        has('from', between(datetime('2015-01-01'),datetime('2018-01-01'))),
        has('to', between(datetime('2015-01-01'),datetime('2018-01-01')))
    ).
    project('person','location','job').
    by(in('JOB').values('firstName','lastName').fold()).
    by(out('LOCATION').values('name','address').fold()).
    by('role')

## Use case 3

**Who were in senior roles at the company where X worked?**
![Use case 3](usecase3.png)

Again, let us first drop the graph we created for use case 2 and create some revised sample data.

In [None]:
%%gremlin
g.V().hasLabel('Person').drop()

In [None]:
%%gremlin
g.V().hasLabel('Company').drop()

In [None]:
%%gremlin
g.V().hasLabel('Location').drop()

In [None]:
%%gremlin
g.V().hasLabel('Job').drop()

### Dataset
![Use case 3](usecase3-dataset.png)

In [None]:
%%gremlin
g.
   addV('Person').property(id,'p-1').property('firstName','Martha').property('lastName','Rivera').
   addV('Person').property(id,'p-2').property('firstName','Richard').property('lastName','Roe').
   addV('Person').property(id,'p-3').property('firstName','Li').property('lastName','Juan').
   addV('Person').property(id,'p-4').property('firstName','John').property('lastName','Stiles').
   addV('Person').property(id,'p-5').property('firstName','Saanvi').property('lastName','Sarkar').
   addV('Role').property(id,'r-1').property('name','Analyst').
   addV('Role').property(id,'r-2').property('name','Senior Analyst').
   addV('Role').property(id,'r-3').property('name','Principal Analyst').
   addV('Role').property(id,'r-4').property('name','Associate Analyst').
   addV('Role').property(id,'r-5').property('name','Manager').
   addV('Company').property(id,'c-1').property('name','Example Corp').
   addV('Company').property(id,'c-2').property('name','AnyCompany').
   addV('Location').property(id,'l-1').property('name','HQ').property('address','100 Main St, Anytown').
   addV('Location').property(id,'l-2').property('name','Offices').property('address','Downtown, Anytown').
   addV('Location').property(id,'l-3').property('name','Exchange').property('address','50 High St, Anytown').
   addV('Job').property(id,'j-1').property('from',datetime('2010-10-20')).property('to',datetime('2017-11-01')).
   addV('Job').property(id,'j-2').property('from',datetime('2011-02-16')).property('to',datetime('2013-09-17')).
   addV('Job').property(id,'j-3').property('from',datetime('2013-11-21')).property('to',datetime('2016-03-23')).
   addV('Job').property(id,'j-4').property('from',datetime('2015-02-02')).property('to',datetime('2018-02-08')).
   addV('Job').property(id,'j-5').property('from',datetime('2011-07-15')).property('to',datetime('2017-10-14')).
   addV('Job').property(id,'j-6').property('from',datetime('2012-03-23')).property('to',datetime('2013-11-01')).
   V('r-1').addE('PARENT_ROLE').to(V('r-2')).
   V('r-2').addE('PARENT_ROLE').to(V('r-3')).
   V('r-4').addE('PARENT_ROLE').to(V('r-5')).
   V('c-1').addE('LOCATION').to(V('l-1')).
   V('c-1').addE('LOCATION').to(V('l-2')).
   V('c-2').addE('LOCATION').to(V('l-3')). 
   V('p-1').addE('JOB').to(V('j-1')).
   V('j-1').addE('ROLE').to(V('r-3')).
   V('j-1').addE('COMPANY').to(V('c-1')).
   V('j-1').addE('LOCATION').to(V('l-1')).                            
   V('p-2').addE('JOB').to(V('j-2')).
   V('j-2').addE('ROLE').to(V('r-2')).
   V('j-2').addE('COMPANY').to(V('c-1')).
   V('j-2').addE('LOCATION').to(V('l-2')).                            
   V('p-3').addE('JOB').to(V('j-3')).
   V('j-3').addE('ROLE').to(V('r-1')).
   V('j-3').addE('COMPANY').to(V('c-1')).
   V('j-3').addE('LOCATION').to(V('l-1')).
   V('p-4').addE('JOB').to(V('j-4')).
   V('j-4').addE('ROLE').to(V('r-1')).
   V('j-4').addE('COMPANY').to(V('c-1')).
   V('j-4').addE('LOCATION').to(V('l-2')).                              
   V('p-5').addE('JOB').to(V('j-5')).
   V('j-5').addE('ROLE').to(V('r-5')).
   V('j-5').addE('COMPANY').to(V('c-2')).
   V('j-5').addE('LOCATION').to(V('l-3')).
   V('p-3').addE('JOB').to(V('j-6')).
   V('j-6').addE('ROLE').to(V('r-4')).
   V('j-6').addE('COMPANY').to(V('c-2')).
   V('j-6').addE('LOCATION').to(V('l-3'))

### Querying the data

#### Query 3 – Who were in senior roles at the companies where Li worked?

To answer this question, we'll have to perform the following steps:


    1. Start at the Person's vertex
    2. Follow JOB and ROLE edges to Roles
    3. Traverse up Role hierarchy
    4. For each parent Role:
        - Get associated Jobs
        - Filter Jobs by date
        - Get Role and Person details for each Job

In [None]:
%%gremlin -p v,oute,inv
g.V('p-3').
    out('JOB').as('li').
    out('ROLE').
    repeat(out('PARENT_ROLE')).until(outE().count().is(0)).
    in('ROLE').as('supervisor').
    or_(
        where('li', between('supervisor','supervisor')).by('from').by('from').by('to'),
        where('li', between('supervisor','supervisor')).by('to').by('from').by('to'),
        where('li', lte('supervisor').and(gt('supervisor'))).by('from').by('from').by('to').by('from')
    ).
        project('person','senior_role').
        by(in('JOB').values('firstName','lastName').fold()).
        by(out('ROLE').values('name').fold())

Clean up.

In [None]:
%%gremlin
g.V().hasLabel('Person').drop()

In [None]:
%%gremlin
g.V().hasLabel('Company').drop()

In [None]:
%%gremlin
g.V().hasLabel('Job').drop()

In [None]:
%%gremlin
g.V().hasLabel('Location').drop()

In [None]:
%%gremlin
g.V().hasLabel('Role').drop()

## References

1. https://github.com/aws-samples/amazon-neptune-samples/tree/master/gremlin/property-graph-data-modelling