# Neo4j
______________________

## NoSql

Dans les années 70, les bases de données relationnelles (propriétés ACID pour les transactions)peinent à lutter contre les 3V (Volume, Velocity, Variety).Une nouvelle façon de structurer, gérer, stocker et interroger les données est née: "Le NoSQL" une approche qui propose de relâcher certaines contraintes lourdes du relationnel.

#### Les propriétés BASE caractérisent les bases NoSQL :

1. Basically Available : quelle que soit la charge de la base de données (données/requêtes), le système garantie un taux de disponibilité de la donnée

2. Soft-state : La base peut changer lors des mises à jour ou lors d'ajout/suppression de serveurs. La base NoSQL n'a pas à être cohérente à tout instant

3. Eventually consistent : À terme, la base atteindra un état cohérent

Différentes familles de bases NoSQL existent : Clé/Valeur, colonnes, documents, graphes. Chacune de ces familles répond à des besoins très spécifiques.


## Famille de graphes

**Neo4j** : eBay, Cisco, UBS, HP, TomTom, The National Geographic Society

**OrientDB (Apache)** : Comcast, Warner Music Group, Cisco, Sky, United Nations, VErisign

**FlockDB (Twitter)** : Twitter

Idées d'applications reposant sur des bases orientées graphes : réseaux sociaux (recommandation, plus court chemin, cluster...), réseaux SIG (routes, réseau électrique, fret...), web social (Linked Data)

# Neo4j 
Neo4j est un système de gestion de base de données au code source libre basée sur les graphes, développé en Java par la société suédo-américaine **Neo technology**. Le produit existe depuis 2000.


Par construction la bd. est extrêmement performante car Neo4j pré-calcule les jointures au moment de l'écriture des données, comparativement aux bd.relationnelles qui calculent les jointures à la lecture en faisant appel aux Index et à la logique de clés. Ce qui fait de Neo4j une technologie adaptée à de larges ensembles de données connectées

**Cypher** est le langage informatique de requête orienté graphe utilisé par Neo4j (plus simple que SQL pour le traîtement des données connectées).

***Login to Neo4j sandbox***: sandbox.ne4j.com 

## Neo4j Editions

1. Community
3. Aura
3. Entreprise 
    + Neo4j Bloom: Application that's might be useful for product managers support staff or other not so technical that  offer query suggestions to the user. It interprets and executes natural language queries.


## Modèle
- Neo4j est une bd. orientée graphe.
- Neo4j définit 2 types d'objects dans le graphe: les **nœuds** et les **relations**.
- Les **labels** (type): servent à décrire l'objet du nœud et à créer des groupes /catégories de nœuds. Ils servent à filtrer les recherches lors de requêtes. Ils être ajoutés sans limite ou upprimés grâce à la fonction set. Par convention, les labels doivent utiliser la mise en forme Camel.
- Les **propriétés** : Les attributs, example: name, title, uri ou url.
- Les **Relations** sont une association unidirectionnelle entre 2 nœuds.

### Useful command

```Cypher
:clear

:help match

call db.schema.visualization()
```

```Cypher
// Return all movies
match (m: Movie)
return distinct m
limit 1
```

```Cypher
// Return the property title of a movie
match (m: Movie)
return m.title
limit 1
```

```Cypher
// Return relation comedy movies
match (m:Movie) -[:IN_GENRE]->(g:Genre)
where g.name = 'Comedy'
return m.title
```

 ```Cypher
// We do not specify the relation or the direction
match (n1)--(n2) 
return n1, n2 
limit 3
```

```Cypher
// We do not specify the direction
match (n1)-[r]-(n2) 
return n1, r, n2 
limit 3
```

```Cypher
// We specify the direction
match (n1)-[r]->(n2) 
return n1, r, n2 
limit 3
```

```Cypher
// Multiple relationships
match (n1)-[r: DIRECTED | ACTED_IN]->(n2) 
return n1, r, n2 
limit 3
 ```

### Filter on the propertises

```Cypher
match (p: Person {name: 'Tom Hanks', age: 26}) 
return p 
limit 1

// or 

match (p: Person) 
where p.name =  'Tom Hanks' and p.age = 26 and p.adress is not null
return p 
limit 1
```

```Cypher
match (p: Person) 
where p.born in [1995, 1997]
return p 
limit 1

match (p: Person) 
where not (p.born <=  and p.born >= 1997)
return p 
limit 1
``` 

## Regular expression 

Cypher supports filtering using regular expressions, which is inherited from the Java regular expressions.

You can match on regular expressions by using =~ 'regexp'

```Cypher
// Case sensitive
match (movie: Movie)
where movie.title =~ '.*The.*'
return toUpper(movie.title), toLower(movie.title)

// Case insensitive
match (movie: Movie)
where movie.title =~ '(?i).*The.*'
return trim(movie.title)
```

### STRING, LIST, NULL, MATH

```Cypher
// replace(current_string, search, replace_by)
return replace('oh my god', 'o', 'er')
```

```Cypher
// List
with [2, 454, 34] as l
return l[0]
```

```Cypher
// Returns NULL
WITH ( NULL + [1,2,3] ) AS result
RETURN result
```

```Cypher
// Returns NULL
WITH ( NULL OR false ) AS result
RETURN result
```

```Cypher
match (person: Person)
with (['address1', 'address 2'] + [person.address]) as str
return str

match (person: Person)
with (['address1', 'address 2'] + [person.address]) as str
return [address in str where address is not null | address] as str
```

```Cypher
// Returns [1, NULL, 2, NULL] 
WITH ( [1, NULL] + [2, NULL] ) AS result
RETURN result
```

```Cypher
// Returns NULL
WITH ( NULL IN [1,NULL,2,3] ) AS result
RETURN result
```

```Cypher
// Math
return ceil(2.3), floor(0.9), round(.59)
```


### LEFT OUTER JOIN

```Cypher 
// Kind of LEFT OUTER JOIN
match (movie: Movie)
optional match (director: Person)-[:Directed]->(movie)<-[:Acted_IN]-(director)
return movie.title, director.name
```

```Cypher
// (A:Person)-[has_contact]->(B:Person)->[has_contact]->(C:Person)
match (p1:Person)-[:HAS_CONTACT]->(p2:Person)-[:HAS_CONTACT]->(p3:Person)
where p1 <> p3
return p1, p2, p3
limit 5
```

### ORDER BY

```
match (actor: Person)-[role: ACTED_IN]->(movie: Movie)
where actor.title = 'Top Gun'
return actor.name as name, role.earnings as earnt
order by role.earnings DESC
skip 3 // skip the first 3
limit 3
```

### Aggregation

```Cypher
match (p: Person {name: 'Tom Crus'})-[role: ACTED_ID]->(m:Movie)
Return sum(m) as moviecount
```

```Cypher
// Actor with the highest average earnings per movie
match (actor: Person)-[role:ACTED_IN]->(movie: Movie)
return distinct actor.name as name, avg(role.earnings) as earnings
order by earnings
limit 1
```

```Cypher
// Nb job per group
match (j: Job)-[:Belongs_to]->(g: Group)
with g, count(j) as count_job
where g.name = 'G2'
set g.nb_jobs = count_job
return g

// Nb ref per group
match (j: Job)-[:Belongs_to]->(g: Group) 
with g, count(distinct j.ref) as nb_dq
set g.nb_dq = nb_dq
return g
```

## Create database

```Cypher
// Create a database
create database name
```

```Cypher
// Show database
show databases
```

```Cypher
// Drop database
drop database name
```

```Cypher
// show rules of our database
show all rules
```

## Create Nodes


```Cypher
// Delete all nodes
match (node)
detach delete node
```

```Cypher
// Create nodes
create (:Group{name: 'G1', measure: 'minute', nb_jobs: 0});
create (:Group{name: 'G2', measure:'minute', nb_jobs: 0});

create (:Job{ref: '1000', type: 'dataset', timing: 5, measure: 'minute', dependancy: True});
create (:Job{ref: '1000', type: 'dq', timing: 4, measure: 'minute', dependancy: True});

create (:Job{ref: '3000', type: 'dq', timing: 3, measure: 'minute', dependancy: False});

create (:Job{ref: '500', type: 'dq', timing: 5.8, measure: 'minute', dependancy: False});
create (:Job{ref: '500', type: 'dataset', timing: 3.8, measure: 'minute', dependancy: True});
```

```Cypher 
// Create a node that has a relation with itself
create (job:Job{ref: '10', type: 'dq', timing: 3, measure: 'minute', dependancy: False}) 
                -[:shared{periode: 1}]
                ->(job)
```

### Create relationships

```Cypher
match (job: Job), (group: Group)
where job.ef in ['3000', '500'] and group.name ='G2'
create (job)-[:Belongs_to]->(group);

// If the relation already exists, use MERGE, otherwise it will be added twice
match (job: Job), (group: Group)
where job.ref in ['3000', '500'] and group.name ='G2'
merge (job)-[r:Belongs_to]->(group)
set r.name = 'dependancy'
return job, group, r
```

```Cypher
match (job: Job), (group: Group)
where job.ref = '500' and job.type = 'dataset' and group.name ='G1'
create (job)-[:Belongs_to]->(group);

match (job: Job), (group: Group)
where job.ref = '1000' and group.name ='G1'
merge (job)-[:Belongs_to]->(group);
```

```Cypher
match (job1: Job), (job2: Job)
where job1.ref = job2.dq_ref and job1.type <> job2.type and job1.type IN ['dataset', 'lookup'] and job2.type = 'dq'
merge (job1)-[:Linked]->(job2)
```

## Update, add, remove a property

```Cypher
// Update a property
match (job: Job)
where job.dq_ref = '5014'
set job.dependancy = True
return job
```

```Cypher
// Add a new property
match (group: Group)
set group: nb_dq
return group
```

```Cypher
// Remove a property
match (job: Job)
remove job.drop
return job
```


### Delete Graph

```
// Method 1: 
match (node)-[relation]-()
delete node, relation

match (node_without_relation)
delete node_without_relation

// Method 2
match (node)
optional match (node)-[relation]-()
delete node, relation

// Method 3
match (node)
detach delete node
```

### Degree, path, shortenpath

```Cypher
// Degree
match path=((p:Person)-[:has_contact*3]->(other))
return path
limit 1
```

```Cypher
// Path length
match (p1: Person{name: 'Dehia'})
match (p2: Person{name: 'Celia'})
match path=((p1)-[:has_contact*]->(p2))
return length(path) as path_length
limit 1
```

```Cypher
// Shorten path
match (p1: Person{name: 'Dehia'})
match (p2: Person{name: 'Celia'})
match path = shortestPath((p1)-[:has_contact*..20]->(p2))
return path, length(path) as path_length
```

```Cypher
// all paths
match (p1: Person{name: 'Dehia'})
match (p2: Person{name: 'Celia'})
match path = allShortestPaths((p1)-[:has_contact*..20]->(p2))
return path, length(path) as path_length
```

