Make sure results are not empty in the Cypher tutorial

* Consistently use one-line headings for the GraphGist format. * Add testing around headings. * Add data so that no queries are performed on empty graphs. * Add new example for WITH.
neo4j · Oct 2, 2015 · 86ad848 · 86ad848
1 parent 195baee
commit 86ad848
Show file tree

Hide file tree

Showing 11 changed files with 110 additions and 64 deletions.
diff --git a/...l/cypher/cypher-docs/src/docs/graphgists/import/import-csv-with-cypher.asciidoc b/...l/cypher/cypher-docs/src/docs/graphgists/import/import-csv-with-cypher.asciidoc
@@ -1,5 +1,4 @@
-Importing CSV files with Cypher
-===============================
+= Importing CSV files with Cypher
 
 //file:movies.csv
 //file:roles.csv

diff --git a/manual/cypher/cypher-docs/src/docs/graphgists/intro/compose-statements.adoc b/manual/cypher/cypher-docs/src/docs/graphgists/intro/compose-statements.adoc
@@ -1,5 +1,20 @@
 = How to Compose Large Statements
 
+Let's first get some data in to retrieve results from:
+
+[source,cypher]
+----
+CREATE (matrix:Movie {title:"The Matrix",released:1997})
+CREATE (cloudAtlas:Movie {title:"Cloud Atlas",released:2012})
+CREATE (forrestGump:Movie {title:"Forrest Gump",released:1994})
+CREATE (keanu:Person {name:"Keanu Reeves", born:1964})
+CREATE (robert:Person {name:"Robert Zemeckis", born:1951})
+CREATE (tom:Person {name:"Tom Hanks", born:1956})
+CREATE (tom)-[:ACTED_IN {roles:["Forrest"]}]->(forrestGump)
+CREATE (tom)-[:ACTED_IN {roles:['Zachry']} ]->(cloudAtlas)
+CREATE (robert)-[:DIRECTED]->(forrestGump)
+----
+
 == Combine statements with UNION
 
 A Cypher statement is usually quite compact.
@@ -11,11 +26,11 @@ For instance if you want to list both actors and directors without using the alt
 
 [source,cypher]
 ----
-MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
-RETURN p,type(r) as rel,m
+MATCH (actor:Person)-[r:ACTED_IN]->(movie:Movie)
+RETURN actor.name AS name, type(r) AS acted_in, movie.title AS title
 UNION
-MATCH (p:Person)-[r:DIRECTED]->(m:Movie)
-RETURN p,type(r) as rel,m
+MATCH (director:Person)-[r:DIRECTED]->(movie:Movie)
+RETURN director.name AS name, type(r) AS acted_in, movie.title AS title
 ----
 
 //table
@@ -31,14 +46,23 @@ You use the `WITH` clause to combine the individual parts and declare which data
 `WITH` is very much like `RETURN` with the difference that it doesn't finish a query but prepares the input for the next part.
 You can use the same expressions, aggregations, ordering and pagination as in the `RETURN` clause.
 
-The only difference is that you _have to_ alias all columns as they would otherwise not be accessible with an identifier.
-Every column that you don't declare in your `WITH` clause is not available in subsequent query parts.
+The only difference is that you _must_ alias all columns as they would otherwise not be accessible.
+Only columns that you declare in your `WITH` clause is available in subsequent query parts.
+
+See below for an example where we collect the movies someone appeared in, and then filter out those which appear in only one movie.
+
+[source,cypher]
+----
+MATCH (person:Person)-[:ACTED_IN]->(m:Movie)
+WITH person, count(*) as appearances, collect(m.title) as movies
+WHERE appearances > 1
+RETURN person.name, appearances, movies
+----
+
+//table
 
 [TIP]
-If you want to filter by an aggregated value in SQL or simlilar languages you would have to use `HAVING`.
+If you want to filter by an aggregated value in SQL or similar languages you would have to use `HAVING`.
 That's a single purpose clause for filtering aggregated information.
 In Cypher, `WHERE` can be used in both cases.
 
-// example to go here
-
-
diff --git a/manual/cypher/cypher-docs/src/docs/graphgists/intro/data-structures.adoc b/manual/cypher/cypher-docs/src/docs/graphgists/intro/data-structures.adoc
@@ -1,16 +1,43 @@
 = Utilizing Data Structures
 
+//file:movies.csv
+//file:roles.csv
+//file:persons.csv
+//file:movie_actor_roles.csv
+
 Cypher can create and consume more complex data structures out of the box.
 As already mentioned you can create literal lists (`[1,2,3]`) and maps (`{name: value}`) within a statement.
 
-There is a number of functions that work with lists, from simple ones like `length(list)` that returns the size of a list to 
+There are a number of functions that work with lists.
+They range from simple ones like `size(list)` that returns the size of a list to `reduce`, which runs an expression against the elements and accumulates the results.
 
-// missing content here
+Let's first load a bit of data into the graph.
+If you want more details on how the data is loaded, see <<cypher-intro-importing-csv>>.
 
 [source,cypher]
 ----
-MATCH (m:Movie)<-[:ACTED_IN]-(a:Person)
-RETURN m.title as movie, collect(a.name)[0..5] as five_of_cast
+LOAD CSV WITH HEADERS FROM "movies.csv" AS line
+CREATE (m:Movie {id:line.id,title:line.title, released:toInt(line.year)});
+LOAD CSV WITH HEADERS FROM "persons.csv" AS line
+MERGE (a:Person {id:line.id}) ON CREATE SET a.name=line.name;
+LOAD CSV WITH HEADERS FROM "roles.csv" AS line
+MATCH (m:Movie {id:line.movieId})
+MATCH (a:Person {id:line.personId})
+CREATE (a)-[:ACTED_IN {roles:[line.role]}]->(m);
+LOAD CSV WITH HEADERS FROM "movie_actor_roles.csv" AS line FIELDTERMINATOR ";"
+MERGE (m:Movie {title:line.title}) ON CREATE SET m.released = toInt(line.released)
+MERGE (a:Person {name:line.actor}) ON CREATE SET a.born = toInt(line.born)
+MERGE (a)-[:ACTED_IN {roles:split(line.characters,",") }]->(m)
+----
+
+Now, let's try out data structures.
+
+To begin with, collect the names of the actors per movie, and return two of them:
+
+[source,cypher]
+----
+MATCH (movie:Movie)<-[:ACTED_IN]-(actor:Person)
+RETURN movie.title as movie, collect(actor.name)[0..2] as two_of_cast
 ----
 
 //table
@@ -26,9 +53,8 @@ There are list predicates to satisfy conditions for `all`, `any`, `none` and `si
 [source,cypher]
 ----
 MATCH path = (:Person)-->(:Movie)<--(:Person)
-WHERE all(r in rels(path) WHERE type(r) = 'ACTED_IN')
-AND any(n in nodes(path) WHERE n.name = 'Clint Eastwood')
-RETURN path
+WHERE any(n in nodes(path) WHERE n.name = 'Michael Douglas')
+RETURN extract(n IN nodes(path)| coalesce(n.name, n.title))
 ----
 
 //table
@@ -58,45 +84,30 @@ In a graph-query you can filter or aggregate collected values instead or work on
 ----
 MATCH (m:Movie)<-[r:ACTED_IN]-(a:Person)
 WITH m.title as movie, collect({name: a.name, roles: r.roles}) as cast
-RETURN movie, extract(c2 IN filter(c1 IN cast WHERE c1.name =~ "T.*") | c2.roles )
-----
-
-//table
-
-Cypher offers to create and consume more complex data structures out of the box. 
-As already mentioned you can create literal lists (`[1,2,3]`) and maps (`{name: value}`) within your statement.
-
-There is a number of functions to work with lists, from simple ones like `length(list)` that returns the size of a list to 
-
-[source,cypher]
-----
-MATCH (m:Movie)<-[:ACTED_IN]-(a:Person)
-RETURN m.title as movie, collect(a.name)[0..5] as five_of_cast
+RETURN movie, filter(actor IN cast WHERE actor.name STARTS WITH "M")
 ----
 
 //table
 
-You can also access individual elements or slices of a list quickly with `list[1]` or `list[5..-5]`.
-Other functions to access parts of a list are `head(list)`, `tail(list)` and `last(list)`.
-
 == Unwind Lists
 
 Sometimes you have collected information into a list, but want to use each element individually as a row.
 For instance, you might want to further match patterns in the graph.
 Or you passed in a collection of values but now want to create or match a node or relationship for each element.
 Then you can use the `UNWIND` clause to unroll a list into a sequence of rows again.
 
-For instance, a query to find the top 5-co-actors and then follow their movies and again list the cast for each of those movies:
+For instance, a query to find the top 3 co-actors and then follow their movies and again list the cast for each of those movies:
 
 [source,cypher]
 ----
-MATCH (a:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(colleague:Person)
-WITH colleague, count(*) as frequency, collect(distinct m) as movies
+MATCH (actor:Person)-[:ACTED_IN]->(movie:Movie)<-[:ACTED_IN]-(colleague:Person)
+WHERE actor.name < colleague.name
+WITH actor, colleague, count(*) AS frequency, collect(movie) AS movies
 ORDER BY frequency DESC
-LIMIT 5
-UNWIND movies as m
+LIMIT 3
+UNWIND movies AS m
 MATCH (m)<-[:ACTED_IN]-(a)
-RETURN m.title as movie, collect(a.name) as cast
+RETURN m.title AS movie, collect(a.name) AS cast
 ----
 
 //table

diff --git a/manual/cypher/cypher-docs/src/docs/graphgists/intro/loading-data.adoc b/manual/cypher/cypher-docs/src/docs/graphgists/intro/loading-data.adoc
@@ -11,12 +11,13 @@ Naturally in most cases you wouldn't want to write or generate huge statements t
 
 That process not only includes creating completely new data but also integrating with existing structures and updating your graph.
 
+[[cypher-intro-load-parameters]]
 == Parameters
 
 In general we recommend passing in varying literal values from the outside as named parameters.
 This allows Cypher to reuse existing execution plans for the statements.
 
-Of course you can also pass in parameters for data to be imported. 
+Of course you can also pass in parameters for data to be imported.
 Those can be scalar values, maps, lists or even lists of maps.
 
 In your Cypher statement you can then iterate over those values (e.g. with `UNWIND`) to create your graph structures.
@@ -42,6 +43,7 @@ FOREACH (role IN movie.cast |
 )
 ----
 
+[[cypher-intro-importing-csv]]
 == Importing CSV
 
 Cypher provides an elegant built-in way to import tabular CSV data into graph structures.
@@ -59,7 +61,7 @@ include::../../graphgists/intro/movies.csv[]
 
 [source,cypher]
 ----
-LOAD CSV WITH HEADERS FROM "movies.csv" AS line 
+LOAD CSV WITH HEADERS FROM "movies.csv" AS line
 CREATE (m:Movie {id:line.id,title:line.title, released:toInt(line.year)});
 ----
 
@@ -71,7 +73,7 @@ include::../../graphgists/intro/persons.csv[]
 
 [source,cypher]
 ----
-LOAD CSV WITH HEADERS FROM "persons.csv" AS line 
+LOAD CSV WITH HEADERS FROM "persons.csv" AS line
 MERGE (a:Person {id:line.id}) ON CREATE SET a.name=line.name;
 ----
 
@@ -83,7 +85,7 @@ include::../../graphgists/intro/roles.csv[]
 
 [source,cypher]
 ----
-LOAD CSV WITH HEADERS FROM "roles.csv" AS line 
+LOAD CSV WITH HEADERS FROM "roles.csv" AS line
 MATCH (m:Movie {id:line.movieId})
 MATCH (a:Person {id:line.personId})
 CREATE (a)-[:ACTED_IN {roles:[line.role]}]->(m);

diff --git a/manual/cypher/cypher-docs/src/docs/graphgists/uniqueness/uniqueness.asciidoc b/manual/cypher/cypher-docs/src/docs/graphgists/uniqueness/uniqueness.asciidoc
@@ -1,5 +1,4 @@
-Uniqueness
-==========
+= Uniqueness
 
 While pattern matching, Neo4j makes sure to not include matches where the same graph relationship is found multiple times in a single pattern.
 In most use cases, this is a sensible thing to do.

diff --git a/manual/cypher/cypher-docs/src/docs/intro/index.adoc b/manual/cypher/cypher-docs/src/docs/intro/index.adoc
@@ -42,10 +42,6 @@ include::../parsed-graphgists/intro/compose-statements.adoc[]
 
 :leveloffset: 2
 
-include::../parsed-graphgists/intro/data-structures.adoc[]
-
-:leveloffset: 2
-
 include::../parsed-graphgists/intro/labels.adoc[]
 
 //include::indexes-and-constraints.adoc[]
@@ -56,6 +52,10 @@ include::../parsed-graphgists/intro/loading-data.adoc[]
 
 :leveloffset: 2
 
+include::../parsed-graphgists/intro/data-structures.adoc[]
+
+:leveloffset: 2
+
 include::../parsed-graphgists/sql/cypher-vs-sql.asciidoc[]
 
 
diff --git a/manual/cypher/graphgist/src/main/java/org/neo4j/doc/cypherdoc/BlockType.java b/manual/cypher/graphgist/src/main/java/org/neo4j/doc/cypherdoc/BlockType.java
@@ -51,10 +51,9 @@ enum BlockType
         boolean isA( List<String> block )
         {
             int size = block.size();
-            return size > 0 && ( ( block.get( 0 )
-                    .startsWith( "=" ) && !block.get( 0 )
-                    .startsWith( "==" ) ) || size > 1 && block.get( 1 )
-                    .startsWith( "=" ) );
+            return size > 0 && 
+                ( ( block.get( 0 ).startsWith( "=" ) 
+                 && !block.get( 0 ).startsWith( "==" )));
         }
 
         @Override

diff --git a/manual/cypher/graphgist/src/main/java/org/neo4j/doc/cypherdoc/CypherDoc.java b/manual/cypher/graphgist/src/main/java/org/neo4j/doc/cypherdoc/CypherDoc.java
@@ -118,7 +118,7 @@ static List<Block> parseBlocks( String input )
         String[] lines = input.split( EOL );
         if ( lines.length < 3 )
         {
-            throw new IllegalArgumentException( "To little content, only "
+            throw new IllegalArgumentException( "Not enough content, only "
                                                 + lines.length + " lines." );
         }
         List<Block> blocks = new ArrayList<>();

diff --git a/manual/cypher/graphgist/src/test/java/org/neo4j/doc/cypherdoc/BlockTypeTest.java b/manual/cypher/graphgist/src/test/java/org/neo4j/doc/cypherdoc/BlockTypeTest.java
@@ -117,13 +117,22 @@ public void titleWithCharsToIgnore()
     }
 
     @Test
-    public void twoLineTitle()
+    public void ignore_second_level_heading()
     {
-        Block block = Block.getBlock( Arrays.asList( "Title here", "==========" ) );
-        assertThat( block.type, sameInstance( BlockType.TITLE ) );
+        Block block = Block.getBlock( Arrays.asList( "== Title here" ) );
+        assertThat( block.type, sameInstance( BlockType.TEXT ) );
         String output = block.process( state );
-        assertThat( output, containsString( "[[cypherdoc-title-here]]" ) );
-        assertThat( output, containsString( "= Title here =" ) );
+        assertThat( output, containsString( "== Title here" ) );
+    }
+
+    @Test
+    public void ignore_second_level_heading_with_id()
+    {
+        Block block = Block.getBlock( Arrays.asList( "[[my-id]]", "== Title here" ) );
+        assertThat( block.type, sameInstance( BlockType.TEXT ) );
+        String output = block.process( state );
+        assertThat( output, containsString( "[[my-id]]" ) );
+        assertThat( output, containsString( "== Title here" ) );
     }
 
     @Test

diff --git a/manual/cypher/graphgist/src/test/java/org/neo4j/doc/cypherdoc/CypherDocTest.java b/manual/cypher/graphgist/src/test/java/org/neo4j/doc/cypherdoc/CypherDocTest.java
@@ -54,11 +54,11 @@ public void fullDocumentBlockParsing() throws IOException
         assertThat( types, equalTo( Arrays.asList( BlockType.TITLE, BlockType.TEXT, BlockType.HIDE,
                 BlockType.SETUP, BlockType.CYPHER, BlockType.QUERYTEST, BlockType.TABLE, BlockType.GRAPH, BlockType.TEXT,
                 BlockType.OUTPUT, BlockType.PARAMETERS, BlockType.CYPHER, BlockType.QUERYTEST, BlockType.PROFILE,
-                BlockType.GRAPH_RESULT, BlockType.SQL, BlockType.SQL_TABLE ) ) );
+                BlockType.GRAPH_RESULT, BlockType.SQL, BlockType.SQL_TABLE, BlockType.TEXT ) ) );
     }
 
     @Test
-    public void toLittleContentBlockParsing()
+    public void notEnoughContentBlockParsing()
     {
         expectedException.expect( IllegalArgumentException.class );
         CypherDoc.parseBlocks( "x\ny\n" );

diff --git a/manual/cypher/graphgist/src/test/resources/hello-world.asciidoc b/manual/cypher/graphgist/src/test/resources/hello-world.asciidoc
@@ -57,3 +57,6 @@ VALUES(0)
 
 // sqltable
 
+[[my-id]]
+== Second level heading
+