Permalink
Show file tree
Hide file tree
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Age laod documentation (#31)
* added ageload documentation
- Loading branch information
1 parent
cbb3b55
commit 5bed73c5cdab7af74005e05af708ddce385e82a4
Showing
2 changed files
with
159 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@@ -14,6 +14,7 @@ Apache AGE's documentation | ||
intro/comparability | ||
intro/operators | ||
intro/aggregation | ||
intro/agload | ||
|
||
.. toctree:: | ||
:caption: Clauses | ||
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@@ -0,0 +1,158 @@ | ||
# Importing graph from files | ||
You can use the following instructions to create a graph from the files. This document explains | ||
- information about the current branch that includes the functions to load graphs from files | ||
- explanation of the functions that enable the creation of graphs from files | ||
- the structure of CSV files that load functions as input, do and do not. | ||
- A simple source code example to load countries and cities from the files. | ||
|
||
## Getting from the current branch | ||
The current implementation is available on the fork I have created and currently have not been merged with the main master branch. You can download the fork from the following. Please make sure that you download `create-graph-from-files-b1` branch and not some other branch as this has been specifically prepared for you. | ||
|
||
https://github.com/muhammadshoaib/incubator-age/tree/create-graph-from-files-b1 | ||
|
||
## Load Graph functions | ||
Following are the details about the functions to create vertices and edges from the file. | ||
|
||
function `load_labels_from_file` is used to load vertices from the CSV files. | ||
|
||
```sql | ||
load_labels_from_file('<graph name>', | ||
'<label name>', | ||
'<file path>') | ||
``` | ||
|
||
By adding the fourth parameter user can exclude the id field. *** Use this when there is no id field in the file*** | ||
|
||
```sql | ||
load_labels_from_file('<graph name>', | ||
'<label name>', | ||
'<file path>', | ||
false) | ||
``` | ||
|
||
Function `load_edges_from_file` can be used to load properties from the CSV file. Please see the file structure in the following. | ||
|
||
Note: make sure that ids in the edge file are identical to ones that are in vertices files. | ||
|
||
```sql | ||
oad_edges_from_file('<graph name>', | ||
'<label name>', | ||
'<file path>'); | ||
``` | ||
|
||
## Explanation about the CSV format | ||
Following is the explanation about the structure for CSV files for vertices and edges. | ||
|
||
- A CSV file for nodes shall be formatted aw following; | ||
|
||
| field name | Field description | | ||
| ---------- | ------------------------------------------------------------ | | ||
| id | it shall be the first column of the file and all values shall be a positive integer. This is an optional field when `id_field_exists` is ***false***. However, it should be present when `id_field_exists` is ***not*** set to false. | | ||
| Properties | all other columns contains the properties for the nodes. Header row shall contain the name of property | | ||
|
||
- Similarly, a CSV file for edges shall be formatted as follows | ||
|
||
| field name | Field description | | ||
| ----------------- | ------------------------------------------------------------ | | ||
| start_id | node id of the node from where the edge is stated. This id shall be present in nodes.csv file. | | ||
| start_vertex_type | class of the node | | ||
| end_id | end id of the node at which the edge shall be terminated | | ||
| end_vertex_type | Class of the node | | ||
| properties | properties of the edge. the header shall contain the property name | | ||
|
||
example files can be viewed at `regress/age_load/data` | ||
|
||
## Example SQL script | ||
|
||
- Load and create graph | ||
```sql | ||
LOAD 'age'; | ||
SET search_path TO ag_catalog; | ||
SELECT create_graph('agload_test_graph'); | ||
``` | ||
|
||
- Create label `country` and load vertices from csv file. *** Note this CSV file has id field *** | ||
|
||
```sql | ||
SELECT create_vlabel('agload_test_graph','Country'); | ||
SELECT load_labels_from_file('agload_test_graph', | ||
'Country', | ||
'age_load/countries.csv'); | ||
``` | ||
|
||
- Create label `City` and load vertices from csv file. *** Note this CSV file has id field *** | ||
|
||
```sql | ||
SELECT create_vlabel('agload_test_graph','City'); | ||
SELECT load_labels_from_file('agload_test_graph', | ||
'City', | ||
'age_load/cities.csv'); | ||
``` | ||
|
||
- Create label `has_city` and load edges from csv file. | ||
|
||
```sql | ||
SELECT create_elabel('agload_test_graph','has_city'); | ||
SELECT load_edges_from_file('agload_test_graph', 'has_city', | ||
'age_load/edges.csv'); | ||
``` | ||
|
||
- check if the graph has been loaded properly | ||
|
||
```sql | ||
SELECT table_catalog, table_schema, table_name, table_type | ||
FROM information_schema.tables | ||
WHERE table_schema = 'agload_test_graph'; | ||
SELECT COUNT(*) FROM agload_test_graph."Country"; | ||
SELECT COUNT(*) FROM agload_test_graph."City"; | ||
SELECT COUNT(*) FROM agload_test_graph."has_city"; | ||
SELECT COUNT(*) FROM cypher('agload_test_graph', $$MATCH(n) RETURN n$$) as (n agtype); | ||
SELECT COUNT(*) FROM cypher('agload_test_graph', $$MATCH (a)-[e]->(b) RETURN e$$) as (n agtype); | ||
``` | ||
|
||
### Creating vertices without id field in the file. | ||
|
||
- Create label `country` and load vertices from csv file. *** Note this CSV file has no id field *** | ||
|
||
```sql | ||
SELECT create_vlabel('agload_test_graph','Country2'); | ||
SELECT load_labels_from_file('agload_test_graph', | ||
'Country2', | ||
'age_load/countries.csv', | ||
false); | ||
``` | ||
|
||
- Create label `City` and load vertices from csv file. *** Note this CSV file has id field *** | ||
```sql | ||
SELECT create_vlabel('agload_test_graph','City2'); | ||
SELECT load_labels_from_file('agload_test_graph', | ||
'City2', | ||
'age_load/cities.csv', | ||
false); | ||
``` | ||
- check if the graph has been loaded properly and perform difference analysis between ids created automatically and picked from the files. | ||
|
||
- labels `country` and `city` were created with id field in the file | ||
- labels `country2` and `city2` were created with no id field in the file. | ||
```sql | ||
SELECT COUNT(*) FROM agload_test_graph."Country2"; | ||
SELECT COUNT(*) FROM agload_test_graph."City2"; | ||
SELECT id FROM agload_test_graph."Country" LIMIT 10; | ||
SELECT id FROM agload_test_graph."Country2" LIMIT 10; | ||
SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country {iso2 : 'BE'}) | ||
RETURN id(n), n.name, n.iso2 $$) as ("id(n)" agtype, "n.name" agtype, "n.iso2" agtype); | ||
SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country2 {iso2 : 'BE'}) | ||
RETURN id(n), n.name, n.iso2 $$) as ("id(n)" agtype, "n.name" agtype, "n.iso2" agtype); | ||
SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country {iso2 : 'AT'}) | ||
RETURN id(n), n.name, n.iso2 $$) as ("id(n)" agtype, "n.name" agtype, "n.iso2" agtype); | ||
SELECT * FROM cypher('agload_test_graph', $$MATCH(n:Country2 {iso2 : 'AT'}) | ||
RETURN id(n), n.name, n.iso2 $$) as ("id(n)" agtype, "n.name" agtype, "n.iso2" agtype); | ||
SELECT drop_graph('agload_test_graph', true); | ||
``` |