Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
67 changes: 67 additions & 0 deletions site/content/3.10/arangograph/data-loader/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: Load your data into ArangoGraph
menuTitle: Data Loader
weight: 22
description: >-
Load your data into ArangoGraph and transform it into richly-connected graph
structures, without needing to write any code or deploy any infrastructure
archetype: chapter
---

The ArangoGraph Data Loader allows you to transform existing data from CSV file
formats into data that can be analyzed by the ArangoGraph platform.

You provide your data in CSV format, a common format used for exports of data
from various systems. Then, using a no-code editor, you can model the schema of
this data and the relationships between them. This allows you to ingest your
existing datasets into your ArangoGraph database, without the need for any
development effort.

You can get started in a few easy steps.

{{< tabs groupid="data-loader-steps" >}}

{{< tab name="1. Create database" >}}
Choose an existing database or create a new one and enter a name for your new graph.
{{< /tab >}}

{{< tab name="2. Add files" >}}
Drag and drop your data files in CSV format.
{{< /tab >}}

{{< tab name="3. Design your graph" >}}
Model your graph schema by adding nodes and connecting them via edges.
{{< /tab >}}

{{< tab name="4. Import data" >}}
Once you are ready, save and start the import. The resulting graph is an
[EnterpriseGraph](../../graphs/enterprisegraphs/_index.md) with its
corresponding collections, available in your ArangoDB web interface.
{{< /tab >}}

{{< /tabs >}}

Follow this [working example](../data-loader/example.md) to see how easy it is
to transform existing data into a graph.

## How to access the Data Loader

1. If you do not have a deployment yet, [create a deployment](../deployments/_index.md#how-to-create-a-new-deployment) first.
2. Open the deployment you want to load data into.
3. In the **Load Data** section, click the **Load your data** button.

![ArangoGraph Data Loader Overview](../../../images/arangograph-data-loader-overview.png)

## Other options to import data into ArangoGraph

To import data from various files into collections **without creating a graph**,
get the ArangoDB client tools for your operating system from the
[download page](https://arangodb.com/download-major/).

- To import data to ArangoGraph from an existing ArangoDB instance, see
[arangodump](../../components/tools/arangodump/) and
[arangorestore](../../components/tools/arangorestore/).
- To import pre-existing data in JSON, CSV, or TSV format, see
[arangoimport](../../components/tools/arangoimport/).
- To transfer data from an existing on-premises ArangoDB instance to your
ArangoGraph cluster, see the [cloud migration tool](../migrate-to-the-cloud.md).
60 changes: 60 additions & 0 deletions site/content/3.10/arangograph/data-loader/add-files.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
title: Add files into Data Loader
menuTitle: Add files
weight: 5
description: >-
Provide your set of files in CSV format containing the data to be imported
archetype: default
---

The Data Loader allows you to upload your data files in CSV format into
ArangoGraph and then use these data sources to design a graph using the
built-in graph designer.

## Upload your files

You can upload your CSV files in the following ways:

- Drag and drop your files in the designated area.
- Click the **Browse files** button and select the files you want to add.

![ArangoGraph Data Loader Upload Files](../../../images/arangograph-data-loader-upload-files.png)

You have the option to either upload several files collectively as a batch or
add them individually. Furthermore, you can supplement additional files later on.
After a file has been uploaded, you can expand it to preview both the header and
the first row of data within the file.

In case you upload CSV files without fields, they will not be available for
manipulation.

Once the files are uploaded, you can start [designing your graph](../data-loader/design-graph.md).

### File formatting limitations

Ensure that the files you upload are correctly formatted. Otherwise, errors may
occur, the upload may fail, or the data may not be correctly mapped.

The following restrictions and limitations apply:

- The only supported file format is CSV. If you submit an invalid file format,
the upload of that specific file will be prevented.
- It is required that all CSV files have a header row. If you upload a file
without a header, the first row of data is treated as the header. To avoid
losing the first row of the data, make sure to include headers in your files.
- The CSV file should have unique header names. It is not possible to have two
columns with the same name within the same file.

For more details, see the [File validation](../data-loader/import.md#file-validation) section.

### Upload limits

Note that there is a cumulative file upload limit of 1GB. This means that the
combined size of all files you upload should not exceed 1GB. If the total size
of the uploaded files surpasses this limit, the upload may not be successful.

## Delete files

You can remove uploaded files by clicking the **Delete file** button in the
**Your files** panel. Please keep in mind that in order to delete a file,
you must first remove all graph associations associated with it.
63 changes: 63 additions & 0 deletions site/content/3.10/arangograph/data-loader/design-graph.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
title: Design your graph
menuTitle: Design graph
weight: 10
description: >-
Design your graph database schema using the integrated graph modeler in the Data Loader
archetype: default
---

Based on the data you have uploaded, you can start designing your graph.
The graph designer allows you to create a schema using nodes and edges.
Once this is done, you can save and start the import. The resulting
[EnterpriseGraph](../../graphs/enterprisegraphs/_index.md) and the
corresponding collections are created in your ArangoDB database instance.

## How to add a node

Nodes are the main objects in your data model and include the attributes of the
objects.

1. To create a new node, click the **Add node** button.
2. In the graph designer, click on the newly create node to view the **Node details**.
3. In the **Node details** panel, fill in the following fields:
- For **Node label**, enter a name you want to use for the node.
- For **File**, select a file from the list to associate it with the node.
- For **Primary Identifier**, select a field from the list. This is used to
reference the nodes when you define relations with edges.
- For **File Headers**, select one or more attributes from the list.

![ArangoGraph Data Loader Add Node](../../../images/arangograph-data-loader-add-node.png)

## How to connect nodes

Nodes can be connected by edges to express and categorize the relations between
them. A relation always has a direction, going from one node to another. You can
define this direction in the graph designer by dragging your cursor from a
particular node to another.

To connect two nodes, you can use the **Connect node(s)** button. Click on any
node to self-reference it or drag to connect it to another node. Alternatively,
when you select a node, a plus sign will appear, allowing you to directly add a
new node with an edge.

The edge needs to be associated with a file and must have a label. Note that a
node and an edge cannot have the same label.

See below the steps to add details to an edge.

1. Click on an edge in the graph designer.
2. In the **Edit Edge** panel, fill in the following fields:
- For **Edge label**, enter a name you want to use for the edge.
- For **Relation file**, select a file from the list to associate it with the edge.
- To define how the relation points from one node to another, select the
corresponding relation file header for both the origin file (`_from`) and the
destination file (`_to`).
- For **File Headers**, select one or more attributes from the list.

![ArangoGraph Data Loader Edit Edge](../../../images/arangograph-data-loader-edit-edge.png)

## How to delete elements

To remove a node or an edge, simply select it in the graph designer and click the
**Delete** icon.
102 changes: 102 additions & 0 deletions site/content/3.10/arangograph/data-loader/example.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
title: Data Loader Example
menuTitle: Example
weight: 20
description: >-
Follow this complete working example to see how easy it is to transform existing
data into a graph and get insights from the connected entities
archetype: default
---

To transform your data into a graph, you need to have CSV files with entities
representing the nodes and a corresponding CSV file representing the edges.

This example uses a sample data set of two files, `airports.csv` and `flights.csv`
to create a graph showing flights arriving to and departing from various cities.

The `airports.csv` contains rows of airport entries, which are the future nodes
in your graph. The `flights.csv` contains rows of flight entries, which are the
future edges connecting the nodes.

The whole process can be broken down into these steps:

1. **Database and graph setup**: Begin by choosing an existing database or
create a new one and enter a name for your new graph.
2. **Add files**: Upload the CSV files to the Data Loader web interface. You can
simply drag and drop them or upload them through the file browser window.
3. **Design graph**: Design your graph schema by adding nodes and edges and map
data from the uploaded files to them. This allows creating the corresponding
documents and collections for your graph.
4. **Import data**: Import the data and start using your newly created
[EnterpriseGraph](../../graphs/enterprisegraphs/_index.md) and its
corresponding collections.

## Step 1: Create database and choose graph name

Start by creating a new database and adding a name for your graph.

![Data Loader Example Step 1](../../../images/arangograph-data-loader-example-choose-names.png)

## Step 2: Add files

Upload your CSV files to the Data Loader web interface. You can drag and drop
them or upload them via a file browser window.

![Data Loader Example Step 2](../../../images/arangograph-data-loader-example-add-files.png)

See also [Add files into Data Loader](../data-loader/add-files.md).

## Step 3: Design graph schema

Once the files are added, you can start designing the graph schema. This example
uses a simple graph consisting of:
- Two nodes (`origin_airport` and `destination_airport`)
- One directed edge going from the origin airport to the destination one
representing a flight

Click **Add node** to create the nodes and connect them with edges.

Next, for each of the nodes and edges, you need to create a mapping to the
corresponding file and headers.

For nodes, the **Node label** is going to be a node collection name and the
**Primary identifier** will be used to populate the `_key` attribute of documents.
You can also select any additional headers to be included as document attributes.

In this example, two node collections have been created (`origin_airport` and
`destination_airport`) and `AirportID` header is used to create the `_key`
attribute for documents in both node collections. The header preview makes it
easy to select the headers you want to use.

![Data Loader Example Step 3 Nodes](../../../images/arangograph-data-loader-example-map-nodes.png)

For edges, the **Edge label** is going to be an edge collection name. Then, you
need to specify how edges will connect nodes. You can do this by selecting the
*from* and *to* nodes to give a direction to the edge.
In this example, the `source airport` header has been selected as a source and
the `destination airport` header as a target for the edge.

![Data Loader Example Step 3 Edges](../../../images/arangograph-data-loader-example-map-edges.png)

Note that the values of source and target for the edge correspond to the
**Primary identifier** (`_key` attribute) of the nodes. In this case, it is the
airport code (i.e. GKA) used as the `_key` in the node documents and in the source
and destination headers to configure the edges.

See also [Design your graph in the Data Loader](../data-loader/design-graph.md).

## Step 4: Import and see resulting graph

After all the mapping is done, all you need to do is click
**Save and start import**. The report provides an overview of the files
processed and the documents created, as well as a link to your new graph.
See also [Start import](../data-loader/import.md).

![Data Loader Example Step 4 See your new graph](../../../images/arangograph-data-loader-example-data-import.png)

Finally, click **See your new graph** to open the ArangoDB web interface and
explore your new collections and graph.

![Data Loader Example Step 4 Resulting graph](../../../images/arangograph-data-loader-example-resulting-graph.png)

Happy graphing!
73 changes: 73 additions & 0 deletions site/content/3.10/arangograph/data-loader/import.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
title: Start the import
menuTitle: Start import
weight: 15
description: >-
Once the data files are provided and the graph is designed, you can start the import
archetype: default
---

Before starting the actual import, make sure that:
- You have selected a database for import or created a new one;
- You have provided a valid name for your graph;
- You have created at least one node;
- You have created at least one edge;
- You have uploaded at least one file;
- Every file is related to at least one node or edge;
- Every node and edge is linked to a file;
- Every node and edge has a unique label;
- Every node has a primary identifier selected;
- Every edge has an origin and destination file header selected.

To continue with the import, click the **Save and start import** button. The data
importer provides an overview showing results with the collections that have been
created with the data provided in the files.

To access your newly created graph in the ArangoDB web interface, click the
**See your new graph** button.

## File validation

Once the import has started, the files that you have provided are being validated.
If the validation process detects parsing errors in any of the files, the import
is temporarily paused and the validation errors are shown. You can get a full
report by clicking the **See full report** button.

At this point, you can:
- Continue with the import without addressing the errors. The CSV files will still
be included in the migration. However, the invalid rows are skipped and
excluded from the migration.
- Revisit the problematic file(s), resolve the issues, and then re-upload the
file(s) again.

{{< tip >}}
To ensure the integrity of your data, it is recommended to address all the errors
detected during the validation process.
{{< /tip >}}

### Validation errors and their meanings

#### Invalid Quotation Mark

This error indicates issues with quotation marks in the CSV data.
It can occur due to improper use of quotes.

#### Missing Quotation Marks

This error occurs when quotation marks are missing or improperly placed in the
CSV data, potentially affecting data enclosure.

#### Insufficient Data Fields

This error occurs when a CSV row has fewer fields than expected. It may indicate
missing or improperly formatted data.

#### Excessive Data Fields

This error occurs when a CSV row has more fields than expected, possibly due to
extra data or formatting issues.

#### Unidentifiable Field Separator

This error suggests that the parser could not identify the field separator
character in the CSV data.
Loading