# Structured Data Loading

### We will walk through this live. The following is for reference if you get behind or want to recreate at a later date.


## Ingest Structured Data

### Getting Started With Data Importer

Neo4j Data Importer provides a streamlined process for bringing structured data into your graph database. Here's how to use this powerful tool. The Neo4j Aura console includes a dedicated Data Importer feature that allows you to transform tabular data into graph structures without writing code. This tool works well in quickly populating your knowledge graph with data from existing datasets.

### Import Structured Data

To import your data:

1. Navigate to **Import \> Data Importer** in the Neo4j Aura console.  
     
<div align="center">
  <img src="Images-Diagrams/11.png" width="400" />
</div>

2. Create a new graph model.  <p>

<div align="center">
  <img src="Images-Diagrams/12.png" width="400" />
</div>


3. A graph data model has been provided for your convenience. Note: Due to pathway differences between operating systems, please choose either [Mac](https://github.com/neo4j-product-examples/graphrag-ebook/blob/main/neo4j_importer_model_Mac_ix.json) or [Windows](https://github.com/neo4j-product-examples/graphrag-ebook/blob/main/neo4j_importer_model_Windows.json) data models.<p>

<div align="center">
  <img src="Images-Diagrams/13.png" width="400" />
</div>

   

4. Once you’ve loaded the provided data model, click Browse and navigate to the data folder in your repository, selecting both the [Asset\_Manager\_Holdings.csv](https://github.com/neo4j-product-examples/graphrag-ebook/blob/main/data/Asset_Manager_Holdings.csv) file and the [Company\_Filings.csv](https://github.com/neo4j-product-examples/graphrag-ebook/blob/main/data/Company_Filings.csv) files.
<div align="center">
  <img src="Images-Diagrams/14.png" width="400" />
</div>

7. Verify mappings.

**AssetManager**

Label: `AssetManager`  
Table: `Asset_Manager_Holdings.csv`  
Properties: `managerName`  
ID(key): `managerName`

**Company**

Label: `Company`  
Table: `Company_Filings.csv`  
Properties: `name`, `ticker`  
ID(key): `name`

**Document**

Label: `Document`  
Table: `Company_Filings.csv`  
Properties: `path`  *(this must match exactly* — *read below)*  
ID(key): `path`

**OWNS (relationship)**

Relationship Type: `OWNS`  
Table: `Asset_Manager_Holdings.csv`

Node ID Mapping:

From: Node - `AssetManager` 
  * ID: `managerName`
  * ID column: `managerName`

To: Node - `Company`
  * ID: `name`
  * ID column: `companyName`  

Properties: shares

**FILED (relationship)** - connects the structured and the unstructured data in this GraphRAG application.

Relationship Type: `FILED`  
Table:`Company_Filings.csv`

Node ID Mapping:

From: Node - `Company`
  * ID: `name`
  * ID column: `name`

To: Node - `Document`
  * ID: `path`
  * ID column: `path_Windows` or<br>
      &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`path_Mac_ix`
     
6. Once the files are connected, you’ll see that the data model has check marks for each entity and relationship. Click **Run Import** in the upper right-hand corner.

### Mapping Your Data to Graph Structures

To get you started, we’ve given you a full, completed data model for this exercise. When working with your own data, you’ll create these data model maps yourself. 

If you’d like to work with your own dataset, here’s how to get started.The Aura console provides a unified experience where you can manage your database instances, connect to diverse data sources, [import](https://neo4j.com/docs/aura/preview/import/introduction/) structured data, model graphs visually, [query](https://neo4j.com/docs/aura/preview/query/introduction/) your data with Cypher, [explore](https://neo4j.com/docs/aura/preview/explore/introduction/) your graph, and more.

When navigating to **Import \> New Data Sources**, you’re presented with many possible connectors. For our case, there are two CSVs in this dataset: [Asset\_Manager\_Holdings.csv](https://github.com/neo4j-product-examples/graphrag-ebook/blob/main/data/Asset_Manager_Holdings.csv) and [Company\_Filings.csv](https://github.com/neo4j-product-examples/graphrag-ebook/blob/main/data/Company_Filings.csv). 
<div align="center">
  <img src="Images-Diagrams/16.png" width="400" />
</div>

Once you’ve uploaded these CSV files, you’ll be given a choice as to how to proceed. Click **Define Manually** to begin building your data model.

First, you’ll see a blank node, and on the right-hand side, you’ll see the parameters for that node, including Label, Table, Properties.  
<div align="center">
  <img src="Images-Diagrams/17.png" width="400" />
</div>

**Label** refers to the type of node. **Table** points to the data source where the information is sourced (the tables you uploaded will appear on the left). **Properties** refer to the values you want associated with that node. Let’s start with the [Company\_Filings.csv](https://github.com/neo4j-product-examples/graphrag-ebook/blob/main/data/Company_Filings.csv).

**COMPANY NODE**

Label: `Company`  
Table: `Company_Filings.csv`  
Properties: `name`, `ticker`  
ID(key): `name`

You’ll also need to identify the unique ID property for that node, akin to the primary key, which in this case is the name of the company. This is done by clicking the key icon next to the property name**.**

<div align="center">
  <img src="Images-Diagrams/18.png" width="400" />
</div>

**DOCUMENT NODE**

Label: `Document`  
Table: `Company_Filings.csv`  
Properties: `path`  *(this must match exactly* — *read below)*  
ID(key): `path`

***CRITICAL STEP: Rename Your Path Column to*** `path`

The `kg_builder` has already created `Document` nodes using a `path` property. To correctly link companies to their documents, your imported data must use the exact same property name: `path`.

<div align="center">
  <img src="Images-Diagrams/19.png" width="400" />
</div>

⚠️ If you skip this renaming step, the relationship will NOT connect and your graph will be incomplete.

The CSV includes two columns with OS-specific paths:

* `path_Windows` for **Windows users**

* `path_Mac_ix` for **macOS/Linux users**

* Choose the appropriate column based on your operating system and **rename it to** `path` during import

Pick the column for your system:

1. Rename that column to exactly: `path` (lowercase, no quotes).  
2. Even though `Document` nodes already exist, we're now creating relationships between each Company and its corresponding Document. This connection bridges **structured** (`Company`) and **unstructured** (`Document`) data, enabling advanced retrieval and reasoning across your graph.

**ASSET MANAGER NODE**

Label: `AssetManager`  
Table: `Asset_Manager_Holdings.csv`  
Properties: `managerName`  
ID(key): `managerName`

<div align="center">
  <img src="Images-Diagrams/20.png" width="400" />
</div>

### Mapping Relationships

Relationships are created with the following criteria: 

* **Relationship Label:** Describes the type of connection between the entities. It is common practice in knowledge graphs for the relationships to be in **ALL\_CAPS** with no spaces.   
* **Table:** Has identifiers for each node type contained in it. It is the way we connect the two nodes.   
* **Node ID Mapping**: Maps the columns in the relevant table to the IDs of the pertinent nodes.   
* **Properties:** Adds information to a relationship or entity.


Next, let’s create connections between and among these entities. In our domain, the Asset Managers own stock in various companies. Here’s a sample from the [Asset\_Manager\_Holdings.csv](https://github.com/neo4j-product-examples/graphrag-ebook/blob/main/data/Asset_Manager_Holdings.csv):

| managerName | companyName | ticker | shares |
| :---- | :---- | :---- | :---- |
| ALLIANCEBERNSTEIN L.P. | AMAZON | AMZN | 50065439 |
| ALLIANCEBERNSTEIN L.P. | APPLE INC | AAPL | 28143032 |
| ALLIANCEBERNSTEIN L.P. | INTEL CORP | INTC | 5735993 |
| ALLIANCEBERNSTEIN L.P. | MCDONALDS CORP | MCD | 1201960 |
| ALLIANCEBERNSTEIN L.P. | MICROSOFT CORP | MSFT | 46541943 |

In a knowledge graph, we want to map the domain knowledge of structured data, which in this case is the Asset Managers’ ownership of stock in a given company. If entities are nouns, then relationships are verbs. So let’s create the relationship **OWNs** that goes from Asset Manager to Company.

1. Click on the AssetManager node. You’ll see a blue outline of the node:  
<div align="center">
  <img src="Images-Diagrams/21.png" width="200" />
</div>

2. Hover over the outline until it turns gray:

<div align="center">
  <img src="Images-Diagrams/22.png" width="200" />
</div>

3. Drag the outline of the AssetManager node to cover the Company node. When you release, you’ll see a new relationship arrow between them:  
<div align="center">
  <img src="Images-Diagrams/23a.png" height="200" /><img src="Images-Diagrams/23b.png" height="200" /><img src="Images-Diagrams/23c.png" height="200" />
</div>

Clicking on this arrow allows you to edit the parameters of the relationship.

**OWNS Relationship**

Relationship Type: `OWNS`  
Table: `Asset_Manager_Holdings.csv`

### Node ID Mapping  
From:  
* Node: AssetManager  
	* ID: managerName  
	* ID column: managerName

To:  
* Node: Company  
	* ID: name  
	* ID column: companyName  

Properties: shares


<div align="center">
  <img src="Images-Diagrams/24.png" height="400" />
</div>
   
The property `shares` represents the number of shares of the Company owned by the Asset Manager and for this book is an optional inclusion. Additional columns such as `value` or `sharevalue` are optional, as well. When working with your own data, it’s best to consider if that property will have value to your use case. Will you be asking to rank based on shares owned? Does the total value of the holding have relevance to your application? Additional information on data modeling can be found at [GraphAcademy](https://graphacademy.neo4j.com/courses/modeling-fundamentals/).

**FILED Relationship**  
Note that the relationship between **Company** and **Document** is the linchpin that connects the structured and the unstructured data in this GraphRAG application.

Relationship Type: `FILED`  
Table:`Company_Filings.csv`

**Node ID Mapping**
- **From:**
    - Node: `Company`
    - ID: `name`
    - ID column: `name`
- **To:**
    - Node: `Document`
    - ID: `path`
    - ID column: `path_Windows` or<br>
      &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`path_Mac_ix`

<div align="center">
  <img src="Images-Diagrams/25.png" width="400" />
</div>

As you see in the diagram above, each entity and relationship will have a green check mark when it has been properly mapped. Now you’re ready to run the import. Click the blue **Run import** button in the upper right corner of the screen.

<div align="center">
  <img src="Images-Diagrams/26.png" width="400" />
</div>

Now that your unstructured and structured data is loaded, you can use the [**Explore**](https://neo4j.com/docs/aura/preview/explore/introduction/) and [**Query**](https://neo4j.com/docs/aura/query/introduction/) functions to refine your graph structure and data to accurately represent your business domain. Use **Explore** to visualize and navigate your graph with Neo4j Bloom and **Query** to investigate the graph.

For a detailed walkthrough of graph data modeling, see [The Developer’s Guide: How to Build a Knowledge Graph](https://neo4j.com/whitepapers/developers-guide-how-to-build-knowledge-graph/).

## About [neo4j_importer_model.json](neo4j_importer_model.json)

The [neo4j_importer_model.json](neo4j_importer_model.json) file defines the data model and schema mapping used for importing structured data into Neo4j via the Neo4j Data Importer tool.



**Key features:**
- **Versioned Schema:** Specifies the schema and visualization versions for compatibility and reproducibility.
- **Node Definitions:** Describes node labels such as `Company`, including their properties (e.g., `name`, `ticker`), types, and whether they are nullable.
- **Visualization Layout:** Contains layout information to help visualize the graph structure in the Neo4j Data Importer UI.
- **Relationship Mapping:** Maps relationships between node types, ensuring correct linkage during import.

**Usage:**  
When uploading your CSV files to [Neo4j Data Importer](https://console.neo4j.io/), you can load this model to automatically configure the mapping of your data columns to the correct node labels, properties, and relationships in the Neo4j database. This streamlines the import process and ensures consistency with your knowledge graph design.

> **Tip:**  
> Always use the latest version of this file when updating your data model or importing new data to keep your graph schema in sync.