Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion guides/structured-processing/load-doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ For this example, we'll use a sample document containing city demographic data t

1. **Access the Sample Document**
Open this Google Docs document containing city demographic data:
[https://docs.google.com/document/d/1_5KfhWL9fN3VuhANIVKuJX6MVBH3vFVW3RiUNYE9jHQ/edit](https://docs.google.com/document/d/1_5KfhWL9fN3VuhANIVKuJX6MVBH3vFVW3RiUNYE9jHQ)
[https://raw.githubusercontent.com/trustgraph-ai/example-data/main/cities/most-populous-cities.pdf](https://raw.githubusercontent.com/trustgraph-ai/example-data/main/cities/most-populous-cities.pdf)

2. **Save as PDF**
- In Google Docs, click **File** → **Download** → **PDF Document (.pdf)**
Expand Down Expand Up @@ -121,6 +121,12 @@ e.g. `Object extraction`.

## Step 3: Launch Document Processing

When loading the document on the workbench, it can help to decide to store
the data in a particular collection for later. Click the dialog
top right, and set the collection to `cities`.

<img src="set-collection.png" alt="Set collection option"/>

On the Library page, select your document containing city information,
click 'Submit' at the bottom of the screen.

Expand Down
10 changes: 7 additions & 3 deletions guides/structured-processing/load-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,9 @@ Before starting this guide, ensure you have:

## Data files you will need:

- [UK pies](https://drive.google.com/file/d/1u0DzP5bu15sSwnHldpZTVXUNoVo5DzFQ/view?usp=sharing)
- [French pies](https://drive.google.com/file/d/1xHBYLkrbB1NmJeeXNRlQUuQCQyPThuN-/view?usp=drive_link)
- [Pies Structured Descriptor Language](https://drive.google.com/file/d/1ALuMuwRy8m_hUk2Y_ftFLHK44TwhNUv3/view?usp=drive_link)
- [UK pies](https://raw.githubusercontent.com/trustgraph-ai/example-data/main/pies/uk-pies-simplified.xml)
- [French pies](https://raw.githubusercontent.com/trustgraph-ai/example-data/main/pies/fr-pies-simplified.xml)
- [Pies Structured Descriptor Language](https://raw.githubusercontent.com/trustgraph-ai/example-data/main/pies/pies-sdl.json)

## Step 1: Define a Schema

Expand Down Expand Up @@ -379,6 +379,10 @@ and collection load.

## Notes

- At the time of writing, the prompts work well at XML processing,
but we'll be optimising to work with smaller models and provide
better coverage of other data types. We recommend you stick with
XML data for TrustGraph 1.3.
- You may find that the prompts are sensitive to different LLMs, and
that you may see hallucinations or insensitivity to different data features.
- XPath expressions have some incompatibilities and edge cases with
Expand Down
Binary file added guides/structured-processing/nlp-query.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
87 changes: 86 additions & 1 deletion guides/structured-processing/query.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,46 @@ Before starting this guide, ensure you have:
- Python 3.10 or later with the TrustGraph CLI tools installed (`pip install trustgraph-cli`)
- Sample documents or structured data files to process

## Workbench

The Structured Query page on the workbench UI allows you to run the
queries we'll be running here. Make sure:

- You have set the collection parameter correctly in the session state
popover, top-right.
- Be sure to set a flow which has object processing enabled e.g. the
`obj-ex` flow which you created if you are following this guide.

<img src="nlp-query.png" alt="NLP query"/>

<img src="structured-query.png" alt="Structured query"/>

## NLP query operation

This operation takes a natural language query, and uses an LLM prompt
to convert to a GraphQL query. This uses defined schema, so you need
to have the pies schema loaded:
to have the schemas loaded in the previous guide steps.

This is a building block for more complete functionality, but it may
be useful for you to be able to look at converted queries to check that
your application is performing well.

```bash
tg-invoke-nlp-query -f obj-ex -q 'Cities with more than 22.8m people'
```

If successful the output is something like...

```
Generated GraphQL Query:
----------------------------------------
query { cities(where: {population: {gt: 22800000}}) { city country population } }
----------------------------------------
Detected Schemas: cities
Confidence: 95.00%
```

Querying the pies data:

```
tg-invoke-nlp-query -f obj-ex \
Expand All @@ -59,6 +94,31 @@ Confidence: 95.00%
This operation takes a GraphQL query, and executes it on the object
store.

City example:

```
tg-invoke-objects-query -f obj-ex --collection cities -q '
{
cities(where: {population: {gt: 22800000}}) { city country population }
}
'
```

```
+-----------+------------+------------+
| city | country | population |
+-----------+------------+------------+
| Shanghai | China | 30482140 |
| São Paulo | Brazil | 22990007 |
| Delhi | India | 34665569 |
| Tokyo | Japan | 37036204 |
| Dhaka | Bangladesh | 24652864 |
| Cairo | Egypt | 23074225 |
+-----------+------------+------------+
```

Pies example:

```
tg-invoke-objects-query -f obj-ex \
--collection uk-pies \
Expand Down Expand Up @@ -86,6 +146,28 @@ You can use `--format` to request CSV or JSON output.

This is an API which uses the above two operations in sequence.

Cities example:

```
tg-invoke-structure-query -f obj-ex --collection cities \
-q 'Cities with more than 22.8m people'
```

```
+-----------+------------+------------+
| city | country | population |
+-----------+------------+------------+
| Shanghai | China | 30482140 |
| São Paulo | Brazil | 22990007 |
| Delhi | India | 34665569 |
| Tokyo | Japan | 37036204 |
| Dhaka | Bangladesh | 24652864 |
| Cairo | Egypt | 23074225 |
+-----------+------------+------------+
```

Pies example:

```
tg-invoke-structured-query -f obj-ex \
--collection uk-pies \
Expand All @@ -107,6 +189,9 @@ You can use `--format` to request CSV or JSON output.

## With collections

Using the same schema with different collections allows you to group
data:

```
tg-invoke-structured-query -f obj-ex \
--collection fr-pies \
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added guides/structured-processing/structured-query.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.