# **Writer Knowledge Graph**

Knowledge Graph is Writer’s implementation of retrieval-augmented generation (RAG), which allows a model’s responses to be improved with additional information in the form of files. This cookbook shows you how to manage knowledge graphs and the files associated with them.

## **Contents**

- [Introduction](#introduction)
- [Setup](#setup)
- [The `graphs` and `files` objects](#the-graphs-and-files-objects)
- [Creating a knowledge graph](#creating-a-knowledge-graph)
- [Getting the list of knowledge graphs](#getting-the-list-of-knowledge-graphs)
- [Retrieving a knowledge graph](#retrieving-a-knowledge-graph)
- [Updating a knowledge graph](#updating-a-knowledge-graph)
- [Deleting a knowledge graph](#deleting-a-knowledge-graph)
- [Uploading files to Writer](#uploading-files-to-writer)
- [Getting the list of files](#getting-the-list-of-files)
- [Retrieving a file](#retrieving-a-file)
- [Downloading a file](#downloading-a-file)
- [Deleting a file](#deleting-a-file)
- [Adding a file to a knowledge graph](#adding-a-file-to-a-knowledge-graph)
- [Deleting a file from a knowledge graph](#deleting-a-file-from-a-knowledge-graph)
- [For more information](#for-more-information)

## **Introduction**

### What is retrieval-augmented generation?

Retrieval-augmented generation (RAG) is a technique for improving the responses generated by large language models (LLMs) by augmenting them with additional data sources. When given a query or prompt, the LLM consults these additional data sources for relevant information. The information it retrieves from these sources is used to augment that input, which is then given to the LLM. The LLM then generates a response based on both the original query and the retrieved information.

The results produced are often more accurate and up-to-date, since it’s based on inforamtion beyond the model’s own training data, and the additional information often contains fewer “hallucinations” or false information. Because the source of the additional information is known, using RAG sometimes has the added benefit of producing results that are more transparent and explainable.

### What is Writer Knowledge Graph?

Writer Knowledge Graph is a form of RAG based on a graph database, which uses a graph structure to represent and store data. Unlike a traditional relational database where data is stored in tables, a graph database is built on nodes, which represent entities, and edges, which connect the nodes and represent the relationships between them.

A knowledge graph acts as a repository of information that you can fill by adding files to it. When you add a file to a knowledge graph, the graph processes the file, adding it to its graph database and incorporating its information. You can create multiple knowledge graphs, and each knowledge graph can have multiple files, allowing you to create knowledge graphs for specific topics, domains, or areas of expertise.

Knowledge graphs and files are managed using two different APIs:

1. The Knowledge Graph API allows you to manage knowledge graphs. You can get a list of the knowledge graphs currently in your account, as well as create, retrieve, update, and delete them. You can also add any file that you have uploaded to your Writer account to any of your knowledge graphs, and you can add the same file to more than one knowledge graph.
2. The file api allows you to manage files, which live in their own repository. You can get a list of the files currently in your account, as well as create, retrieve, update, and delete them. You can also download the contents of any file.

You can upload these file types to Writer:

- `csv`
- `doc` and `docx`
- `eml`
- `html`
- `pdf`
- `ppt` and `pptx`
- `srt`
- `txt`
- `xls` and `xlsx`

## **Setup**

### Dependencies

This notebook uses the following packages:

* `python-dotenv`: To load environment variables.
* `writer-sdk`: To access the Writer API.

Run the cell below ensure you have these packages.

In [None]:
%pip install -r requirements.txt -q

### Initialization

The cell below performs the initialization required for this notebook including the creation of an instance of the `Writer` object to interact with the LLM.

To create a Palmyra client object, you need an API key. [You can sign up for one for free](https://app.writer.com/aistudio/signup). 

Once you have an API key, we recommend that you store it as an environment variable in a `.env` file like so:

```
WRITER_API_KEY="{Your Writer API key goes here}"
```

When you instantiate the client with `client = Writer()`, the newly-created object will automatically look for an environment variable named `WRITER_API_KEY` and will complete the instantiation if an only if `WRITER_API_KEY` has been defined. This notebook uses the [python-dotenv](https://pypi.org/project/python-dotenv/) library to automatically define environment variables based on the contents of an `.env` file in the same directory.

The `Writer()` initializer method also has an `api_key` parameter that you can use like this...

```
client = Writer(api_key="{Your Writer API key goes here}")
```

...but we strongly encourage you not to leave API keys in your source code.

In [None]:
# Run this cell before running any other cells in this cookbook!

from writerai import Writer

# Load environment variables from .env file
%reload_ext dotenv
%dotenv

client = Writer()

<a id="the-graphs-and-files-objects"></a>
## **The `graphs` and `files` objects**

Now that you have a Writer client instance, it’s time to start working with knowledge graphs!

When working with knowledge graphs, you’ll be working with two properties of a Writer client instance:

1. The `graphs` property, which has methods for managing your knowledge graphs, and
2. The `files` property, which has methods for adding and removing files to knowledge graphs.

<a id="creating-a-knowledge-graph"></a>
## **Creating a knowledge graph**

To create a new knowledge graph, use the `graphs.create()` method. It has some named parameters — here are the two most useful ones:

<table width="66%">
    <tr>
        <th width="25%" style="background-color: #5551ff; color: #ffffff;">Named parameter</th>
        <th style="background-color: #5551ff; color: #ffffff;">Description</th>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>name</code> (required)</td>
        <td style="border: 1px solid #bfcbff;">
            The name of the graph.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>description</code> (optional)</td>
        <td style="border: 1px solid #bfcbff;">
            A description of the model.
        </td>
    </tr>
</table>

The code below creates a new knowledge graph named `My Knowledge Graph`.

In [None]:
my_graph = client.graphs.create(name="My Knowledge Graph")

`graphs.create()` returns a `GraphCreateResponse` object with the following properties:

<table width="66%">
    <tr>
        <th width="25%" style="background-color: #5551ff; color: #ffffff;">Property</th>
        <th style="background-color: #5551ff; color: #ffffff;">Description</th>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>id</code></td>
        <td style="border: 1px solid #bfcbff;">
            The ID of the newly-created graph. You’ll need this to call to refer to a graph when calling other
            <code>client.graphs</code> methods.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>created_at</code></td>
        <td style="border: 1px solid #bfcbff;">
            The date and time the graph was created.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>name</code></td>
        <td style="border: 1px solid #bfcbff;">
            The name of the graph.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>description</code></td>
        <td style="border: 1px solid #bfcbff;">
            The description of the graph. This is an optional property; if the graph has no description, this value is `None`.
        </td>
    </tr>
</table>

<a id="getting-the-list-of-knowledge-graphs"></a>
## **Getting the list of knowledge graphs**

To get a list of your knowledge graphs, use the `graphs.list()` method:

In [None]:
print("My knowledge graphs")
print("===================")
for index, graph in enumerate(client.graphs.list(), start=1):
    print(f"{index}. {graph.name} (id: {graph.id})")

`graphs.list()` returns a list of `Graph` objects with the following properties:

<table width="66%">
    <tr>
        <th width="25%" style="background-color: #5551ff; color: #ffffff;">Property</th>
        <th style="background-color: #5551ff; color: #ffffff;">Description</th>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>id</code></td>
        <td style="border: 1px solid #bfcbff;">
            The ID of the graph. You’ll need this to call to refer to a graph when calling other
            methods of the client’s <code>graphs</code> property.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>created_at</code></td>
        <td style="border: 1px solid #bfcbff;">
            The date and time the graph was created.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>file_status</code></td>
        <td style="border: 1px solid #bfcbff;">
            A `FileStatus` object describing files associated with the graph. It has the following properties:
            <ul>
                <li><code>completed</code>: Number of files that have successfully been associated with the graph.</li>
                <li><code>failed</code>: Number of files for which the attempt to associate them with the graph failed.</li>
                <li><code>in_progress</code>: Number of files that are in the process of being associated with the graph.</li>
                <li><code>total</code>: The total of <code>completed</code>, <code>failed</code>, and <code>in_progress</code>.</li>
            </ul>
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>name</code></td>
        <td style="border: 1px solid #bfcbff;">
            The name of the graph.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>description</code></td>
        <td style="border: 1px solid #bfcbff;">
            The description of the graph. This is an optional property; if the graph has no description, this value is <code>None</code>.
        </td>
    </tr>
</table>

<a id="retrieving-a-knowledge-graph"></a>
## **Retrieving a knowledge graph**

Retrieving a knowledge graph is done with the `graphs.retrieve()` method, which gets the `Graph` object representing that graph. The `Graph` object contains the properties listed in the table above.

The code below retrieves the knowledge graph with the `id` value of `{Graph ID goes here}`:

In [None]:
graph = client.graphs.retrieve("{Graph ID goes here}")

<a id="updating-a-knowledge-graph"></a>
## **Updating a knowledge graph**

The `graphs.update()` method can update a couple of properties of any graph:

1. Its `name` (required)
2. Its `description`

Even you want to only update a know

The code below updates the knowledge graph with the `id` value of `{Graph ID goes here}`, updates its name to `New Graph Name`, and updates its description to `This is an updated description.`:


In [None]:
updated_graph = client.graphs.update(
    id="{Graph ID goes here}",
    name="New Graph Name",
    description="This is an updated description."
)

<a id="deleting-a-knowledge-graph"></a>
## **Deleting a knowledge graph**

To delete a knowledge graph, use the `graphs.delete()` method, providing it with the `id` of the graph to be deleted.

The code below deletes the knowledge graph with the `id` value of `{Graph ID goes here}`:

In [None]:
deleted_graph = client.graphs.delete("{Graph ID goes here}")

`graphs.delete()` returns a `GraphDeleteResponse` object with the following properties:

<table width="66%">
    <tr>
        <th width="25%" style="background-color: #5551ff; color: #ffffff;">Property</th>
        <th style="background-color: #5551ff; color: #ffffff;">Description</th>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>id</code></td>
        <td style="border: 1px solid #bfcbff;">
            The ID of the graph that was specified for deletion.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>deleted</code></td>
        <td style="border: 1px solid #bfcbff;">
            <code>True</code> if the graph was successfully deleted.
        </td>
    </tr>
</table>

<a id="uploading-files-to-writer"></a>
## **Uploading files to Writer**

Before you can add a file to a knowledge graph, you first need to upload it to Writer.

The code below defines a couple of convenience functions:

- `upload_file()`, a convenient function that uses the `files.upload()` method to upload an individual file to Writer, and
- `upload_files()`, which uses `upload_file` to upload a list of files to Writer.

In [None]:
import os

def upload_file_to_writer(file_path, client):
    """
    Uploads a single file to Writer (specified by pathname)
    and returns its id.
    """
    # Open and read the file's contents
    with open(file_path, 'rb') as file_obj:
        file_contents = file_obj.read()

    # Upload the file
    file = client.files.upload(
        content=file_contents,
        content_disposition=f"attachment; filename={os.path.basename(file_path)}",
        content_type="application/octet-stream",
    )

    return file.id

def upload_files_to_writer(file_paths, client):
    """
    Uploads a list of files to Writer (specified by pathnames)
    and returns a corresponding list of ids.
    """
    file_ids = []
    
    for file_path in file_paths:
        file_ids.append(upload_file_to_writer(file_path, client))

    return file_ids

Here’s an example that shows `upload_files()` (which uses `upload_file()`) in action. It uploads two files to Writer:

In [None]:
files = [
    "./neal-stephenson--in-the-beginning-was-the-command-line.pdf",
    "./neal-stephenson--the-interface-culture.pdf"
]        
file_ids = upload_files_to_writer(files, client)

Note that uploading a file to Writer and adding it to a graph are _not_ the same thing. Uploading a file simply makes it available to be added to a graph, which is a separate process (see _Adding a file to a knowledge graph_, below).

<a id="getting-the-list-of-files"></a>
## **Getting the list of files**

The `files.list()` method lists the files you have uploaded to Writer:

In [None]:
print("My files")
print("========")
for index, file in enumerate(client.files.list(), start=1):
    print(f"{index}. {file.name} (id: {file.id})")

`files.list()` returns a list of `File` objects with the following properties:

<table width="66%">
    <tr>
        <th width="25%" style="background-color: #5551ff; color: #ffffff;">Property</th>
        <th style="background-color: #5551ff; color: #ffffff;">Description</th>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>id</code></td>
        <td style="border: 1px solid #bfcbff;">
            The ID of the file. You’ll need this to call to refer to a file when calling other
            methods of the client’s <code>files</code> property.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>created_at</code></td>
        <td style="border: 1px solid #bfcbff;">
            The date and time the file was created (on Writer’s system — technically, it’s the date
            and time the file was <em>uploaded</em>).
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>graph_ids</code></td>
        <td style="border: 1px solid #bfcbff;">
            A list of strings representing the IDs of the graphs to which the file has been added.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>name</code></td>
        <td style="border: 1px solid #bfcbff;">
            The name of the file.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>status</code></td>
        <td style="border: 1px solid #bfcbff;">
            The status of the file. Possible values are:
            <ul>
                <li><code>completed</code></li>
                <li><code>failed</code></li>
                <li><code>in_progress</code></li>
            </ul>
        </td>
    </tr>
</table>

<a id="retrieving-a-file"></a>
## **Retrieving a file**

Retrieving a file is done with the `files.retrieve()` method, which gets the `File` object representing that graph. The `File` object contains the properties listed in the table above.

The code below retrieves the file with the `id` value of `{File ID goes here}`:

In [None]:
file = client.files.retrieve("{File ID goes here}")
print(file)

<a id="downloading-a-file"></a>
## **Downloading a file**

Downloading a file downloads a file in your Writer account’s file storage into memory. You do this with the `files.download()` method.

`files.download()` returns an object with a `read()` method that returns the downloaded file as bytes encoded in UTF-8 format. They’re easily converted into a standard Python string, which you can then write as a text file. 

The code below defines `download_file_from_writer()`, a convenience function that takes a file’s `id` value, a filename, and a Writer client instance. It downloads the file with given `id` and writes it as a text file to the local filesystem under the given filename:

In [None]:
def download_file_from_writer(file_id, filename, client):
    """
    Downloads a file from Writer and saves it to the specified filename.
    """
    file_bytes = client.files.download(file_id)
    file_string = file_bytes.read().decode("utf-8")
    with open(filename, "w") as file:
        file.write(file_string)

<a id="deleting-a-file"></a>
## **Deleting a file**

To delete a file, use the `files.delete()` method, providing it with the `id` of the file to be deleted.

The code below deletes the file with the `id` value of `{File ID goes here}`:

In [None]:
deleted_file = client.files.delete("{File ID goes here}")

`file.delete()` returns a `FileDeleteResponse` object with the following properties:

<table width="66%">
    <tr>
        <th width="25%" style="background-color: #5551ff; color: #ffffff;">Property</th>
        <th style="background-color: #5551ff; color: #ffffff;">Description</th>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>id</code></td>
        <td style="border: 1px solid #bfcbff;">
            The ID of the file that was specified for deletion.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>deleted</code></td>
        <td style="border: 1px solid #bfcbff;">
            <code>True</code> if the file was successfully deleted.
        </td>
    </tr>
</table>

<a id="adding-a-file-to-a-knowledge-graph"></a>
## **Adding a file to a knowledge graph**

Given a knowledge graph `id` and a file `id`, `graphs.add_file()` adds the given file to the given graph, making the file’s content available to the graph.

The following code adds 

In [None]:
client.graphs.add_file_to_graph(
    graph_id="{Graph ID goes here}", 
    file_id="{File ID goes here}"
)

Note that attempting to add a file to a knowledge graph that already has that file will result in an error.

`files.add_file_to_graph()` returns a list of objects with the following properties:

<table width="66%">
    <tr>
        <th width="25%" style="background-color: #5551ff; color: #ffffff;">Property</th>
        <th style="background-color: #5551ff; color: #ffffff;">Description</th>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>id</code></td>
        <td style="border: 1px solid #bfcbff;">
            The ID of the file.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>created_at</code></td>
        <td style="border: 1px solid #bfcbff;">
            The date and time the file was added to the knowledge graph.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>name</code></td>
        <td style="border: 1px solid #bfcbff;">
            The name of the file.
        </td>
    </tr>
    <tr>
        <td style="border: 1px solid #bfcbff;"><code>status</code></td>
        <td style="border: 1px solid #bfcbff;">
            <p>The status of the file. Possible values are:</p>
            <ul>
                <li><code>completed</code></li>
                <li><code>failed</code></li>
                <li><code>in_progress</code></li>
            </ul>
        </td>
    </tr>
</table>

## **For more information**

For more information about knowledge graphs, the `graphs` object, and its methods, see:

- [The _Knowledge Graph_ guide](https://dev.writer.com/api-guides/knowledge-graph)
- The Knowledge Graph API pages:
  - [Create graph](https://dev.writer.com/api-guides/api-reference/kg-api/create-graph)
  - [List graphs](https://dev.writer.com/api-guides/api-reference/kg-api/list-graphs)
  - [Retrieve graph](https://dev.writer.com/api-guides/api-reference/kg-api/retrieve-graph)
  - [Update graph](https://dev.writer.com/api-guides/api-reference/kg-api/update-graph)
  - [Delete graph](https://dev.writer.com/api-guides/api-reference/kg-api/delete-graph)
  - [Add file to graph](https://dev.writer.com/api-guides/api-reference/kg-api/add-file-to-graph)
  - [Remove file from graph](https://dev.writer.com/api-guides/api-reference/kg-api/remove-file-from-graph)
- The File API pages:
  - [Upload file]()
  - [Retrieve file]()
  - [List files]()
  - [Download file]()
  - [Delete file]()
  - [Retry failed files]()
- [_Graph File Upload Tool_ tutorial](https://dev.writer.com/api-guides/api-tutorials/file-upload-tool)