Goals


*   Convert a document to a knowledge graph
*   Merge knowlege graph



Architecture


*   Input of Prompts or Documents
*   List item



To Do

* Handle prompts and documents in STM
* Convert LTM from flat to a hierarchy
* How do I keep track of the source of an entity? What if it has multiple sources?
* Process all the nodes with LTMType of BaseType and look to form type groups
* Process all the nodes under the same LTMType of BaseType and see if any need to be merged.
* Turn off the merging in JSON and let it all occur in neo4j
* Find anomlies - statistics / ML Based

In [None]:
# Install the necessary libraries
!pip install python-docx
!pip install neo4j
!pip install python-dotenv

# Import the libraries
import docx  # Used for working with .docx files.
import os  # Provides a way to interact with the operating system.
import google.generativeai as genai  # Google’s Generative AI library.
from google.colab import userdata  # Manages user data in Google Colab.
import json  # Parses and manipulates JSON data.
from neo4j import GraphDatabase  # Interacts with the Neo4j graph database.
import re  # Supports working with regular expressions.
import textwrap  # Formats text by wrapping it to a specified width.
import dotenv

# Documentation for the libraries
"""
    python-docx: Library for creating, modifying, and reading .docx files.
    os: Standard library for interacting with the operating system, including file operations.
    google.generativeai: Library for interacting with Google’s Generative AI services.
    google.colab: Library for handling user data in Google Colab environment.
    json: Standard library for parsing and manipulating JSON data.
    neo4j: Library for interacting with Neo4j graph database.
    re: Standard library for working with regular expressions.
    textwrap: Standard library for formatting text by wrapping it to a specified width.
"""




'\n    python-docx: Library for creating, modifying, and reading .docx files.\n    os: Standard library for interacting with the operating system, including file operations.\n    google.generativeai: Library for interacting with Google’s Generative AI services.\n    google.colab: Library for handling user data in Google Colab environment.\n    json: Standard library for parsing and manipulating JSON data.\n    neo4j: Library for interacting with Neo4j graph database.\n    re: Standard library for working with regular expressions.\n    textwrap: Standard library for formatting text by wrapping it to a specified width.\n'

In [None]:
# Mount Google Drive to access the file
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
class GeminiInitializationError(Exception):
    pass

class GeminiQueryError(Exception):
    pass

class GeminiResponseBlockedError(Exception):
    pass

class ParagraphsTooLong(Exception):
    pass

In [None]:
class Mind:
  INDENT = 4                    # The numer of characters to indent the json submitted to Gemini.
  MAX_VARIABLE_LENGTH = 2 ** 16 # Adjust as needed.  Consider the context window of your chosen model.
  PARAGRAPH_SET = 50            # The maximum number of paragraphs to process in a set.
  ITERATIONS = 3                # The number of times to process the data
  STM_MAX = 10000               # Once STM reaches this size it will be consolidated to LTM

  Q_KG = '''Create a separate knowledge graph for each named entity you find in the python variable.
            If you find an entity and know its name but not its type, use "Unknown" for the type.
            If you find an entity and know its type but not its name, use "Unknown" for the name.
            If you find a relationship and know the entity names but not the relationship type, use "Unknown" for the type.
            Do not include explanations of what you did. Use the schema to format your response.
            Do not include the schema in your response. Just include the json in your response.
            The knowledge graph must be in the following json schema format.
            {
              "$schema": "https://json-schema.org/draft/2020-12/schema",
              "$id": "https://example.com/product.schema.json",
              "title": "Frames in JSON.",
              "description": "An approach to use JSON for data in a Frame structure. See A Framework for Representing Knowlege, Marvin Minsky, 1974.",
              "type": "array",
              "entities": {
                "type": "object",
                "properties": {
                  "entityName": { "type": "string" },
                  "entityType": { "type": "string" },
                  "properties": {
                    "type": "array",
                    "items": {
                      "type": "object",
                      "properties": {
                        "propertyName": { "type": "string" },
                        "propertyValue": { "type": "string" }
                      },
                      "required": ["propertyName", "propertyValue"]
                    }
                  },
                  "relationships": {
                    "type": "array",
                    "items": {
                      "type": "object",
                      "properties": {
                        "relationshipType": { "type": "string" },
                        "relationshipTarget": {
                          "type": "object",
                          "properties": {
                            "entityName": { "type": "string" },
                            "entityType": { "type": "string" }
                          },
                          "required": ["entityName", "entityType"]
                        }
                      },
                      "required": ["relationshipType", "relationshipTarget"]
                    }
                  }
                },
                "required": ["entityName", "entityType", "properties", "relationships"]
              }
            }'''
  Q_PC = "How many paragraphs are there in this content?"
  Q_MPC = "Review the content and assume that the estimates are correct. Then add them all togeather for a total count of paragraphs."
  Q_WC = "How many words are there in this content?"
  Q_MWC = "Review the content and assume that the estimates are correct. Then add them all togeather for a total count of words."
  Q_TW = "What are the ten most used words in this content and how often does each occur? Ignore function words for this count."
  Q_MTW = "Review the content and assume that the estimates are correct. Then merge them all togeather for a summarized list of words sorted by frequency."


  def __init__(self,neo4j_file_path):
    load_status = dotenv.load_dotenv(neo4j_file_path)
    if load_status is False:
      raise RuntimeError('Environment variables not loaded.')

    URI = os.getenv("NEO4J_URI")
    AUTH = (os.getenv("NEO4J_USERNAME"), os.getenv("NEO4J_PASSWORD"))

    self.driver = GraphDatabase.driver(URI, auth=AUTH)
    self.ShortTermMemory = []
    self.LongTermMemory = []

  def text_to_stm(self, variable, question, api_key=None, model_name="gemini-1.5-flash", max_output_tokens=None, temperature=None,  additional_context=None, doc_path=None, file_name=None, chunk=None):
    """
    Asks Gemini a question about the contents of a Python variable.

    Args:
        variable: The Python variable to analyze.
        question: The question to ask Gemini about the variable.
        api_key: (Optional) Your Gemini API key.  If not provided, it should be
                  available via genai.configure().
        model_name: (Optional) The Gemini model to use (e.g., "gemini-1.5-pro", "gemini-1.0-pro").  Defaults to "gemini-1.5-pro".
        max_output_tokens: (Optional) Maximum number of tokens in the response.
        temperature: (Optional) Controls the randomness of the response (0.0 - 1.0).  Lower values are more deterministic.
        additional_context: (Optional) A string containing additional information relevant to the variable or question.

    Returns:
        The Gemini API's response as a string, or an error message.

    Raises:
        GeminiInitializationError: If there's an issue initializing the Gemini model.
        GeminiQueryError: If there's an error querying the Gemini API.
        GeminiResponseBlockedError: If the Gemini response is blocked due to content policy.
    """

    if api_key:
        genai.configure(api_key=api_key)


    try:
        model = genai.GenerativeModel(model_name)
    except Exception as e:
      raise GeminiInitializationError(f"Error initializing Gemini model: {str(e)}. Ensure your API key is configured and you have access to the specified model.")

    # --- Improved Prompt Construction ---
    prompt_parts = []
    prompt_parts.append("You are a helpful assistant that can analyze Python variables that contain paragraphs of text and answer questions about them.")

    if additional_context:
        prompt_parts.append(f"Additional Context:\n{additional_context}\n")

    if doc_path:
        prompt_parts.append(f"For each named entity you find include a relationship with relationship type of Source, entity type of Document Path, and entity name of {doc_path}\n")

    if file_name:
        prompt_parts.append(f"For each named entity you find include a relationship with relationship type of Source, entity type of File Name, and entity name of {file_name}\n")

    if chunk:
        prompt_parts.append(f"For each named entity you find include a relationship with relationship type of Source, entity type of Chunk, and entity name of {chunk}\n")

    prompt_parts.append(f"Analyze the following Python variable and answer the question below:")

    # --- Improved Variable Handling ---

    # Use a safer representation:  json.dumps is generally better for complex objects.
    try:
        variable_str = json.dumps(variable, indent=self.INDENT, default=str)  # Use default=str to handle non-serializable objects
    except TypeError:
        variable_str = repr(variable) # Fallback to repr if json fails

    # Limit variable size to prevent long prompts
    variable_str = textwrap.shorten(variable_str, width=self.MAX_VARIABLE_LENGTH, placeholder="... (truncated)")

    prompt_parts.append(f"Variable (Python):\n`python\n{variable_str}\n`")

    prompt_parts.append(f"Question:\n{question}")

    # ---  Construct the full prompt  ---
    prompt = "\n\n".join(prompt_parts)

    # --- Generation Configuration ---
    generation_config = {}
    if max_output_tokens is not None:
        generation_config["max_output_tokens"] = max_output_tokens
    if temperature is not None:
        generation_config["temperature"] = temperature

    # Add more safety settings, these are very important
    safety_settings = [
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "threshold": "BLOCK_MEDIUM_AND_ABOVE"
      },
    ]
    # for testing
    if len(prompt) > self.MAX_VARIABLE_LENGTH:
      print("Prompt is too long, fix it.")

    # ---  Generate Response  ---
    try:
        response = model.generate_content(prompt, generation_config=generation_config, safety_settings=safety_settings)
        #  Check for response.prompt_feedback to see if the prompt was blocked.
        if response.prompt_feedback:
          raise GeminiResponseBlockedError(f"Prompt was blocked: {response.prompt_feedback}")
        return response.text
    except Exception as e:
        # More specific error handling:  Catch common errors.
        if "Response was blocked" in str(e):  # Check for content filtering.
            raise GeminiResponseBlockedError(f"Error: Gemini response was blocked due to content policy. Error: {e}")
        raise GeminiQueryError(f"Error querying Gemini API: {str(e)}")

  def close(self):
    self.driver.close()

  def clear_database(self):
    with self.driver.session() as session:
        session.run("MATCH (n) DETACH DELETE n")

  def kg_to_neo4j(self, json_data):
    data = json.loads(json_data)  # Parse the JSON string
    with self.driver.session() as session:
        for entity in data:
            entity_name = entity["entityName"]
            entity_type = entity["entityType"]
            properties = entity["properties"]
            relationships = entity["relationships"]

            # Create or match the entity node
            session.run("""
                MERGE (e:Entity {name: $entity_name, type: $entity_type})
                ON CREATE SET e.created = timestamp()
            """, entity_name=entity_name, entity_type=entity_type)

            # Create or match the entity node and entity type node in the same clause
            # and add LTMType property to the entity type node
            session.run("""
                MERGE (e:Entity {name: $entity_name, type: $entity_type})
                ON CREATE SET e.created = timestamp()
                MERGE (et:EntityType {name: $entity_type})
                ON CREATE SET et.created = timestamp(), et.LTMType = 'BaseType'
                WITH e, et
                MERGE (e)-[:TypeOf]->(et)
            """, entity_name=entity_name, entity_type=entity_type)

            # Set the properties
            for prop in properties:
                prop_name = prop["propertyName"]
                prop_value = prop["propertyValue"]
                session.run("""
                    MATCH (e:Entity {name: $entity_name, type: $entity_type})
                    SET e[$prop_name] = $prop_value
                """, entity_name=entity_name, entity_type=entity_type, prop_name=prop_name, prop_value=prop_value)

            # Create the relationships
            for rel in relationships:
                rel_type = rel["relationshipType"]
                target_name = rel["relationshipTarget"]["entityName"]
                target_type = rel["relationshipTarget"]["entityType"]
                # Sanitize the relationship type: Replace invalid characters with underscores
                rel_type = re.sub(r"[^a-zA-Z0-9_]", "_", rel_type)
                # If rel_type is empty after sanitization, use 'RELATED_TO' as default
                if not rel_type:
                    rel_type = "RELATED_TO"
                # Use the sanitized relationship type in the query
                session.run(f"""
                    MATCH (e:Entity {{name: $entity_name, type: $entity_type}}),
                          (t:Entity {{name: $target_name, type: $target_type}})
                    MERGE (e)-[r:`{rel_type}`]->(t)
                    ON CREATE SET r.created = timestamp()
                """, entity_name=entity_name, entity_type=entity_type, target_name=target_name, target_type=target_type)


  def consolidate_stm_to_ltm(self):
    """
    Consolidate the Short Term Memory into the Long Term Memory.
    """
    new_JSON = ""
    for jsonString in self.ShortTermMemory:
      self.kg_to_neo4j(jsonString)
      if new_JSON == "":
        new_JSON = jsonString
      else:
        new_JSON = self.merge_entities_lists(new_JSON, jsonString)
    if new_JSON != "":
      self.LongTermMemory.append(new_JSON)
    self.ShortTermMemory.clear()

  def merge_entities_lists(self, entities1_str, entities2_str):
      if not entities1_str:
          return entities2_str
      if not entities2_str:
          return entities1_str

      entities1 = json.loads(entities1_str)
      entities2 = json.loads(entities2_str)

      merged_entities = {}

      for entity in entities1 + entities2:
          # Check if 'entityName' and 'entityType' keys exist before creating the key
          if "entityName" in entity and "entityType" in entity:
              key = (entity["entityName"], entity["entityType"])
              if key not in merged_entities:
                  merged_entities[key] = {
                      "entityName": entity["entityName"],
                      "entityType": entity["entityType"],
                      "properties": [],
                      "relationships": []
                  }
              merged_entities[key]["properties"].extend(entity["properties"])
              merged_entities[key]["relationships"].extend(entity["relationships"])
          else:
              # Handle cases where 'entity' or 'type' is missing
              print(f"Warning: Entity missing 'entityName' or 'entityType' key: {entity}")
              # You can choose to skip these entities or handle them differently

      return json.dumps(list(merged_entities.values()), indent=2)

  def clean_ltm(self):
      for i in range(len(self.LongTermMemory)):
          # Parse the JSON string into a list of dictionaries
          entities = json.loads(self.LongTermMemory[i])
          clean_entity_set = self.remove_duplicates(entities)
          # Convert the cleaned entities back to a JSON string
          self.LongTermMemory[i] = json.dumps(clean_entity_set, indent=2)

  def remove_duplicates(self, entities):
      for entity in entities:
          # Remove duplicate properties
          properties = entity["properties"]
          unique_properties = []
          seen_properties = set()
          for prop in properties:
              prop_tuple = (prop["propertyName"], prop["propertyValue"])
              if prop_tuple not in seen_properties:
                  seen_properties.add(prop_tuple)
                  unique_properties.append(prop)
          entity["properties"] = unique_properties

          # Remove duplicate relationships
          relationships = entity["relationships"]
          unique_relationships = []
          seen_relationships = set()
          for rel in relationships:
              rel_tuple = (rel["relationshipType"], rel["relationshipTarget"]["entityName"], rel["relationshipTarget"]["entityType"])
              if rel_tuple not in seen_relationships:
                  seen_relationships.add(rel_tuple)
                  unique_relationships.append(rel)
          entity["relationships"] = unique_relationships

      return entities

  def proccessDocument(self, document):
    for paragraphSet in document:
      print(len(paragraphSet))
      self.processInput(paragraphSet, "document", file_name=document.get_file_name(), chunk=document.get_iteration())
    # One last time to make sure STM is clean
    self.consolidate_stm_to_ltm()
    # Get rid of duplicate properties and relationships
    print("Starting clean up")
    self.clean_ltm()
    print(f"STM Len: {len(self.ShortTermMemory)}.")
    print(f"LTM Len: {len(self.LongTermMemory)}.")
    for entitySet in self.LongTermMemory:
      print(len(entitySet))
      print(entitySet)

  def processPrompt(self, prompt):
    pass

  def processInput(self, input, input_type="document", doc_path=None, file_name=None, chunk=None):
    stm_size = 0
    read_pass = 0
    if input_type == "document":
        response = self.text_to_stm(variable=input, question=self.Q_KG, api_key=userdata.get('GeminiAPIKey'), model_name="gemini-1.5-flash", doc_path=doc_path, file_name=file_name, chunk=chunk)
        # Attempt to extract JSON and remove triple quotes
        try:
            # Search for JSON enclosed in triple quotes
            json_str = re.search(r'\`\`\`json\s*([\s\S]*?)\s*\`\`\`', response, re.DOTALL).group(1)
            # group(1) captures the content inside the triple quotes
            json.loads(json_str)  # This line will raise an exception if json_str is not valid JSON
            self.ShortTermMemory.append(json_str)
        except (AttributeError, json.JSONDecodeError):  # Catch both AttributeError and JSONDecodeError
            # Handle cases where the regex doesn't find a match
            print(f"Warning: Could not extract JSON from response: {response}")

        for textString in self.ShortTermMemory:
            stm_size += len(textString)
        if stm_size > self.STM_MAX:
            self.consolidate_stm_to_ltm()
    elif input_type == "prompt":
        self.ShortTermMemory.append("TBD")
    else:
        raise ValueError("Invalid input type. Must be 'document' or 'prompt'.")



In [None]:
class Document:

    def __init__(self, path, firstParagraphSet=50, remainingParagraphSet=50):
        """
        Initialize the Document object with the given file path.

        Args:
            path (str): The path to a document that will be analyzed. Must be a
                        Word document in docx format.
            firstParagraphSet (int): The number of paragraphs to extract in the first set.
            remainingParagraphSet (int): The number of paragraphs to extract in the subsequent sets.

        Returns:
            None
        """
        self.doc_path = path
        try:
            self.doc = docx.Document(path)
        except Exception as e:
            raise ValueError("Error opening document: " + str(e))
        self.num_paragraphs = len(self.doc.paragraphs)

        self.restart(firstParagraphSet, remainingParagraphSet)

    def __iter__(self):
        """
        Make the Document object an iterator.

        Returns:
            self: The iterator object itself.
        """
        return self

    def __next__(self):
        """
        Retrieve the next set of paragraphs from the document.

        Returns:
            str: The requested number of paragraphs as a single string. If there are fewer
                 paragraphs remaining, all available paragraphs are returned. If none are
                 available, StopIteration is raised.
        """
        if self.current_paragraph >= self.num_paragraphs:
            raise StopIteration
        else:
            end_paragraph = min(self.current_paragraph + self.currentParagraphSet, self.num_paragraphs)
            extracted_text = [self.doc.paragraphs[i].text for i in range(self.current_paragraph, end_paragraph)]
            retval = '\n'.join(extracted_text)
            self.current_paragraph = end_paragraph
            if self.iteration == 0:
                self.currentParagraphSet = self.remainingParagraphSet
            self.iteration += 1
            return retval

    def __str__(self):
        """
        Provide a string representation of the Document object.

        Returns:
            str: A string describing the document, including its path and the number of paragraphs.
        """
        return f"Document Path: {self.doc_path}\nTotal Paragraphs: {self.num_paragraphs}\nFirst Paragraph Set Size: {self.firstParagraphSet}\nRemaining Paragraph Set Size: {self.remainingParagraphSet}"

    def restart(self, firstParagraphSet=50, remainingParagraphSet=50):
        """
        Restart the iteration process with new paragraph set sizes.

        Args:
            firstParagraphSet (int): The number of paragraphs to extract in the first set.
            remainingParagraphSet (int): The number of paragraphs to extract in the subsequent sets.

        Returns:
            None
        """
        self.firstParagraphSet = firstParagraphSet
        self.remainingParagraphSet = remainingParagraphSet
        self.iteration = 0
        self.current_paragraph = 0
        self.currentParagraphSet = self.firstParagraphSet

    def get_iteration(self):
        """
        Get the current iteration number.

        Returns:
            int: The current iteration number.
        """
        return self.iteration

    def get_doc_path(self):
        """
        Get the path to the document.

        Returns:
            str: The path to the document.
        """
        return self.doc_path

    def get_file_name(self):
        """
        Get the name of the document file.

        Returns:
            str: The name of the document file.
        """
        return os.path.basename(self.doc_path)

    def get_word_count(self):
        """
        Calculate the total number of words in the document.

        Returns:
            int: The total word count.
        """
        total_words = sum(len(paragraph.text.split()) for paragraph in self.doc.paragraphs)
        return total_words

    def search_keyword(self, keyword):
        """
        Search for a keyword in the document and return the paragraphs containing it.

        Args:
            keyword (str): The keyword to search for.

        Returns:
            list: A list of paragraphs containing the keyword.
        """
        paragraphs_with_keyword = [paragraph.text for paragraph in self.doc.paragraphs if keyword in paragraph.text]
        return paragraphs_with_keyword

    def replace_keyword(self, old_keyword, new_keyword):
        """
        Replace a keyword in the document with a new keyword.

        Args:
            old_keyword (str): The keyword to be replaced.
            new_keyword (str): The new keyword to replace the old one.

        Returns:
            None
        """
        for paragraph in self.doc.paragraphs:
            if old_keyword in paragraph.text:
                paragraph.text = paragraph.text.replace(old_keyword, new_keyword)

    def get_paragraph(self, index):
        """
        Retrieve a specific paragraph by its index.

        Args:
            index (int): The index of the paragraph to retrieve.

        Returns:
            str: The text of the specified paragraph.
        """
        if index < 0 or index >= self.num_paragraphs:
            raise IndexError("Paragraph index out of range.")
        return self.doc.paragraphs[index].text

    def save(self, path=None):
        """
        Save the document to a file.

        Args:
            path (str): The file path to save the document. If None, saves to the original path.

        Returns:
            None
        """
        save_path = path if path else self.doc_path
        self.doc.save(save_path)

    def get_statistics(self):
        """
        Retrieve statistics about the document.

        Returns:
            dict: A dictionary containing various statistics about the document.
        """
        statistics = {
            "Total Paragraphs": self.num_paragraphs,
            "Total Words": self.get_word_count(),
            "First Paragraph Set Size": self.firstParagraphSet,
            "Remaining Paragraph Set Size": self.remainingParagraphSet,
        }
        return statistics


In [None]:
file_path = "/content/drive/MyDrive/Praxis/Conewago Township Sewer Authority, PA.docx"
aDocument1 = Document(file_path, 50, 50)
print(aDocument1.get_word_count())
#aList = aDocument1.search_keyword("sewer")
print(aDocument1)
for paragraph in aDocument1:
  print(len(paragraph))
aDocument1.restart(25, 50)
print(aDocument1)
for paragraph in aDocument1:
  print(len(paragraph))

21964
Document Path: /content/drive/MyDrive/Praxis/Conewago Township Sewer Authority, PA.docx
Total Paragraphs: 1259
First Paragraph Set Size: 50
Remaining Paragraph Set Size: 50
4008
8111
4424
3851
7507
2822
3625
361
5929
6175
6842
9301
8966
9026
7484
7425
4333
4706
7284
5138
9153
6192
2122
2122
1702
8
Document Path: /content/drive/MyDrive/Praxis/Conewago Township Sewer Authority, PA.docx
Total Paragraphs: 1259
First Paragraph Set Size: 25
Remaining Paragraph Set Size: 50
1141
5914
7969
2842
6821
3861
4254
1570
3421
5270
6566
8815
9339
8827
8558
8347
4661
4472
6725
5624
7010
7446
4237
1119
3263
545


In [None]:
doc_file_path = "/content/drive/MyDrive/Praxis/Conewago Township Sewer Authority, PA.docx"
neo4j_file_path = "/content/drive/MyDrive/Praxis/Neo4j-839ec4f3-Created-2025-03-07.txt"

aDocument = Document(doc_file_path, 50, 50)
aMind = Mind(neo4j_file_path)
try:
  aMind.clear_database()
  aMind.proccessDocument(aDocument)
finally:
  aMind.close()
  print("Done")

4008
8111
4424
3851
7507
2822
3625
361
5929
6175
6842
9301
8966
9026
7484
7425
4333
4706
7284
5138
9153
6192
2122
2122
1702
8
Starting clean up
STM Len: 0.
LTM Len: 7.
8807
[
  {
    "entityName": "Conewago Township Sewer Authority",
    "entityType": "Organization",
    "properties": [],
    "relationships": [
      {
        "relationshipType": "Source",
        "relationshipTarget": {
          "entityName": "Conewago Township Sewer Authority, PA.docx",
          "entityType": "File Name"
        }
      },
      {
        "relationshipType": "Source",
        "relationshipTarget": {
          "entityName": "1",
          "entityType": "Chunk"
        }
      },
      {
        "relationshipType": "Source",
        "relationshipTarget": {
          "entityName": "3",
          "entityType": "Chunk"
        }
      },
      {
        "relationshipType": "Source",
        "relationshipTarget": {
          "entityName": "4",
          "entityType": "Chunk"
        }
      },
      {
  