<a href="https://colab.research.google.com/github/ebamberg/research-projects-ml/blob/main/agents_and_routing/examples_agents_for_knowledgeGraph.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [7]:
!pip install ollama langchain_community --quiet

host="localhost:11434"
modelid="gemma3:12b"

get_ipython().system_raw("curl -fsSL https://ollama.com/install.sh | sh")
get_ipython().system_raw("ollama serve &")
get_ipython().system_raw(f"ollama pull {modelid}")


In [8]:
!pip install openai --quiet

In [13]:
get_ipython().system_raw(f"ollama pull {modelid}")

In [1]:
!wget https://raw.githubusercontent.com/ebamberg/research-projects-ml/refs/heads/main/data/text/synthetic_history_of_rock.txt -O history_of_rock.txt

--2025-09-16 15:19:31--  https://raw.githubusercontent.com/ebamberg/research-projects-ml/refs/heads/main/data/text/synthetic_history_of_rock.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3245 (3.2K) [text/plain]
Saving to: ‘history_of_rock.txt’


2025-09-16 15:19:31 (47.9 MB/s) - ‘history_of_rock.txt’ saved [3245/3245]



In [9]:
from openai import OpenAI


llm = OpenAI(
        base_url=f"http://{host}/v1",
        api_key="ollama",  # required, but unused
    )




In [10]:
def call(system_prompt: str, message: str, model: str = modelid ) -> str:
  completion = llm.chat.completions.create(
      model=modelid,
      messages=[
          {"role": "system", "content": system_prompt},
          {
              "role": "user",
              "content": message,
          },
      ],
  )

  return completion.choices[0].message.content

In [11]:
def create_kg_agent(content : str):
  """

You are a helpful assistant specializing in English language tasked with extracting knowledge‑graph‑ready triplets from input sentences.

Your job is to identify triplets of entity–relation–entity suitable for high‑quality Knowledge Graph construction.

Output format:

Return only a JSON array of objects, with no extra characters, explanations, or surrounding text.

Each object must follow this exact schema (attributes are empty strings if absent):

[{“head_entity”:{“entity”:<string>, “attribute”:<string>},”relation”:{“relation”:<string>, “attribute”:<string>},”tail_entity”:{“entity”:<string>, “attribute”:<string>}}]

Core extraction rules:

Relations: use lowercase lemma (root) form for the predicate; normalize inflected forms by lemmatization (e.g., “celebrated” → “celebrate”).

Coreference: resolve pronouns and nominal references across sentences and replace them with the canonical entity mention (e.g., “He” → “Bilbo Baggins”).

De‑duplication: avoid duplicate triples after normalization; keep a single instance of identical head–relation–tail.

Entity cleaning: strip determiners and punctuation, preserve multi‑word names, and use a cleaned, canonical surface form where resolvable.

Adjectives and modifiers: attach descriptive adjectives, ordinals, numerals, dates, and similar qualifiers as the attribute of the nearest relevant entity (e.g., “111th” on “birthday”).

Prepositions and normalized relations: map common prepositional or nominal patterns to canonical snake_case relation names when appropriate (e.g., give_to, located_in, part_of, born_in, work_at).

Voice normalization: for passive constructions, recover the logical subject as head and object as tail (e.g., “The ring was given to Frodo by Bilbo” → head=Bilbo, relation=give_to, tail=Frodo).

Coordination: split conjuncts into multiple triples when they denote separate facts (e.g., “Bilbo and Frodo traveled to Rivendell” → two triples, one per subject).

Negation: if a predicate is explicitly negated (e.g., “not”, “never”), keep the relation in lemma form and set relation.attribute to “negated”.

Uncertainty and conditionals: for explicit modality/conditionality (e.g., “may”, “might”, “if”), keep the lemmatized relation and set relation.attribute to a short qualifier such as “modal:may” or “conditional”.

Document level relations: allow cross‑sentence relations when clearly expressed via coreference or discourse, but do not infer unstated facts.

Do not invent attributes: include only attributes explicitly present or safely normalized from the text; otherwise use an empty string.

Best‑practice reminders:

Prefer verb‑centric predicates; convert nominalizations to their verbal lemmas when this better captures the relation (e.g., “celebration” → “celebrate”).

Keep entities and relations concise and unambiguous; avoid overlapping or synonymous duplicates (e.g., do not emit both give and give_to for the same fact).

Use English throughout; process text in cleaned form before extraction.

Examples:

Input: “Bilbo Baggins was celebrating his 111th birthday.”

Output:

[{“head_entity”:{“entity”:”Bilbo Baggins”,”attribute”:””},”relation”:{“relation”:”celebrate”,”attribute”:””},”tail_entity”:{“entity”:”birthday”,”attribute”:”111th”}}]

Input: “Bilbo was celebrating his birthday. He gave the ring to Frodo.”

Output:

[{“head_entity”:{“entity”:”Bilbo Baggins”,”attribute”:””},”relation”:{“relation”:”celebrate”,”attribute”:””},”tail_entity”:{“entity”:”birthday”,”attribute”:””}}, {“head_entity”:{“entity”:”Bilbo Baggins”,”attribute”:””},”relation”:{“relation”:”give_to”,”attribute”:””},”tail_entity”:{“entity”:”Frodo”,”attribute”:””}}]

Input: “Bilbo was celebrating his birthday. Frodo celebrated the party.”

Output:

[{“head_entity”:{“entity”:”Bilbo”,”attribute”:””},”relation”:{“relation”:”celebrate”,”attribute”:””},”tail_entity”:{“entity”:”birthday”,”attribute”:””}}, {“head_entity”:{“entity”:”Frodo”,”attribute”:””},”relation”:{“relation”:”celebrate”,”attribute”:””},”tail_entity”:{“entity”:”party”,”attribute”:””}}]

Input: “The ring was given to Frodo by Bilbo.”

Output:

[{“head_entity”:{“entity”:”Bilbo”,”attribute”:””},”relation”:{“relation”:”give_to”,”attribute”:””},”tail_entity”:{“entity”:”Frodo”,”attribute”:””}}]

Input: “Bilbo did not attend the party.”

Output:

[{“head_entity”:{“entity”:”Bilbo”,”attribute”:””},”relation”:{“relation”:”attend”,”attribute”:”negated”},”tail_entity”:{“entity”:”party”,”attribute”:””}}]

Return only the JSON array as specified, exactly matching the schema, with no extra characters.

  """
  return call (build_kg_agent.__doc__, content )

In [4]:
with open('history_of_rock.txt') as f:
    document = f.read()


In [5]:
print(document)

# The History of Rock Music and the Rock Guitar

Rock music emerged in the mid-1950s from the fusion of blues, country, and rhythm & blues, with the electric guitar serving as its defining instrument. The solid-body electric guitar, pioneered by Leo Fender and Les Paul in the 1940s and early 1950s, provided the volume and sustain that acoustic guitars couldn't match. Chuck Berry became one of rock's first guitar heroes, crafting iconic riffs and solos that would influence generations of musicians.

The 1960s brought an explosion of guitar innovation, with players like Jimi Hendrix revolutionizing the instrument through feedback, distortion, and effects pedals. The British Invasion introduced new sounds through bands like The Beatles, The Rolling Stones, and The Who, each contributing unique approaches to guitar playing. Eric Clapton, Jimmy Page, and Jeff Beck emerged as guitar gods, pushing the boundaries of blues-based rock with technical prowess and creative expression.

The late 196

In [14]:
kg_json=create_kg_agent(document)

In [35]:
import re
import json
kg_json=kg_json.replace("\n","")
match_object = re.match(r'```json(.*)```', kg_json, re.MULTILINE)
kg_tidy=match_object.group(1)
print (kg_tidy)


[{"head_entity":{"entity":"rock music","attribute":""},"relation":{"relation":"emerge","attribute":""},"tail_entity":{"entity":"fusion","attribute":""}}, {"head_entity":{"entity":"electric guitar","attribute":""},"relation":{"relation":"serve","attribute":""},"tail_entity":{"entity":"instrument","attribute":"defining"}}, {"head_entity":{"entity":"solid-body electric guitar","attribute":""},"relation":{"relation":"pioneer","attribute":""},"tail_entity":{"entity":"Leo Fender","attribute":""}}, {"head_entity":{"entity":"solid-body electric guitar","attribute":""},"relation":{"relation":"pioneer","attribute":""},"tail_entity":{"entity":"Les Paul","attribute":""}}, {"head_entity":{"entity":"Chuck Berry","attribute":""},"relation":{"relation":"become","attribute":""},"tail_entity":{"entity":"guitar heroes","attribute":""}}, {"head_entity":{"entity":"Jimi Hendrix","attribute":""},"relation":{"relation":"revolutionize","attribute":""},"tail_entity":{"entity":"instrument","attribute":""}}, {"he

In [23]:
print (kg_json)
print (kg_tidy)

```json
[{"head_entity":{"entity":"rock music","attribute":""},"relation":{"relation":"emerge","attribute":""},"tail_entity":{"entity":"fusion","attribute":""}}, {"head_entity":{"entity":"electric guitar","attribute":""},"relation":{"relation":"serve","attribute":""},"tail_entity":{"entity":"instrument","attribute":"defining"}}, {"head_entity":{"entity":"solid-body electric guitar","attribute":""},"relation":{"relation":"pioneer","attribute":""},"tail_entity":{"entity":"Leo Fender","attribute":""}}, {"head_entity":{"entity":"solid-body electric guitar","attribute":""},"relation":{"relation":"pioneer","attribute":""},"tail_entity":{"entity":"Les Paul","attribute":""}}, {"head_entity":{"entity":"Chuck Berry","attribute":""},"relation":{"relation":"become","attribute":""},"tail_entity":{"entity":"guitar heroes","attribute":""}}, {"head_entity":{"entity":"Jimi Hendrix","attribute":""},"relation":{"relation":"revolutionize","attribute":""},"tail_entity":{"entity":"instrument","attribute":""