In [20]:
require "bundler/inline"

gemfile do
  source "https://rubygems.org"

  gem "dotenv", require: "dotenv/load"
  gem "ruby_llm"
  gem "rgl"
end

[<Bundler::Dependency type=:runtime name="dotenv" requirements=">= 0">, <Bundler::Dependency type=:runtime name="ruby_llm" requirements=">= 0">, <Bundler::Dependency type=:runtime name="rgl" requirements=">= 0">]

In [None]:
RubyLLM.configure do |config|
  # Add keys ONLY for the providers you intend to use.
  # Using environment variables is highly recommended.
  config.openai_api_key = ENV.fetch('OPENAI_API_KEY', nil)
  # config.anthropic_api_key = ENV.fetch('ANTHROPIC_API_KEY', nil)
end

In [22]:
# --- Define LLM Model ---
# Choose the model available at your configured endpoint.
# Examples: 'gpt-4o', 'gpt-3.5-turbo', 'llama3', 'mistral', 'deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct', 'gemma'
# llm_model_name = "deepseek-ai/DeepSeek-V3" # <-- *** CHANGE THIS TO YOUR MODEL ***
llm_model_name = 'chatgpt-4o-latest'
puts "Intended LLM model: #{llm_model_name}"

Intended LLM model: chatgpt-4o-latest


In [23]:
# --- Define LLM Call Parameters ---
llm_temperature = 0.0 # Lower temperature for more deterministic, factual output. 0.0 is best for extraction.
llm_max_tokens = 4096 # Max tokens for the LLM response (adjust based on model limits)

puts "LLM Temperature set to: #{llm_temperature}"
puts "LLM Max Tokens set to: #{llm_max_tokens}"

LLM Temperature set to: 0.0
LLM Max Tokens set to: 4096


In [24]:
# Create a chat instance (uses the configured default model)
chat = RubyLLM.chat(model: llm_model_name)

# Ask a question
response = chat.ask "What is Ruby on Rails?"
puts response.content

Ruby on Rails, often just called "Rails," is an open-source web application framework written in the Ruby programming language. It was created by David Heinemeier Hansson and released in 2004. Rails is designed to make programming web applications easier by making assumptions about what developers need to get started. It emphasizes convention over configuration (CoC) and don't repeat yourself (DRY) principles.

Key features of Ruby on Rails include:

1. **MVC Architecture**: Rails follows the Model-View-Controller (MVC) architectural pattern, which helps in organizing application logic and separating concerns.

2. **Convention over Configuration**: Instead of requiring extensive configuration files, Rails assumes sensible defaults, which speeds up development.

3. **Built-in Tools**: Rails provides a wide range of built-in tools to handle tasks such as database migrations, testing, form generation, and session management.

4. **Active Record**: This is Rails’ object-relational mapping 

In [25]:
unstructured_text = """
Marie Curie, born Maria Skłodowska in Warsaw, Poland, was a pioneering physicist and chemist.
She conducted groundbreaking research on radioactivity. Together with her husband, Pierre Curie,
she discovered the elements polonium and radium. Marie Curie was the first woman to win a Nobel Prize,
the first person and only woman to win the Nobel Prize twice, and the only person to win the Nobel Prize
in two different scientific fields. She won the Nobel Prize in Physics in 1903 with Pierre Curie
and Henri Becquerel. Later, she won the Nobel Prize in Chemistry in 1911 for her work on radium and
polonium. During World War I, she developed mobile radiography units, known as 'petites Curies',
to provide X-ray services to field hospitals. Marie Curie died in 1934 from aplastic anemia, likely
caused by her long-term exposure to radiation.

Marie was born on November 7, 1867, to a family of teachers who valued education. She received her
early schooling in Warsaw but moved to Paris in 1891 to continue her studies at the Sorbonne, where
she earned degrees in physics and mathematics. She met Pierre Curie, a professor of physics, in 1894,
and they married in 1895, beginning a productive scientific partnership. Following Pierre's tragic
death in a street accident in 1906, Marie took over his teaching position, becoming the first female
professor at the Sorbonne.

The Curies' work on radioactivity was conducted in challenging conditions, in a poorly equipped shed
with no proper ventilation, as they processed tons of pitchblende ore to isolate radium. Marie Curie
established the Curie Institute in Paris, which became a major center for medical research. She had
two daughters: Irène, who later won a Nobel Prize in Chemistry with her husband, and Eve, who became
a writer. Marie's notebooks are still radioactive today and are kept in lead-lined boxes. Her legacy
includes not only her scientific discoveries but also her role in breaking gender barriers in academia
and science.
"""

"\nMarie Curie, born Maria Skłodowska in Warsaw, Poland, was a pioneering physicist and chemist.\nShe conducted groundbreaking research on radioactivity. Together with her husband, Pierre Curie,\nshe discovered the elements polonium and radium. Marie Curie was the first woman to win a Nobel Prize,\nthe first person and only woman to win the Nobel Prize twice, and the only person to win the Nobel Prize\nin two different scientific fields. She won the Nobel Prize in Physics in 1903 with Pierre Curie\nand Henri Becquerel. Later, she won the Nobel Prize in Chemistry in 1911 for her work on radium and\npolonium. During World War I, she developed mobile radiography units, known as 'petites Curies',\nto provide X-ray services to field hospitals. Marie Curie died in 1934 from aplastic anemia, likely\ncaused by her long-term exposure to radiation.\n\nMarie was born on November 7, 1867, to a family of teachers who valued education. She received her\nearly schooling in Warsaw but moved to Paris i

In [26]:
# --- Chunking Configuration ---
chunk_size = 150  # Number of words per chunk (adjust as needed)
overlap = 30     # Number of words to overlap (must be < chunk_size)

words = unstructured_text.split(' ')
total_words = words.count()

puts "Total words: #{total_words}"

Total words: 324


In [27]:
puts "--- Input Text Loaded ---"
puts unstructured_text
puts "-" * 25
# Basic stats visualization
char_count = unstructured_text.length
word_count = unstructured_text.split(' ').length
puts "Total characters: #{char_count}"
puts "Approximate word count: #{word_count}"
puts "-" * 25


--- Input Text Loaded ---

Marie Curie, born Maria Skłodowska in Warsaw, Poland, was a pioneering physicist and chemist.
She conducted groundbreaking research on radioactivity. Together with her husband, Pierre Curie,
she discovered the elements polonium and radium. Marie Curie was the first woman to win a Nobel Prize,
the first person and only woman to win the Nobel Prize twice, and the only person to win the Nobel Prize
in two different scientific fields. She won the Nobel Prize in Physics in 1903 with Pierre Curie
and Henri Becquerel. Later, she won the Nobel Prize in Chemistry in 1911 for her work on radium and
polonium. During World War I, she developed mobile radiography units, known as 'petites Curies',
to provide X-ray services to field hospitals. Marie Curie died in 1934 from aplastic anemia, likely
caused by her long-term exposure to radiation.

Marie was born on November 7, 1867, to a family of teachers who valued education. She received her
early schooling in Warsaw but mov

In [28]:
chunks = []
start_index = 0
chunk_number = 1

puts "Starting chunking process..."

while start_index < total_words
  end_index = [start_index + chunk_size, total_words].min
  chunk_text = words[start_index...end_index].join(" ")
  chunks << { text: chunk_text, chunk_number: chunk_number }

  puts "  Created chunk #{chunk_number}: words #{start_index} to #{end_index-1}" # Uncomment for detailed log

  # Calculate the start of the next chunk
  next_start_index = start_index + chunk_size - overlap

  # Ensure progress is made
  if next_start_index <= start_index
    if end_index == total_words
      break # Already processed the last part
    end
    next_start_index = start_index + 1
  end

  start_index = next_start_index
  chunk_number += 1

  # Safety break (optional)
  if chunk_number > total_words # Simple safety
    puts "Warning: Chunking loop exceeded total word count, breaking."
    break
  end
end

puts "\nText successfully split into #{chunks.length} chunks."

Starting chunking process...
  Created chunk 1: words 0 to 149
  Created chunk 2: words 120 to 269
  Created chunk 3: words 240 to 323

Text successfully split into 3 chunks.


In [29]:
# --- System Prompt: Sets the context/role for the LLM ---
extraction_system_prompt = <<~PROMPT
  You are an AI expert specialized in knowledge graph extraction.
  Your task is to identify and extract factual Subject-Predicate-Object (SPO) triples from the given text.
  Focus on accuracy and adhere strictly to the JSON output format requested in the user prompt.
  Extract core entities and the most direct relationship.
PROMPT

# --- User Prompt Template: Contains specific instructions and the text ---
extraction_user_prompt_template = <<~TEMPLATE
  Please extract Subject-Predicate-Object (S-P-O) triples from the text below.

  **VERY IMPORTANT RULES:**
  1.  **Output Format:** Respond ONLY with a single, valid JSON array. Each element MUST be an object with keys "subject", "predicate", "object".
  2.  **JSON Only:** Do NOT include any text before or after the JSON array (e.g., no 'Here is the JSON:' or explanations). Do NOT use markdown ```json ... ``` tags.
  3.  **Concise Predicates:** Keep the 'predicate' value concise (1-3 words, ideally 1-2). Use verbs or short verb phrases (e.g., 'discovered', 'was born in', 'won').
  4.  **Lowercase:** ALL values for 'subject', 'predicate', and 'object' MUST be lowercase.
  5.  **Pronoun Resolution:** Replace pronouns (she, he, it, her, etc.) with the specific lowercase entity name they refer to based on the text context (e.g., 'marie curie').
  6.  **Specificity:** Capture specific details (e.g., 'nobel prize in physics' instead of just 'nobel prize' if specified).
  7.  **Completeness:** Extract all distinct factual relationships mentioned.

  **Text to Process:**
  {text_chunk}
TEMPLATE

"Please extract Subject-Predicate-Object (S-P-O) triples from the text below.\n\n**VERY IMPORTANT RULES:**\n1.  **Output Format:** Respond ONLY with a single, valid JSON array. Each element MUST be an object with keys \"subject\", \"predicate\", \"object\".\n2.  **JSON Only:** Do NOT include any text before or after the JSON array (e.g., no 'Here is the JSON:' or explanations). Do NOT use markdown ```json ... ``` tags.\n3.  **Concise Predicates:** Keep the 'predicate' value concise (1-3 words, ideally 1-2). Use verbs or short verb phrases (e.g., 'discovered', 'was born in', 'won').\n4.  **Lowercase:** ALL values for 'subject', 'predicate', and 'object' MUST be lowercase.\n5.  **Pronoun Resolution:** Replace pronouns (she, he, it, her, etc.) with the specific lowercase entity name they refer to based on the text context (e.g., 'marie curie').\n6.  **Specificity:** Capture specific details (e.g., 'nobel prize in physics' instead of just 'nobel prize' if specified).\n7.  **Completeness:**

In [30]:
puts "--- System Prompt ---"
puts extraction_system_prompt
puts "\n" + "-" * 25 + "\n"

puts "--- User Prompt Template (Structure) ---"
# Show structure, replacing the placeholder for clarity
puts extraction_user_prompt_template.gsub("{text_chunk}", "[... text chunk goes here ...]")
puts "\n" + "-" * 25 + "\n"

# Show an example of the *actual* prompt that will be sent for the first chunk
puts "--- Example Filled User Prompt (for Chunk 1) ---"
if chunks.any?
  example_filled_prompt = extraction_user_prompt_template.gsub("{text_chunk}", chunks[0][:text])
  # Displaying a limited portion for brevity
  preview = example_filled_prompt[0, 600]
  ending = example_filled_prompt[-200, 200]
  puts "#{preview}\n[... rest of chunk text ...]\n#{ending}"
else
  puts "No chunks available to create an example filled prompt."
end
puts "\n" + "-" * 25

--- System Prompt ---
You are an AI expert specialized in knowledge graph extraction.
Your task is to identify and extract factual Subject-Predicate-Object (SPO) triples from the given text.
Focus on accuracy and adhere strictly to the JSON output format requested in the user prompt.
Extract core entities and the most direct relationship.

-------------------------
--- User Prompt Template (Structure) ---
Please extract Subject-Predicate-Object (S-P-O) triples from the text below.

**VERY IMPORTANT RULES:**
1.  **Output Format:** Respond ONLY with a single, valid JSON array. Each element MUST be an object with keys "subject", "predicate", "object".
2.  **JSON Only:** Do NOT include any text before or after the JSON array (e.g., no 'Here is the JSON:' or explanations). Do NOT use markdown ```json ... ``` tags.
3.  **Concise Predicates:** Keep the 'predicate' value concise (1-3 words, ideally 1-2). Use verbs or short verb phrases (e.g., 'discovered', 'was born in', 'won').
4.  **Lowercas

In [31]:
# Initialize arrays to store results and failures
all_extracted_triples = []
failed_chunks = []

puts "Starting triple extraction from #{chunks.length} chunks using model '#{llm_model_name}'..."
# We will process chunks one by one in the following cells.

Starting triple extraction from 3 chunks using model 'chatgpt-4o-latest'...


In [32]:
# --- Knowledge Graph Triple Extraction Over All Chunks (RubyLLM) ---
# Uses RubyLLM's chat API (chat.ask) for each chunk. Each chunk has isolated system instructions
# to avoid cross-contamination of context between chunks.
# If you need explicit output token limits, uncomment the with_params line (provider-dependent).

require 'json'

# --- State Reset Safeguards -------------------------------------------------
# Clear legacy results produced by earlier (now removed) cells that used an OpenAI-style 'client'
if defined?(failed_chunks) && failed_chunks.any? { |fc| fc['error'].to_s.include?('client') }
  puts "Detected legacy failures referencing 'client' => clearing stale failed_chunks..."
  failed_chunks = []
end

# Ensure arrays exist (in case user skipped the initialization cell)
unless defined?(all_extracted_triples) && all_extracted_triples.is_a?(Array)
  puts "Initializing all_extracted_triples array (was undefined)."
  all_extracted_triples = []
end
unless defined?(failed_chunks) && failed_chunks.is_a?(Array)
  puts "Initializing failed_chunks array (was undefined)."
  failed_chunks = []
end

if chunks.nil? || !chunks.is_a?(Array)
  puts "ERROR: 'chunks' not defined or not an Array. Rerun the earlier cells that create 'chunks'."
  return
end

if chunks.empty?
  puts 'No chunks to process.'
  return
end

puts "Starting structured triple extraction across #{chunks.length} chunks..."

# Helper: build a fresh chat per chunk with instructions + temperature
build_chat = lambda do
  chat = RubyLLM.chat(model: llm_model_name)
  chat = chat.with_instructions(extraction_system_prompt, replace: true)
  chat = chat.with_temperature(llm_temperature)
  # chat = chat.with_params(max_output_tokens: llm_max_tokens) # Optional: only if supported by provider
  chat
end

chunks.each_with_index do |chunk_info, idx|
  # Support both symbol and string keys defensively
  chunk_text = chunk_info[:text] || chunk_info['text']
  chunk_num  = chunk_info[:chunk_number] || chunk_info['chunk_number'] || (idx + 1)

  puts "\n=== Chunk #{chunk_num}/#{chunks.length} ==="

  unless chunk_text && !chunk_text.empty?
    puts "Skipping chunk #{chunk_num}: empty text"
    failed_chunks << { 'chunk_number' => chunk_num, 'error' => 'Empty chunk text', 'response' => '' }
    next
  end

  user_prompt = extraction_user_prompt_template.gsub('{text_chunk}', chunk_text)

  chat = build_chat.call

  response_content = nil
  begin
    response = chat.ask(user_prompt)
    response_content = response.content.to_s.strip
    puts "Received response (#{response_content.length} chars)."
  rescue => e
    puts "ERROR: API call failed for chunk #{chunk_num}: #{e.message}"
    failed_chunks << { 'chunk_number' => chunk_num, 'error' => "API Error: #{e.message}", 'response' => '' }
    next
  end

  if response_content.empty?
    puts "Empty response for chunk #{chunk_num}."
    failed_chunks << { 'chunk_number' => chunk_num, 'error' => 'Empty response', 'response' => '' }
    next
  end

  parsed_json = nil
  begin
    candidate = JSON.parse(response_content)
    if candidate.is_a?(Array)
      parsed_json = candidate
    elsif candidate.is_a?(Hash)
      arrays = candidate.values.select { |v| v.is_a?(Array) }
      if arrays.length == 1
        parsed_json = arrays.first
      else
        raise 'Hash did not contain a single array of triples'
      end
    else
      raise "Unexpected JSON top-level type: #{candidate.class}"
    end
  rescue => primary_err
    if (m = response_content.match(/\[(?:.|\n)*?\]/m))
      begin
        parsed_json = JSON.parse(m[0])
        puts "Recovered JSON array via regex fallback."
      rescue => fallback_err
        puts "Fallback parse failed: #{fallback_err} (primary: #{primary_err})"
      end
    else
      puts "Primary parse failed (#{primary_err}); no JSON array pattern found."
    end
  end

  unless parsed_json
    failed_chunks << { 'chunk_number' => chunk_num, 'error' => 'Parse failure', 'response' => response_content }
    next
  end

  valid = []
  invalid = []
  parsed_json.each do |item|
    if item.is_a?(Hash) && %w[subject predicate object].all? { |k| item.key?(k) && item[k].is_a?(String) }
      item['chunk'] = chunk_num
      valid << item
    else
      invalid << item
    end
  end

  if valid.any?
    all_extracted_triples.concat(valid)
    puts "Valid triples extracted: #{valid.length}";
  else
    puts "No valid triples extracted in chunk #{chunk_num}."
  end
  puts "Invalid entries skipped: #{invalid.length}" if invalid.any?
end

puts "\n=== Extraction Complete ==="
puts "Total valid triples: #{all_extracted_triples.length}"
puts "Failed chunks: #{failed_chunks.length}"

if failed_chunks.any?
  puts "Failed chunk summary:"
  failed_chunks.each do |fc|
    puts " - Chunk #{fc['chunk_number']}: #{fc['error']}"
  end
end

if all_extracted_triples.any?
  puts "\nSample of extracted triples:";
  sample = all_extracted_triples.first([5, all_extracted_triples.length].min)
  puts JSON.pretty_generate(sample)
end

# Result: all_extracted_triples holds the unified list of triples across chunks.


Starting structured triple extraction across 3 chunks...

=== Chunk 1/3 ===
Received response (2797 chars).
Valid triples extracted: 26

=== Chunk 2/3 ===
Received response (2440 chars).
Valid triples extracted: 24

=== Chunk 3/3 ===
Received response (1407 chars).
Valid triples extracted: 14

=== Extraction Complete ===
Total valid triples: 64
Failed chunks: 0

Sample of extracted triples:
[
  {
    "subject": "marie curie",
    "predicate": "was born as",
    "object": "maria skłodowska",
    "chunk": 1
  },
  {
    "subject": "marie curie",
    "predicate": "was born in",
    "object": "warsaw, poland",
    "chunk": 1
  },
  {
    "subject": "marie curie",
    "predicate": "was a",
    "object": "physicist",
    "chunk": 1
  },
  {
    "subject": "marie curie",
    "predicate": "was a",
    "object": "chemist",
    "chunk": 1
  },
  {
    "subject": "marie curie",
    "predicate": "conducted research on",
    "object": "radioactivity",
    "chunk": 1
  }
]


In [33]:
# --- Extraction Summary (Ruby Version) ---
# Summarizes results after running the multi-chunk extraction cell.
# Provides counts, failed chunk diagnostics, and prints triples in a simple table.

# Collect distinct chunk numbers we actually touched
successful_chunk_numbers = all_extracted_triples.map { |t| t['chunk'] }.uniq
failed_chunk_numbers     = failed_chunks.map { |f| f['chunk_number'] }.uniq
processed_chunk_numbers  = (successful_chunk_numbers + failed_chunk_numbers).uniq

attempted_chunks   = processed_chunk_numbers.size
successful_chunks  = successful_chunk_numbers.size
failed_chunk_count = failed_chunks.size

deduped_triples = all_extracted_triples.uniq { |t| [t['subject'], t['predicate'], t['object']] }

puts "\n=== Overall Extraction Summary ==="
puts "Total chunks defined:                #{chunks.length}"
puts "Chunks attempted (processed/failed): #{attempted_chunks}" if attempted_chunks != chunks.length
puts "Successful chunks (>=1 triple):      #{successful_chunks}"
puts "Failed chunks (API/parse/no triples): #{failed_chunk_count}"
puts "Total triples (raw):                 #{all_extracted_triples.length}"
puts "Total triples (deduplicated):        #{deduped_triples.length}"

if failed_chunks.any?
  puts "\n-- Failed Chunks Detail --"
  failed_chunks.each do |failure|
    snippet = failure['response'].to_s[0, 80]
    puts "  Chunk #{failure['chunk_number']}: #{failure['error']}#{snippet.empty? ? '' : ' | resp: ' + snippet + ('...' if failure['response'].to_s.length > 80).to_s}"
  end
end

if deduped_triples.empty?
  puts "\nNo triples were successfully extracted."
else
  puts "\n=== Extracted Triples (Deduplicated) ==="
  # Compute column widths
  headers = ['Chunk', 'Subject', 'Predicate', 'Object']
  rows = deduped_triples.map { |t| [t['chunk'].to_s, t['subject'], t['predicate'], t['object']] }
  col_widths = headers.map.with_index do |h, i|
    ([h.length] + rows.map { |r| r[i].length }).max.clamp(0, 120)
  end

  # Helper to format a row
  format_row = lambda do |cols|
    cols.each_with_index.map { |c, i| c.ljust(col_widths[i]) }.join(' | ')
  end

  puts format_row.call(headers)
  puts col_widths.map { |w| '-' * w }.join('-+-')
  rows.each { |r| puts format_row.call(r) }
end

# Optional: write triples to JSON file (uncomment if desired)
# File.write('extracted_triples.json', JSON.pretty_generate(deduped_triples))
# puts "\nSaved deduplicated triples to extracted_triples.json"

puts "\nSummary complete."


=== Overall Extraction Summary ===
Total chunks defined:                3
Successful chunks (>=1 triple):      3
Failed chunks (API/parse/no triples): 0
Total triples (raw):                 64
Total triples (deduplicated):        57

=== Extracted Triples (Deduplicated) ===
Chunk | Subject                      | Predicate                          | Object                                
------+------------------------------+------------------------------------+---------------------------------------
1     | marie curie                  | was born as                        | maria skłodowska                      
1     | marie curie                  | was born in                        | warsaw, poland                        
1     | marie curie                  | was a                              | physicist                             
1     | marie curie                  | was a                              | chemist                               
1     | marie curie               

In [34]:
# --- Triple Normalization & De-duplication (Ruby) ---
# This cell normalizes subjects/predicates/objects, removes empties & duplicates,
# and provides detailed logging for the first few examples.

require 'set'

unless defined?(all_extracted_triples) && all_extracted_triples.is_a?(Array)
  puts "all_extracted_triples is not defined or not an Array. Run extraction first."
  return
end

original_count = all_extracted_triples.length
if original_count.zero?
  puts "No triples available to normalize. Run the extraction cell first."
  return
end

normalized_triples = []
seen_triples = Set.new # Tracks [subject, predicate, object]
empty_removed_count = 0
duplicates_removed_count = 0
processed_count = 0
example_limit = 5

puts "Starting normalization and de-duplication of #{original_count} triples..."
puts "Processing triples for normalization (showing first #{example_limit} examples):"

all_extracted_triples.each_with_index do |triple, i|
  show_example = i < example_limit
  if show_example
    puts "\n--- Example #{i + 1} ---"
    puts "Original Triple (Chunk #{triple['chunk'] || '?' }): #{triple.inspect}"
  end

  subject_raw   = triple['subject']
  predicate_raw = triple['predicate']
  object_raw    = triple['object']
  chunk_num     = triple['chunk'] || 'unknown'

  triple_valid = false
  normalized_sub = normalized_pred = normalized_obj = nil

  if subject_raw.is_a?(String) && predicate_raw.is_a?(String) && object_raw.is_a?(String)
    # 1. Normalize (lowercase & trim, collapse whitespace in predicate)
    normalized_sub  = subject_raw.strip.downcase
    normalized_pred = predicate_raw.strip.downcase.gsub(/\s+/, ' ').strip
    normalized_obj  = object_raw.strip.downcase

    puts "Normalized: SUB='#{normalized_sub}', PRED='#{normalized_pred}', OBJ='#{normalized_obj}'" if show_example

    # 2. Filter Empty
    if normalized_sub.empty? || normalized_pred.empty? || normalized_obj.empty?
      empty_removed_count += 1
      puts "Status: Discarded (Empty component after normalization)" if show_example
    else
      triple_identifier = [normalized_sub, normalized_pred, normalized_obj]

      # 3. De-duplicate
      if seen_triples.include?(triple_identifier)
        duplicates_removed_count += 1
        puts "Status: Discarded (Duplicate)" if show_example
      else
        normalized_triples << {
          'subject'      => normalized_sub,
          'predicate'    => normalized_pred,
            'object'       => normalized_obj,
          'source_chunk' => chunk_num
        }
        seen_triples.add(triple_identifier)
        triple_valid = true
        puts "Status: Kept (New Unique Triple)" if show_example
      end
    end
  else
    empty_removed_count += 1
    puts "Status: Discarded (Non-string or missing component)" if show_example
  end

  processed_count += 1
end

puts "\n... Finished processing #{processed_count} triples."
puts "Original triple count:          #{original_count}"
puts "Normalized unique triple count: #{normalized_triples.length}"
puts "Duplicates removed:             #{duplicates_removed_count}"
puts "Empty/invalid removed:          #{empty_removed_count}"

# Provide a small sample of the normalized triples
if normalized_triples.any?
  sample = normalized_triples.first([5, normalized_triples.length].min)
  puts "\nSample of normalized triples:" if sample.length > 0
  require 'json'
  puts JSON.pretty_generate(sample)
end

# normalized_triples now holds cleaned, deduped triples. You can export if desired:
# File.write('normalized_triples.json', JSON.pretty_generate(normalized_triples))
# puts "Saved normalized triples to normalized_triples.json"

Starting normalization and de-duplication of 64 triples...
Processing triples for normalization (showing first 5 examples):

--- Example 1 ---
Original Triple (Chunk 1): {"subject" => "marie curie", "predicate" => "was born as", "object" => "maria skłodowska", "chunk" => 1}
Normalized: SUB='marie curie', PRED='was born as', OBJ='maria skłodowska'
Status: Kept (New Unique Triple)

--- Example 2 ---
Original Triple (Chunk 1): {"subject" => "marie curie", "predicate" => "was born in", "object" => "warsaw, poland", "chunk" => 1}
Normalized: SUB='marie curie', PRED='was born in', OBJ='warsaw, poland'
Status: Kept (New Unique Triple)

--- Example 3 ---
Original Triple (Chunk 1): {"subject" => "marie curie", "predicate" => "was a", "object" => "physicist", "chunk" => 1}
Normalized: SUB='marie curie', PRED='was a', OBJ='physicist'
Status: Kept (New Unique Triple)

--- Example 4 ---
Original Triple (Chunk 1): {"subject" => "marie curie", "predicate" => "was a", "object" => "chemist", "chunk" =>

In [35]:
# --- Build / Inspect Knowledge Graph (RGL) ---
# Converts the previous NetworkX-style placeholder into Ruby using RGL.
# Creates a directed graph and (optionally) populates it from normalized_triples or all_extracted_triples.

require 'rgl/adjacency'
require 'rgl/connected_components'
# require 'rgl/dot' # Uncomment if you want DOT/GraphViz export later

knowledge_graph = RGL::DirectedAdjacencyGraph.new
puts "Initialized empty RGL::DirectedAdjacencyGraph"
puts
# Determine source triple list preference: normalized_triples > all_extracted_triples
triple_source = nil
if defined?(normalized_triples) && normalized_triples.is_a?(Array) && !normalized_triples.empty?
  triple_source = normalized_triples
  puts "Using normalized_triples (#{normalized_triples.length}) to populate graph."
elsif defined?(all_extracted_triples) && all_extracted_triples.is_a?(Array) && !all_extracted_triples.empty?
  triple_source = all_extracted_triples
  puts "Using all_extracted_triples (#{all_extracted_triples.length}) to populate graph."
else
  puts "No triples available to add to the graph yet. Run extraction & normalization first if desired."
end

edge_count = 0
predicate_catalog = Hash.new(0)

if triple_source
  triple_source.each do |t|
    s = t['subject'] || t[:subject]
    o = t['object']  || t[:object]
    p = t['predicate'] || t[:predicate]
    next unless s.is_a?(String) && o.is_a?(String) && !s.empty? && !o.empty?
    knowledge_graph.add_edge(s, o)
    edge_count += 1
    predicate_catalog[p] += 1 if p
  end
end

puts "--- Graph Info ---"
puts "Nodes: #{knowledge_graph.vertices.to_a.size}"
puts "Edges: #{knowledge_graph.edges.to_a.size} (added #{edge_count} from triples)"
if predicate_catalog.any?
  top_preds = predicate_catalog.sort_by { |_, c| -c }.first(5)
  puts "Top predicates (frequency): #{top_preds.map { |pr, c| "#{pr}=#{c}" }.join(', ')}"
end
puts '-' * 25

# Example: list first few edges
if knowledge_graph.edges.any?
  sample_edges = knowledge_graph.edges.to_a.first(5)
  puts "Sample edges (subject -> object):"
  sample_edges.each { |e| puts "  #{e.source} -> #{e.target}" }
end

# Strongly connected components (available via connected_components require)
if knowledge_graph.vertices.any? && knowledge_graph.respond_to?(:strongly_connected_components)
  scc = knowledge_graph.strongly_connected_components
  puts "Strongly connected component count: #{scc.length}" if scc.respond_to?(:length)
end

# Optional GraphViz export (uncomment if graphviz installed):
# require 'rgl/dot'
# knowledge_graph.write_to_graphic_file('png', 'knowledge_graph')
# puts "Wrote knowledge_graph.png"

# knowledge_graph now represents the directed entity graph.


Initialized empty RGL::DirectedAdjacencyGraph

Using normalized_triples (57) to populate graph.
--- Graph Info ---
Nodes: 53
Edges: 55 (added 57 from triples)
Top predicates (frequency): became=3, won=3, died from=2, died in=2, processed=2
-------------------------
Sample edges (subject -> object):
  marie curie -> maria skłodowska
  marie curie -> warsaw, poland
  marie curie -> physicist
  marie curie -> chemist
  marie curie -> radioactivity


In [36]:
# --- Incrementally Add Triples to Graph (Ruby / RGL) ---
# Adds edges from normalized_triples (preferred) with incremental progress logging.
# RGL does not support edge attributes directly, so we maintain a separate edge_metadata hash
# mapping [subject, object] => Set of predicates encountered.

require 'set'

puts "Adding triples to the RGL graph..."
update_interval = 5  # How often to print graph info updates

# Ensure we have a graph
unless defined?(knowledge_graph) && knowledge_graph
  require 'rgl/adjacency'
  knowledge_graph = RGL::DirectedAdjacencyGraph.new
  puts "Initialized new RGL::DirectedAdjacencyGraph (previous graph not found)."
end

# Choose source triples (normalized preferred)
triples_source = if defined?(normalized_triples) && normalized_triples.is_a?(Array) && !normalized_triples.empty?
  normalized_triples
elsif defined?(all_extracted_triples) && all_extracted_triples.is_a?(Array) && !all_extracted_triples.empty?
  all_extracted_triples
else
  nil
end

unless triples_source
  puts "Warning: No triples available to add to the graph. Run extraction & normalization first."
  return
end

# Edge metadata store (predicate labels). Preserve if already defined.
edge_metadata = if defined?(edge_metadata) && edge_metadata.is_a?(Hash)
  edge_metadata
else
  Hash.new { |h, k| h[k] = Set.new }
end

added_edges_count = 0
processed_count = 0

triples_source.each_with_index do |triple, i|
  subject_node  = triple['subject']  || triple[:subject]
  object_node   = triple['object']   || triple[:object]
  predicate_lbl = triple['predicate']|| triple[:predicate]
  next unless subject_node.is_a?(String) && object_node.is_a?(String) && predicate_lbl.is_a?(String)

  knowledge_graph.add_edge(subject_node, object_node)
  edge_metadata[[subject_node, object_node]] << predicate_lbl
  added_edges_count += 1
  processed_count = i + 1

  if (processed_count % update_interval).zero? || processed_count == triples_source.length
    puts "\n--- Graph Info after adding Triple #{processed_count} --- (#{subject_node} -> #{object_node} / #{predicate_lbl})"
    puts "Nodes: #{knowledge_graph.vertices.to_a.size}"
    puts "Edges (unique subject->object): #{knowledge_graph.edges.to_a.size}"
  end
end

puts "\nFinished adding triples. Processed #{processed_count} records; added #{added_edges_count} edges (subject->object)."

# Show sample of predicate labels per edge
if edge_metadata.any?
  puts "\nSample edge predicate labels:";
  edge_metadata.first(5).each do |(s,o), preds|
    puts "  #{s} -> #{o} : [#{preds.to_a.sort.join(', ')}]"
  end
end

# Optionally derive a simple predicate frequency report
predicate_freq = Hash.new(0)
edge_metadata.each_value { |set| set.each { |p| predicate_freq[p] += 1 } }
if predicate_freq.any?
  top = predicate_freq.sort_by { |_,c| -c }.first(5)
  puts "\nTop predicates (unique edge occurrences): #{top.map { |p,c| "#{p}=#{c}" }.join(', ')}"
end

# edge_metadata now holds predicate labels; knowledge_graph holds directed connectivity.


Adding triples to the RGL graph...

--- Graph Info after adding Triple 5 --- (marie curie -> radioactivity / conducted research on)
Nodes: 53
Edges (unique subject->object): 55

--- Graph Info after adding Triple 10 --- (marie curie -> nobel prize in two scientific fields / was only person to win)
Nodes: 53
Edges (unique subject->object): 55

--- Graph Info after adding Triple 15 --- (marie curie -> nobel prize in chemistry / won)
Nodes: 53
Edges (unique subject->object): 55

--- Graph Info after adding Triple 20 --- (mobile radiography units -> x-ray services / provided)
Nodes: 53
Edges (unique subject->object): 55

--- Graph Info after adding Triple 25 --- (marie curie -> november 7, 1867 / was born on)
Nodes: 53
Edges (unique subject->object): 55

--- Graph Info after adding Triple 30 --- (marie curie -> sorbonne / studied at)
Nodes: 53
Edges (unique subject->object): 55

--- Graph Info after adding Triple 35 --- (marie curie -> 1895 / married in)
Nodes: 53
Edges (unique subject->ob

In [37]:
# --- Final Graph Statistics (Ruby / RGL) ---
# Summarizes the current knowledge_graph (RGL::DirectedAdjacencyGraph) and edge_metadata predicate labels.

unless defined?(knowledge_graph) && knowledge_graph
  puts "No graph (knowledge_graph) found. Run the graph construction cells first."
  return
end

# Collect node / edge counts
vertices_array = knowledge_graph.vertices.to_a
edges_array    = knowledge_graph.edges.to_a
num_nodes = vertices_array.size
num_edges = edges_array.size

puts "\n--- Final Graph Summary ---"
puts "Total unique nodes (entities): #{num_nodes}"
puts "Total unique edges (subject->object): #{num_edges}"

if defined?(added_edges_count) && added_edges_count && added_edges_count != num_edges
  puts "Note: Added #{added_edges_count} edges incrementally, but graph holds #{num_edges}. Duplicate subject->object pairs collapse to one edge in DirectedAdjacencyGraph."
end

# Density (directed, no self-loops assumed): E / (N * (N - 1)) when N > 1
if num_nodes > 1
  max_edges = num_nodes * (num_nodes - 1)
  density = (max_edges > 0) ? (num_edges.to_f / max_edges) : 0.0
  puts "Graph density (directed, excl. self-loops): #{format('%.4f', density)}"
else
  puts "Graph density: N/A (fewer than 2 nodes)"
end

# Weakly connected components count (treat edges as undirected)
if num_nodes > 0
  begin
    require 'rgl/connected_components'
    # Build an undirected view (RGL doesn't auto-provide weak component helper for directed graphs)
    undirected = RGL::AdjacencyGraph.new
    edges_array.each { |e| undirected.add_edge(e.source, e.target) }
    comps = undirected.connected_components
    if comps.respond_to?(:size)
      if comps.size == 1
        puts "The graph is weakly connected (all nodes reachable ignoring direction)."
      else
        puts "The graph has #{comps.size} weakly connected components."
      end
    end
  rescue => e
    puts "Could not compute weakly connected components: #{e.message}"
  end
else
  puts "Graph is empty; connectivity metrics unavailable."
end

puts '-' * 25

# --- Sample Nodes ---
puts "\n--- Sample Nodes (First 10) ---"
if num_nodes > 0
  vertices_array.first(10).each_with_index do |v, i|
    puts "  #{i + 1}. #{v}"
  end
else
  puts "Graph has no nodes."
end

# --- Sample Edges ---
puts "\n--- Sample Edges (First 10 with Predicate Labels) ---"
if num_edges > 0
  # edge_metadata maps [subject, object] => Set(predicates) if maintained earlier
  has_labels = defined?(edge_metadata) && edge_metadata.is_a?(Hash) && edge_metadata.any?
  edges_array.first(10).each_with_index do |e, i|
    label_part = nil
    if has_labels
      preds = edge_metadata[[e.source, e.target]]
      label_part = preds && preds.any? ? preds.to_a.sort.join(', ') : 'N/A'
    end
    if has_labels
      puts "  #{i + 1}. #{e.source} -> #{e.target} | predicates: #{label_part}"
    else
      puts "  #{i + 1}. #{e.source} -> #{e.target}"
    end
  end
else
  puts "Graph has no edges."
end

puts '-' * 25

# Optional: export edge list & nodes (uncomment to use)
# require 'json'
# File.write('graph_nodes.json', JSON.pretty_generate(vertices_array))
# File.write('graph_edges.json', JSON.pretty_generate(edges_array.map { |e| { 'source' => e.source, 'target' => e.target, 'predicates' => (edge_metadata[[e.source, e.target]]&.to_a || []) } }))
# puts "Exported graph_nodes.json and graph_edges.json"



--- Final Graph Summary ---
Total unique nodes (entities): 53
Total unique edges (subject->object): 55
Note: Added 57 edges incrementally, but graph holds 55. Duplicate subject->object pairs collapse to one edge in DirectedAdjacencyGraph.
Graph density (directed, excl. self-loops): 0.0200
Could not compute weakly connected components: undefined method 'connected_components' for an instance of RGL::AdjacencyGraph
-------------------------

--- Sample Nodes (First 10) ---
  1. marie curie
  2. maria skłodowska
  3. warsaw, poland
  4. physicist
  5. chemist
  6. radioactivity
  7. marie curie and pierre curie
  8. polonium
  9. radium
  10. nobel prize

--- Sample Edges (First 10 with Predicate Labels) ---
  1. marie curie -> maria skłodowska | predicates: was born as
  2. marie curie -> warsaw, poland | predicates: was born in
  3. marie curie -> physicist | predicates: was a
  4. marie curie -> chemist | predicates: was a
  5. marie curie -> radioactivity | predicates: conducted resea

In [46]:
IRuby.html '<b style="color:blue">I&apos;m Blue</b>'

In [49]:
# --- JS Execution Sanity Test ---
# If scripting works, the text below should change from 'Waiting for JS...' to 'JS executed OK.'
# If it does NOT change, your notebook frontend is blocking inline <script> execution, which prevents Cytoscape from working.
IRuby.html <<~HTML
  <div id="js_exec_test" style="padding:8px; background:#eef; color:#333; border:1px solid #99c; font:13px system-ui;">
    Waiting for JS...
  </div>
  <script>
    try {
      var el = document.getElementById('js_exec_test');
      if(el) { el.textContent = 'JS executed OK.'; el.style.background = '#e5ffe5'; el.style.borderColor = '#5c5'; }
    } catch(e) {
      console.log('JS test error', e);
    }
  </script>
HTML

In [60]:
# --- Hybrid Graph Viewer: Immediate Static SVG + Progressive JS Force Layout (Fixed fit view) ---
# Fix: Removed unintended Ruby interpolation referencing 'scale' inside JS template strings.
# Provides static circular layout plus JS force simulation if scripts run.

require 'json'

unless defined?(knowledge_graph) && knowledge_graph
  puts "knowledge_graph not defined. Run graph construction cells first."; return
end

nodes = knowledge_graph.vertices.to_a
edges_raw = knowledge_graph.edges.to_a
if nodes.empty?
  puts "No nodes to visualize."; return
end

index_map = {}
nodes.each_with_index { |n,i| index_map[n] = i }

edges = []
edges_raw.each do |e|
  s = e.source; t = e.target
  next unless index_map.key?(s) && index_map.key?(t)
  preds = if defined?(edge_metadata) && edge_metadata.is_a?(Hash)
    (edge_metadata[[s,t]]&.to_a || []).sort
  else
    []
  end
  edges << { source: index_map[s], target: index_map[t], predicates: preds }
end

viewer_id   = "kg_hybrid_" + (0...6).map { ('a'..'z').to_a.sample }.join
svg_id      = viewer_id + '_svg'
log_id      = viewer_id + '_log'
info_id     = viewer_id + '_info'
fit_btn_id  = viewer_id + '_fit'

nodes_json = JSON.generate(nodes.map { |n| { label: n } })
edges_json = JSON.generate(edges)

width = 1200.0
height = 640.0
cx = width/2.0
cy = height/2.0
radius = [ ( [width,height].min / 2.6 ), 120 + nodes.size * 2 ].max

require 'digest'
color_for = lambda do |str|
  str = (str && !str.empty?) ? str : 'default'
  h = Digest::MD5.hexdigest(str)
  r = h[0,2].to_i(16); g = h[2,2].to_i(16); b = h[4,2].to_i(16)
  format('#%02x%02x%02x', r, g, b)
end

circ_positions = nodes.each_index.map do |i|
  angle = 2 * Math::PI * i / nodes.size
  [ (cx + radius * Math.cos(angle)), (cy + radius * Math.sin(angle)) ]
end

static_svg_lines = []
static_svg_lines << %Q{<svg id="#{svg_id}" xmlns="http://www.w3.org/2000/svg" width="100%" height="100%" viewBox="0 0 #{width.to_i} #{height.to_i}" style="background:#fafafa;font-family:Helvetica,Arial,sans-serif;">}
static_svg_lines << %Q{  <defs><marker id="#{viewer_id}_arrow" viewBox="0 0 10 10" refX="10" refY="5" markerWidth="7" markerHeight="7" orient="auto"><path d="M0 0 L10 5 L0 10 z" fill="#555"/></marker></defs>}
static_svg_lines << %Q{  <g id="#{viewer_id}_links">}
edges.each do |el|
  s_pos = circ_positions[el[:source]]; t_pos = circ_positions[el[:target]]
  stroke = color_for.call(el[:predicates].first)
  static_svg_lines << %Q{    <line data-edge="1" x1="#{s_pos[0].round(2)}" y1="#{s_pos[1].round(2)}" x2="#{t_pos[0].round(2)}" y2="#{t_pos[1].round(2)}" stroke="#{stroke}" stroke-width="1.6" marker-end="url(##{viewer_id}_arrow)">
      <title>#{el[:predicates].any? ? el[:predicates].join(', ') : ''}</title></line>}
end
static_svg_lines << %Q{  </g>}
static_svg_lines << %Q{  <g id="#{viewer_id}_nodes">}
nodes.each_with_index do |label,i|
  x,y = circ_positions[i]
  static_svg_lines << %Q{    <g data-node-index="#{i}" transform="translate(#{x.round(2)},#{y.round(2)})">}
  static_svg_lines << %Q{      <circle r="18" fill="#1976d2" stroke="#0d47a1" stroke-width="2"><title>#{label}</title></circle>}
  static_svg_lines << %Q{      <text y="5" text-anchor="middle" font-size="10" fill="#fff">#{label[0,14]}</text>}
  static_svg_lines << %Q{    </g>}
end
static_svg_lines << %Q{  </g>}
static_svg_lines << '</svg>'
static_svg = static_svg_lines.join("\n")

html = <<~HTML
  <div style="font:13px system-ui, sans-serif; margin-top:6px;">
    <div style="display:flex;align-items:center;gap:12px;flex-wrap:wrap;">
      <div style="font-weight:600;">Knowledge Graph (#{nodes.size} nodes / #{edges.size} edges)</div>
      <button id="#{fit_btn_id}" style="padding:4px 8px; font-size:12px;">Fit</button>
      <span id="#{info_id}" style="font-size:12px;color:#555;">Static layout loaded; starting force sim...</span>
    </div>
    <div id="#{viewer_id}" style="width:100%;height:660px;border:1px solid #ccc;position:relative;margin-top:4px;background:#fff;overflow:hidden;">
      #{static_svg}
      <pre id="#{log_id}" style="position:absolute;bottom:0;left:0;right:0;max-height:110px;margin:0;background:#111;color:#0f0;font-size:11px;line-height:1.25;padding:4px;overflow:auto;">[LOG] Static SVG rendered.\n</pre>
      <div style="position:absolute;top:4px;right:4px;background:rgba(0,0,0,0.05);padding:2px 6px;font-size:10px;border-radius:4px;">Hybrid Viewer</div>
    </div>
  </div>
  <script>
  (function(){
    const nodes = #{nodes_json};
    const links = #{edges_json};
    const svgId = #{JSON.generate(svg_id)};
    const logEl = document.getElementById(#{JSON.generate(log_id)});
    const infoEl = document.getElementById(#{JSON.generate(info_id)});
    const fitBtn = document.getElementById(#{JSON.generate(fit_btn_id)});
    function log(){ logEl.textContent += Array.from(arguments).join(' ') + '\n'; logEl.scrollTop = logEl.scrollHeight; }
    log('[LOG] JS start');

    const svg = document.getElementById(svgId);
    if(!svg){ log('[ERR] SVG root missing'); infoEl.textContent='Static only'; return; }

    const nodeGs = Array.from(svg.querySelectorAll('[data-node-index]'));
    nodeGs.forEach(g=>{ const idx=+g.getAttribute('data-node-index'); const m=g.getAttribute('transform').match(/translate\(([-0-9.]+),([-0-9.]+)\)/); const x=m?parseFloat(m[1]):0; const y=m?parseFloat(m[2]):0; nodes[idx].x=x; nodes[idx].y=y; nodes[idx].vx=0; nodes[idx].vy=0; nodes[idx].el=g; });
    const lineEls = Array.from(svg.querySelectorAll('line[data-edge]'));
    links.forEach((l,i)=>{ l.sourceNode=nodes[l.source]; l.targetNode=nodes[l.target]; l.el=lineEls[i]; });

    const repulsion=1400, springLen=130, springK=0.02, damping=0.85, centerForce=0.015;
    let running=true, ticks=0;
    function step(){ if(!running) return; ticks++; // repulsion
      for(let i=0;i<nodes.length;i++){ for(let j=i+1;j<nodes.length;j++){ const a=nodes[i], b=nodes[j]; let dx=a.x-b.x, dy=a.y-b.y; let d2=dx*dx+dy*dy+0.01; const f=repulsion/d2; const d=Math.sqrt(d2); dx/=d; dy/=d; a.vx+=dx*f; a.vy+=dy*f; b.vx-=dx*f; b.vy-=dy*f; } }
      // springs
      links.forEach(l=>{ const a=l.sourceNode, b=l.targetNode; let dx=b.x-a.x, dy=b.y-a.y; let d=Math.sqrt(dx*dx+dy*dy)||0.01; const diff=d-springLen; const f=springK*diff; dx/=d; dy/=d; a.vx+=dx*f; a.vy+=dy*f; b.vx-=dx*f; b.vy-=dy*f; });
      // center
      const vb=svg.viewBox.baseVal; const cx=(vb&&vb.width?vb.x+vb.width/2:#{width.to_i}/2); const cy=(vb&&vb.height?vb.y+vb.height/2:#{height.to_i}/2); nodes.forEach(n=>{ n.vx+=(cx-n.x)*centerForce; n.vy+=(cy-n.y)*centerForce; });
      // integrate
      nodes.forEach(n=>{ n.vx*=damping; n.vy*=damping; n.x+=n.vx; n.y+=n.vy; });
      // draw
      links.forEach(l=>{ l.el.setAttribute('x1', l.sourceNode.x); l.el.setAttribute('y1', l.sourceNode.y); l.el.setAttribute('x2', l.targetNode.x); l.el.setAttribute('y2', l.targetNode.y); });
      nodes.forEach(n=>{ n.el.setAttribute('transform', `translate(${n.x.toFixed(2)},${n.y.toFixed(2)})`); });
      if(ticks===10) log('[LOG] 10 ticks');
      if(ticks===60) infoEl.textContent='Force running...';
      if(ticks===200){ infoEl.textContent='Settled'; running=false; }
      if(running) requestAnimationFrame(step); }
    requestAnimationFrame(step);

    function fit(){ const xs=nodes.map(n=>n.x), ys=nodes.map(n=>n.y); const minX=Math.min(...xs), maxX=Math.max(...xs), minY=Math.min(...ys), maxY=Math.max(...ys); const bw=maxX-minX, bh=maxY-minY; const pad=40; const scaleFactor=Math.min((#{width.to_i}-pad)/bw, (#{height.to_i}-pad)/bh, 4); const viewW=#{width.to_i}/scaleFactor; const viewH=#{height.to_i}/scaleFactor; const centerX=(minX+maxX)/2; const centerY=(minY+maxY)/2; svg.setAttribute('viewBox', `${centerX - viewW/2} ${centerY - viewH/2} ${viewW} ${viewH}`); log('[LOG] fit scale='+scaleFactor.toFixed(2)); }
    document.getElementById(#{JSON.generate(fit_btn_id)}).addEventListener('click', fit);
    setTimeout(()=>{ fit(); }, 900);

  })();
  </script>
HTML

puts "Hybrid viewer (fixed) prepared (nodes=#{nodes.size} edges=#{edges.size})."

if defined?(IRuby) && IRuby.respond_to?(:html)
  IRuby.html(html)
elsif defined?(IRuby) && IRuby.respond_to?(:display)
  IRuby.display({ data: { 'text/html' => html }, metadata: {} })
else
  puts html
end

Hybrid viewer (fixed) prepared (nodes=51 edges=53).
