# Semantic Knowledge Graph Exploration

This notebook is dedicated to exploring and analyzing the trained semantic knowledge graph (`nx_semantic_final.graphml`). It provides comprehensive tools for:

- Loading and analyzing the graph statistics
- Exploring concept relationships and semantic paths
- Discovering patterns and extracting taxonomies
- Visualizing concept neighborhoods
- Performing advanced semantic queries
- Exporting results for downstream applications

The graph was created using hyper-training on ConceptNet data with semantic enrichment including high-confidence flags, transitivity scores, and centrality measures.

## 1. Import Libraries and Load the Semantic Graph

In [None]:
# Import required libraries
import pandas as pd
import networkx as nx
import os
import numpy as np
import json
from collections import Counter, defaultdict
import matplotlib.pyplot as plt
import seaborn as sns

# Set up plotting style
plt.style.use('default')
sns.set_palette("husl")

print("Libraries imported successfully!")

In [None]:
# Load the trained semantic graph
graph_path = os.path.join('..', 'Data', 'Output', 'nx_semantic_final.graphml')

print(f"Loading semantic graph from: {graph_path}")
G = nx.read_graphml(graph_path)

print(f"✅ Graph loaded successfully!")
print(f"📊 Nodes: {G.number_of_nodes():,}")
print(f"📊 Edges: {G.number_of_edges():,}")
print(f"📊 Graph type: {type(G).__name__}")

## 2. Knowledge Graph Explorer Class

This comprehensive explorer class provides all the tools needed for semantic analysis.

In [None]:
class SemanticGraphExplorer:
    """
    Comprehensive tools for exploring and analyzing a semantic knowledge graph.
    Optimized for ConceptNet-style graphs with relation types and weights.
    """
    
    def __init__(self, graph):
        """Initialize the explorer with a NetworkX graph"""
        self.graph = graph
        self.relation_types = self._get_all_relation_types()
        self._centrality_cache = {}
        
        print(f"🔍 Explorer initialized for graph with {self.graph.number_of_nodes():,} nodes")
        print(f"📋 Found {len(self.relation_types)} unique relation types")
        
    def _get_all_relation_types(self):
        """Get all unique relation types in the graph"""
        relation_types = set()
        for _, _, data in self.graph.edges(data=True):
            if 'relation' in data:
                relation_types.add(data['relation'])
        return sorted(list(relation_types))
    
    def get_graph_overview(self):
        """Get comprehensive statistics about the graph"""
        print("📊 GRAPH OVERVIEW")
        print("=" * 50)
        
        # Basic stats
        stats = {
            'nodes': self.graph.number_of_nodes(),
            'edges': self.graph.number_of_edges(),
            'density': nx.density(self.graph),
            'is_directed': nx.is_directed(self.graph),
            'is_multigraph': nx.is_multigraph(self.graph)
        }
        
        for key, value in stats.items():
            if isinstance(value, float):
                print(f"  {key}: {value:.6f}")
            else:
                print(f"  {key}: {value:,}" if isinstance(value, int) else f"  {key}: {value}")
        
        # Degree statistics
        degrees = [d for n, d in self.graph.degree()]
        print(f"\n📈 DEGREE STATISTICS")
        print(f"  Average degree: {np.mean(degrees):.2f}")
        print(f"  Median degree: {np.median(degrees):.2f}")
        print(f"  Max degree: {max(degrees):,}")
        print(f"  Min degree: {min(degrees):,}")
        
        # Weight statistics (if available)
        weights = [data.get('weight', 0) for _, _, data in self.graph.edges(data=True)]
        if weights and any(w > 0 for w in weights):
            print(f"\n⚖️  WEIGHT STATISTICS")
            print(f"  Average weight: {np.mean(weights):.3f}")
            print(f"  Median weight: {np.median(weights):.3f}")
            print(f"  Max weight: {max(weights):.3f}")
            print(f"  Min weight: {min(weights):.3f}")
        
        # Relation type distribution
        relation_counts = Counter()
        for _, _, data in self.graph.edges(data=True):
            relation = data.get('relation', 'unknown')
            relation_counts[relation] += 1
        
        print(f"\n🔗 TOP 10 RELATION TYPES")
        for relation, count in relation_counts.most_common(10):
            print(f"  {relation}: {count:,} ({count/self.graph.number_of_edges()*100:.1f}%)")
        
        return stats
    
    def explore_concept(self, concept, relation_filter=None, min_weight=0.0, limit=20, show_enrichment=True):
        """Explore all relationships for a given concept"""
        if concept not in self.graph:
            return f"❌ Concept '{concept}' not found in the graph."
        
        print(f"🔍 EXPLORING CONCEPT: '{concept}'")
        print("=" * 60)
        
        # Collect all edges
        edges = []
        
        # Outgoing edges
        for _, target, data in self.graph.out_edges(concept, data=True):
            if self._edge_matches_filter(data, relation_filter, min_weight):
                edge_info = {
                    'source': concept,
                    'target': target,
                    'relation': data.get('relation', 'unknown'),
                    'weight': data.get('weight', 0.0),
                    'direction': 'outgoing'
                }
                if show_enrichment:
                    edge_info.update({
                        'high_confidence': data.get('high_confidence', False),
                        'iteration_added': data.get('iteration_added', 'unknown')
                    })
                edges.append(edge_info)
        
        # Incoming edges
        for source, _, data in self.graph.in_edges(concept, data=True):
            if self._edge_matches_filter(data, relation_filter, min_weight):
                edge_info = {
                    'source': source,
                    'target': concept,
                    'relation': data.get('relation', 'unknown'),
                    'weight': data.get('weight', 0.0),
                    'direction': 'incoming'
                }
                if show_enrichment:
                    edge_info.update({
                        'high_confidence': data.get('high_confidence', False),
                        'iteration_added': data.get('iteration_added', 'unknown')
                    })
                edges.append(edge_info)
        
        # Convert to DataFrame and sort
        if not edges:
            print(f"No edges found matching the criteria.")
            return pd.DataFrame()
        
        df = pd.DataFrame(edges)
        df = df.sort_values('weight', ascending=False).head(limit)
        
        print(f"📊 Found {len(edges)} total relationships, showing top {len(df)}")
        print(f"📊 Degree: {self.graph.degree(concept)} (in: {self.graph.in_degree(concept)}, out: {self.graph.out_degree(concept)})")
        
        return df
    
    def _edge_matches_filter(self, data, relation_filter, min_weight):
        """Check if an edge matches the given filters"""
        if relation_filter and data.get('relation') not in relation_filter:
            return False
        if data.get('weight', 0) < min_weight:
            return False
        return True
    
    def find_semantic_path(self, source, target, relation_filter=None, max_length=4):
        """Find the shortest semantic path between two concepts"""
        if source not in self.graph:
            return f"❌ Source concept '{source}' not found"
        if target not in self.graph:
            return f"❌ Target concept '{target}' not found"
        
        print(f"🔍 FINDING PATH: '{source}' → '{target}'")
        print("=" * 60)
        
        # Create filtered graph if needed
        search_graph = self.graph
        if relation_filter:
            search_graph = nx.MultiDiGraph()
            for u, v, key, data in self.graph.edges(keys=True, data=True):
                if data.get('relation') in relation_filter:
                    search_graph.add_edge(u, v, key=key, **data)
            print(f"🔍 Using filtered graph with relations: {relation_filter}")
        
        try:
            # Find shortest path weighted by inverse of edge weight
            path = nx.shortest_path(
                search_graph, source, target,
                weight=lambda u, v, data: 1.0 / max(data.get('weight', 0.1), 0.1)
            )
            
            # Construct detailed path with relations
            detailed_path = []
            total_weight = 0
            
            for i in range(len(path) - 1):
                u, v = path[i], path[i + 1]
                edges = search_graph.get_edge_data(u, v)
                
                if edges:
                    # Get the best edge (highest weight)
                    best_key = max(edges.keys(), key=lambda k: edges[k].get('weight', 0))
                    best_edge = edges[best_key]
                    
                    step_info = {
                        'step': i + 1,
                        'from': u,
                        'relation': best_edge.get('relation', 'unknown'),
                        'to': v,
                        'weight': best_edge.get('weight', 0.0),
                        'high_confidence': best_edge.get('high_confidence', False)
                    }
                    detailed_path.append(step_info)
                    total_weight += step_info['weight']
            
            print(f"✅ Path found! Length: {len(path) - 1} steps")
            print(f"📊 Total path weight: {total_weight:.3f}")
            print(f"📊 Average step weight: {total_weight / len(detailed_path):.3f}")
            
            return pd.DataFrame(detailed_path)
            
        except nx.NetworkXNoPath:
            print(f"❌ No path found between '{source}' and '{target}'")
            return None
    
    def get_top_concepts(self, n=20, measure='degree', force_recalculate=False):
        """Get top concepts by various centrality measures"""
        print(f"🏆 TOP {n} CONCEPTS BY {measure.upper()}")
        print("=" * 50)
        
        if measure in self._centrality_cache and not force_recalculate:
            centrality = self._centrality_cache[measure]
        else:
            if measure == 'degree':
                centrality = dict(self.graph.degree())
            elif measure == 'in_degree':
                centrality = dict(self.graph.in_degree())
            elif measure == 'out_degree':
                centrality = dict(self.graph.out_degree())
            elif measure == 'pagerank':
                print("🔄 Computing PageRank (this may take a while for large graphs)...")
                centrality = nx.pagerank(self.graph, alpha=0.85, weight='weight')
            elif measure == 'betweenness':
                print("🔄 Computing Betweenness Centrality (this may take a while)...")
                centrality = nx.betweenness_centrality(self.graph, weight='weight')
            else:
                raise ValueError(f"Unknown centrality measure: {measure}")
            
            self._centrality_cache[measure] = centrality
        
        # Convert to DataFrame and get top N
        df = pd.DataFrame(centrality.items(), columns=['concept', 'score'])
        top_df = df.nlargest(n, 'score')
        
        print(f"📊 Showing top {len(top_df)} concepts")
        return top_df
    
    def analyze_relation_types(self, top_n=15):
        """Analyze the distribution and characteristics of relation types"""
        print(f"🔗 RELATION TYPE ANALYSIS")
        print("=" * 50)
        
        relation_stats = defaultdict(lambda: {
            'count': 0,
            'weights': [],
            'high_confidence_count': 0
        })
        
        for _, _, data in self.graph.edges(data=True):
            relation = data.get('relation', 'unknown')
            weight = data.get('weight', 0.0)
            high_conf = data.get('high_confidence', False)
            
            relation_stats[relation]['count'] += 1
            relation_stats[relation]['weights'].append(weight)
            if high_conf:
                relation_stats[relation]['high_confidence_count'] += 1
        
        # Create summary DataFrame
        summary_data = []
        for relation, stats in relation_stats.items():
            weights = stats['weights']
            summary_data.append({
                'relation': relation,
                'count': stats['count'],
                'percentage': stats['count'] / self.graph.number_of_edges() * 100,
                'avg_weight': np.mean(weights) if weights else 0,
                'high_confidence_rate': stats['high_confidence_count'] / stats['count'] * 100 if stats['count'] > 0 else 0
            })
        
        df = pd.DataFrame(summary_data)
        df = df.sort_values('count', ascending=False).head(top_n)
        
        print(f"📊 Found {len(relation_stats)} unique relation types")
        print(f"📊 Showing top {len(df)} by frequency")
        
        return df
    
    def extract_concept_taxonomy(self, root_concept, relation_types=['IsA'], max_depth=4, max_children=10):
        """Extract a hierarchical taxonomy starting from a root concept"""
        if root_concept not in self.graph:
            return f"❌ Root concept '{root_concept}' not found"
        
        print(f"🌳 EXTRACTING TAXONOMY FROM: '{root_concept}'")
        print(f"🔗 Using relations: {relation_types}")
        print(f"📏 Max depth: {max_depth}, Max children per node: {max_children}")
        print("=" * 60)
        
        def build_tree(concept, current_depth=0, visited=None):
            if visited is None:
                visited = set()
            
            if current_depth >= max_depth or concept in visited:
                return {}
            
            visited.add(concept)
            tree = {}
            
            # Get children for this concept
            children = []
            for _, target, data in self.graph.out_edges(concept, data=True):
                if data.get('relation') in relation_types:
                    children.append((target, data.get('weight', 0.0)))
            
            # Sort by weight and limit
            children.sort(key=lambda x: x[1], reverse=True)
            children = children[:max_children]
            
            for child, weight in children:
                tree[child] = {
                    'weight': weight,
                    'children': build_tree(child, current_depth + 1, visited.copy())
                }
            
            return tree
        
        taxonomy = build_tree(root_concept)
        
        # Count total nodes in taxonomy
        def count_nodes(tree):
            count = len(tree)
            for child_data in tree.values():
                if isinstance(child_data, dict) and 'children' in child_data:
                    count += count_nodes(child_data['children'])
            return count
        
        total_nodes = count_nodes(taxonomy)
        print(f"✅ Extracted taxonomy with {total_nodes} nodes")
        
        return {root_concept: taxonomy}
    
    def search_concepts(self, pattern, limit=20, use_regex=False):
        """Search for concepts matching a pattern"""
        import re
        
        print(f"🔍 SEARCHING CONCEPTS: '{pattern}'")
        print(f"📊 Pattern matching: {'Regex' if use_regex else 'Substring'}")
        print("=" * 50)
        
        matches = []
        
        if use_regex:
            try:
                regex = re.compile(pattern, re.IGNORECASE)
                for node in self.graph.nodes():
                    if regex.search(str(node)):
                        matches.append({
                            'concept': node,
                            'degree': self.graph.degree(node)
                        })
                        if len(matches) >= limit:
                            break
            except re.error as e:
                return f"❌ Invalid regex pattern: {e}"
        else:
            pattern_lower = pattern.lower()
            for node in self.graph.nodes():
                if pattern_lower in str(node).lower():
                    matches.append({
                        'concept': node,
                        'degree': self.graph.degree(node)
                    })
                    if len(matches) >= limit:
                        break
        
        if not matches:
            print(f"❌ No concepts found matching '{pattern}'")
            return pd.DataFrame()
        
        df = pd.DataFrame(matches)
        df = df.sort_values('degree', ascending=False)
        
        print(f"✅ Found {len(matches)} matching concepts")
        return df

print("✅ SemanticGraphExplorer class defined!")

## 3. Initialize Explorer and Get Graph Overview

In [None]:
# Initialize the semantic graph explorer
explorer = SemanticGraphExplorer(G)

# Get comprehensive graph overview
overview = explorer.get_graph_overview()

## 4. Explore Individual Concepts

Let's explore some interesting concepts in detail to understand their semantic relationships.

In [None]:
# Explore a technology concept
computer_relations = explorer.explore_concept('computer', min_weight=1.0, limit=15)
display(computer_relations)

In [None]:
# Explore an animal concept
dog_relations = explorer.explore_concept('dog', min_weight=1.5, limit=15)
display(dog_relations)

In [None]:
# Explore only specific relation types for a concept
human_isa = explorer.explore_concept('human', relation_filter=['IsA', 'CapableOf'], min_weight=1.0, limit=10)
display(human_isa)

## 5. Semantic Path Finding

Discover how concepts are semantically connected through chains of relationships.

In [None]:
# Find path between seemingly unrelated concepts
path1 = explorer.find_semantic_path('dog', 'computer', max_length=4)
if path1 is not None:
    display(path1)

In [None]:
# Find path using only specific relation types
path2 = explorer.find_semantic_path('animal', 'science', relation_filter=['RelatedTo', 'UsedFor'], max_length=3)
if path2 is not None:
    display(path2)

In [None]:
# Find path between abstract concepts
path3 = explorer.find_semantic_path('happiness', 'success', max_length=3)
if path3 is not None:
    display(path3)

## 6. Discover Most Important Concepts

Find the most central and influential concepts in the knowledge graph.

In [None]:
# Top concepts by total degree (most connected)
top_by_degree = explorer.get_top_concepts(20, 'degree')
display(top_by_degree)

In [None]:
# Top concepts by out-degree (most outgoing relationships)
top_by_out_degree = explorer.get_top_concepts(15, 'out_degree')
display(top_by_out_degree)

In [None]:
# Top concepts by PageRank (most influential)
# Note: This may take a while for large graphs
top_by_pagerank = explorer.get_top_concepts(15, 'pagerank')
display(top_by_pagerank)

## 7. Analyze Relation Types

Understand the distribution and characteristics of different semantic relation types.

In [None]:
# Comprehensive relation type analysis
relation_analysis = explorer.analyze_relation_types(top_n=20)
display(relation_analysis)

In [None]:
# Visualize relation type distribution
plt.figure(figsize=(12, 8))

# Top 10 relations by count
top_relations = relation_analysis.head(10)

plt.subplot(2, 1, 1)
bars = plt.bar(range(len(top_relations)), top_relations['count'])
plt.xticks(range(len(top_relations)), top_relations['relation'], rotation=45, ha='right')
plt.ylabel('Count')
plt.title('Top 10 Relation Types by Frequency')
plt.yscale('log')

# Color bars by percentage
colors = plt.cm.viridis(top_relations['percentage'] / top_relations['percentage'].max())
for bar, color in zip(bars, colors):
    bar.set_color(color)

plt.subplot(2, 1, 2)
plt.scatter(top_relations['avg_weight'], top_relations['high_confidence_rate'], 
           s=top_relations['count']/100, alpha=0.6)
plt.xlabel('Average Weight')
plt.ylabel('High Confidence Rate (%)')
plt.title('Relation Quality: Weight vs High Confidence Rate (bubble size = frequency)')

# Add labels for interesting points
for _, row in top_relations.iterrows():
    if row['high_confidence_rate'] > 50 or row['avg_weight'] > 2:
        plt.annotate(row['relation'], 
                    (row['avg_weight'], row['high_confidence_rate']),
                    xytext=(5, 5), textcoords='offset points',
                    fontsize=8, alpha=0.7)

plt.tight_layout()
plt.show()

## 8. Extract Semantic Taxonomies

Build hierarchical taxonomies from specific root concepts using 'IsA' relationships.

In [None]:
# Extract animal taxonomy
animal_taxonomy = explorer.extract_concept_taxonomy('animal', relation_types=['IsA'], max_depth=3, max_children=8)

# Pretty print the taxonomy
def print_taxonomy(taxonomy, indent=0):
    for concept, data in taxonomy.items():
        print("  " * indent + f"📌 {concept}")
        if isinstance(data, dict) and 'children' in data:
            print_taxonomy(data['children'], indent + 1)
        elif isinstance(data, dict):
            print_taxonomy(data, indent + 1)

print("🌳 ANIMAL TAXONOMY:")
print_taxonomy(animal_taxonomy)

In [None]:
# Extract technology taxonomy
tech_taxonomy = explorer.extract_concept_taxonomy('technology', relation_types=['IsA', 'InstanceOf'], max_depth=3, max_children=6)

print("🌳 TECHNOLOGY TAXONOMY:")
print_taxonomy(tech_taxonomy)

## 9. Concept Search and Discovery

Search for concepts using patterns and discover related terms.

In [None]:
# Search for concepts containing 'artificial'
ai_concepts = explorer.search_concepts('artificial', limit=15)
display(ai_concepts)

In [None]:
# Search using regex for concepts ending in 'science'
science_concepts = explorer.search_concepts(r'.*science$', limit=15, use_regex=True)
display(science_concepts)

In [None]:
# Search for emotion-related concepts
emotion_concepts = explorer.search_concepts('emotion', limit=10)
display(emotion_concepts)

# Explore relationships of the most connected emotion concept
if not emotion_concepts.empty:
    top_emotion = emotion_concepts.iloc[0]['concept']
    emotion_relations = explorer.explore_concept(top_emotion, min_weight=1.0, limit=10)
    print(f"\n🔍 Exploring '{top_emotion}':")
    display(emotion_relations)

## 10. Export Results for Further Analysis

Save interesting findings to files for use in other applications or analysis.

In [None]:
# Export top concepts to CSV
output_dir = os.path.join('..', 'Data', 'Output')

# Export top concepts by different measures
top_by_degree.to_csv(os.path.join(output_dir, 'top_concepts_by_degree.csv'), index=False)
top_by_pagerank.to_csv(os.path.join(output_dir, 'top_concepts_by_pagerank.csv'), index=False)

# Export relation analysis
relation_analysis.to_csv(os.path.join(output_dir, 'relation_type_analysis.csv'), index=False)

# Export concept explorations
computer_relations.to_csv(os.path.join(output_dir, 'computer_relationships.csv'), index=False)
dog_relations.to_csv(os.path.join(output_dir, 'dog_relationships.csv'), index=False)

# Export taxonomies as JSON
with open(os.path.join(output_dir, 'animal_taxonomy.json'), 'w') as f:
    json.dump(animal_taxonomy, f, indent=2)

with open(os.path.join(output_dir, 'tech_taxonomy.json'), 'w') as f:
    json.dump(tech_taxonomy, f, indent=2)

print("✅ Results exported to Data/Output/")
print("📁 Files created:")
print("  - top_concepts_by_degree.csv")
print("  - top_concepts_by_pagerank.csv")
print("  - relation_type_analysis.csv")
print("  - computer_relationships.csv")
print("  - dog_relationships.csv")
print("  - animal_taxonomy.json")
print("  - tech_taxonomy.json")

## 11. Advanced Analysis and Insights

Perform deeper analysis to extract insights about the semantic structure.

In [None]:
# Analyze high-confidence edges
high_conf_count = sum(1 for _, _, data in G.edges(data=True) if data.get('high_confidence', False))
total_edges = G.number_of_edges()
high_conf_percentage = high_conf_count / total_edges * 100

print(f"📊 HIGH-CONFIDENCE EDGE ANALYSIS")
print(f"Total edges: {total_edges:,}")
print(f"High-confidence edges: {high_conf_count:,} ({high_conf_percentage:.1f}%)")

# Analyze enrichment attributes by iteration
iteration_counts = Counter()
for _, _, data in G.edges(data=True):
    iteration = data.get('iteration_added', 'unknown')
    iteration_counts[iteration] += 1

print(f"\n📊 EDGES BY TRAINING ITERATION")
for iteration, count in sorted(iteration_counts.items()):
    percentage = count / total_edges * 100
    print(f"Iteration {iteration}: {count:,} edges ({percentage:.1f}%)")

In [None]:
# Find concepts that are hubs (high out-degree) vs authorities (high in-degree)
out_degrees = dict(G.out_degree())
in_degrees = dict(G.in_degree())

# Calculate hub vs authority ratio
hub_authority_ratio = {}
for node in G.nodes():
    out_deg = out_degrees[node]
    in_deg = in_degrees[node]
    if in_deg > 0:
        hub_authority_ratio[node] = out_deg / in_deg
    else:
        hub_authority_ratio[node] = float('inf') if out_deg > 0 else 0

# Find top hubs (high out-degree, low in-degree ratio)
top_hubs = sorted(hub_authority_ratio.items(), key=lambda x: x[1], reverse=True)[:10]
print("🌟 TOP HUBS (concepts that point to many others):")
for concept, ratio in top_hubs:
    if ratio != float('inf'):
        print(f"  {concept}: out={out_degrees[concept]}, in={in_degrees[concept]}, ratio={ratio:.2f}")

# Find top authorities (low out-degree, high in-degree ratio)
authority_scores = {node: 1/ratio if ratio > 0 and ratio != float('inf') else 0 
                   for node, ratio in hub_authority_ratio.items()}
top_authorities = sorted(authority_scores.items(), key=lambda x: x[1], reverse=True)[:10]
print("\n🎯 TOP AUTHORITIES (concepts that many others point to):")
for concept, score in top_authorities:
    if score > 0:
        print(f"  {concept}: out={out_degrees[concept]}, in={in_degrees[concept]}, authority_score={score:.2f}")

## 12. Interactive Exploration

Use this section for ad-hoc exploration and testing new queries.

In [None]:
# Interactive exploration cell - modify as needed
# Try exploring different concepts or finding paths between interesting pairs

concept_to_explore = 'love'  # Change this to any concept you want to explore
relations = explorer.explore_concept(concept_to_explore, min_weight=1.0, limit=12)
display(relations)

In [None]:
# Try finding interesting semantic paths
source_concept = 'music'  # Change these to explore different paths
target_concept = 'mathematics'

path = explorer.find_semantic_path(source_concept, target_concept, max_length=4)
if path is not None:
    display(path)
else:
    print(f"No path found between '{source_concept}' and '{target_concept}'")

## Summary

This notebook provides comprehensive tools for exploring your semantic knowledge graph. The graph contains rich semantic relationships with:

- **Semantic enrichment**: High-confidence flags, iteration tracking, and centrality measures
- **Multiple relation types**: Covering various semantic relationships like IsA, RelatedTo, CapableOf, etc.
- **Quality filtering**: Edges are weighted and filtered based on confidence and training iterations

**Key findings to explore further:**
1. The most central concepts in your domain
2. Semantic paths between seemingly unrelated concepts
3. Hierarchical taxonomies for specific domains
4. Patterns in relation types and their characteristics

**Next steps:**
- Use the exported data for downstream applications
- Integrate with NLP models for semantic search or QA
- Build interactive visualizations
- Expand the analysis with domain-specific queries