# Module 5a: Understanding Knowledge Graphs
## Theory and Foundations

### 1. What is a Knowledge Graph?

A knowledge graph is a structured way to represent information as a network of:
- **Nodes**: Representing entities or concepts
- **Edges**: Representing relationships between nodes
- **Properties**: Additional information about nodes and relationships

Let's see a simple example:

```python
import networkx as nx
import matplotlib.pyplot as plt

# Create a simple knowledge graph about movies
def create_movie_graph():
    G = nx.Graph()
    
    # Add nodes
    G.add_node("The Matrix", type="Movie")
    G.add_node("Keanu Reeves", type="Actor")
    G.add_node("Sci-Fi", type="Genre")
    G.add_node("1999", type="Year")
    
    # Add edges
    G.add_edge("The Matrix", "Keanu Reeves", relation="stars")
    G.add_edge("The Matrix", "Sci-Fi", relation="genre")
    G.add_edge("The Matrix", "1999", relation="released")
    
    return G

# Create and visualize the graph
G = create_movie_graph()
pos = nx.spring_layout(G)
nx.draw(G, pos, with_labels=True, node_color='lightblue', 
        node_size=2000, font_size=10, font_weight='bold')
plt.show()
```

### 2. Why Use Knowledge Graphs?

#### 2.1 Advantages Over Traditional Databases

1. **Flexible Schema**
```python
# Traditional Database (SQL)
"""
Movies:
  id INT
  title VARCHAR
  year INT
  genre VARCHAR

Actors:
  id INT
  name VARCHAR
  
MovieActors:
  movie_id INT
  actor_id INT
"""

# Knowledge Graph
"""
Can easily add new types of relationships:
- Actor INSPIRED_BY OtherActor
- Movie REMAKE_OF OtherMovie
- Actor DIRECTED Movie
"""
```

2. **Natural Relationships**
```python
# Finding all movies an actor influenced in SQL
"""
SELECT m.*
FROM Movies m
JOIN MovieActors ma ON m.id = ma.movie_id
JOIN Actors a ON a.id = ma.actor_id
JOIN ActorInfluences ai ON a.id = ai.influenced_id
WHERE ai.influencer_id = ?
"""

# In a Knowledge Graph
"""
MATCH (influencer:Actor)-[:INFLUENCED]->(actor:Actor)-[:ACTED_IN]->(movie:Movie)
WHERE influencer.name = 'Bruce Lee'
RETURN movie
"""
```

#### 2.2 Real-World Applications

1. **Google Knowledge Graph**
   - Powers rich search results
   - Connects billions of facts
   - Enhances understanding of queries

2. **Social Networks**
   - Facebook's Social Graph
   - LinkedIn's Economic Graph
   - Understanding relationships and connections

3. **Recommendation Systems**
   - Netflix content suggestions
   - Amazon product recommendations
   - Spotify music discovery

### 3. How Knowledge Graphs Work

#### 3.1 Basic Components

```python
class KnowledgeGraphExample:
    def __init__(self):
        self.nodes = {}
        self.relationships = []
    
    def add_entity(self, name: str, type: str, properties: dict = None):
        self.nodes[name] = {
            'type': type,
            'properties': properties or {}
        }
    
    def add_relationship(self, from_node: str, to_node: str, type: str):
        self.relationships.append({
            'from': from_node,
            'to': to_node,
            'type': type
        })
    
# Example usage
kg = KnowledgeGraphExample()

# Adding movie information
kg.add_entity('The Matrix', 'Movie', {
    'year': 1999,
    'rating': 8.7
})

kg.add_entity('Keanu Reeves', 'Person', {
    'birth_year': 1964,
    'nationality': 'Canadian'
})

kg.add_relationship('Keanu Reeves', 'The Matrix', 'ACTED_IN')
```

#### 3.2 Types of Relationships

1. **Hierarchical**
   - IS_A (inheritance)
   - PART_OF (composition)
   - BELONGS_TO (membership)

2. **Temporal**
   - HAPPENED_BEFORE
   - OCCURRED_AT
   - DURING

3. **Semantic**
   - SIMILAR_TO
   - RELATED_TO
   - SYNONYM_OF

### 4. Knowledge Graph Intelligence

#### 4.1 Inference Capabilities

```python
# Example of inference
class SmartKnowledgeGraph:
    def infer_new_knowledge(self):
        """
        If A TEACHES B and B LEARNS_WELL
        Then A IS_GOOD_TEACHER
        """
        for rel in self.relationships:
            if rel['type'] == 'TEACHES':
                student = rel['to']
                teacher = rel['from']
                
                # Check if student learns well
                if self.has_property(student, 'performance', 'excellent'):
                    # Infer new knowledge
                    self.add_property(teacher, 'teaching_skill', 'high')
```

#### 4.2 Question Answering

```python
class KnowledgeGraphQA:
    def answer_question(self, question: str) -> str:
        """
        Example of using knowledge graph for QA
        """
        # Parse question to identify entities and relationships
        entities = self.extract_entities(question)
        relationships = self.extract_relationships(question)
        
        # Search graph for relevant paths
        paths = self.find_paths(entities, relationships)
        
        # Construct answer from paths
        return self.construct_answer(paths)
```

### 5. Building Knowledge Graphs

#### 5.1 Information Extraction
```python
from spacy import displacy

def extract_knowledge(text: str):
    """
    Example of extracting knowledge from text
    """
    # Use NLP to identify entities
    doc = nlp(text)
    
    # Extract entities and relationships
    entities = [(ent.text, ent.label_) for ent in doc.ents]
    
    # Find relationships between entities
    relationships = extract_relationships(doc)
    
    return entities, relationships
```

#### 5.2 Quality Control
```python
class KnowledgeValidator:
    def validate_fact(self, fact):
        """
        Validate a new fact before adding to graph
        """
        # Check for contradictions
        if self.contradicts_existing(fact):
            return False
        
        # Verify source reliability
        if not self.is_reliable_source(fact.source):
            return False
        
        # Check fact consistency
        return self.is_consistent(fact)
```

### 6. Visualizing Knowledge

```python
import networkx as nx
import pyvis
from pyvis.network import Network

def visualize_knowledge_graph(nodes, edges):
    # Create network
    net = Network(height='750px', width='100%')
    
    # Add nodes with different colors per type
    for node, type in nodes.items():
        net.add_node(node, label=node, color=get_color_for_type(type))
    
    # Add edges with labels
    for edge in edges:
        net.add_edge(edge['from'], edge['to'], 
                    label=edge['type'],
                    arrows='to')
    
    # Display
    net.show('knowledge_graph.html')
```

### 7. Common Use Cases

1. **Enterprise Knowledge Management**
   - Document organization
   - Expert finding
   - Project connections

2. **Research and Development**
   - Scientific literature analysis
   - Drug discovery
   - Patent analysis

3. **Customer Service**
   - Automated support
   - Product recommendations
   - User behavior analysis

4. **Education**
   - Curriculum mapping
   - Learning path generation
   - Knowledge assessment

### 8. Future of Knowledge Graphs

1. **AI Integration**
   - Automated knowledge extraction
   - Self-healing graphs
   - Intelligent querying

2. **Scale and Performance**
   - Distributed graphs
   - Real-time updates
   - Query optimization

3. **New Applications**
   - Metaverse mapping
   - IoT device relationships
   - Autonomous systems

### Resources
- [Neo4j Graph Database](https://neo4j.com/developer/get-started/)
- [Knowledge Graphs (MIT Press)](https://mitpress.mit.edu/books/knowledge-graphs)
- [Google Knowledge Graph Search API](https://developers.google.com/knowledge-graph)
- [Practical Knowledge Graphs (O'Reilly)](https://www.oreilly.com/library/view/practical-knowledge-graphs/9781098111679/)