# Module 1: Graph Database Fundamentals

Welcome to your journey into the world of graph databases! In this module, you'll discover why graphs are revolutionizing how we think about connected data and learn the foundations that will power your AI applications.

## 🎯 What You'll Learn

By the end of this module, you'll be able to:
- Understand what makes graph databases unique and powerful
- Connect to Neo4j and navigate the basics
- Write fundamental Cypher queries to explore data
- Model real-world problems as graphs
- Recognize when to use graphs vs. traditional databases

## 🧠 Why This Matters

Before we dive into code, let's understand the "why" behind graphs:

**Traditional databases** excel at storing isolated records:
```
Customer Table: ID, Name, Email
Order Table: ID, Customer_ID, Amount
```

**Graph databases** excel at storing relationships:
```
(Customer)-[:PLACED]->(Order)-[:CONTAINS]->(Product)
```

The magic happens when you need to ask questions like:
- "Which customers bought similar products?"
- "What's the shortest path between two people in a social network?"
- "Which accounts are connected through suspicious transactions?"

These questions are **natural** in graphs but **complex** in traditional databases.

---

## 📚 Prerequisites

- Basic understanding of databases
- Python familiarity (we'll explain the graph concepts)
- Curiosity about connected data!

---

# Lesson 1: What Are Graphs? (10 minutes)

## 🤔 Think About This Scenario

Imagine you're building a recommendation system for an e-commerce site. You need to answer:

> *"Show me products that customers similar to John have purchased, but that John hasn't bought yet."*

In a traditional SQL database, this requires complex joins:

```sql
-- Find products that customers similar to John have purchased
-- but that John hasn't bought yet (SQL version)
SELECT DISTINCT p2.product_name, COUNT(*) as recommendation_score
FROM customers john
JOIN purchases jp ON john.customer_id = jp.customer_id
JOIN purchases similar_purchases ON jp.product_id = similar_purchases.product_id
JOIN customers similar_customers ON similar_purchases.customer_id = similar_customers.customer_id
JOIN purchases recommendations ON similar_customers.customer_id = recommendations.customer_id
JOIN products p2 ON recommendations.product_id = p2.product_id
LEFT JOIN purchases john_already_bought 
    ON john.customer_id = john_already_bought.customer_id 
    AND p2.product_id = john_already_bought.product_id
WHERE john.name = 'John' 
    AND similar_customers.customer_id != john.customer_id
    AND john_already_bought.product_id IS NULL
GROUP BY p2.product_name
ORDER BY recommendation_score DESC
LIMIT 5;
```

In a graph database, the same logic is much clearer:

```cypher
// Find customers who bought the same products as John
MATCH (john:Customer {name: 'John'})-[:PURCHASED]->(product)<-[:PURCHASED]-(similar:Customer)
WHERE john <> similar

// Find products these similar customers bought that John hasn't
MATCH (similar)-[:PURCHASED]->(recommendation)
WHERE NOT (john)-[:PURCHASED]->(recommendation)

// Return recommendations with popularity score
RETURN recommendation.name as product, 
       count(similar) as popularity_score
ORDER BY popularity_score DESC
LIMIT 5
```

The graph version reads like English: *"Find customers similar to John, see what they bought that John hasn't, and recommend the most popular items."*

## 🧩 Core Graph Concepts

Every graph database has three fundamental building blocks:

### 1. **Nodes** (The "Things")
- Represent entities in your domain
- Like rows in a table, but more flexible
- Examples: Person, Product, Company, Account

### 2. **Relationships** (The "Connections")
- Connect nodes with meaning
- Have direction and type
- Examples: FRIENDS_WITH, PURCHASED, WORKS_FOR, TRANSFERRED_TO

### 3. **Properties** (The "Details")
- Key-value pairs on nodes and relationships
- Store the actual data
- Examples: name, age, amount, timestamp

## 💡 Your First Mental Model

Think of a graph like a city map:
- **Nodes** = Intersections (places)
- **Relationships** = Roads (connections)
- **Properties** = Street names, distances, traffic lights (details)

Just like you can find the shortest route between two intersections, you can find the shortest path between any two nodes in your data!

## 🚀 Let's See It In Action

First, let's set up our environment and connect to Neo4j. Don't worry about understanding every line right now - focus on the concepts we're exploring.

### Step 1: Install Required Libraries

We need the Neo4j Python driver to connect to our database:

In [None]:
# Install the Neo4j driver - this lets Python talk to our graph database
!pip install neo4j pandas

print("✅ Libraries installed! Now we can connect to Neo4j.")

### Step 2: Import Libraries and Set Up Connection

Let's import what we need and establish our connection to Neo4j:

In [None]:
# Import the libraries we'll use
from neo4j import GraphDatabase
import pandas as pd
import os

# These are the connection details for your Neo4j database
# In a real project, you'd store these securely as environment variables
NEO4J_URI = os.getenv('NEO4J_URI', 'bolt://localhost:7687')
NEO4J_USERNAME = os.getenv('NEO4J_USERNAME', 'neo4j')
NEO4J_PASSWORD = os.getenv('NEO4J_PASSWORD', 'password')

print("📚 Libraries imported!")
print(f"🔗 Will connect to Neo4j at: {NEO4J_URI}")

### Step 3: Create Our Database Connection

Now let's create a connection to Neo4j and test it:

In [None]:
# Create a driver instance - think of this as opening a connection to the database
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))

# Let's create a helper function to run queries easily
def run_query(query, parameters=None):
    """
    This function takes a Cypher query and runs it against our Neo4j database.
    It returns the results as a list of dictionaries - easy to work with!
    """
    with driver.session() as session:
        result = session.run(query, parameters or {})
        return [record.data() for record in result]

# Test our connection with a simple query
test_result = run_query("RETURN 'Hello, Graph World!' as greeting")
print(f"🎉 Connection successful! Neo4j says: {test_result[0]['greeting']}")

## 🎯 Knowledge Check #1

Before we continue, let's make sure you understand the basics:

**Question**: What are the three fundamental components of a graph database?

**Your Answer**: (Think about it, then check below)

<details>
<summary>Click to see the answer</summary>

1. **Nodes** - The entities/things in your data
2. **Relationships** - The connections between nodes
3. **Properties** - The attributes/details stored on nodes and relationships

</details>

---

# Lesson 2: Introduction to Neo4j (10 minutes)

## 🏢 What Makes Neo4j Special?

Neo4j is the world's leading graph database. Here's why it's perfect for AI applications:

1. **Native Graph Storage**: Data is stored as graphs, not tables converted to graphs
2. **Cypher Query Language**: Intuitive, pattern-based queries that read like English
3. **ACID Compliance**: Reliable transactions for enterprise applications
4. **High Performance**: Optimized for traversing relationships quickly
5. **AI Integration**: Built-in support for vector embeddings and graph algorithms

## 🗃️ Understanding Neo4j's Data Model

Let's explore how Neo4j organizes data by creating a simple example. We'll model a small social network to understand the concepts.

### Our Example: A Tiny Social Network

Imagine we have:
- **People** (nodes with names and ages)
- **Friendships** (relationships between people)
- **Interests** (nodes representing hobbies)
- **Likes** (relationships from people to interests)

### Let's Build Our First Graph!

We'll start by clearing any existing data, then create our social network:

In [None]:
# First, let's clear any existing data to start fresh
# This query deletes all nodes and relationships
run_query("MATCH (n) DETACH DELETE n")
print("🧹 Database cleared - we're starting with a clean slate!")

Now let's create our first nodes. In Cypher, we use the `CREATE` command:

In [None]:
# Create people nodes
# Notice the syntax: (variable:Label {property: value})
create_people_query = """
CREATE (alice:Person {name: 'Alice', age: 28, city: 'New York'}),
       (bob:Person {name: 'Bob', age: 35, city: 'San Francisco'}),
       (carol:Person {name: 'Carol', age: 31, city: 'Chicago'})
"""

run_query(create_people_query)
print("👥 Created 3 people in our social network!")

# Let's verify they were created
people = run_query("MATCH (p:Person) RETURN p.name as name, p.age as age, p.city as city")
print("\nOur people:")
for person in people:
    print(f"  - {person['name']}, age {person['age']}, lives in {person['city']}")

### 🔗 Adding Relationships

Now comes the magic - let's connect our people with friendships! This is where graphs really shine:

In [None]:
# Create friendships between people
# We use MATCH to find existing nodes, then CREATE relationships
create_friendships_query = """
MATCH (alice:Person {name: 'Alice'}),
      (bob:Person {name: 'Bob'}),
      (carol:Person {name: 'Carol'})
CREATE (alice)-[:FRIENDS_WITH {since: '2020-01-15'}]->(bob),
       (bob)-[:FRIENDS_WITH {since: '2021-03-10'}]->(carol),
       (alice)-[:FRIENDS_WITH {since: '2019-08-22'}]->(carol)
"""

run_query(create_friendships_query)
print("🤝 Created friendships! Our people are now connected.")

# Let's see the friendships
friendships = run_query("""
MATCH (person1:Person)-[friendship:FRIENDS_WITH]->(person2:Person)
RETURN person1.name as friend1, person2.name as friend2, friendship.since as since
""")

print("\nFriendships in our network:")
for friendship in friendships:
    print(f"  - {friendship['friend1']} is friends with {friendship['friend2']} (since {friendship['since']})")

## 💡 Understanding What Just Happened

Let's break down what we just did:

1. **Created Nodes**: `(alice:Person {name: 'Alice', age: 28})`
   - `alice` is a variable (like a temporary name)
   - `Person` is a label (like a category)
   - `{name: 'Alice', age: 28}` are properties

2. **Created Relationships**: `(alice)-[:FRIENDS_WITH]->(bob)`
   - `[:FRIENDS_WITH]` is the relationship type
   - The arrow `->` shows direction
   - Relationships can have properties too!

3. **Queried Data**: `MATCH (p:Person) RETURN p.name`
   - `MATCH` finds patterns in the graph
   - `RETURN` specifies what to give back

## 🎯 Try It Yourself!

**Challenge**: Add yourself to the social network! Modify the code below:

In [None]:
# TODO: Add yourself to the social network!
# 1. Replace 'YourName' with your actual name
# 2. Update the age and city
# 3. Run this cell

add_yourself_query = """
CREATE (you:Person {name: 'YourName', age: 25, city: 'YourCity'})
"""

# Uncomment the line below after updating the query
# run_query(add_yourself_query)
# print("🎉 Welcome to the social network!")

# Verify everyone is there
all_people = run_query("MATCH (p:Person) RETURN p.name as name ORDER BY p.name")
print(f"\nTotal people in our network: {len(all_people)}")
for person in all_people:
    print(f"  - {person['name']}")

---

# Lesson 3: Basic Cypher Queries (15 minutes)

## 🎯 Learning Cypher: The Graph Query Language

Cypher is designed to be intuitive. If you can draw a graph on a whiteboard, you can write Cypher! Let's learn the essential patterns.

## 📝 The Five Essential Cypher Patterns

### 1. **MATCH** - Finding Things

`MATCH` is like `SELECT` in SQL, but for graph patterns:

In [None]:
# Basic MATCH: Find all people
print("1️⃣ Finding all people:")
all_people = run_query("MATCH (p:Person) RETURN p.name, p.age, p.city")
for person in all_people:
    print(f"   {person['p.name']}, {person['p.age']} years old, from {person['p.city']}")

print("\n2️⃣ Finding people with specific conditions:")
young_people = run_query("MATCH (p:Person) WHERE p.age < 30 RETURN p.name, p.age")
for person in young_people:
    print(f"   {person['p.name']} is {person['p.age']} years old")

### 2. **Relationship Patterns** - Following Connections

This is where graphs become powerful:

In [None]:
# Find who is friends with whom
print("3️⃣ Direct friendships:")
direct_friends = run_query("""
MATCH (person1:Person)-[:FRIENDS_WITH]->(person2:Person)
RETURN person1.name as from_person, person2.name as to_person
""")

for friendship in direct_friends:
    print(f"   {friendship['from_person']} → {friendship['to_person']}")

print("\n4️⃣ Mutual friendships (bidirectional):")
mutual_friends = run_query("""
MATCH (person1:Person)-[:FRIENDS_WITH]-(person2:Person)
WHERE person1.name < person2.name
RETURN person1.name as person1, person2.name as person2
""")

for friendship in mutual_friends:
    print(f"   {friendship['person1']} ↔ {friendship['person2']}")

### 3. **Aggregation** - Counting and Grouping

Just like SQL, we can count, sum, and group:

In [None]:
# Count total people and friendships
print("5️⃣ Network statistics:")
stats = run_query("""
MATCH (p:Person)
OPTIONAL MATCH (p)-[:FRIENDS_WITH]-(friend)
RETURN count(DISTINCT p) as total_people,
       count(DISTINCT friend) as total_connections
""")

stat = stats[0]
print(f"   Total people: {stat['total_people']}")
print(f"   Total connections: {stat['total_connections']}")

# Find who has the most friends
print("\n6️⃣ Most popular person:")
popularity = run_query("""
MATCH (p:Person)-[:FRIENDS_WITH]-(friend:Person)
RETURN p.name as person, count(friend) as friend_count
ORDER BY friend_count DESC
LIMIT 1
""")

if popularity:
    popular = popularity[0]
    print(f"   {popular['person']} has {popular['friend_count']} friends!")

### 4. **Path Finding** - The Graph Superpower

This is where graphs really shine - finding paths between nodes:

In [None]:
# Find paths between people
print("7️⃣ Finding paths in our network:")

# Direct paths (1 hop)
direct_paths = run_query("""
MATCH path = (start:Person)-[:FRIENDS_WITH]-(end:Person)
WHERE start.name = 'Alice' AND end.name = 'Bob'
RETURN length(path) as path_length
""")

if direct_paths:
    print(f"   Alice and Bob are {direct_paths[0]['path_length']} hop(s) apart")

# Find all possible paths up to 3 hops
all_paths = run_query("""
MATCH path = (start:Person)-[:FRIENDS_WITH*1..3]-(end:Person)
WHERE start.name = 'Alice'
RETURN end.name as reachable_person, length(path) as distance
ORDER BY distance, reachable_person
""")

print("\n   People Alice can reach:")
for path in all_paths:
    print(f"     {path['reachable_person']} (distance: {path['distance']})")

### 5. **Creating and Updating** - Modifying the Graph

Let's add some interests to make our network more interesting:

In [None]:
# Create interest nodes and connect people to them
print("8️⃣ Adding interests to our network:")

create_interests = """
CREATE (python:Interest {name: 'Python Programming'}),
       (ai:Interest {name: 'Artificial Intelligence'}),
       (music:Interest {name: 'Music'}),
       (travel:Interest {name: 'Travel'})
"""
run_query(create_interests)

# Connect people to interests
connect_interests = """
MATCH (alice:Person {name: 'Alice'}),
      (bob:Person {name: 'Bob'}),
      (carol:Person {name: 'Carol'}),
      (python:Interest {name: 'Python Programming'}),
      (ai:Interest {name: 'Artificial Intelligence'}),
      (music:Interest {name: 'Music'}),
      (travel:Interest {name: 'Travel'})
CREATE (alice)-[:INTERESTED_IN]->(python),
       (alice)-[:INTERESTED_IN]->(ai),
       (bob)-[:INTERESTED_IN]->(python),
       (bob)-[:INTERESTED_IN]->(travel),
       (carol)-[:INTERESTED_IN]->(music),
       (carol)-[:INTERESTED_IN]->(travel)
"""
run_query(connect_interests)

print("   ✅ Added interests and connections!")

# Now let's see who likes what
interests_query = run_query("""
MATCH (person:Person)-[:INTERESTED_IN]->(interest:Interest)
RETURN person.name as person, collect(interest.name) as interests
ORDER BY person.name
""")

print("\n   People and their interests:")
for person_interest in interests_query:
    interests_list = ', '.join(person_interest['interests'])
    print(f"     {person_interest['person']}: {interests_list}")

## 🎯 Try It Yourself - Advanced Queries!

Now you have a richer dataset. Try these challenges:

**Challenge 1**: Find people who share the same interests

In [None]:
# Challenge 1: Find people with shared interests
shared_interests = run_query("""
MATCH (person1:Person)-[:INTERESTED_IN]->(interest:Interest)<-[:INTERESTED_IN]-(person2:Person)
WHERE person1.name < person2.name
RETURN person1.name as person1, person2.name as person2, interest.name as shared_interest
""")

print("🤝 People with shared interests:")
for match in shared_interests:
    print(f"   {match['person1']} and {match['person2']} both like {match['shared_interest']}")

**Challenge 2**: Recommend new friends based on shared interests

In [None]:
# Challenge 2: Friend recommendations
# Find people Alice might want to be friends with (shared interests, not already friends)
recommendations = run_query("""
MATCH (alice:Person {name: 'Alice'})-[:INTERESTED_IN]->(interest:Interest)<-[:INTERESTED_IN]-(potential_friend:Person)
WHERE NOT (alice)-[:FRIENDS_WITH]-(potential_friend) AND alice <> potential_friend
RETURN potential_friend.name as recommended_friend,
       collect(interest.name) as shared_interests,
       count(interest) as interest_count
ORDER BY interest_count DESC
""")

print("💡 Friend recommendations for Alice:")
for rec in recommendations:
    shared = ', '.join(rec['shared_interests'])
    print(f"   {rec['recommended_friend']} (shared interests: {shared})")

## 🎯 Knowledge Check #2

**Question**: What's the difference between these two Cypher patterns?
1. `(a:Person)-[:FRIENDS_WITH]->(b:Person)`
2. `(a:Person)-[:FRIENDS_WITH]-(b:Person)`

**Your Answer**: (Think about it, then check below)

<details>
<summary>Click to see the answer</summary>

1. `(a)-[:FRIENDS_WITH]->(b)` - **Directed**: Only matches where `a` has an outgoing FRIENDS_WITH relationship to `b`
2. `(a)-[:FRIENDS_WITH]-(b)` - **Undirected**: Matches relationships in either direction between `a` and `b`

In our social network, if Alice is friends with Bob but we only created one direction, pattern #1 might miss some connections, while pattern #2 would find them regardless of direction.

</details>

---

# Lesson 4: Hands-on Exercise (10 minutes)

## 🏗️ Build Your Own Mini Social Network

Now it's your turn! Let's create a more complex scenario and practice what you've learned.

### 📋 Your Mission

You're building a professional networking app. Create a graph that includes:
- **People** with names, skills, and experience levels
- **Companies** where people work
- **Skills** that people have
- **Connections** between people (professional relationships)

### Step 1: Design Your Data Model

Before coding, think about:
- What **nodes** do you need?
- What **relationships** connect them?
- What **properties** should each have?

Let's start with a clear slate:

In [None]:
# Clear the database for our new exercise
run_query("MATCH (n) DETACH DELETE n")
print("🧹 Database cleared - ready for your professional network!")

### Step 2: Create Your Professional Network

Let's build this step by step. First, create some companies:

In [None]:
# Create companies
create_companies = """
CREATE (tech_corp:Company {name: 'TechCorp', industry: 'Technology', size: 'Large'}),
       (data_inc:Company {name: 'DataInc', industry: 'Analytics', size: 'Medium'}),
       (ai_startup:Company {name: 'AI Startup', industry: 'AI/ML', size: 'Small'})
"""

run_query(create_companies)
print("🏢 Created companies!")

# Verify companies were created
companies = run_query("MATCH (c:Company) RETURN c.name, c.industry, c.size")
print("\nOur companies:")
for company in companies:
    print(f"  - {company['c.name']} ({company['c.industry']}, {company['c.size']})")

Now create professionals with different skills:

In [None]:
# Create professionals
create_professionals = """
CREATE (sarah:Person {name: 'Sarah Johnson', title: 'Data Scientist', experience_years: 5}),
       (mike:Person {name: 'Mike Chen', title: 'Software Engineer', experience_years: 8}),
       (lisa:Person {name: 'Lisa Rodriguez', title: 'ML Engineer', experience_years: 3}),
       (david:Person {name: 'David Kim', title: 'Product Manager', experience_years: 6}),
       (emma:Person {name: 'Emma Wilson', title: 'AI Researcher', experience_years: 4})
"""

run_query(create_professionals)
print("👩‍💼👨‍💼 Created professionals!")

# Create skills
create_skills = """
CREATE (python:Skill {name: 'Python', category: 'Programming'}),
       (sql:Skill {name: 'SQL', category: 'Database'}),
       (ml:Skill {name: 'Machine Learning', category: 'AI/ML'}),
       (leadership:Skill {name: 'Leadership', category: 'Management'}),
       (statistics:Skill {name: 'Statistics', category: 'Analytics'})
"""

run_query(create_skills)
print("🛠️ Created skills!")

### Step 3: Create the Connections

Now let's connect people to companies and skills:

In [None]:
# Connect people to companies (who works where)
create_employment = """
MATCH (sarah:Person {name: 'Sarah Johnson'}),
      (mike:Person {name: 'Mike Chen'}),
      (lisa:Person {name: 'Lisa Rodriguez'}),
      (david:Person {name: 'David Kim'}),
      (emma:Person {name: 'Emma Wilson'}),
      (tech_corp:Company {name: 'TechCorp'}),
      (data_inc:Company {name: 'DataInc'}),
      (ai_startup:Company {name: 'AI Startup'})
CREATE (sarah)-[:WORKS_AT {start_date: '2022-01-15', role: 'Senior Data Scientist'}]->(data_inc),
       (mike)-[:WORKS_AT {start_date: '2020-03-01', role: 'Lead Engineer'}]->(tech_corp),
       (lisa)-[:WORKS_AT {start_date: '2023-06-01', role: 'ML Engineer'}]->(ai_startup),
       (david)-[:WORKS_AT {start_date: '2021-09-15', role: 'Product Manager'}]->(tech_corp),
       (emma)-[:WORKS_AT {start_date: '2023-02-01', role: 'Research Scientist'}]->(ai_startup)
"""

run_query(create_employment)
print("💼 Connected people to companies!")

# Connect people to skills
create_skill_relationships = """
MATCH (sarah:Person {name: 'Sarah Johnson'}),
      (mike:Person {name: 'Mike Chen'}),
      (lisa:Person {name: 'Lisa Rodriguez'}),
      (david:Person {name: 'David Kim'}),
      (emma:Person {name: 'Emma Wilson'}),
      (python:Skill {name: 'Python'}),
      (sql:Skill {name: 'SQL'}),
      (ml:Skill {name: 'Machine Learning'}),
      (leadership:Skill {name: 'Leadership'}),
      (statistics:Skill {name: 'Statistics'})
CREATE (sarah)-[:HAS_SKILL {proficiency: 'Expert', years: 5}]->(python),
       (sarah)-[:HAS_SKILL {proficiency: 'Expert', years: 5}]->(sql),
       (sarah)-[:HAS_SKILL {proficiency: 'Advanced', years: 4}]->(ml),
       (sarah)-[:HAS_SKILL {proficiency: 'Intermediate', years: 3}]->(statistics),
       (mike)-[:HAS_SKILL {proficiency: 'Expert', years: 8}]->(python),
       (mike)-[:HAS_SKILL {proficiency: 'Intermediate', years: 3}]->(leadership),
       (lisa)-[:HAS_SKILL {proficiency: 'Advanced', years: 3}]->(python),
       (lisa)-[:HAS_SKILL {proficiency: 'Expert', years: 3}]->(ml),
       (david)-[:HAS_SKILL {proficiency: 'Advanced', years: 4}]->(leadership),
       (david)-[:HAS_SKILL {proficiency: 'Beginner', years: 1}]->(sql),
       (emma)-[:HAS_SKILL {proficiency: 'Expert', years: 4}]->(ml),
       (emma)-[:HAS_SKILL {proficiency: 'Advanced', years: 3}]->(python),
       (emma)-[:HAS_SKILL {proficiency: 'Expert', years: 4}]->(statistics)
"""

run_query(create_skill_relationships)
print("🎯 Connected people to skills!")

### Step 4: Explore Your Network

Now let's ask some interesting questions about your professional network:

In [None]:
# Question 1: Who works at each company?
print("🏢 Company rosters:")
company_employees = run_query("""
MATCH (person:Person)-[:WORKS_AT]->(company:Company)
RETURN company.name as company, 
       collect(person.name + ' (' + person.title + ')') as employees
ORDER BY company.name
""")

for company_info in company_employees:
    employees_list = ', '.join(company_info['employees'])
    print(f"   {company_info['company']}: {employees_list}")

print("\n🎯 Skill experts:")
# Question 2: Who are the experts in each skill?
skill_experts = run_query("""
MATCH (person:Person)-[skill_rel:HAS_SKILL]->(skill:Skill)
WHERE skill_rel.proficiency = 'Expert'
RETURN skill.name as skill_name,
       collect(person.name) as experts
ORDER BY skill_name
""")

for skill_info in skill_experts:
    experts_list = ', '.join(skill_info['experts'])
    print(f"   {skill_info['skill_name']}: {experts_list}")

### 🎯 Your Challenge - Advanced Analytics!

Now try these advanced queries. Can you understand what each one does?

In [None]:
# Challenge 1: Find potential collaborators
# People who work at different companies but have similar skills
print("🤝 Potential cross-company collaborators:")
collaborators = run_query("""
MATCH (person1:Person)-[:HAS_SKILL]->(skill:Skill)<-[:HAS_SKILL]-(person2:Person)
MATCH (person1)-[:WORKS_AT]->(company1:Company)
MATCH (person2)-[:WORKS_AT]->(company2:Company)
WHERE company1 <> company2 AND person1.name < person2.name
RETURN person1.name as person1, 
       company1.name as company1,
       person2.name as person2, 
       company2.name as company2,
       collect(skill.name) as shared_skills
""")

for collab in collaborators:
    skills = ', '.join(collab['shared_skills'])
    print(f"   {collab['person1']} ({collab['company1']}) ↔ {collab['person2']} ({collab['company2']})")
    print(f"     Shared skills: {skills}")

# Challenge 2: Company skill analysis
print("\n🏢 Company skill analysis:")
company_skills = run_query("""
MATCH (person:Person)-[:WORKS_AT]->(company:Company)
MATCH (person)-[:HAS_SKILL]->(skill:Skill)
RETURN company.name as company,
       skill.name as skill,
       count(*) as skill_count,
       avg(toFloat(CASE 
           WHEN person.experience_years IS NOT NULL 
           THEN person.experience_years 
           ELSE 0 
       END)) as avg_experience
ORDER BY company, skill_count DESC
""")

current_company = ""
for skill_info in company_skills:
    if skill_info['company'] != current_company:
        current_company = skill_info['company']
        print(f"\n   {current_company}:")
    print(f"     {skill_info['skill']}: {skill_info['skill_count']} people (avg exp: {skill_info['avg_experience']:.1f} years)")

## 🏆 What You Just Accomplished!

In this hands-on exercise, you:

1. **Designed a complex data model** with multiple node types and relationships
2. **Created realistic professional network data** with companies, people, and skills
3. **Wrote advanced Cypher queries** for business insights
4. **Discovered patterns** that would be difficult to find in traditional databases

### 💡 Real-World Applications

The patterns you just learned are used in:
- **LinkedIn's professional recommendations**
- **Company talent analytics**
- **Skill gap analysis**
- **Team formation for projects**
- **Career path recommendations**

---

# 🎓 Module 1 Summary

## What You've Learned

Congratulations! You've completed Module 1 and learned:

### ✅ **Core Concepts**
- **Nodes, Relationships, and Properties** - the building blocks of graphs
- **When to use graphs** vs. traditional databases
- **Graph thinking** - modeling problems as connected data

### ✅ **Technical Skills**
- **Neo4j connection** and basic setup
- **Essential Cypher patterns**: MATCH, CREATE, relationships, aggregation
- **Path finding** and graph traversal
- **Complex queries** for business insights

### ✅ **Practical Experience**
- Built two different networks (social and professional)
- Wrote queries for real-world scenarios
- Analyzed patterns and relationships

## 🚀 What's Next?

In **Module 2: Structured Data**, you'll learn:
- Advanced data modeling techniques
- Importing large datasets efficiently
- Performance optimization
- Schema design best practices

## 🎯 Final Knowledge Check

Before moving on, make sure you can:

1. **Explain** when you'd choose a graph database over SQL
2. **Write** basic Cypher queries to find patterns
3. **Design** a simple graph model for a business problem
4. **Understand** how relationships make graphs powerful

## 💪 Practice Challenge

**Try this on your own**: Model a movie recommendation system with:
- Movies, Actors, Directors, Genres
- User ratings and preferences
- Find similar movies and recommend new ones

You now have all the tools you need!

---

**🎉 Congratulations on completing Module 1! You're ready for the next level of graph mastery.**

In [None]:
# Optional: Clean up and close connection
# Uncomment these lines if you want to clean up
# run_query("MATCH (n) DETACH DELETE n")
# driver.close()

print("🎊 Module 1 Complete! Well done!")
print("🚀 Ready for Module 2: Structured Data")