<div style="text-align: right;">¬© 2026 Moses Boudourides. All Rights Reserved.</div>

# LLMs for Qualitative and Mixed-Methods Social Network Analysis (SNA)
## Moses Boudourides

# Session 3: LLM Capabilities and Mixed-Methods Designs

In [1]:
import networkx as nx
import pandas as pd
import matplotlib.pyplot as plt
from pyvis.network import Network
import os
import json
import sys
import hashlib
import random
from sklearn.datasets import fetch_20newsgroups
import IPython
from openai import OpenAI
# import google.generativeai as genai

In [2]:
# --- 1. & 2. KEY LOADING & INITIALIZATION ---

# Force Google to use REST to avoid ALTS/GCP credential errors
os.environ["GOOGLE_API_USE_MTLS"] = "never" 

def get_api_key(file_path):
    if os.path.exists(file_path):
        with open(file_path, 'r') as f:
            return f.read().strip().replace('"', '').replace("'", "")
    return None

oa_key = get_api_key("openai_key.txt")
# gem_key = get_api_key("gemini_key.txt")

# Initialize OpenAI
client_oa = OpenAI(api_key=oa_key)

# # Initialize Gemini using 'rest' transport to bypass gRPC/ALTS errors
# genai.configure(api_key=gem_key, transport='rest')

# # Dynamic Model Selection
# available_models = [m.name for m in genai.list_models() if 'generateContent' in m.supported_generation_methods]
# target_model = 'gemini-1.5-flash' if 'models/gemini-1.5-flash' in available_models else available_models[0].split('/')[-1]
# model_gemini = genai.GenerativeModel(target_model)

## Part 1: Qualitative Text Corpus and LLM-Assisted Analysis

**Goal:**  
Demonstrate how LLMs can assist qualitative coding, memoing, and abductive reasoning‚Äîwithout replacing interpretation.

This notebook uses OpenAI API for real LLM analysis. All LLM outputs are **provisional** and must be interpreted by the researcher.

In [3]:
# --- 3. DATA & PERSISTENT QUERY STEP ---

# Simulated interview excerpts describing relationships
texts = [
    "I rely on Alex when decisions become difficult.",
    "Jordan often undermines my authority in meetings.",
    "Sam provides emotional support but avoids formal responsibility.",
    "Alex and Jordan appear aligned publicly but compete privately."
]

df = pd.DataFrame({"text": texts})
df

Unnamed: 0,text
0,I rely on Alex when decisions become difficult.
1,Jordan often undermines my authority in meetings.
2,Sam provides emotional support but avoids form...
3,Alex and Jordan appear aligned publicly but co...


## LLM-Assisted Provisional Coding

We ask the LLM to suggest relational codes. Codes are not final categories but provisional interpretations that the researcher must review and refine.

In [4]:
# 2. Persistence Logic
CACHE_FILE = "llm_cache_s3.json"

if os.path.exists(CACHE_FILE):
    with open(CACHE_FILE, "r") as f:
        cache = json.load(f)
else:
    cache = {}

def get_label(model_id, text, api_func, prompt_type="coding"):
    # Unique key ensures cache stays valid even if prompt or text changes
    cache_key = f"{model_id}_{prompt_type}_{text[:50]}"
    
    if cache_key in cache:
        return cache[cache_key]
    
    # Cache Miss: Call API
    result = api_func(text)
    cache[cache_key] = result
    
    # Save updated cache to disk
    with open(CACHE_FILE, "w") as f:
        json.dump(cache, f)
    return result

# 3. API Execution Wrappers
def query_openai_coding(text):
    prompt = f"""Propose provisional qualitative codes for social relationships in this excerpt. 
List only the codes, one per line, without explanation.

Text: {text}"""
    res = client_oa.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return res.choices[0].message.content.strip()

# 4. Run Process
print("Processing Qualitative Coding (Syncing Cache/API)...")
print()

for i, text in enumerate(texts):
    codes = get_label("openai", text, query_openai_coding, "coding")
    print(f"Text {i+1}: {text}")
    print(f"Provisional Codes:\n{codes}")
    print("-" * 60)

print("Status: All texts coded.")

Processing Qualitative Coding (Syncing Cache/API)...

Text 1: I rely on Alex when decisions become difficult.
Provisional Codes:
- Dependence
- Trust
- Support
- Decision-making partnership
------------------------------------------------------------
Text 2: Jordan often undermines my authority in meetings.
Provisional Codes:
- Authority Undermined  
- Interpersonal Conflict  
- Power Dynamics  
- Professional Disrespect  
- Team Disruption  
- Communication Breakdown  
- Lack of Support
------------------------------------------------------------
Text 3: Sam provides emotional support but avoids formal responsibility.
Provisional Codes:
- Emotional Support
- Avoidance of Responsibility
------------------------------------------------------------
Text 4: Alex and Jordan appear aligned publicly but compete privately.
Provisional Codes:
- Public Alignment  
- Private Competition  
- Social Facade  
- Interpersonal Dynamics  
- Hidden Rivalry
----------------------------------------------

## Relational Interpretation

Beyond simple coding, we ask the LLM to interpret the relational meaning‚Äîthe dynamics, power structures, and emotional undertones in the relationships described.

In [5]:
def query_openai_interpretation(text):
    prompt = f"""Interpret the relational dynamics in this excerpt. Identify:
1. The type of relationship (e.g., hierarchical, collegial, dependent)
2. The emotional or power dynamic
3. Any ambiguities or contradictions

Text: {text}"""
    res = client_oa.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return res.choices[0].message.content.strip()

print("Processing Relational Interpretation...")
print()

for i, text in enumerate(texts):
    interpretation = get_label("openai", text, query_openai_interpretation, "interpretation")
    print(f"Text {i+1}: {text}")
    print(f"Relational Interpretation:\n{interpretation}")
    print("-" * 60)

Processing Relational Interpretation...

Text 1: I rely on Alex when decisions become difficult.
Relational Interpretation:
Based on the excerpt provided, here‚Äôs the interpretation of the relational dynamics:

1. **Type of Relationship**: The relationship appears to be **dependent**. The speaker indicates a reliance on Alex, suggesting that they look to him for support or guidance, particularly in challenging situations. This dynamic indicates that the speaker may view Alex as a source of expertise or authority in decision-making.

2. **Emotional or Power Dynamic**: The emotional dynamic suggests a level of trust and vulnerability from the speaker towards Alex. The speaker acknowledges their difficulty in making decisions, which implies a certain degree of insecurity or uncertainty. In terms of power dynamics, Alex may hold a position of influence or authority over the speaker, as the latter seeks assistance when faced with tough choices. This can create an imbalance, where Alex may 

## LLM-Assisted Memo Drafting

We ask the LLM to draft a reflective memo that surfaces themes, suggests theoretical links, and identifies patterns across the corpus. The researcher must then interpret, contextualize, and theorize these insights.

In [6]:
def query_openai_memo():
    texts_str = "\n".join([f"- {t}" for t in texts])
    prompt = f"""Write a short analytic memo reflecting on power, trust, and ambiguity in these relationship descriptions.
Address themes that emerge across multiple texts.

Texts:
{texts_str}"""
    res = client_oa.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return res.choices[0].message.content.strip()

print("Generating Analytic Memo...")
print()

memo_key = "openai_memo_corpus"
if memo_key in cache:
    memo = cache[memo_key]
else:
    memo = query_openai_memo()
    cache[memo_key] = memo
    with open(CACHE_FILE, "w") as f:
        json.dump(cache, f)

print("ANALYTIC MEMO:")
print(memo)
print()
print("Note: The memo surfaces themes and suggests links. Researchers must edit, contextualize, and theorize.")

Generating Analytic Memo...

ANALYTIC MEMO:
**Analytic Memo: Power, Trust, and Ambiguity in Relationship Dynamics**

**Introduction:**
The relationships described across the four texts reveal complex dynamics of power, trust, and ambiguity that significantly influence decision-making and organizational coherence. By analyzing these themes, we can gain insights into how interpersonal relationships shape the dynamics of authority and support within a group.

**Power Dynamics:**
Power is a central theme in these relationships. The reliance on Alex during difficult decisions suggests a distribution of power based on expertise or emotional intelligence, indicating trust in Alex's judgment. This reliance implies a hierarchical structure where Alex holds a position of influence, albeit informally. In contrast, Jordan‚Äôs behavior undermines authority, suggesting a struggle for power that disrupts team cohesion. This power struggle complicates decision-making processes and may lead to an envir

## Abductive Reasoning and Anomaly Detection

We ask the LLM to identify surprising or contradictory patterns. Abduction flags anomalies; researchers formulate hypotheses, return to data, and refine theory.

In [7]:
def query_openai_abduction():
    texts_str = "\n".join([f"- {t}" for t in texts])
    prompt = f"""Identify any surprising or contradictory relational patterns in these texts.
What patterns violate expectations? What tensions exist?

Texts:
{texts_str}"""
    res = client_oa.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return res.choices[0].message.content.strip()

print("Identifying Surprising Patterns (Abductive Reasoning)...")
print()

abduction_key = "openai_abduction_corpus"
if abduction_key in cache:
    abduction = cache[abduction_key]
else:
    abduction = query_openai_abduction()
    cache[abduction_key] = abduction
    with open(CACHE_FILE, "w") as f:
        json.dump(cache, f)

print("SURPRISING PATTERNS:")
print(abduction)
print()
print("Note: Abduction flags anomalies. Researchers formulate hypotheses, return to data, and refine theory.")

Identifying Surprising Patterns (Abductive Reasoning)...

SURPRISING PATTERNS:
The texts reveal several relational patterns that might be surprising or contradictory, as well as tensions that violate expectations:

1. **Dependence vs. Undermining**: The first text indicates a reliance on Alex during difficult decisions, suggesting trust and support. However, the second text highlights Jordan's behavior of undermining authority. This creates a tension between reliance on someone for support (Alex) and feeling undermined by another (Jordan), which could lead to conflicts or challenges in decision-making processes.

2. **Support vs. Responsibility**: The third text describes Sam as providing emotional support but avoiding formal responsibility. This pattern may violate the expectation that emotional support often comes with a willingness to take on responsibilities. This tension suggests a potential imbalance in the relationship dynamics, where emotional support is offered without a commi

## Structural vs. Qualitative Anomalies

We detect cases where structural equivalence (same network position) diverges from qualitative identity (different relational meanings). This is a key abductive prompt.

In [8]:
# Simulated node metadata from the qualitative analysis
nodes_metadata = {
    "Alex": {"role": "Advisor", "description": "Acts as a trusted confidant for difficult decisions."},
    "Jordan": {"role": "Advisor", "description": "Acts as a formal authority figure who undermines others."},
    "Sam": {"role": "Supporter", "description": "Provides emotional support but avoids formal responsibility."}
}

print("ANOMALY DETECTION: Structural Equivalence vs. Qualitative Identity")
print()

# Check for structural equivalents with divergent qualitative descriptions
for node_a in nodes_metadata:
    for node_b in nodes_metadata:
        if node_a < node_b:  # Avoid duplicates
            if nodes_metadata[node_a]["role"] == nodes_metadata[node_b]["role"]:
                if nodes_metadata[node_a]["description"] != nodes_metadata[node_b]["description"]:
                    print(f"ANOMALY DETECTED:")
                    print(f"  {node_a} and {node_b} have the same role ('{nodes_metadata[node_a]['role']}')")
                    print(f"  but divergent relational identities:")
                    print(f"    {node_a}: {nodes_metadata[node_a]['description']}")
                    print(f"    {node_b}: {nodes_metadata[node_b]['description']}")
                    print(f"  Research Question: How does this difference affect network dynamics?")
                    print()

print("Anomaly detection complete.")

ANOMALY DETECTION: Structural Equivalence vs. Qualitative Identity

ANOMALY DETECTED:
  Alex and Jordan have the same role ('Advisor')
  but divergent relational identities:
    Alex: Acts as a trusted confidant for difficult decisions.
    Jordan: Acts as a formal authority figure who undermines others.
  Research Question: How does this difference affect network dynamics?

Anomaly detection complete.


## Bias and Hallucination Check

Critical limitations of LLMs include hallucination and normalization of certain patterns. Researchers must ask:
- Does the LLM infer motives not in the data?
- Does it normalize certain relational patterns?
- Does it erase ambiguity?

Scaling magnifies these risks.

In [9]:
print("BIAS AND HALLUCINATION REFLECTION")
print()
print("Questions for the researcher:")
print("1. Did the LLM infer motives or relationships not explicitly stated in the texts?")
print("2. Did the LLM normalize certain power dynamics as 'natural' or 'expected'?")
print("3. Did the LLM erase ambiguity by forcing clear categories?")
print("4. Are there cultural or contextual biases in the LLM's interpretations?")
print()
print("Recommendation: Always validate LLM outputs against the original texts.")
print("Maintain interpretive control and theoretical sensitivity.")

BIAS AND HALLUCINATION REFLECTION

Questions for the researcher:
1. Did the LLM infer motives or relationships not explicitly stated in the texts?
2. Did the LLM normalize certain power dynamics as 'natural' or 'expected'?
3. Did the LLM erase ambiguity by forcing clear categories?
4. Are there cultural or contextual biases in the LLM's interpretations?

Recommendation: Always validate LLM outputs against the original texts.
Maintain interpretive control and theoretical sensitivity.


## Session 3 Takeaway

LLMs can assist coding, memoing, and abductive exploration. They do not replace:
- Theoretical sensitivity
- Interpretive judgment
- Reflexive responsibility

## Part 2: Applying Mixed-Methods Analysis to 20 Newsgroups Dataset

Now we apply the same LLM-assisted qualitative and mixed-methods approaches to a larger dataset: the 20 Newsgroups corpus.

In [10]:
# --- CONFIGURATION ---
n = 20  # Number of Nodes (Researchers)
m = 100  # Number of Edges (Interactions/posts)

# Dataset Description (Formal Comment for Seminar)
# The 20 Newsgroups dataset is a collection of approximately 18,000 newsgroup posts 
# that originated in the early days of the internet (Usenet) and they can be 
# displayed as a social network (a directed weighted multigraph) among thousands 
# of unique nodes/researchers interacting/replying in the posts of the 20 newsgroups.
# Taken from sklearn.datasets.fetch_20newsgroups

# Generate a unique filename based on m to avoid mixing samples
config_hash = hashlib.md5(f"{m}_newsgroups_s3".encode()).hexdigest()[:8]
SNAPSHOT_FILE = f"news_snapshot_m{m}_{config_hash}.csv"

# CHECK IF WE ALREADY HAVE THE COMPLETE DATA
if os.path.exists(SNAPSHOT_FILE):
    print(f"‚úÖ LOADING PERMANENT SNAPSHOT: {SNAPSHOT_FILE}")
    interactions = pd.read_csv(SNAPSHOT_FILE)
else:
    print(f"üöÄ SNAPSHOT NOT FOUND. GENERATING NEW SAMPLE...")
    
    # 1. Fetch the big dataset (11,000+ posts)
    newsgroups = fetch_20newsgroups(subset='train', remove=('headers', 'footers', 'quotes'))
    full_df = pd.DataFrame({'text': newsgroups.data})
    
    # 2. Filter and Sample M posts
    df = full_df[full_df['text'].str.strip().str.len() > 20].copy()
    subset = df.sample(n=m, random_state=42).reset_index(drop=True)
    
    # 3. Assign the Social Structure (Source/Target)
    user_pool = [f"Researcher_{i:02d}" for i in range(n)]
    sources = [random.choice(user_pool) for _ in range(m)]
    targets = [random.choice([u for u in user_pool if u != s]) for s in sources]

    interactions = pd.DataFrame({
        "source": sources,
        "target": targets,
        "text": subset['text'].str[:300].replace('\n', ' ', regex=True)
    })

    # 4. IMMEDIATELY SAVE (before LLM processing)
    interactions.to_csv(SNAPSHOT_FILE, index=False)
    print(f"üíæ PERMANENTLY SAVED: {SNAPSHOT_FILE}")

print(f"\n--- READY: {len(interactions)} interactions between {n} nodes ---")
interactions.head()

üöÄ SNAPSHOT NOT FOUND. GENERATING NEW SAMPLE...
üíæ PERMANENTLY SAVED: news_snapshot_m100_715b7bf8.csv

--- READY: 100 interactions between 20 nodes ---


Unnamed: 0,source,target,text
0,Researcher_16,Researcher_08,In case you missed it on the news....the first...
1,Researcher_04,Researcher_19,We have no way of knowing because we cann...
2,Researcher_02,Researcher_10,The lengthy article you quote doesn't imply ...
3,Researcher_10,Researcher_17,"The recent rise of nostalgia in this group, co..."
4,Researcher_16,Researcher_10,"# ## Absolutely nothing, seeing as there is no..."


## LLM-Assisted Coding on Newsgroups Data

Apply the same provisional coding approach to a sample of newsgroup posts to identify relational codes.

In [11]:
print("Processing Qualitative Coding on Newsgroups Sample...")
print()

# Sample a subset for detailed coding analysis
sample_size = min(5, len(interactions))
sample_interactions = interactions.sample(n=sample_size, random_state=42)

for idx, row in sample_interactions.iterrows():
    text = row['text']
    codes = get_label("openai", text, query_openai_coding, "coding_newsgroups")
    print(f"Post {idx}: {text[:80]}...")
    print(f"Codes: {codes}")
    print("-" * 60)

print("Coding sample complete.")

Processing Qualitative Coding on Newsgroups Sample...

Post 83:     Ok boys & girls, hang on; here we go!      Christ's Eternal Gospel          ...
Codes: - Group dynamics
- Authority figures
- Collaboration
- Intellectual exchange
- Scholarly relationships
- Shared beliefs
- Community identity
- Educational interaction
------------------------------------------------------------
Post 53:     J.N. Darby was one of the founders of the "Plymouth Brethren" and an early s...
Codes: - Foundational Influence  
- Supportive Relationships  
- Translation Collaboration  
- Approval and Endorsement  
- Multilingual Engagement  
- Historical Context  
- Scholarly Contribution
------------------------------------------------------------
Post 70:  I missed the presentations given in the morning session (when Shea gave his "ra...
Codes: - Absence
- Critique
- Presence
- Small group dynamics
- Communication tools
- Speaker-audience interaction
- Panel discussion
--------------------------------------

## Relational Interpretation on Newsgroups Data

Interpret the relational dynamics and power structures in newsgroup interactions.

In [12]:
print("Processing Relational Interpretation on Newsgroups Sample...")
print()

for idx, row in sample_interactions.iterrows():
    text = row['text']
    interpretation = get_label("openai", text, query_openai_interpretation, "interpretation_newsgroups")
    print(f"Post {idx}: {text[:80]}...")
    print(f"Interpretation: {interpretation}")
    print("-" * 60)

print("Interpretation sample complete.")

Processing Relational Interpretation on Newsgroups Sample...

Post 83:     Ok boys & girls, hang on; here we go!      Christ's Eternal Gospel          ...
Interpretation: Based on the provided excerpt, here‚Äôs an interpretation of the relational dynamics:

1. **Type of Relationship**: 
   - The relationship appears to be **collegial**, as indicated by the informal greeting ("Ok boys & girls") and the collaborative nature of discussing academic topics (e.g., the Dead Sea Scrolls, biblical texts). This suggests a group dynamic where individuals are likely peers or colleagues engaging in a shared intellectual pursuit.

2. **Emotional or Power Dynamic**:
   - The emotional tone is **enthusiastic and informal**, which may suggest a friendly and engaging atmosphere among the participants. However, the use of multiple names associated with scholarly works hints at varying levels of expertise and authority, which could imply a subtle **power dynamic** where certain individuals (possibly the a

## Analytic Memo on Newsgroups Discourse

Generate a reflective memo on the themes and patterns emerging from the newsgroups sample.

In [13]:
def query_openai_newsgroups_memo():
    sample_texts = "\n".join([f"- {row['text'][:100]}..." for _, row in sample_interactions.iterrows()])
    prompt = f"""Write a short analytic memo reflecting on the themes, tensions, and relational dynamics 
in these newsgroup posts. What patterns of discourse emerge?

Posts:
{sample_texts}"""
    res = client_oa.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return res.choices[0].message.content.strip()

print("Generating Analytic Memo on Newsgroups Discourse...")
print()

memo_ng_key = "openai_memo_newsgroups_sample"
if memo_ng_key in cache:
    memo_ng = cache[memo_ng_key]
else:
    memo_ng = query_openai_newsgroups_memo()
    cache[memo_ng_key] = memo_ng
    with open(CACHE_FILE, "w") as f:
        json.dump(cache, f)

print("ANALYTIC MEMO ON NEWSGROUPS DISCOURSE:")
print(memo_ng)
print()
print("Note: Researchers must contextualize and theorize these insights.")

Generating Analytic Memo on Newsgroups Discourse...

ANALYTIC MEMO ON NEWSGROUPS DISCOURSE:
**Analytic Memo: Reflections on Themes, Tensions, and Relational Dynamics in Newsgroup Posts**

**Introduction:**
The examined newsgroup posts reveal a complex tapestry of themes, tensions, and relational dynamics among participants. The discourse is characterized by an interplay of religious discussions, personal impressions, and technical exchanges, suggesting a community grappling with both ideological and practical matters.

**Themes:**
1. **Religious Discourse and Authority:**
   The reference to J.N. Darby and the "Plymouth Brethren" indicates a strong undercurrent of religious debate, suggesting that discussions around doctrinal interpretations and historical theological figures are central to this community. The mention of ‚ÄúChrist's Eternal Gospel‚Äù implies a commitment to exploring or defending particular beliefs, which can foster both unity and division among participants.

2. **Fru

## Abductive Analysis on Newsgroups Data

Identify surprising patterns and anomalies in the newsgroups interactions.

In [14]:
def query_openai_newsgroups_abduction():
    sample_texts = "\n".join([f"- {row['text'][:100]}..." for _, row in sample_interactions.iterrows()])
    prompt = f"""Identify any surprising, contradictory, or unexpected patterns in these newsgroup posts.
What tensions or anomalies exist in the discourse?

Posts:
{sample_texts}"""
    res = client_oa.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.7
    )
    return res.choices[0].message.content.strip()

print("Identifying Surprising Patterns in Newsgroups...")
print()

abduction_ng_key = "openai_abduction_newsgroups_sample"
if abduction_ng_key in cache:
    abduction_ng = cache[abduction_ng_key]
else:
    abduction_ng = query_openai_newsgroups_abduction()
    cache[abduction_ng_key] = abduction_ng
    with open(CACHE_FILE, "w") as f:
        json.dump(cache, f)

print("SURPRISING PATTERNS IN NEWSGROUPS:")
print(abduction_ng)
print()
print("Note: These anomalies prompt further investigation and theory refinement.")

Identifying Surprising Patterns in Newsgroups...

SURPRISING PATTERNS IN NEWSGROUPS:
Analyzing the newsgroup posts, several surprising, contradictory, or unexpected patterns emerge, revealing underlying tensions and anomalies in the discourse.

1. **Diverse Topics and Perspectives**: The posts cover a wide range of subjects, from religious discussions about Christ and J.N. Darby to technical issues related to computer mice. This diversity suggests a lack of focus within the group, leading to confusion about the central theme of the discussion. The sudden transition from theological discourse to technical troubleshooting indicates that participants may have varied interests, which could create tension as contributors struggle to follow or engage with topics outside their expertise.

2. **Tone and Engagement**: The first post has an excited tone, indicated by "Ok boys & girls, hang on; here we go!" This contrasts sharply with the more critical and somewhat dismissive tone in the third po

## Network Construction from Newsgroups Data

Build a directed graph from the newsgroups interactions to visualize the social network structure.

In [15]:
# Build the Graph from all interactions
G = nx.from_pandas_edgelist(interactions, 'source', 'target', 
                            create_using=nx.DiGraph())

print(f"Graph Statistics:")
print(f"Nodes: {G.number_of_nodes()}")
print(f"Edges: {G.number_of_edges()}")
print(f"Density: {nx.density(G):.3f}")
print()

# Print first 10 edges
print("Sample edges:")
for i, e in enumerate(G.edges(data=True)):
    if i < 10:
        print(e)

Graph Statistics:
Nodes: 20
Edges: 88
Density: 0.232

Sample edges:
('Researcher_16', 'Researcher_08', {})
('Researcher_16', 'Researcher_10', {})
('Researcher_16', 'Researcher_02', {})
('Researcher_16', 'Researcher_01', {})
('Researcher_16', 'Researcher_09', {})
('Researcher_16', 'Researcher_04', {})
('Researcher_16', 'Researcher_07', {})
('Researcher_16', 'Researcher_06', {})
('Researcher_08', 'Researcher_00', {})
('Researcher_08', 'Researcher_06', {})


## Network Visualization

Visualize the newsgroups social network using pyvis.

In [19]:
# Initialize Network
net = Network(height="500px", width="100%", directed=True, bgcolor="#ffffff")

# Add Nodes (labels only, no visible circles)
for node in G.nodes():
    net.add_node(
        node, 
        label=node, 
        shape='dot',
        size=1,
        color='#ffffff',
        borderWidth=0,
        font={'size': 12, 'color': 'black', 'align': 'center'}
    )
    
# Add Edges
for source, target in G.edges():
    net.add_edge(
        source, 
        target, 
        color='#848484',
        arrows={'to': {'enabled': True, 'scaleFactor': 0.5}},
        smooth={'type': 'curvedCW', 'roundness': 0.2},
        font={'align': 'top', 'size': 12, 'color': 'blue'}
    )

# Physics and Rendering
net.set_options("""
var options = {
  "physics": {
    "barnesHut": { "gravitationalConstant": -3000, "springLength": 150 }
  }
}
""")

html_content = net.generate_html()
with open("newsgroups_graph_s3.html", "w") as f:
    f.write(html_content)

IPython.display.IFrame(src="newsgroups_graph_s3.html", width='100%', height='550px')