# Phase 3: User Engagement Graph + Profile Features

This notebook demonstrates the graph construction and profile feature extraction pipeline as described in the research paper.

## What we'll do:
1. **Load Real Dataset**: Use the provided GossipCop and PolitiFact datasets
2. **Profile Features**: Extract 10-dimensional user profile features
3. **Graph Construction**: Build hierarchical retweet networks
4. **Graph Analysis**: Analyze network properties and structure
5. **Feature Integration**: Combine text embeddings with profile features

## Paper Reference:
- Graph construction follows the hierarchical tree structure (Section 6.2)
- Profile features: 10-dimensional user engagement features
- F3 feature set: spaCy + Profile features (as mentioned in the paper)


In [None]:
import sys
import os
sys.path.append('../src')

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import networkx as nx
from collections import Counter

from fake_news.graphs.graph_construction import UserEngagementGraph, create_synthetic_engagement_data
from fake_news.graphs.profile_features import UserProfileExtractor, create_synthetic_user_data, create_synthetic_tweet_data
from fake_news.features.text_preprocessing import TextPreprocessor
from fake_news.features.embeddings import EmbeddingExtractor
from fake_news.utils.logging import get_logger
from fake_news.utils.paths import PROCESSED_DIR, GRAPHS_DIR

logger = get_logger('notebook')
logger.info('Phase 3: Graph Construction + Profile Features')
