# Graph Deep Learning for NLP

This notebook demonstrates implementations of text summarization using TextRank algorithm and an AMR-to-Text summarizer using Graph Neural Networks.

## 1. TextRank Implementation

First, we'll implement the TextRank algorithm for extractive text summarization.

In [1]:
# Import required libraries
import networkx as nx
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.feature_extraction.text import TfidfVectorizer

In [4]:
def textrank(sentences, top_n=2):
    """Implement TextRank algorithm for text summarization
    
    Args:
        sentences (list): List of sentences
        top_n (int): Number of sentences to return
        
    Returns:
        list: Top n ranked sentences
    """
    # Create TF-IDF matrix
    tfidf = TfidfVectorizer().fit_transform(sentences)
    similarity_matrix = cosine_similarity(tfidf)
    
    # Create graph and compute pagerank
    graph = nx.from_numpy_array(similarity_matrix)
    scores = nx.pagerank(graph)
    
    # Sort sentences by score
    ranked_sentences = sorted(((scores[i], s) for i, s in enumerate(sentences)), reverse=True)
    return [s for _, s in ranked_sentences[:top_n]]

### Test TextRank Implementation

In [5]:
# Sample text for testing
text = """
Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language.
The ultimate objective of NLP is to read, decipher, understand, and make sense of human languages in a valuable way.
NLP is used in many applications, including machine translation, speech recognition, and chatbots.
"""

# Preprocess and summarize
sentences = [s.strip() for s in text.strip().split('.')]
sentences = [s for s in sentences if s]  # Remove empty sentences

summary = textrank(sentences)
print("Original Text:\n", text)
print("\nSummary:\n", ' '.join(summary))

Original Text:
 
Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language.
The ultimate objective of NLP is to read, decipher, understand, and make sense of human languages in a valuable way.
NLP is used in many applications, including machine translation, speech recognition, and chatbots.


Summary:
 The ultimate objective of NLP is to read, decipher, understand, and make sense of human languages in a valuable way Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language


## 2. AMR-to-Text Summarizer Implementation

Now we'll implement the Abstract Meaning Representation (AMR) to text summarizer using Graph Neural Networks.

In [6]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch_geometric.nn import GCNConv
from torch_geometric.data import Data

In [7]:
class AMRToTextSummarizer(nn.Module):
    def __init__(self, input_dim, hidden_dim, output_dim):
        super().__init__()
        self.graph_conv = GCNConv(input_dim, hidden_dim)
        self.gru = nn.GRU(hidden_dim, hidden_dim, batch_first=True)
        self.fc = nn.Linear(hidden_dim, output_dim)
    
    def forward(self, data):
        x, edge_index = data.x, data.edge_index
        
        # Graph convolution
        h = self.graph_conv(x, edge_index)
        h = F.relu(h)
        
        # Sequence decoding
        h = h.unsqueeze(0)  # Add batch dimension
        output, _ = self.gru(h)
        output = self.fc(output)
        
        return output

### Test AMR-to-Text Summarizer

In [8]:
# Set model parameters
input_dim = 100  # Dimension of input node features
hidden_dim = 256
output_dim = 10000  # Vocabulary size

# Create sample data
edge_index = torch.tensor([[0, 1, 2], [1, 2, 0]], dtype=torch.long)
x = torch.randn(3, input_dim)
data = Data(x=x, edge_index=edge_index)

# Initialize and test model
model = AMRToTextSummarizer(input_dim, hidden_dim, output_dim)
summary_logits = model(data)
print("Summary logits shape:", summary_logits.shape)

# Print model architecture
print("\nModel Architecture:")
print(model)

Summary logits shape: torch.Size([1, 3, 10000])

Model Architecture:
AMRToTextSummarizer(
  (graph_conv): GCNConv(100, 256)
  (gru): GRU(256, 256, batch_first=True)
  (fc): Linear(in_features=256, out_features=10000, bias=True)
)
