## Let's install all relevant libraries to construct our GCN!

The entire thing will take about 10 minutes. Please be patient!

In [None]:
!pip uninstall torch-scatter torch-sparse torch-geometric torch-cluster  
!pip install torch-scatter -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
!pip install torch-sparse -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
!pip install torch-geometric
#!pip install -U sentence-transformers

Now for our dataset consisting of 314 Politifact(Political News) and 5464 Gossicop(Celebrity News) Graphs. Read more at https://pytorch-geometric.readthedocs.io/en/latest/modules/datasets.html#torch_geometric.datasets.UPFD

In [1]:
from torch_geometric.datasets import UPFD

name can be politifact and gossicop depending on which graphs you would like to extract. feature refers to the embedding type- which refers to the features and transformers used to create the node vectors. We use 'content' which is an aggregation of user profiles and user activities on twitter

- bert: the 768-dimensional node feature composed of Twitter user historical tweets encoded by the bert-as-service
- content: the 310-dimensional node feature composed of a 300-dimensional “spacy” vector plus a 10-dimensional “profile” vector
- profile: the 10-dimensional node feature composed of ten Twitter user profile attributes.
- spacy: the 300-dimensional node feature composed of Twitter user historical tweets encoded by the spaCy word2vec encoder.

Statistics:

Politifact:
- Graphs: 314
- Nodes: 41,054
- Edges: 40,740
- Classes:
    - Fake: 157
    - Real: 157
- Node feature size:
    - bert: 768
    - content: 310
    - profile: 10
    - spacy: 300
    
Gossipcop:
- Graphs: 5,464
- Nodes: 314,262
- Edges: 308,798
- Classes:
    - Fake: 2,732
    - Real: 2,732
- Node feature size:
    - bert: 768
    - content: 310
    - profile: 10
    - spacy: 300

In [2]:
test_data_gos = UPFD(root=".", name="gossipcop", feature="bert",split="test")
train_data_gos = UPFD(root=".", name="gossipcop", feature="bert", split="train")
val_data_gos = UPFD(root=".", name="gossipcop", feature="bert", split="val")

test_data_pol = UPFD(root=".", name="politifact", feature="bert",split="test")
train_data_pol = UPFD(root=".", name="politifact", feature="bert", split="train")
val_data_pol = UPFD(root=".", name="politifact", feature="bert", split="val")
train_data_pol = train_data_pol + val_data_pol

print("Gossipcop Dataset")
print("Train Samples: ", len(train_data_gos))
print("Validation Samples: ", len(val_data_gos))
print("Test Samples: ", len(test_data_gos))

print("Politifact Dataset")
print("Train Samples: ", len(train_data_pol))
print("Validation Samples: ", len(val_data_pol))
print("Test Samples: ", len(test_data_pol))

Gossipcop Dataset
Train Samples:  1092
Validation Samples:  546
Test Samples:  3826
Politifact Dataset
Train Samples:  93
Validation Samples:  31
Test Samples:  221


In [3]:
# combing all data availbale
def combineAllUPFDData(feature):
  test_data_gos = UPFD(root=".", name="gossipcop", feature="content",split="train")
  train_data_gos = UPFD(root=".", name="gossipcop", feature="content", split="test")
  val_data_gos = UPFD(root=".", name="gossipcop", feature="content", split="val")

  test_data_pol = UPFD(root=".", name="politifact", feature="content",split="train")
  train_data_pol = UPFD(root=".", name="politifact", feature="content", split="test")
  val_data_pol = UPFD(root=".", name="politifact", feature="content", split="val")
  print("Gossipcop Dataset")
  print("Train Samples: ", len(train_data_gos))
  print("Validation Samples: ", len(val_data_gos))
  print("Test Samples: ", len(test_data_gos))

  print("Politifact Dataset")
  print("Train Samples: ", len(train_data_pol))
  print("Validation Samples: ", len(val_data_pol))
  print("Test Samples: ", len(test_data_pol))



train is a indexable object with each index refering to a different graph. Each graph has the attribute x, which refers to the node embeddings (node vectors) and edge-index, which specifies the directed edges in the graph as an object containing two lists- the first one specifying the source node index and the other specifying the destination node index.

In [5]:
train_data_pol[0].edge_index

tensor([[ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
          0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  7,
          7, 12, 14, 16, 30, 32, 33, 35],
        [ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
         19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
         37, 38, 39, 40, 41, 42, 43, 44]])

In [6]:
train_data_pol[0].x # what do you mean by node vectors?

tensor([[0.5780, 0.5658, 0.3858,  ..., 0.5833, 0.1750, 0.3777],
        [0.6087, 0.6196, 0.2900,  ..., 0.4122, 0.1538, 0.2703],
        [0.5888, 0.5631, 0.2753,  ..., 0.2230, 0.1538, 0.0000],
        ...,
        [0.5958, 0.5722, 0.4397,  ..., 0.7095, 0.2308, 0.1081],
        [0.5987, 0.5378, 0.4844,  ..., 0.6419, 0.0769, 0.0811],
        [0.5906, 0.5376, 0.3346,  ..., 0.9324, 0.1538, 0.3784]])

In [3]:
from torch_geometric.loader import DataLoader
train_loader = DataLoader(train_data_pol, batch_size=256, shuffle=True)
test_loader = DataLoader(test_data_pol, batch_size=256, shuffle=False)

In [4]:
test_data_pol.num_features

768

Let's build our model! Remember the architecture given by the paper https://arxiv.org/pdf/1902.06673.pdf Page 6.

## Model Definition

In [6]:
import torch
import torch.nn.functional as F
from torch.nn import LeakyReLU, Softmax, Linear, SELU,Dropout
from torch_geometric.nn import SAGEConv, global_max_pool, GATv2Conv, TopKPooling, global_mean_pool
from torch_geometric.transforms import ToUndirected
from torch.nn import LeakyReLU

In [13]:
class Net(torch.nn.Module):
    def __init__(self, in_channels, hidden_channels, out_channels):
        super(Net, self).__init__()
        self.conv1 = SAGEConv(in_channels, hidden_channels[0])
        self.conv2 = SAGEConv(hidden_channels[0], hidden_channels[1])
        self.conv3 = SAGEConv(hidden_channels[1], hidden_channels[2])
        
        self.full1 = Linear(hidden_channels[2],hidden_channels[3])
        self.full2 = Linear(hidden_channels[3],hidden_channels[4])
        self.full3 = Linear(hidden_channels[4],hidden_channels[5])

        self.softmax = Linear(hidden_channels[5],out_channels)

        #droupouts
        self.dp1 = Dropout(0.2)
        self.dp2 = Dropout(0.2)
        self.dp3 = Dropout(0.2)

    def forward(self, x, edge_index, batch):
        h = self.conv1(x, edge_index).relu()
        h = self.conv2(h, edge_index).relu()
        h = self.conv3(h, edge_index).relu()

        h = global_max_pool(h,batch)

        h = self.full1(h).relu()
        h = self.dp1(h)
        h = self.full2(h).relu()
        h = self.dp2(h)
        h = self.full3(h).relu()
        h = self.dp3(h)
        
        h = self.softmax(h)

        return torch.sigmoid(h)

In [8]:
from torch.autograd import Variable
from sklearn.metrics import accuracy_score, f1_score 

In [14]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Net(test_data_pol.num_features,[512,512,512,256,256,256],1).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)
lossff = torch.nn.BCELoss()
print(device)

cpu


In [17]:
print("Number of Parameters of the model :",sum([param.nelement() for param in model.parameters()]))
print(model)


Number of Parameters of the model : 2099713
Net(
  (conv1): SAGEConv(768, 512, aggr=mean)
  (conv2): SAGEConv(512, 512, aggr=mean)
  (conv3): SAGEConv(512, 512, aggr=mean)
  (full1): Linear(in_features=512, out_features=256, bias=True)
  (full2): Linear(in_features=256, out_features=256, bias=True)
  (full3): Linear(in_features=256, out_features=256, bias=True)
  (softmax): Linear(in_features=256, out_features=1, bias=True)
  (dp1): Dropout(p=0.2, inplace=False)
  (dp2): Dropout(p=0.2, inplace=False)
  (dp3): Dropout(p=0.2, inplace=False)
)


## Training

In [15]:
def train(epoch):
    model.train()
    total_loss = 0
    for data in train_loader:
        data = data.to(device)
        optimizer.zero_grad()
        out = model(data.x, data.edge_index, data.batch)
        # print(out)

        loss = lossff(torch.reshape(out,(-1,)), data.y.float())
        # print(loss)
        loss.backward()
        optimizer.step()
        total_loss += float(loss) * data.num_graphs
    return total_loss / len(train_loader.dataset)

@torch.no_grad()
def test(epoch):
    model.eval()
    total_loss = 0
    all_preds = []
    all_labels = []
    for data in test_loader:
        data = data.to(device)
        out = model(data.x, data.edge_index, data.batch)
        # print(out)
        loss = lossff(torch.reshape(out,(-1,)), data.y.float())
        # print(loss)
        total_loss += float(loss) * data.num_graphs
        all_preds.append(torch.reshape(out, (-1,)))
        all_labels.append(data.y.float())
    # print(all_preds)
    accuracy, f1 = metrics(all_preds, all_labels)
    return total_loss / len(test_loader.dataset), accuracy, f1


def metrics(preds, gts):
    preds = torch.round(torch.cat(preds))
    gts = torch.cat(gts)
    # print(preds.cpu().numpy())

    acc = accuracy_score(preds.cpu().numpy(), gts.cpu().numpy())
    f1 = f1_score(preds.cpu().numpy(), gts.cpu().numpy())
    return acc, f1

In [16]:
wloss = []
weighted_loss = 0
exp_param = 0.8

# without droupout training results
for epoch in range(500):
  train_loss = train(epoch)
  test_loss, test_acc, test_f1 = test(epoch)
  # weighted_loss = exp_param*(weighted_loss) + (1-exp_param)*(test_loss/ len(test_loader.dataset))

  # wloss.append(weighted_loss/(1-exp_param**(epoch+1)))

  # if(epoch-20>=0 and wloss[epoch-6]-weighted_loss<0.01):
  #     print("Stopped Early at Epoch {} ".format(epoch))
  #     break

  print(f'Epoch: {epoch:02d} |  TrainLoss: {train_loss:.5f} | '
          f'TestLoss: {test_loss:.5f} | TestAcc: {test_acc:.5f} | TestF1: {test_f1:.2f}')
  # print(f'Epoch: {epoch:02d} |  TrainLoss: {train_loss:.7f} |')

Epoch: 00 |  TrainLoss: 0.69356 | TestLoss: 0.69266 | TestAcc: 0.51131 | TestF1: 0.68
Epoch: 01 |  TrainLoss: 0.69438 | TestLoss: 0.69253 | TestAcc: 0.51131 | TestF1: 0.68
Epoch: 02 |  TrainLoss: 0.69463 | TestLoss: 0.69240 | TestAcc: 0.51131 | TestF1: 0.68
Epoch: 03 |  TrainLoss: 0.69284 | TestLoss: 0.69234 | TestAcc: 0.51131 | TestF1: 0.68
Epoch: 04 |  TrainLoss: 0.69205 | TestLoss: 0.69224 | TestAcc: 0.70588 | TestF1: 0.76
Epoch: 05 |  TrainLoss: 0.69238 | TestLoss: 0.69213 | TestAcc: 0.61991 | TestF1: 0.42
Epoch: 06 |  TrainLoss: 0.69101 | TestLoss: 0.69203 | TestAcc: 0.50679 | TestF1: 0.07
Epoch: 07 |  TrainLoss: 0.69066 | TestLoss: 0.69193 | TestAcc: 0.48869 | TestF1: 0.00
Epoch: 08 |  TrainLoss: 0.69055 | TestLoss: 0.69186 | TestAcc: 0.48869 | TestF1: 0.00
Epoch: 09 |  TrainLoss: 0.68927 | TestLoss: 0.69167 | TestAcc: 0.48869 | TestF1: 0.00
Epoch: 10 |  TrainLoss: 0.68942 | TestLoss: 0.69131 | TestAcc: 0.48869 | TestF1: 0.00
Epoch: 11 |  TrainLoss: 0.68921 | TestLoss: 0.69085 | 

### Best Metrics found after training 
**Epoch: 49 |  TrainLoss: 0.19083 | TestLoss: 0.39260 | TestAcc: 0.85068 | TestF1: 0.85**

*Note : We choose the best epoch based on lowest Test Loss.*


In [None]:
wloss = []
weighted_loss = 0
exp_param = 0.8

for epoch in range(400):
  train_loss = train(epoch)
  test_loss, test_acc, test_f1 = test(epoch)
  # weighted_loss = exp_param*(weighted_loss) + (1-exp_param)*(test_loss/ len(test_loader.dataset))

  # wloss.append(weighted_loss/(1-exp_param**(epoch+1)))

  # if(epoch-20>=0 and wloss[epoch-6]-weighted_loss<0.01):
  #     print("Stopped Early at Epoch {} ".format(epoch))
  #     break

  print(f'Epoch: {epoch:02d} |  TrainLoss: {train_loss:.5f} | '
          f'TestLoss: {test_loss:.5f} | TestAcc: {test_acc:.5f} | TestF1: {test_f1:.2f}')
  # print(f'Epoch: {epoch:02d} |  TrainLoss: {train_loss:.7f} |')