Replies: 2 comments 1 reply
-
Hi @theo-long, thanks for the question. Unfortunately, I'm neither familiar with the Spectral library nor the paper you mentioned so I can only give a few general pointers regarding BNN training that might help you debug this:
I hope these ideas help you debug the convergence issues. If this doesn't help, maybe it would be worth trying to use the default Keras training loop via |
Beta Was this translation helpful? Give feedback.
-
Hi,
I'm currently trying to recreate the Binarized Graph Convolutional Network from the paper 'Bi-GCN: Binary Graph Convolutional Network' (link) by Wang et. al using Larq. The code for that paper is available here: Bi-GCN Github. It is written in PyTorch and PyTorch Geometric, and I am trying to rewrite it using tensorflow, larq, and spektral. However, I am struggling to achieve good performance with my current implementation - the torch code generates a model with ~80% accuracy and I am only able to achieve ~20% with my model!
One of the main issues seems to be a bad gradient signal - the model gets stuck at a relatively large loss and fails to decrease it any further. I've been having a hard time diagnosing what exactly is causing this. One thing to note is that when I set
input_quantizer=None, kernel_quantizer=None
(i.e. just a normalDense
layer), I'm able to achieve 80+% accuracy which is comparable to the original non-binarized GCN architecture. This means that the issues I'm seeing are directly caused by the quantization and not my implementation of the GCN. Please let me know if you have any ideas what the issue may be here and any tips for dealing with it. See below for my code.I first define the graph convolutional layer and read in the Cora dataset using the spektral library:
Beta Was this translation helpful? Give feedback.
All reactions