CSI is a dataset that reports characters and detailed transcripts of the CSI TV-show. This work is based on the original implementation of GCNs in Pytorch, available here.
The aim of the work is:
- explore GCNs for visualization of embeddings in 3-dimensions for nodes in a criminal network (can we benefit from NLP features when visualizing the embeddings?)
- explore model performance for classification (predicting the community, i.e. the episode of CSI during which the person appeared based on TF-IDF features of the text pronounced)
python setup.py install
- PyTorch 0.4 or higher
- Python 2.7 or 3.6 + higher
In PyGCN, go to:
CSI_GCN.ipynb
Using Plotly, intermediate embeddings are displayed over the number of epochs:
[1] Kipf & Welling, Semi-Supervised Classification with Graph Convolutional Networks, 2016
[2] Sen et al., Collective Classification in Network Data, AI Magazine 2008