This project implements the method described in Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs and applies it to identifying different classes of snakes!
In order to run this project you must have first installed the following dependencies:
- Python 3
- numpy
- matplotlib
- torch
- torchvision
Traditional Models (closed-world):
- new class → new data needed (usually lots), redo fine-tuning
- Models take a lot of time to fine-tune (assuming we even have the extra data)
Solution: Zero-shot learning - Infer knowledge from past training
- Implicit knowledge: learn vector representation of categories using text data → learn mapping connecting vector representation to visual classifier
- Explicit knowledge: relations from KGs and using them as zero-shot classifiers
Our task: combine both implicit and explicit knowledge using a KG and GCN (following the method of [Wang, Ye, and Gupta 2018]). Applying zero-shot approach to small/noisy/hard-to-learn dataset of snake images.
- ImageNet 2012 1k
- Snake dataset
- SnakeKG
- a custom subset of Wikidata
ResNet is used for visual feature extraction as well as providing a baseline method for comparison.
The KG we use for our project is constructed as a subset of the Wikidata knowledge graph. We use the mapping from the ImageNet dataset to Wikidata entities created by Filipiak, D., Fensel, A., & Filipowska, A. to select the nodes from the ImageNet dataset. Wikidata identifiers for the 10 classes not in the ImageNet dataset are identified to include those nodes in the subgraph.
To integrate the information from our KG as part of the classifications, this method employs a 6-layer GCN.
While most of the referenced work is linked directly above, it is also more formally collected below: