Several distance-based learning algorithms, including our study topic TransD.
Pre-specified features often restrict performance of various algorithms. Distance-based features thus provide an alternative to perform learning, especially in the situation where similarity relation is easier to get or analyze, such as computer vision, bioinformatics natural language processing etc.
- Transform data into a “neat” distribution, by pulling or pushing each pair of points.
- Use simple distance-based algorithm to get the final prediction!
Semi-supervised: Train unlabeled data with labeled data.
for each round: determine label for unlabeled data for each pair of data: if(pass the conditions): adjust their distance if(data is neat enough): end
round: maximum of 20 rounds
conditions:
- 𝑐𝑖, 𝑐𝑗 are calculated in the Bayesian KNN.
- If random 𝑟 >= 𝜉𝑖𝑗 , transform to new distance. else keep it.
neat enough:
- Consensus of 1-nn and 1-mi algorithm.
We have k hypothesis : 1-NN, 2-NN, …, K-NN.
- Our model becomes a single linear transform matrix 𝑇!
Use feature space extension method. Result for quadratic transformation:
Linear
Quadratic
Some significant good, some significant bad. We can treat different transformation as learning parameter, tune for a specific dataset.
- Improving accuracy:
Result: We can increase accuracy in some dataset using clustering preprocessing, however, the overhead time isn’t worthy. - Compressing data:
Result: Compress unlabeled data into 1/5 or even 1/10 with same accuracy (No significant bad), thus saving a lot of time performing TransD.
Randomly return class based on the weight.
The randomness will decrease after every iteration.
Result: not significant, need more experiments!
- We need more experiments on big data.
- Further improve time and space complexity.
- Implement the algorithm on CUDA (run on GPU).
- Other ways to compress data—Fewer data but higher dimension?
Yuh-Jyh Hu, Min-Che Yu, Hsiang-An Wang, and Zih-Yun Ting, “A Similarity-Based Learning Algorithm Using Distance Transformation,” IEEE TKDE., vol. 27, no. 6, pp. June. 2015.