You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation of tree algorithms are based on a class CTreeMachineNode, which is templated, with a templated field T data to carry tree node data for the various algorithms The template argument is then set to custom data structures, like CARTreeNodeData for the CARTree algorithm.
While this is reasonable design, unfortunately, Shogun template classes only can be set to basic types, such as float64_t, int32_t, .... If this (unwritten) rule is violated, then something fundamental stops working:
Creating an empty object via class_list.cpp (this is decribed in #3481 ) will be impossible. The class list only supports basic template types (those defined by EPrimitiveType) and not custom data structures. Even worse, creating empty objects is needed for both cloning and serializing objects. This arises from the current design of Shogun at a very low level.
Workaround
A simple way to fix the problem, which requires only a bit of refactoring (unfortunately, all tree algorithms will need to be touched):
Instead of templating the tree nodes, we can rather use OOP and subclasses/virtual methods to represent nodes that behave the same but carry different data. Instead of a template member field T data, we can introduce a field/class CTreeNodeData* data. This class should inherit from CSGObject, register parameters as usual. All the tree node data structs then would be converted into subclasses (with public member field). This makes the algorithm refactoring straight-forward.
This is big patch, however, conceptually quite simple. It is actually a cool exercise for GSoC students to learn a lot about some of the internals of shogun.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Problem
The current implementation of tree algorithms are based on a class
CTreeMachineNode
, which is templated, with a templated fieldT data
to carry tree node data for the various algorithms The template argument is then set to custom data structures, likeCARTreeNodeData
for theCARTree
algorithm.While this is reasonable design, unfortunately, Shogun template classes only can be set to basic types, such as
float64_t, int32_t, ...
. If this (unwritten) rule is violated, then something fundamental stops working:Creating an empty object via
class_list.cpp
(this is decribed in #3481 ) will be impossible. The class list only supports basic template types (those defined byEPrimitiveType
) and not custom data structures. Even worse, creating empty objects is needed for both cloning and serializing objects. This arises from the current design of Shogun at a very low level.Workaround
A simple way to fix the problem, which requires only a bit of refactoring (unfortunately, all tree algorithms will need to be touched):
Instead of templating the tree nodes, we can rather use OOP and subclasses/virtual methods to represent nodes that behave the same but carry different data. Instead of a template member field
T data
, we can introduce a field/classCTreeNodeData* data
. This class should inherit fromCSGObject
, register parameters as usual. All the tree node data structs then would be converted into subclasses (with public member field). This makes the algorithm refactoring straight-forward.This is big patch, however, conceptually quite simple. It is actually a cool exercise for GSoC students to learn a lot about some of the internals of shogun.
The text was updated successfully, but these errors were encountered: