Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
How to build transform-only pipelines #642
I found comments, that there should be a way to use transforms without the need of a trainer/learner (i.e., building a "processing / transform - pipeline instead of an LearningPipeline, cp. #259 (comment)). Unfourtunately, I could not find out, how to achieve this.
In my usecase, I want to determine similarity of documents with n-gram vectorization and cosine distance. The functionalty for featurization is given by the TextFeaturizer (https://docs.microsoft.com/de-de/dotnet/api/microsoft.ml.transforms.textfeaturizer). In this usecase I don't want to do a training (yet), but am interessted in the output in the result of the TextFeaturizer itself.
As far as I know, this is currently not possible.
However, once we build the final API (see #583), you will be able to access the output of transforms without having to train a model.
In fact, the 'trained model' will be just one form of 'transformer', and you will be able to have as many of them chained together as you want (including 0), and mix and match them with other transformers, like TextFeaturizer.