Open
Description
I need to use the intfloat/multilingual-e5-small model. However, I encountered a problem with missing tags such as [UNK] and [SEP] when loading VOCab.txt on the ARM64 architecture. Upon researching, it was found that 'intfloat/multilingual-e5-small' uses XLMRobertaTokenizer (dependent on SentencePiece). I am in Microsoft I found SentencePieceTokenizers in ML.Tokenizers, and their usage is different from BertTokenizer's. I don't know how to use it. Can you provide me with a tutorial on how to use it. I went through the file The OpenRead method read the Stream and successfully loaded SentencePieceTokenizers, but I don't know how to use it in the future.