-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python binding #25
Comments
considering to use PyO3, following hugging-face tokenizers |
What about Python API FFI like this? Main Idea - hide heavyweight API behind traits and use them via objects from Python side. pub fn dictionary_from_config(cfg: Config) -> SudachiResult<Arc<dyn PyDictionary>> {
todo!()
}
pub trait PyDictionary {
fn tokenizer(&self) -> Box<dyn mut PyTokenize>;
}
pub trait PyTokenise {
fn tokenize(&mut self, input: &str, mode: Mode, debug: bool) -> SudachiResult<Vec<Morpheme>>;
} Morpheme should be owning for Python, internal one would not work, binding should handle copying from Rust types to Python ones. Result also may be non-compatible. |
PyO3 can only convert struct, so I'm considering to expose wrapper classes like following to hide rust API: pub struct PyDictionary {
inner: Arc<JapaneseDictionary>
}
impl PyDictionary {
fn from_config(config: Config) -> Self {
todo!();
}
fn tokenizer(&self) -> PyTokenizer{
PyTokenizer { inner: self.inner }
}
}
pub PyTokenizer {
inner: StatelessTokenizer<Arc<JapaneseDictionary>>
} |
I'd like that we actually write down the list of requirements for the python binding. |
Create sudachipy conpatible python binding
The text was updated successfully, but these errors were encountered: