Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load sudachidict dictionary #93

Merged

Conversation

mh-northlander
Copy link
Collaborator

@mh-northlander mh-northlander commented Oct 13, 2021

#73

WIP: default dictionary path setting
SudachiPy: dict_type > config.systemDict > sudachidict_core.
sudachi.rs: arg > config.systemDict > baked dict.

@mh-northlander
Copy link
Collaborator Author

Assume that "systemDict" key of default sudachi.json is an empty string as SudachiPy.
We will need to consider how to distribute resource files for python binding.

@mh-northlander mh-northlander marked this pull request as ready for review October 13, 2021 08:25
@eiennohito
Copy link
Collaborator

Can we do path resolution completely in Python?
Then just pass the path to the dictionary to the Rust binding?
I am not sure all this logic should be written in Rust.

@mh-northlander
Copy link
Collaborator Author

mh-northlander commented Oct 13, 2021

Can we do path resolution completely in Python? Then just pass the path to the dictionary to the Rust binding? I am not sure all this logic should be written in Rust.

We can load python file using PyModule::from_code.

In that case python side code will be like this:
https://github.com/WorksApplications/SudachiPy/blob/03d9ff8e7094859275984ee718e08c3fcefaeb6b/sudachipy/config.py#L42-L50

@mh-northlander mh-northlander changed the title Load sudachidict dictionary WIP Load sudachidict dictionary Oct 13, 2021
@@ -151,3 +153,21 @@ fn get_absolute_dict_path(py: Python, dict_type: DictionaryType) -> PyResult<Pat

Ok(dict_path)
}

fn find_dict_path_py(py: Python, dict_type: &str) -> PyResult<PathBuf> {
let source_file = PathBuf::from(env!("CARGO_MANIFEST_DIR"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path won't be like this in the installed module (after doing pip install).

I think it is OK to do this resolution inside __init__.py, or even put the resolution function there (prefixed with underscore so it will be sort of private for the module).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like this?

let module = PyModule::import(py, "sudachi.dictionary_path")?;

In this case I think it's better to separate file, not to load other functions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by load?
Resolving the path of the pip-installed dictionary is a core functionality of the Python binding, as long as it is shown as "do not call" for users, e.g. having _revolve_dic_internal, it should not be a problem. Python does not have the concept of visibility.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyModule.import should work though

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant the loading cost of PyModule::import, which will loads entire module.
If we put functions in __init__.py, maybe PyModule::from_code(py, "from sudachi import function"), is better.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating a new dictionary is one off operation, it is OK for it to be more or less heavy.

return get_absolute_dict_path(dict_type)
else:
raise ModuleNotFoundError(
'Package `sudachidict_{}` dose not exist. '
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: does not exist

@mh-northlander mh-northlander changed the title WIP Load sudachidict dictionary Load sudachidict dictionary Oct 14, 2021
@mh-northlander mh-northlander merged commit 4138cb4 into WorksApplications:develop Oct 14, 2021
@mh-northlander mh-northlander deleted the load-sudachidict branch October 14, 2021 05:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants