Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No language refset for any locale listed in priority list #27

Closed
sidharthramesh opened this issue Oct 30, 2021 · 4 comments
Closed

No language refset for any locale listed in priority list #27

sidharthramesh opened this issue Oct 30, 2021 · 4 comments

Comments

@sidharthramesh
Copy link
Contributor

sidharthramesh commented Oct 30, 2021

Hey @wardle, I just updated to the latest version and I tried to import and index the SNOMED CT International Edition (SnomedCT_InternationalRF2_PRODUCTION_20210131T120000Z). However, I got this error:

java -jar hermes-v0.8.0.jar -d ./snomed2.db index                       

2021-10-30 20:06:17,617 [main] INFO  com.eldrix.hermes.terminology - Building search index {:root "./snomed2.db", :languages "en-IN"}
Exception in thread "main" clojure.lang.ExceptionInfo: No language refset for any locale listed in priority list {:priority-list "en-IN", :store-filename "/Users/sid/Desktop/mlds/snomed2.db/store.db"}
        at com.eldrix.hermes.impl.search$build_search_index.invokeStatic(search.clj:138)
        at com.eldrix.hermes.impl.search$build_search_index.invoke(search.clj:131)
        at com.eldrix.hermes.terminology$build_search_index.invokeStatic(terminology.clj:173)
        at com.eldrix.hermes.terminology$build_search_index.invoke(terminology.clj:168)
        at com.eldrix.hermes.terminology$build_search_index.invokeStatic(terminology.clj:169)
        at com.eldrix.hermes.terminology$build_search_index.invoke(terminology.clj:168)
        at com.eldrix.hermes.core$build_index.invokeStatic(core.clj:53)
        at com.eldrix.hermes.core$build_index.invoke(core.clj:51)
        at com.eldrix.hermes.core$invoke_command.invokeStatic(core.clj:118)
        at com.eldrix.hermes.core$invoke_command.invoke(core.clj:116)
        at com.eldrix.hermes.core$_main.invokeStatic(core.clj:135)
        at com.eldrix.hermes.core$_main.doInvoke(core.clj:121)
        at clojure.lang.RestFn.applyTo(RestFn.java:137)
        at com.eldrix.hermes.core.main(Unknown Source)

I'm curious - how does Hermes know that the priority list is "en-IN"? I'm only importing the international version. Any idea on why this error happens?

@wardle
Copy link
Owner

wardle commented Oct 30, 2021

It's using java's idea of the default locale in order to choose which language reference set(s) to use in order to choose the preferred synonym to be stored in the cache that is the search index used for autocompletion. Arguably, it shouldn't fail like that but instead flag an error and point out you need to provide a locale to use upfront, or fallback to using "en-US". I have already committed some changes to permit explicit locale choice at the command-line. I will create a new release, if that helps - but in the meantime you could set the locale at the command line. The other option is that I include en-IN in the list of known locales and you tell me which language reference sets to use for India. Are there any specific for India, or do you use UK or US? I'd value your opinion.

@sidharthramesh
Copy link
Contributor Author

Yup. I figured that out after looking at the code. It works after I went into my system settings and changed my locale to en-US! Indian Medical Terminologies are based more around UK terms than US, an explicit option would be best.

In the current version v0.8.0 can I pass the locale via a command-line flag or env variable?

@wardle
Copy link
Owner

wardle commented Oct 30, 2021

Try v0.8.1 676f3d8 - you can now specify --locale as an option for index creation to override your system default, and it will fallback to en-US which I think is a reasonable default.

Also see

{"en-gb" [999001261000000100 ;; NHS realm language (clinical part)
for the current list of "known" languages, which is incomplete. As I wasn't permitted to download the different international releases, I haven't been able to test or indeed populate this language mapping. I can easily add a suitable priority list of refset identifiers for "en-IN" which could use UK and US as a fallback.

@sidharthramesh
Copy link
Contributor Author

Works now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants