implementation("io.github.hindigarv:shabdkosh:1.0.4")
val wordFinder = WordFinder()
val text = "बाबा बैद्यनाथ की धरती पर आज विकास के इतने सारे काम हो रहे हैं, लेकिन मेरा आपसे एक सवाल भी है…"
val words = wordFinder.find(text)
words.forEach { println("${it.shabd} (${it.mool}) -> ${it.paryays.joinToString(", ") }}") }
// console:
// बाबा (अरबी) -> पितामह, साधु
// लेकिन (अरबी) -> परन्तु, किन्तु
// सवाल (अरबी) -> प्रश्न
WordFinder
uses शब्दकोश file as source.
Use autoRefresh
option like this val wordFinder = WordFinder(autoRefresh = true)
to automatically load new words from the शब्दकोश file.
How to build java library from gradle
create a file ./lib/gradle.properties
with following content
# ossrh JIRA creds used at https://issues.sonatype.org/secure/Dashboard.jspa
ossrh_username=xxx
ossrh_password=xxx
signing.keyId=xxx
signing.password=xxx
signing.secretKeyRingFile=/Users/xxx/secring.gpg
run ./gradlew clean build
- change version in
build.gradle.kts
- Publish to maven local and verify
./gradlew publishToMavenLocal
- Publish to maven central
./gradlew publish
- Go to https://s01.oss.sonatype.org/#stagingRepositories
- close and release the new version
- Verify in maven repos:
- Create a new release on github
- Update the readme file example code with new version
- TBA
- Minor: remove dependency on emoji parser library
- Minor: update regex to split using any non devanagari character
- Minor: Regex is enabled by default
- Minor: Tokenize text using non-word symbols:
+
,^
,<
,>
,|
,&
,=
, numbers and roman letters
- Minor: Regex allows optional OptionSet. e.g.
x(a)?x
->"xx", "xax"
- Major: Enable Regex to generate ShabdaRoop list
- Minor: Handle three dots character
…
to split the word.
- Added a feature to auto refresh the dictionary every 5 minutes.
- New dependency added: api("io.github.microutils:kotlin-logging-jvm:2.0.11")
- Removed dependency: implementation("com.google.guava:guava:30.1.1-jre")
- Added WordFinder which loads dictionary on constructor.
- First version without any feature.