-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend "Blasto" by the language Latin #1
Comments
I can if I have the data of Latin quadgram dataset, but unfortunately I don't have it. |
Unfortunately, I also do not have a corresponding dataset and do not know how one could create this from a Latin corpus. I have made a request in a forum if someone knows a source and I will get back to you if there is any feedback. |
I put this together from a ~2.5M words corpus. Can it be useful? https://drive.google.com/file/d/1ZX0Fu3rWREViVayVat_1myvSJq8lk2u-/view |
Nice one, let me check that if it can be implemented |
Something that occurred to me after I created that file: medieval Latin typically represented 'u' and 'v' with the same character. Should this be 'simulated' in the quadgrams by replacing u with v or vice-versa in at least part of the corpus? Maybe duplicating those lines so that they appear both with distinct u/v and a single character? |
I think duplicating those characters as different line is the easiest way, although the quadgram size will be bigger. |
An updated version of the file, where I added the replacement of AE/OE with E and of V with U. This of course results in additional quadgrams (about 1% more lines). https://drive.google.com/file/d/1F3R1byY_63bS4H6TLssn3PieUthCNxwc/view?usp=sharing |
I am also interested in the implementation of Latin in Blasto. Does it look like this can happen? |
Is there any progress yet in the implementation of Latin ? |
If you are still interested, here is the mini version with Latin support: |
Hi. Thank you, this is great!
…On Mon, Jan 8, 2024 at 7:26 AM bi3mw ***@***.***> wrote:
I am also interested in the implementation of Latin in Blasto. Does it
look like this can happen?
If you are still interested, here is the mini version with Latin support:
https://www.dropbox.com/scl/fi/y066ahjjsccnpc8z9knpu/subst_solver_latin.zip?rlkey=l5en2nl8lps3rgv1rjiv6ln84&dl=1
—
Reply to this email directly, view it on GitHub
<#1 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AYAEMBJR7UQHC53NL3WT5Z3YNPQXTAVCNFSM5CMHVAUKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBYGA4TANZRHAYQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
--
Zackery Belanger
Founder and Director
Arcgeometer LC
313.242.7088
|
Can you extend your program "Blasto" by the language Latin ? The program would be very useful for deciphering old manuscripts then.
The text was updated successfully, but these errors were encountered: