onkelos

Programs related to the computational analysis of the Aramaic found in the Targum Onkelos

scraper pulls the text from the cal.huc.edu database and formats it into csvs, vowels removed, Latin characters and Hebrew characters. Has the ability to format text broken up by either book, chapter, or verse. Primarily uses BeautifulSoup.

train_clf is built off of a program by @dimidd that is trained on common tractates in the Talmudim and analyzes the uncommon tractates and those mistakenly labelled. It now still trains on the common tractates of the Talmudim (half in Jewish Palestinian Aramaic and half in Jewish Babylonian Aramaic) but instead analyzes the Targum Onkelos to predict the dialectal origin of different parts of the text.

It utilizes the Multinomial Naive Bayes ML algorithm primarily, but also can analyze using Logistic Regression and LightGBM. For NB, it has the functionality of displaying the most influential words for making the prediction, along with that word's weight. Alongside the predictions, text can be displayed in Hebrew and in Latin characters. Primarily uses sklearn, pandas.

analyze_results tallies up the verses attributed to yerushalmi, taking into account confidence of the prediction.

DATE: 1/14/24 CONTRIBUTORS: scraper - Eitan Zemel, train_clf - Dimid Duchovny, Eitan Zemel, analyze_results - Eitan Zemel

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
analyze_results.py		analyze_results.py
scraper.py		scraper.py
train_clf.py		train_clf.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

onkelos

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

onkelos

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages