Terebellum

Project Terebellum is an open-source experimentation on the use of machine learning to process and analyze Turkish text. The project is named after the Terebellum worm, which is a type of marine worm that lives in a tube and has a crown of tentacles that it uses to catch food particles in the water. The project aims to develop a machine learning model that can process and analyze Turkish text in a similar way to how the Terebellum worm processes and analyzes food particles in the water.

And also, the first two consonants of the project name are "T" and "R", resulting in the abbreviation "TR". This abbreviation is also the ISO 3166-1 alpha-2 code for Türkiye. This is a coincidence, but it is a happy coincidence that fits well with the project's focus on Turkish text.

The project uses the Rust programming language and the Burn machine learning library to develop the models. The project is still in the early stages of development, but the goal is to create a machine learning model that can process and analyze Turkish text with high accuracy and efficiency.

Currently, the project contains a model that can classify Turkish text into one of two categories: "offensive" or "non-offensive". The model is trained on a dataset of Turkish tweets that have been labeled as either "offensive" or "non-offensive". The model uses a small BERT model for text classification and achieves an accuracy of around 85% on the test dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Terebellum

About

Releases

Packages

Languages

diseptennea/terebellum

Folders and files

Latest commit

History

Repository files navigation

Terebellum

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages