Skip to content

diseptennea/terebellum

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Terebellum

Project Terebellum is an open-source experimentation on the use of machine learning to process and analyze Turkish text. The project is named after the Terebellum worm, which is a type of marine worm that lives in a tube and has a crown of tentacles that it uses to catch food particles in the water. The project aims to develop a machine learning model that can process and analyze Turkish text in a similar way to how the Terebellum worm processes and analyzes food particles in the water.

And also, the first two consonants of the project name are "T" and "R", resulting in the abbreviation "TR". This abbreviation is also the ISO 3166-1 alpha-2 code for Türkiye. This is a coincidence, but it is a happy coincidence that fits well with the project's focus on Turkish text.

The project uses the Rust programming language and the Burn machine learning library to develop the models. The project is still in the early stages of development, but the goal is to create a machine learning model that can process and analyze Turkish text with high accuracy and efficiency.

Currently, the project contains a model that can classify Turkish text into one of two categories: "offensive" or "non-offensive". The model is trained on a dataset of Turkish tweets that have been labeled as either "offensive" or "non-offensive". The model uses a small BERT model for text classification and achieves an accuracy of around 85% on the test dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages