Skip to content

syordanov94/language_detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python-Language Detection Library Comparer

This is a project, written in Python, that compares the accuracy between two of the most used Python packages for language detection: Langdetect and Langid.

A simple txt file is introduced as input and the program runs through each line of set file and predicts the language in which the line is written. These languages (and their probabilities) are then returned in a new csv* file called lang_detection_results.csv.

The program also outputs the prediction perfomance of each algorythm in various charts.

NOTE: This is strictly a prediction accuracy comparison, NOT a technical performance one. This means that there is no comparison on overall speed or memory or similar usage that each algorythm offers.

Prerequisites

  • Python installed. The version used for this project is Python 3.11.2
  • make installed (for more information click here)
  • Recomended but not mandatory VS Code or a similiar IDE

How to install and Run the project

  • First you will have to clone the project from this github repository
git clone https://github.com/syordanov94/language_detection.git
  • Once cloned, you will need to download and update all module dependencies. To do this just use the make file provided by running the following command:
make
  • Once upgraded, you can run the language_detection.py file that performs all the functionality.
python3 language_detection.py

or

python language_detection.py
  • Once ran, this will produce an output like the following:

myimage-alt-tag

How to test the project

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages