Skip to content

atakankocyigit/n-gram-algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

n-gram-algorithm

In this study, a corpus was created from Turkish language texts. N-gram analysis was performed from the Corpus. It is implemented in 2 different programming languages. The frequencies of 1-gram, 2-gram, and 3-gram are found and time comparison is made. While creating n-gram, hashmap was used in java and dictionary was used in python. Regular Expression was used for punctuation. Words with the highest frequency are listed.

Results

1-gram results

image

2-gram results

image

3-gram results

image

LICENSE

MIT LICENSE

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published