In this study, a corpus was created from Turkish language texts. N-gram analysis was performed from the Corpus. It is implemented in 2 different programming languages. The frequencies of 1-gram, 2-gram, and 3-gram are found and time comparison is made. While creating n-gram, hashmap was used in java and dictionary was used in python. Regular Expression was used for punctuation. Words with the highest frequency are listed.
MIT LICENSE