Skip to content
/ kww Public

A word list and the frequencies for Kurdish Wikipedia has been made.

License

Notifications You must be signed in to change notification settings

0xdolan/kww

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kurdish Wikipedia Wordlist (ckb)

Kurdish Wikipedia Wordlist

Welcome to Kurdish Wikipedia Wordlist - a wordlist and the frequencies for Kurdish Wikipedia has been made. The original wordlist file, cleaned wordlist file and the frequencies file are included.

Note: The latest wikipedia data has been used. here is the ZIM file which has been used for this project. Moreover, for generating the links, I have used @layik's wordlist. Here are the links:

Wikipedia link: wikipedia_ckb_all_maxi_2021-04.zim Wordlist link: kurdi_words.txt

Python libraries used in this project:

  • re (regex)
  • zimply
  • requests
  • Beautifulsoup4
  • matplotlib
  • collections
  • numpy
  • tqdm

Need help?

If you have questions about the Kurdish Wikipedia Wordlist, feel free to reach out to me via below links:

License

Kurdish Wikipedia Wordlist is available under the MIT license.

About

A word list and the frequencies for Kurdish Wikipedia has been made.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages