GitHub - AshkanArabim/persian-word-extractor: This script creates a list of unique words from Persian text. Words can be sorted by frequency or alphabetical order. This is a new project, there could be major bugs in the code.

persian-word-extractor

This script creates a list of unique words from Persian text. Words are sorted by the frequency that they appear in the source.txt file. This is a new project, there could be major bugs in the code. Words with accent marks are excluded from results.

Features:

sort by frequency or alphabetical order
extract words from source.txt or online links

How to use:

Create a file named 'source.txt' in root directory and paste source text inside.
Run 'main.py'
Follow CLI instructions.
Results will be written to 'output.txt' in root directory.

Feel free to tweak the code to suit your needs.

How did I use it?

I ran this script on a large body of Persian text to extract words for contribution to Monkeytype. I added the "Persian 1k" & "Persian 5k" tests. My first open-source contribution!!

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

persian-word-extractor

Features:

How to use:

How did I use it?

About

Releases

Packages

Languages

AshkanArabim/persian-word-extractor

Folders and files

Latest commit

History

Repository files navigation

persian-word-extractor

Features:

How to use:

How did I use it?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages