Work explained here: https://duenaslerin.com/diccionario-palabras-espanol-en-texto-script/
Updated with RAE server in: 2024-05-22
Download all data from RAE
python3 src/rae_downloader.py
It generates the file data/palabras_todas.txt
by default.
Split in diferent files
bash src/spliter.sh
Usage
usage: rae_downloader.py [-h] [--conjugaciones] [--skip-conjugaciones]
[--outfile outfile]
RAE Downloader.
optional arguments:
-h, --help show this help message and exit
--conjugaciones
--skip-conjugaciones
--outfile outfile
Words in file has no order and can be duplicades:
cat palabras_todas.txt | grep -v '.*-$' | grep -v ^- | sort | uniq > 0_palabras_todas.txt
The 0_palabras_todas.txt
file is needed.
Inside the diccionario-espanol-txt
folder and running the length.sh
file will create the length
folder with the words classified by its length.
bash src/length.sh
The 0_palabras_todas.txt
file is needed.
Due to the lack of palabras_todas.txt
file (creating it will last so many hours) the spliter.sh
file will not work. So this script works with the 0_palabras_todas.txt
file.
Inside the diccionario-espanol-txt
folder and running the starting_letter.sh
file will create the starting_letter
folder with the words classified by the first letter.
bash src/starting_letter.sh