-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A method to scrape all Forvo pronunciations to use the add-on offline #5
Comments
Good day, I have not yet tried the script myself, I'm curious how it will prevent forvo from blocking the ip that's bulk downloading all audios of a language. I'll soon have some fresh throwaway IPs to test it on. For now, I will mark this as an enhancement, this proposal could well solve the issues my original bulk scraper had by having the users give the addon a dictionary file to work with. |
The Author of the Script is from China. He is a member of the Telegram group of "FreeMDict": He obtained a list with 5.7 million URLs from Forvo using Python and spent several weeks doing it! He finished the work on August 2021 and shared with me the script. The original author tried to scrape too quickly all the sounds from Forvo and after querying 1 or 2 million URLs his IP was blocked in China. Then, he asked me to scrape from my IP. I did it slowly (at an speed 400 Kb/s) and succesfully queried all the 5.7 million URLs 🥇 Forvo never blocked me. 👯 😃 On September I obtained 620.000 German Pronunciations from Forvo and made an .mdx dictionary (on FreeMDict - Private post). Yesterday I run the Python script and is still working perfectly ! I tried Russian, French and English and those languages work OK. Just follow the instructions on FreeMDict where the script was posted:
|
Someone from https://forum.ru-board.com/ (aleven) is downloading all the Russian pronunciations from Forvo.com. He might finish within 3-4 days. Please let me know if you are interested in the sounds. |
@Rascalov All Forvo Audios are now available to download: https://forum.freemdict.com/t/topic/11947 You can use the Russian audios for your language learning :D |
A method to scrape all Forvo pronunciations is now available :
https://ankiweb.net/shared/info/560814150#:~:text=11%2F21%2F2021-,a%20method%20to%20scrape,-all%20Forvo%20pronunciations
The scraping method works perfectly. It can scrape absolutely all the audio files for each language.
For example, there are more than 500.000 russian audios to scrape easily.
Would it be possible to download the audios of one's target language and then bulk add into Anki with the add-on ?
The text was updated successfully, but these errors were encountered: