Skip to content

๐Ÿฆ‰๐Ÿ“š This project scrapes the Italian words (& any other languages available on Duolingo) from the Duome website (https://duome.eu/vocabulary/en/it) using Playwright, then it downloads phonetics from GoogleTextToSpeech (gTTS) & creates Anki flashcards.

License

Notifications You must be signed in to change notification settings

411A/DuomeScraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

10 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

DuomeScraper

This project scrapes the Italian words (& any other languages available on Duolingo) from the Duome website (https://duome.eu/vocabulary/en/it) using Playwright, then it downloads phonetics from GoogleTextToSpeech (gTTS) & creates Anki flashcards.

DuomeScraperVideo.mp4

๐Ÿ”ฐ

How to run
  1. Download main.py & requirements.txt and put them inside a folder
  2. Create a virtual environment:
    python -m venv VEnv
        
  3. Activate virtual environment:
    • ๐ŸชŸ Windows CMD:
      VEnv\Scripts\activate
              
    • ๐Ÿง Linux:
      source VEnv/bin/activate
              
  4. Install dependencies:
    pip install -r requirements.txt
        
  5. Install playwright (โš ๏ธ code uses Microsoft Edge browser, you can change that to chromium if you don't want to download msedge):
    playwright install && playwright install msedge
        
  6. Read the code, you may need to personalize some variables, then run the main.py & wait to get the final .apkg file
  7. Open Anki application...
    On Android: From top-right, click on โ‹ฎ and select Import โžก๏ธ Deck package (.apkg)
    On Desktop: File โžก๏ธ Import... โžก๏ธ Choose .apkg file

โš ๏ธ Known (possible) issues
  • If all word elements didn't load all at once, we should scroll down to retrieve all the words. However, this feature has not been implemented yet, as the website displays all words at once (all necessary elements are visible after load).
  • Some languages, like German, don't have definitions. When accessing the definition element, an exception may be raised.

About

๐Ÿฆ‰๐Ÿ“š This project scrapes the Italian words (& any other languages available on Duolingo) from the Duome website (https://duome.eu/vocabulary/en/it) using Playwright, then it downloads phonetics from GoogleTextToSpeech (gTTS) & creates Anki flashcards.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages