This Python script scrapes image URLs from a website hosting episodes of the Hong Kong comic book series "Pendekar Laut" and downloads the images into folders based on the episode URLs.
-
Clone this repository:
git clone https://github.com/mrdodgerx/pendekar-laut.git
-
Install the required Python packages:
pip install -r requirements.txt
-
Modify the
MAIN_URL
variable in the script to the URL of the main page of the "Pendekar Laut" comic website. -
Run the script:
python main.py
The script will fetch all episodes from the main page, find the "Read more" link for each episode, download the images from that link, and save them into folders named after each episode.
- requests: For making HTTP requests to fetch webpage content.
- Beautiful Soup: For parsing HTML content.
- fake_headers: For generating fake user-agent headers to avoid bot detection.
- urllib: For parsing URLs.
"Pendekar Laut" (Tiger Shark) is a popular Hong Kong comic book series written and illustrated by Wan Yat Leung. The story revolves around the adventures of Pai Cheung Lang, the titular character who is a skilled martial artist . Set in a fictional world inspired by Chinese martial arts and nautical themes, Pai Cheung Lang embarks on various quests and battles against formidable adversaries while seeking justice and protecting the innocent.
This project is licensed under the MIT License - see the LICENSE file for details.