Releases · eezd/EhFavDL

27 Oct 08:59

eezd

v1.2.5-beta.1

8f89edf

v1.2.5-beta.1 Latest

Latest

Fix: EH网站更新了网页，导致无法获取图片地址
Fix: 图片列表地址为None的报错
Fix: 流式下载图片时因中断从而导致下载不完全

Fix: The EH website updated its pages, causing the inability to retrieve image URLs.
Fix: Error when the image list URL is None.
Fix: Incomplete downloads of images due to interruptions during streaming downloads.

Full Changelog: v1.2.4...v1.2.5-beta.1

Assets 2

10 Sep 14:28

eezd

v1.2.4

cfffd46

v1.2.4

🔈What's Changed

v1.2.4正式版发布

本次更新日志范围包含 v1.2.3~v1.2.4

Feat

支持自动下载画廊. python main.py -w 详情看文档
config.yaml 新增 watch_fav_ids watch_lan_status watch_archive_status 字段, 配合 watch 模式使用. 详情看文档
新增根据 gid 重命名文件/文件夹名称

Fix

彻底解决使用 Web 下载画廊时出现的 ssl:default & 连接超时问题，

原因: 我们在下载画廊图片的时候是通过的 H@H 服务器下载，但是有些 H@H 服务器可能无法访问，就会导致上述问题，只需刷新画廊地址就可以了

DownloadWebGallery 使用 fetch_data 返回字符串 reload_image 无法处理的的错误
This IP address has been temporarily banned due to an excessive request rate" in 出现 hours 时无法正确计算等待时间
Support().rename_cbz_file() 的 base64_max_len 的值从 280>>>196，最大长度从 114>>>90，均缩减 30%，否则在 LANraragi 中会无法识别导致程序卡住。

Refactor

画廊文件从 zip >>> cbz, 请注意更改, 否则会无法识别
针对于 Watch() 以及若干代码的重构和细分

关于剩下的改动，你在使用的时候 logger 会列出一系列的操作，代码我也给了相关的注解，因此剩下的这里不一一列出了

Warn

在旧版本中 1280x 与 Original 两种格式的画廊可以共存，可以一起下载的。

但是在>=1.2.4，如果你下载了 Original 那么使用 DownloadArchiveGallery()/DownloadWebGallery() 都是无法下载 1280x 格式的画廊的。

当然了如果你两种都下载了, 也不会影响使用。

🔔ZIPtoCBZ

import os

def rename_zip_to_cbz(zip_path: str):
    cbz_path = os.path.splitext(zip_path)[0] + '.cbz'
    os.rename(zip_path, cbz_path)
    print(f"Renamed: {zip_path} to {cbz_path}")

def convert_all_zip_to_cbz_in_directory(directory: str):
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith('.zip'):
                zip_path = os.path.join(root, file)
                rename_zip_to_cbz(zip_path)

directory = '/path/to/your/directory'
convert_all_zip_to_cbz_in_directory(directory)

🔈What's Changed

v1.2.4 Official Release

The update log covers changes from v1.2.3 to v1.2.4.

Feat

Added support for automatic gallery downloads. Use the command: python main.py -w See documentation for details.
New fields added to config.yaml: watch_fav_ids, watch_lan_status, and watch_archive_status, which work in conjunction with the watch mode. See documentation for details.
Added the ability to rename files/folders based on gid.

Fix

Fully resolved the issue of ssl:default & connection timeout when downloading galleries via the web.

Cause: Gallery images are downloaded through H@H servers, but some H@H servers may be inaccessible, leading to the issue. Simply refreshing the gallery URL will fix it.
Fixed an error where DownloadWebGallery couldn't process reload_image returned by fetch_data.
Corrected the issue where waiting time could not be calculated correctly when encountering the error "This IP address has been temporarily banned due to an excessive request rate" and displaying hours.
The value of base64_max_len in Support().rename_cbz_file() has been reduced from 280>>>196, and the maximum length from 114>>>90, both reduced by 30%. Otherwise, it would not be recognized in LANraragi, causing the program to hang.

Refactor

Changed gallery file format from zip to cbz. Please update accordingly, or the files will not be recognized.
Refactored and optimized Watch() and several other code sections.

For the remaining changes, the logger will list a series of actions during usage, and I’ve added annotations in the code, so they are not all listed here.

Warn

In previous versions, both 1280x and Original gallery formats could coexist and be downloaded together.

However, in version >=1.2.4, if you download the Original format, you will not be able to download the 1280x format using DownloadArchiveGallery() or DownloadWebGallery().

Of course, if you've downloaded both formats, it will not affect usage.

Full Changelog: v1.2.3...v1.2.4

Assets 2

07 Sep 16:29

eezd

v1.2.4-beta.4

ded97ef

v1.2.4-beta.4

fix: real_url returns "reload_image"

Full Changelog: v1.2.4-beta.3...v1.2.4-beta.4

Assets 2

05 Sep 15:48

eezd

v1.2.4-beta.3

e32a10b

v1.2.4-beta.3

fix: 彻底解决使用Web下载画廊时出现的 ssl:default & 连接超时问题

原因: 我们在下载画廊图片的时候是通过的 H@H 服务器下载，但是有些 H@H 服务器可能无法访问，就会导致上述问题，只需刷新画廊地址就可以了

解决办法: 刷新图片地址

Fix: Completely resolve the ssl:default & connection timeout issues when downloading galleries via the web.

Cause: When downloading gallery images, we use various H@H servers. Some of these H@H servers might be inaccessible, leading to the aforementioned issues. Simply refreshing the gallery address resolves the problem.

Solution: Refresh the image address.

Full Changelog: v1.2.4-beta.2...v1.2.4-beta.3

Assets 2

05 Sep 09:18

eezd

v1.2.4-beta.2

db04bc4

v1.2.4-beta.2

fix real_url is False

fix DH_KEY_TOO_SMALL

Full Changelog: v1.2.4-beta.1...v1.2.4-beta.2

Assets 2

02 Sep 16:09

eezd

v1.2.4-beta.1

f9b3fb0

v1.2.4-beta.1

🔈What's Changed

本项目从建立之初到现在已经半年有多了，现在代码已经有庞大了，虽然我已经 code review 了好几次, 但也不可避免的会遗漏一些 BUG，如果你遇到了希望你可以提下 issue。感谢🙏

这次补齐 -w 功能，之后基本没什么可以更新功能了，最多例行维护下功能优化下模块。

Feat

python main.py -w 自动下载画廊. 详情看文档

config.yaml 新增 watch_fav_ids watch_lan_status watch_archive_status 字段详情看文档

Refactor

画廊文件从 zip >>> cbz, 请注意更改, 否则会无法识别

针对于 Watch() 以及若干代码的重构和细分

关于剩下的改动，你在使用的时候 logger 会列出一系列的操作，代码我也给了相关的注解，因此剩下的这里不一一列出了

Warn

在旧版本中 1280x 与 Original 两种格式的画廊可以共存，可以一起下载的。

但是在>=1.2.4，如果你下载了 Original 那么使用 DownloadArchiveGallery()/DownloadWebGallery() 都是无法下载 1280x 格式的画廊的，只能选一种格式下载。

当然了如果你两种都下载了, 也不会影响使用。

🔔ZIPtoCBZ

import os

def rename_zip_to_cbz(zip_path: str):
    cbz_path = os.path.splitext(zip_path)[0] + '.cbz'
    os.rename(zip_path, cbz_path)
    print(f"Renamed: {zip_path} to {cbz_path}")

def convert_all_zip_to_cbz_in_directory(directory: str):
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith('.zip'):
                zip_path = os.path.join(root, file)
                rename_zip_to_cbz(zip_path)

directory = '/path/to/your/directory'
convert_all_zip_to_cbz_in_directory(directory)

🔈What's Changed

It has been more than six months since the project was established, and the codebase has grown significantly. Although I've conducted several code reviews, it's inevitable that some bugs may have been missed. If you encounter any, please consider opening an issue. Thank you 🙏

This update completes the -w feature, and there will likely be no major new features in the future, just routine maintenance and minor optimizations.

Feat

python main.py -w now automatically downloads galleries. See documentation for details
New fields added to config.yaml: watch_fav_ids, watch_lan_status, watch_archive_status. See documentation for details

Refactor

Gallery files have changed from zip to cbz. Please make the necessary adjustments, or they won't be recognized.
Refactored and reorganized Watch() and several other parts of the code.

For the remaining changes, you'll see a series of operations listed by the logger during use. I've also provided relevant code comments, so I won't detail all the changes here.

Warn

In older versions, both 1280x and Original formats could coexist and be downloaded together.

However, in version >=1.2.4, if you download the Original format, neither DownloadArchiveGallery() nor DownloadWebGallery() will allow you to download the 1280x format. You can only choose one format to download.

Of course, if you've already downloaded both formats, it won't affect your usage.

Full Changelog: v1.2.3.1...v1.2.4-beta.1

Assets 2

01 Sep 09:36

eezd

v1.2.3.1

6ee2af3

v1.2.3.1

本次更新主要修复网络请求的BUG

Fix

FIX: This IP address has been temporarily banned due to an excessive request rate

FIX: get_image_limits

connect_limit: 3 该值最好不要大于3, 否则你会出现下列报错 (连续下载100个画廊的经验)

下面是本次新增比较重要的异常捕获

Server disconnected
- 目前初步判断是: eh设置了每分钟/小时最多请求数量, 所以才会出现请求超时, 白屏
CERTIFICATE_VERIFY_FAILED & ssl:default
- 这个是证书问题, 有时候请求是没有 SSL 证书, 或者说是一个错误的证书
- 所以导致 anihttp 报错, 解决办法就是不去校验证书就可以了 (ssl=False)

目前经过测试下载了80个画廊，还未出现问题。

This update primarily addresses bugs related to network requests.

Fixes

FIX: "This IP address has been temporarily banned due to an excessive request rate"
FIX: get_image_limits

connect_limit: 3 It's best not to set this value higher than 3, otherwise, you may encounter the following errors (based on the experience of downloading 100 galleries consecutively).

Below are some important exception captures added in this update:

Server disconnected
- The initial assessment is that this occurs because EH has set a maximum number of requests per minute/hour, leading to request timeouts and a blank screen.
CERTIFICATE_VERIFY_FAILED & ssl:default
- This issue is related to certificates. Sometimes, the requests lack an SSL certificate or have an incorrect one.
- As a result, anihttp throws an error. The solution is to bypass certificate verification (ssl=False).

So far, 80 galleries have been downloaded during testing without any issues.

Full Changelog: v1.2.3...v1.2.3.1

Assets 2

11 Aug 17:13

eezd

v1.2.3

af76c75

v1.2.3

🔈What's Changed

新增

根据 IP 配额下载(只适用于DownloadWebGallery()) @bf179 #2
中文Tag翻译 config.yaml 新增 tags_translation 字段 @bf179 #2
DownloadWebGallery() 存在重复文件时使用哈希检查 @bf179 #2
config.yaml 的 cookies 新增 sk 和 hath_perks
- 如果不设置, 那么IP配额将从50000变成5000(取决于你的账号)
判断画廊是否存在新版本(AddFavData().apply())
新增注解

修复

当 IP 配额不足时返回的地址会变成509.gif @bf179 #2
将 tag字段从 eh_data 移动出来, 作为两个单独的表 tag_list & gid_tid @bf179 #2
无法正确识别 copyright
避免 DownloadWebGallery() 下载重复图片, 现在下载图片的格式是 00000001.jpg
捕获 TLS/SSL connection has been closed (EOF) (_ssl.c:1006) 错误
捕获 Your IP address has been temporarily banned for excessive pageloads 错误
无法获取 current_key

注意

本次更新数据库字段存在变更: eh_data表移除tag字段

现在使用 DownloadWebGallery() 时IP配额必须小于 total_limits * 80% 总数的80%, 假如触发了该预警, 将会等待恢复到 total_limits * 30% 总数的30%才会继续运行(total_limits * 30% = lower_value < image_limits < total_limits * 80%)

如果你不喜欢这样的设置, 请修改 Config().wait_image_limits() 代码

🔔旧版本迁移

v1.2.1>=version<=v1.2.2

执行下面代码

import sqlite3

dbs_name = "./data.db"

with sqlite3.connect(dbs_name) as co:
    print("RENAME eh_data TO old_eh_data...")
    co.execute('ALTER TABLE eh_data RENAME TO old_eh_data')
    co.commit()

    print("CREATE TABLE eh_data...")
    co.execute('''
    CREATE TABLE IF NOT EXISTS "eh_data" (
        "gid" INTEGER PRIMARY KEY NOT NULL,
        "token" TEXT NOT NULL,
        "title" TEXT,
        "title_jpn" TEXT,
        "category" TEXT /*分类*/,
        "thumb" TEXT /*封面 URL*/,
        "uploader" TEXT /*上传者*/,
        "posted" TEXT /*发布时间*/,
        "filecount" INTEGER /*页数*/,
        "filesize" INTEGER /*大小*/,
        "expunged" INTEGER NOT NULL DEFAULT 0 /*是否被隐藏*/,
        "copyright_flag" INTEGER NOT NULL DEFAULT 0 /*是否被版权*/,
        "rating" TEXT /*评分*/,
        "current_gid" INTEGER /*画廊最新gid*/,
        "current_token" TEXT /*画廊最新token*/
    )''')
    co.commit()

    print("Move gid and token from old_eh_data to new eh_data...")
    co.execute('''
    INSERT INTO eh_data (gid, token)
    SELECT gid, token FROM old_eh_data;
    ''')
    co.commit()

    print("Data transfer complete.")

下载新版本代码
执行 2. Update Gallery Metadata (Update Tags)

完成

🔈What's Changed

Added

IP quota-based download (only applicable to DownloadWebGallery()) @bf179 #2
Added tags_translation field in config.yaml for Chinese tag translation @bf179 #2
Implemented hash check for duplicate files in DownloadWebGallery() @bf179 #2
Added sk and hath_perks fields to cookies in config.yaml
- If not set, the IP quota will drop from 50,000 to 5,000 (depending on your account)
Check if a new version of the gallery exists (AddFavData().apply()).
Added annotations

Fixed

Address returning 509.gif when IP quota is exhausted @bf179 #2
Moved the tag field from eh_data to two separate tables: tag_list & gid_tid @bf179 #2
Issue with correctly recognizing copyright
Prevented DownloadWebGallery() from downloading duplicate images; image format is now 00000001.jpg
Captured TLS/SSL connection has been closed (EOF) (_ssl.c:1006) error
Captured Your IP address has been temporarily banned for excessive pageloads error
Unable to retrieve current_key

Note

This update includes a change to the database fields: the tag field has been removed from the eh_data table.

Now, when using DownloadWebGallery(), the IP quota must be less than 80% of the total_limits. If this threshold is triggered, the process will pause until the quota drops below 30% of the total_limits before continuing (total_limits * 30% = lower_value < image_limits < total_limits * 80%).

If you don't like this setting, please modify the Config().wait_image_limits() code.

🔔 Migration from Previous Versions

v1.2.1 >= version <= v1.2.2

Run the following code:

import sqlite3

dbs_name = "./data.db"

with sqlite3.connect(dbs_name) as co:
    print("RENAME eh_data TO old_eh_data...")
    co.execute('ALTER TABLE eh_data RENAME TO old_eh_data')
    co.commit()

    print("CREATE TABLE eh_data...")
    co.execute('''
    CREATE TABLE IF NOT EXISTS "eh_data" (
        "gid" INTEGER PRIMARY KEY NOT NULL,
        "token" TEXT NOT NULL,
        "title" TEXT,
        "title_jpn" TEXT,
        "category" TEXT /*Category*/,
        "thumb" TEXT /*Thumbnail URL*/,
        "uploader" TEXT /*Uploader*/,
        "posted" TEXT /*Posted Time*/,
        "filecount" INTEGER /*Page Count*/,
        "filesize" INTEGER /*Size*/,
        "expunged" INTEGER NOT NULL DEFAULT 0 /*Expunged*/,
        "copyright_flag" INTEGER NOT NULL DEFAULT 0 /*Copyrighted*/,
        "rating" TEXT /*Rating*/,
        "current_gid" INTEGER /*Latest gallery gid*/,
        "current_token" TEXT /*Latest gallery token*/
    )''')
    co.commit()

    print("Move gid and token from old_eh_data to new eh_data...")
    co.execute('''
    INSERT INTO eh_data (gid, token)
    SELECT gid, token FROM old_eh_data;
    ''')
    co.commit()

    print("Data transfer complete.")

Download the new version of the code.
Run 2. Update Gallery Metadata (Update Tags).

Done.

Full Changelog: v1.2.2...v1.2.3

Contributors

bf179

Assets 2

20 Jul 12:44

eezd

v1.2.2

047a906

v1.2.2

请先阅读 v1.2.0 更新日志

请先阅读 v1.2.1 更新日志

🔈更新内容

修复了 AddFavData().delete_fav_category_del_flag() 报错, 并优化了 del_flag 的检测

Full Changelog: v1.2.1...v1.2.2

Please read the v1.2.0 release notes first.

Please read the v1.2.1 release notes first.

🔈 Update Details

Fixed the error in AddFavData().delete_fav_category_del_flag()
Improved detection of the del_flag

Full Changelog: v1.2.1...v1.2.2

Assets 2

09 Jul 14:16

eezd

v1.2.1

30b884b

v1.2.1

请先阅读 v1.2.0 更新日志

🔈更新内容

DownloadWebGallery 获取图片不再使用 Streaming Response Content。并且也支持恢复下载了, 只要下载内容还在 web/tmp 里面。
💥现在使用 aiohttp 代替原有 httpx。
💥所有EH请求由 Config().fetch_data() 以及 Config().fetch_data_stream() 同一处理, 并且所有网络请求相关的异常和判断也都在这里处理。
删除了大量早期的屎山代码👍

async def check_fetch_err(self, response, msg):
    content_type = response.headers.get('Content-Type', '').lower()
    if 'text' in content_type or 'json' in content_type or 'html' in content_type:
        content = await response.text()
        if "IP quota exhausted" in content:
            logger.warning("IP quota exhausted.")
            raise Exception("IP quota exhausted")
        elif "You have clocked too many downloaded bytes on this gallery" in content:
            logger.warning("You have clocked too many downloaded bytes on this gallery.")
            logger.warning("Please open Gallery---Archive Download---Cancel")
            logger.warning(msg)
            raise Exception("You have clocked too many downloaded bytes on this gallery")
        elif response.status != 200:
            logger.warning(f"code: {response.status_code}")
            logger.warning(f'{content}: {msg}')

Full Changelog: v1.2.0...v1.2.1

Please read the v1.2.0 release notes first.

🔈 What's New

DownloadWebGallery no longer uses Streaming Response Content for retrieving images. Additionally, it now supports resuming downloads, as long as the content remains in web/tmp.
💥 The aiohttp library has replaced httpx.
💥 All EH requests are now handled by Config().fetch_data() and Config().fetch_data_stream(). This centralizes the handling of all network-related exceptions and checks.
Removed a substantial amount of early, messy code. 👍

async def check_fetch_err(self, response, msg):
    content_type = response.headers.get('Content-Type', '').lower()
    if 'text' in content_type or 'json' in content_type or 'html' in content_type:
        content = await response.text()
        if "IP quota exhausted" in content:
            logger.warning("IP quota exhausted.")
            raise Exception("IP quota exhausted")
        elif "You have clocked too many downloaded bytes on this gallery" in content:
            logger.warning("You have clocked too many downloaded bytes on this gallery.")
            logger.warning("Please open Gallery---Archive Download---Cancel")
            logger.warning(msg)
            raise Exception("You have clocked too many downloaded bytes on this gallery")
        elif response.status != 200:
            logger.warning(f"code: {response.status_code}")
            logger.warning(f'{content}: {msg}')

Full Changelog: v1.2.0...v1.2.1

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🔈What's Changed

Feat

Fix

Refactor

Warn

🔔ZIPtoCBZ

🔈What's Changed

Feat

Fix

Refactor

Warn

🔈What's Changed

Feat

Refactor

Warn

🔔ZIPtoCBZ

🔈What's Changed

Feat

Refactor

Warn

Fix

Fixes

🔈What's Changed

新增

修复

注意

🔔旧版本迁移

🔈What's Changed

Added

Fixed

Note

🔔 Migration from Previous Versions

Contributors

🔈更新内容

🔈 Update Details

🔈更新内容

🔈 What's New

Releases: eezd/EhFavDL

v1.2.5-beta.1

v1.2.4

🔈What's Changed

Feat

Fix

Refactor

Warn

🔔ZIPtoCBZ

🔈What's Changed

Feat

Fix

Refactor

Warn

v1.2.4-beta.4

v1.2.4-beta.3

v1.2.4-beta.2

v1.2.4-beta.1

🔈What's Changed

Feat

Refactor

Warn

🔔ZIPtoCBZ

🔈What's Changed

Feat

Refactor

Warn

v1.2.3.1

Fix

Fixes

v1.2.3

🔈What's Changed

新增

修复

注意

🔔旧版本迁移

🔈What's Changed

Added

Fixed

Note

🔔 Migration from Previous Versions

Contributors

v1.2.2

🔈更新内容

🔈 Update Details

v1.2.1

🔈更新内容

🔈 What's New