Skip to content

Releases: eezd/EhFavDL

v1.2.5-beta.1

27 Oct 08:59
Compare
Choose a tag to compare

Fix: EH网站更新了网页,导致无法获取图片地址
Fix: 图片列表地址为None的报错
Fix: 流式下载图片时因中断从而导致下载不完全


Fix: The EH website updated its pages, causing the inability to retrieve image URLs.
Fix: Error when the image list URL is None.
Fix: Incomplete downloads of images due to interruptions during streaming downloads.

Full Changelog: v1.2.4...v1.2.5-beta.1

v1.2.4

10 Sep 14:28
Compare
Choose a tag to compare

🔈What's Changed

v1.2.4正式版发布

本次更新日志范围包含 v1.2.3~v1.2.4

Feat

  1. 支持自动下载画廊. python main.py -w 详情看文档

  2. config.yaml 新增 watch_fav_ids watch_lan_status watch_archive_status 字段, 配合 watch 模式使用. 详情看文档

  3. 新增根据 gid 重命名文件/文件夹名称

Fix

  1. 彻底解决使用 Web 下载画廊时出现的 ssl:default & 连接超时问题,

原因: 我们在下载画廊图片的时候是通过的 H@H 服务器下载,但是有些 H@H 服务器可能无法访问,就会导致上述问题,只需刷新画廊地址就可以了

  1. DownloadWebGallery 使用 fetch_data 返回字符串 reload_image 无法处理的的错误

  2. This IP address has been temporarily banned due to an excessive request rate" in 出现 hours 时无法正确计算等待时间

  3. Support().rename_cbz_file()base64_max_len 的值从 280>>>196,最大长度从 114>>>90,均缩减 30%,否则在 LANraragi 中会无法识别导致程序卡住。

Refactor

  1. 画廊文件从 zip >>> cbz, 请注意更改, 否则会无法识别

  2. 针对于 Watch() 以及若干代码的重构和细分

关于剩下的改动,你在使用的时候 logger 会列出一系列的操作,代码我也给了相关的注解,因此剩下的这里不一一列出了

Warn

在旧版本中 1280xOriginal 两种格式的画廊可以共存,可以一起下载的。

但是在>=1.2.4,如果你下载了 Original 那么使用 DownloadArchiveGallery()/DownloadWebGallery() 都是无法下载 1280x 格式的画廊的。

当然了如果你两种都下载了, 也不会影响使用。

🔔ZIPtoCBZ

import os

def rename_zip_to_cbz(zip_path: str):
    cbz_path = os.path.splitext(zip_path)[0] + '.cbz'
    os.rename(zip_path, cbz_path)
    print(f"Renamed: {zip_path} to {cbz_path}")

def convert_all_zip_to_cbz_in_directory(directory: str):
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith('.zip'):
                zip_path = os.path.join(root, file)
                rename_zip_to_cbz(zip_path)

directory = '/path/to/your/directory'
convert_all_zip_to_cbz_in_directory(directory)



🔈What's Changed

v1.2.4 Official Release

The update log covers changes from v1.2.3 to v1.2.4.

Feat

  1. Added support for automatic gallery downloads. Use the command: python main.py -w See documentation for details.

  2. New fields added to config.yaml: watch_fav_ids, watch_lan_status, and watch_archive_status, which work in conjunction with the watch mode. See documentation for details.

  3. Added the ability to rename files/folders based on gid.

Fix

  1. Fully resolved the issue of ssl:default & connection timeout when downloading galleries via the web.

    Cause: Gallery images are downloaded through H@H servers, but some H@H servers may be inaccessible, leading to the issue. Simply refreshing the gallery URL will fix it.

  2. Fixed an error where DownloadWebGallery couldn't process reload_image returned by fetch_data.

  3. Corrected the issue where waiting time could not be calculated correctly when encountering the error "This IP address has been temporarily banned due to an excessive request rate" and displaying hours.

  4. The value of base64_max_len in Support().rename_cbz_file() has been reduced from 280>>>196, and the maximum length from 114>>>90, both reduced by 30%. Otherwise, it would not be recognized in LANraragi, causing the program to hang.

Refactor

  1. Changed gallery file format from zip to cbz. Please update accordingly, or the files will not be recognized.

  2. Refactored and optimized Watch() and several other code sections.

For the remaining changes, the logger will list a series of actions during usage, and I’ve added annotations in the code, so they are not all listed here.

Warn

In previous versions, both 1280x and Original gallery formats could coexist and be downloaded together.

However, in version >=1.2.4, if you download the Original format, you will not be able to download the 1280x format using DownloadArchiveGallery() or DownloadWebGallery().

Of course, if you've downloaded both formats, it will not affect usage.

Full Changelog: v1.2.3...v1.2.4

v1.2.4-beta.4

07 Sep 16:29
Compare
Choose a tag to compare

fix: real_url returns "reload_image"

Full Changelog: v1.2.4-beta.3...v1.2.4-beta.4

v1.2.4-beta.3

05 Sep 15:48
Compare
Choose a tag to compare

fix: 彻底解决使用Web下载画廊时出现的 ssl:default & 连接超时问题

原因: 我们在下载画廊图片的时候是通过的 H@H 服务器下载,但是有些 H@H 服务器可能无法访问,就会导致上述问题,只需刷新画廊地址就可以了

解决办法: 刷新图片地址




Fix: Completely resolve the ssl:default & connection timeout issues when downloading galleries via the web.

Cause: When downloading gallery images, we use various H@H servers. Some of these H@H servers might be inaccessible, leading to the aforementioned issues. Simply refreshing the gallery address resolves the problem.

Solution: Refresh the image address.

Full Changelog: v1.2.4-beta.2...v1.2.4-beta.3

v1.2.4-beta.2

05 Sep 09:18
Compare
Choose a tag to compare

v1.2.4-beta.1

02 Sep 16:09
Compare
Choose a tag to compare

🔈What's Changed

本项目从建立之初到现在已经半年有多了,现在代码已经有庞大了,虽然我已经 code review 了好几次, 但也不可避免的会遗漏一些 BUG,如果你遇到了希望你可以提下 issue。感谢🙏

这次补齐 -w 功能,之后基本没什么可以更新功能了,最多例行维护下功能优化下模块。

Feat

python main.py -w 自动下载画廊. 详情看文档

config.yaml 新增 watch_fav_ids watch_lan_status watch_archive_status 字段 详情看文档

Refactor

画廊文件从 zip >>> cbz, 请注意更改, 否则会无法识别

针对于 Watch() 以及若干代码的重构和细分

关于剩下的改动,你在使用的时候 logger 会列出一系列的操作,代码我也给了相关的注解,因此剩下的这里不一一列出了

Warn

在旧版本中 1280xOriginal 两种格式的画廊可以共存,可以一起下载的。

但是在>=1.2.4,如果你下载了 Original 那么使用 DownloadArchiveGallery()/DownloadWebGallery() 都是无法下载 1280x 格式的画廊的,只能选一种格式下载。

当然了如果你两种都下载了, 也不会影响使用。

🔔ZIPtoCBZ

import os

def rename_zip_to_cbz(zip_path: str):
    cbz_path = os.path.splitext(zip_path)[0] + '.cbz'
    os.rename(zip_path, cbz_path)
    print(f"Renamed: {zip_path} to {cbz_path}")

def convert_all_zip_to_cbz_in_directory(directory: str):
    for root, dirs, files in os.walk(directory):
        for file in files:
            if file.endswith('.zip'):
                zip_path = os.path.join(root, file)
                rename_zip_to_cbz(zip_path)

directory = '/path/to/your/directory'
convert_all_zip_to_cbz_in_directory(directory)



🔈What's Changed

It has been more than six months since the project was established, and the codebase has grown significantly. Although I've conducted several code reviews, it's inevitable that some bugs may have been missed. If you encounter any, please consider opening an issue. Thank you 🙏

This update completes the -w feature, and there will likely be no major new features in the future, just routine maintenance and minor optimizations.

Feat

Refactor

  • Gallery files have changed from zip to cbz. Please make the necessary adjustments, or they won't be recognized.

  • Refactored and reorganized Watch() and several other parts of the code.

For the remaining changes, you'll see a series of operations listed by the logger during use. I've also provided relevant code comments, so I won't detail all the changes here.

Warn

In older versions, both 1280x and Original formats could coexist and be downloaded together.

However, in version >=1.2.4, if you download the Original format, neither DownloadArchiveGallery() nor DownloadWebGallery() will allow you to download the 1280x format. You can only choose one format to download.

Of course, if you've already downloaded both formats, it won't affect your usage.

Full Changelog: v1.2.3.1...v1.2.4-beta.1

v1.2.3.1

01 Sep 09:36
Compare
Choose a tag to compare

本次更新主要修复网络请求的BUG

Fix

FIX: This IP address has been temporarily banned due to an excessive request rate

FIX: get_image_limits

connect_limit: 3 该值最好不要大于3, 否则你会出现下列报错 (连续下载100个画廊的经验)

下面是本次新增比较重要的异常捕获

  • Server disconnected

    • 目前初步判断是: eh设置了每分钟/小时最多请求数量, 所以才会出现请求超时, 白屏
  • CERTIFICATE_VERIFY_FAILED & ssl:default

    • 这个是证书问题, 有时候请求是没有 SSL 证书, 或者说是一个错误的证书
    • 所以导致 anihttp 报错, 解决办法就是不去校验证书就可以了 (ssl=False)

目前经过测试下载了80个画廊,还未出现问题。




This update primarily addresses bugs related to network requests.

Fixes

  • FIX: "This IP address has been temporarily banned due to an excessive request rate"
  • FIX: get_image_limits

connect_limit: 3 It's best not to set this value higher than 3, otherwise, you may encounter the following errors (based on the experience of downloading 100 galleries consecutively).

Below are some important exception captures added in this update:

  • Server disconnected

    • The initial assessment is that this occurs because EH has set a maximum number of requests per minute/hour, leading to request timeouts and a blank screen.
  • CERTIFICATE_VERIFY_FAILED & ssl:default

    • This issue is related to certificates. Sometimes, the requests lack an SSL certificate or have an incorrect one.
    • As a result, anihttp throws an error. The solution is to bypass certificate verification (ssl=False).

So far, 80 galleries have been downloaded during testing without any issues.

Full Changelog: v1.2.3...v1.2.3.1

v1.2.3

11 Aug 17:13
Compare
Choose a tag to compare

🔈What's Changed

新增

  • 根据 IP 配额下载(只适用于DownloadWebGallery()) @bf179 #2
  • 中文Tag翻译 config.yaml 新增 tags_translation 字段 @bf179 #2
  • DownloadWebGallery() 存在重复文件时使用哈希检查 @bf179 #2
  • config.yamlcookies 新增 skhath_perks
    • 如果不设置, 那么IP配额将从50000变成5000(取决于你的账号)
  • 判断画廊是否存在新版本(AddFavData().apply())
  • 新增注解

修复

  • 当 IP 配额不足时返回的地址会变成509.gif @bf179 #2
  • 将 tag字段 从 eh_data 移动出来, 作为两个单独的表 tag_list & gid_tid @bf179 #2
  • 无法正确识别 copyright
  • 避免 DownloadWebGallery() 下载重复图片, 现在下载图片的格式是 00000001.jpg
  • 捕获 TLS/SSL connection has been closed (EOF) (_ssl.c:1006) 错误
  • 捕获 Your IP address has been temporarily banned for excessive pageloads 错误
  • 无法获取 current_key

注意

本次更新数据库字段存在变更: eh_data表移除tag字段

现在使用 DownloadWebGallery() 时IP配额必须小于 total_limits * 80% 总数的80%, 假如触发了该预警, 将会等待恢复到 total_limits * 30% 总数的30%才会继续运行(total_limits * 30% = lower_value < image_limits < total_limits * 80%)

如果你不喜欢这样的设置, 请修改 Config().wait_image_limits() 代码

🔔旧版本迁移

  • v1.2.1>=version<=v1.2.2
  1. 执行下面代码
import sqlite3

dbs_name = "./data.db"

with sqlite3.connect(dbs_name) as co:
    print("RENAME eh_data TO old_eh_data...")
    co.execute('ALTER TABLE eh_data RENAME TO old_eh_data')
    co.commit()

    print("CREATE TABLE eh_data...")
    co.execute('''
    CREATE TABLE IF NOT EXISTS "eh_data" (
        "gid" INTEGER PRIMARY KEY NOT NULL,
        "token" TEXT NOT NULL,
        "title" TEXT,
        "title_jpn" TEXT,
        "category" TEXT /*分类*/,
        "thumb" TEXT /*封面 URL*/,
        "uploader" TEXT /*上传者*/,
        "posted" TEXT /*发布时间*/,
        "filecount" INTEGER /*页数*/,
        "filesize" INTEGER /*大小*/,
        "expunged" INTEGER NOT NULL DEFAULT 0 /*是否被隐藏*/,
        "copyright_flag" INTEGER NOT NULL DEFAULT 0 /*是否被版权*/,
        "rating" TEXT /*评分*/,
        "current_gid" INTEGER /*画廊最新gid*/,
        "current_token" TEXT /*画廊最新token*/
    )''')
    co.commit()

    print("Move gid and token from old_eh_data to new eh_data...")
    co.execute('''
    INSERT INTO eh_data (gid, token)
    SELECT gid, token FROM old_eh_data;
    ''')
    co.commit()

    print("Data transfer complete.")
  1. 下载新版本代码

  2. 执行 2. Update Gallery Metadata (Update Tags)

完成




🔈What's Changed

Added

  • IP quota-based download (only applicable to DownloadWebGallery()) @bf179 #2
  • Added tags_translation field in config.yaml for Chinese tag translation @bf179 #2
  • Implemented hash check for duplicate files in DownloadWebGallery() @bf179 #2
  • Added sk and hath_perks fields to cookies in config.yaml
    • If not set, the IP quota will drop from 50,000 to 5,000 (depending on your account)
  • Check if a new version of the gallery exists (AddFavData().apply()).
  • Added annotations

Fixed

  • Address returning 509.gif when IP quota is exhausted @bf179 #2
  • Moved the tag field from eh_data to two separate tables: tag_list & gid_tid @bf179 #2
  • Issue with correctly recognizing copyright
  • Prevented DownloadWebGallery() from downloading duplicate images; image format is now 00000001.jpg
  • Captured TLS/SSL connection has been closed (EOF) (_ssl.c:1006) error
  • Captured Your IP address has been temporarily banned for excessive pageloads error
  • Unable to retrieve current_key

Note

This update includes a change to the database fields: the tag field has been removed from the eh_data table.

Now, when using DownloadWebGallery(), the IP quota must be less than 80% of the total_limits. If this threshold is triggered, the process will pause until the quota drops below 30% of the total_limits before continuing (total_limits * 30% = lower_value < image_limits < total_limits * 80%).

If you don't like this setting, please modify the Config().wait_image_limits() code.

🔔 Migration from Previous Versions

  • v1.2.1 >= version <= v1.2.2
  1. Run the following code:
import sqlite3

dbs_name = "./data.db"

with sqlite3.connect(dbs_name) as co:
    print("RENAME eh_data TO old_eh_data...")
    co.execute('ALTER TABLE eh_data RENAME TO old_eh_data')
    co.commit()

    print("CREATE TABLE eh_data...")
    co.execute('''
    CREATE TABLE IF NOT EXISTS "eh_data" (
        "gid" INTEGER PRIMARY KEY NOT NULL,
        "token" TEXT NOT NULL,
        "title" TEXT,
        "title_jpn" TEXT,
        "category" TEXT /*Category*/,
        "thumb" TEXT /*Thumbnail URL*/,
        "uploader" TEXT /*Uploader*/,
        "posted" TEXT /*Posted Time*/,
        "filecount" INTEGER /*Page Count*/,
        "filesize" INTEGER /*Size*/,
        "expunged" INTEGER NOT NULL DEFAULT 0 /*Expunged*/,
        "copyright_flag" INTEGER NOT NULL DEFAULT 0 /*Copyrighted*/,
        "rating" TEXT /*Rating*/,
        "current_gid" INTEGER /*Latest gallery gid*/,
        "current_token" TEXT /*Latest gallery token*/
    )''')
    co.commit()

    print("Move gid and token from old_eh_data to new eh_data...")
    co.execute('''
    INSERT INTO eh_data (gid, token)
    SELECT gid, token FROM old_eh_data;
    ''')
    co.commit()

    print("Data transfer complete.")
  1. Download the new version of the code.

  2. Run 2. Update Gallery Metadata (Update Tags).

Done.

Full Changelog: v1.2.2...v1.2.3

v1.2.2

20 Jul 12:44
Compare
Choose a tag to compare

请先阅读 v1.2.0 更新日志

请先阅读 v1.2.1 更新日志


🔈更新内容

修复了 AddFavData().delete_fav_category_del_flag() 报错, 并优化了 del_flag 的检测

Full Changelog: v1.2.1...v1.2.2




Please read the v1.2.0 release notes first.

Please read the v1.2.1 release notes first.


🔈 Update Details

  • Fixed the error in AddFavData().delete_fav_category_del_flag()
  • Improved detection of the del_flag

Full Changelog: v1.2.1...v1.2.2

v1.2.1

09 Jul 14:16
Compare
Choose a tag to compare

请先阅读 v1.2.0 更新日志

请先阅读 v1.2.0 更新日志

请先阅读 v1.2.0 更新日志


🔈更新内容

  1. DownloadWebGallery 获取图片不再使用 Streaming Response Content。并且也支持恢复下载了, 只要下载内容还在 web/tmp 里面。
  2. 💥现在使用 aiohttp 代替原有 httpx
  3. 💥所有EH请求由 Config().fetch_data() 以及 Config().fetch_data_stream() 同一处理, 并且所有网络请求相关的异常和判断也都在这里处理。
  4. 删除了大量早期的屎山代码👍
async def check_fetch_err(self, response, msg):
    content_type = response.headers.get('Content-Type', '').lower()
    if 'text' in content_type or 'json' in content_type or 'html' in content_type:
        content = await response.text()
        if "IP quota exhausted" in content:
            logger.warning("IP quota exhausted.")
            raise Exception("IP quota exhausted")
        elif "You have clocked too many downloaded bytes on this gallery" in content:
            logger.warning("You have clocked too many downloaded bytes on this gallery.")
            logger.warning("Please open Gallery---Archive Download---Cancel")
            logger.warning(msg)
            raise Exception("You have clocked too many downloaded bytes on this gallery")
        elif response.status != 200:
            logger.warning(f"code: {response.status_code}")
            logger.warning(f'{content}: {msg}')

Full Changelog: v1.2.0...v1.2.1




Please read the v1.2.0 release notes first.

Please read the v1.2.0 release notes first.

Please read the v1.2.0 release notes first.


🔈 What's New

  1. DownloadWebGallery no longer uses Streaming Response Content for retrieving images. Additionally, it now supports resuming downloads, as long as the content remains in web/tmp.
  2. 💥 The aiohttp library has replaced httpx.
  3. 💥 All EH requests are now handled by Config().fetch_data() and Config().fetch_data_stream(). This centralizes the handling of all network-related exceptions and checks.
  4. Removed a substantial amount of early, messy code. 👍
async def check_fetch_err(self, response, msg):
    content_type = response.headers.get('Content-Type', '').lower()
    if 'text' in content_type or 'json' in content_type or 'html' in content_type:
        content = await response.text()
        if "IP quota exhausted" in content:
            logger.warning("IP quota exhausted.")
            raise Exception("IP quota exhausted")
        elif "You have clocked too many downloaded bytes on this gallery" in content:
            logger.warning("You have clocked too many downloaded bytes on this gallery.")
            logger.warning("Please open Gallery---Archive Download---Cancel")
            logger.warning(msg)
            raise Exception("You have clocked too many downloaded bytes on this gallery")
        elif response.status != 200:
            logger.warning(f"code: {response.status_code}")
            logger.warning(f'{content}: {msg}')

Full Changelog: v1.2.0...v1.2.1