Releases: eezd/EhFavDL
v1.2.5-beta.1
Fix: EH网站更新了网页,导致无法获取图片地址
Fix: 图片列表地址为None的报错
Fix: 流式下载图片时因中断从而导致下载不完全
Fix: The EH website updated its pages, causing the inability to retrieve image URLs.
Fix: Error when the image list URL is None.
Fix: Incomplete downloads of images due to interruptions during streaming downloads.
Full Changelog: v1.2.4...v1.2.5-beta.1
v1.2.4
🔈What's Changed
v1.2.4正式版发布
本次更新日志范围包含 v1.2.3~v1.2.4
Feat
-
支持自动下载画廊.
python main.py -w
详情看文档 -
config.yaml
新增watch_fav_ids
watch_lan_status
watch_archive_status
字段, 配合watch
模式使用. 详情看文档 -
新增根据
gid
重命名文件/文件夹名称
Fix
- 彻底解决使用 Web 下载画廊时出现的
ssl:default
& 连接超时问题,
原因: 我们在下载画廊图片的时候是通过的 H@H
服务器下载,但是有些 H@H
服务器可能无法访问,就会导致上述问题,只需刷新画廊地址就可以了
-
DownloadWebGallery
使用fetch_data
返回字符串reload_image
无法处理的的错误 -
This IP address has been temporarily banned due to an excessive request rate" in
出现hours
时无法正确计算等待时间 -
Support().rename_cbz_file()
的base64_max_len
的值从280>>>196
,最大长度从114>>>90
,均缩减 30%,否则在LANraragi
中会无法识别导致程序卡住。
Refactor
-
画廊文件从
zip
>>>cbz
, 请注意更改, 否则会无法识别 -
针对于
Watch()
以及若干代码的重构和细分
关于剩下的改动,你在使用的时候 logger
会列出一系列的操作,代码我也给了相关的注解,因此剩下的这里不一一列出了
Warn
在旧版本中 1280x
与 Original
两种格式的画廊可以共存,可以一起下载的。
但是在>=1.2.4,如果你下载了 Original
那么使用 DownloadArchiveGallery()/DownloadWebGallery()
都是无法下载 1280x
格式的画廊的。
当然了如果你两种都下载了, 也不会影响使用。
🔔ZIPtoCBZ
import os
def rename_zip_to_cbz(zip_path: str):
cbz_path = os.path.splitext(zip_path)[0] + '.cbz'
os.rename(zip_path, cbz_path)
print(f"Renamed: {zip_path} to {cbz_path}")
def convert_all_zip_to_cbz_in_directory(directory: str):
for root, dirs, files in os.walk(directory):
for file in files:
if file.endswith('.zip'):
zip_path = os.path.join(root, file)
rename_zip_to_cbz(zip_path)
directory = '/path/to/your/directory'
convert_all_zip_to_cbz_in_directory(directory)
🔈What's Changed
v1.2.4 Official Release
The update log covers changes from v1.2.3
to v1.2.4
.
Feat
-
Added support for automatic gallery downloads. Use the command:
python main.py -w
See documentation for details. -
New fields added to
config.yaml
:watch_fav_ids
,watch_lan_status
, andwatch_archive_status
, which work in conjunction with thewatch
mode. See documentation for details. -
Added the ability to rename files/folders based on
gid
.
Fix
-
Fully resolved the issue of
ssl:default
& connection timeout when downloading galleries via the web.Cause: Gallery images are downloaded through
H@H
servers, but someH@H
servers may be inaccessible, leading to the issue. Simply refreshing the gallery URL will fix it. -
Fixed an error where
DownloadWebGallery
couldn't processreload_image
returned byfetch_data
. -
Corrected the issue where waiting time could not be calculated correctly when encountering the error "This IP address has been temporarily banned due to an excessive request rate" and displaying
hours
. -
The value of
base64_max_len
inSupport().rename_cbz_file()
has been reduced from280>>>196
, and the maximum length from114>>>90
, both reduced by 30%. Otherwise, it would not be recognized inLANraragi
, causing the program to hang.
Refactor
-
Changed gallery file format from
zip
tocbz
. Please update accordingly, or the files will not be recognized. -
Refactored and optimized
Watch()
and several other code sections.
For the remaining changes, the logger
will list a series of actions during usage, and I’ve added annotations in the code, so they are not all listed here.
Warn
In previous versions, both 1280x
and Original
gallery formats could coexist and be downloaded together.
However, in version >=1.2.4, if you download the Original
format, you will not be able to download the 1280x
format using DownloadArchiveGallery()
or DownloadWebGallery()
.
Of course, if you've downloaded both formats, it will not affect usage.
Full Changelog: v1.2.3...v1.2.4
v1.2.4-beta.4
fix: real_url returns "reload_image"
Full Changelog: v1.2.4-beta.3...v1.2.4-beta.4
v1.2.4-beta.3
fix: 彻底解决使用Web下载画廊时出现的 ssl:default
& 连接超时问题
原因: 我们在下载画廊图片的时候是通过的 H@H
服务器下载,但是有些 H@H
服务器可能无法访问,就会导致上述问题,只需刷新画廊地址就可以了
解决办法: 刷新图片地址
Fix: Completely resolve the ssl:default
& connection timeout issues when downloading galleries via the web.
Cause: When downloading gallery images, we use various H@H
servers. Some of these H@H
servers might be inaccessible, leading to the aforementioned issues. Simply refreshing the gallery address resolves the problem.
Solution: Refresh the image address.
Full Changelog: v1.2.4-beta.2...v1.2.4-beta.3
v1.2.4-beta.2
v1.2.4-beta.1
🔈What's Changed
本项目从建立之初到现在已经半年有多了,现在代码已经有庞大了,虽然我已经 code review 了好几次, 但也不可避免的会遗漏一些 BUG,如果你遇到了希望你可以提下 issue
。感谢🙏
这次补齐 -w
功能,之后基本没什么可以更新功能了,最多例行维护下功能优化下模块。
Feat
python main.py -w
自动下载画廊. 详情看文档
config.yaml
新增 watch_fav_ids
watch_lan_status
watch_archive_status
字段 详情看文档
Refactor
画廊文件从 zip
>>> cbz
, 请注意更改, 否则会无法识别
针对于 Watch()
以及若干代码的重构和细分
关于剩下的改动,你在使用的时候 logger
会列出一系列的操作,代码我也给了相关的注解,因此剩下的这里不一一列出了
Warn
在旧版本中 1280x
与 Original
两种格式的画廊可以共存,可以一起下载的。
但是在>=1.2.4,如果你下载了 Original
那么使用 DownloadArchiveGallery()/DownloadWebGallery()
都是无法下载 1280x
格式的画廊的,只能选一种格式下载。
当然了如果你两种都下载了, 也不会影响使用。
🔔ZIPtoCBZ
import os
def rename_zip_to_cbz(zip_path: str):
cbz_path = os.path.splitext(zip_path)[0] + '.cbz'
os.rename(zip_path, cbz_path)
print(f"Renamed: {zip_path} to {cbz_path}")
def convert_all_zip_to_cbz_in_directory(directory: str):
for root, dirs, files in os.walk(directory):
for file in files:
if file.endswith('.zip'):
zip_path = os.path.join(root, file)
rename_zip_to_cbz(zip_path)
directory = '/path/to/your/directory'
convert_all_zip_to_cbz_in_directory(directory)
🔈What's Changed
It has been more than six months since the project was established, and the codebase has grown significantly. Although I've conducted several code reviews, it's inevitable that some bugs may have been missed. If you encounter any, please consider opening an issue
. Thank you 🙏
This update completes the -w
feature, and there will likely be no major new features in the future, just routine maintenance and minor optimizations.
Feat
-
python main.py -w
now automatically downloads galleries. See documentation for details -
New fields added to
config.yaml
:watch_fav_ids
,watch_lan_status
,watch_archive_status
. See documentation for details
Refactor
-
Gallery files have changed from
zip
tocbz
. Please make the necessary adjustments, or they won't be recognized. -
Refactored and reorganized
Watch()
and several other parts of the code.
For the remaining changes, you'll see a series of operations listed by the logger
during use. I've also provided relevant code comments, so I won't detail all the changes here.
Warn
In older versions, both 1280x
and Original
formats could coexist and be downloaded together.
However, in version >=1.2.4, if you download the Original
format, neither DownloadArchiveGallery()
nor DownloadWebGallery()
will allow you to download the 1280x
format. You can only choose one format to download.
Of course, if you've already downloaded both formats, it won't affect your usage.
Full Changelog: v1.2.3.1...v1.2.4-beta.1
v1.2.3.1
本次更新主要修复网络请求的BUG
Fix
FIX: This IP address has been temporarily banned due to an excessive request rate
FIX: get_image_limits
connect_limit: 3
该值最好不要大于3, 否则你会出现下列报错 (连续下载100个画廊的经验)
下面是本次新增比较重要的异常捕获
-
Server disconnected
- 目前初步判断是: eh设置了每分钟/小时最多请求数量, 所以才会出现请求超时, 白屏
-
CERTIFICATE_VERIFY_FAILED
&ssl:default
- 这个是证书问题, 有时候请求是没有
SSL
证书, 或者说是一个错误的证书 - 所以导致
anihttp
报错, 解决办法就是不去校验证书就可以了 (ssl=False)
- 这个是证书问题, 有时候请求是没有
目前经过测试下载了80个画廊,还未出现问题。
This update primarily addresses bugs related to network requests.
Fixes
- FIX: "This IP address has been temporarily banned due to an excessive request rate"
- FIX:
get_image_limits
connect_limit: 3
It's best not to set this value higher than 3, otherwise, you may encounter the following errors (based on the experience of downloading 100 galleries consecutively).
Below are some important exception captures added in this update:
-
Server disconnected
- The initial assessment is that this occurs because EH has set a maximum number of requests per minute/hour, leading to request timeouts and a blank screen.
-
CERTIFICATE_VERIFY_FAILED & ssl:default
- This issue is related to certificates. Sometimes, the requests lack an SSL certificate or have an incorrect one.
- As a result,
anihttp
throws an error. The solution is to bypass certificate verification (ssl=False).
So far, 80 galleries have been downloaded during testing without any issues.
Full Changelog: v1.2.3...v1.2.3.1
v1.2.3
🔈What's Changed
新增
- 根据 IP 配额下载(只适用于
DownloadWebGallery()
) @bf179 #2 - 中文Tag翻译
config.yaml
新增tags_translation
字段 @bf179 #2 DownloadWebGallery()
存在重复文件时使用哈希检查 @bf179 #2config.yaml
的cookies
新增sk
和hath_perks
- 如果不设置, 那么IP配额将从50000变成5000(取决于你的账号)
- 判断画廊是否存在新版本(
AddFavData().apply()
) - 新增注解
修复
- 当 IP 配额不足时返回的地址会变成509.gif @bf179 #2
- 将 tag字段 从 eh_data 移动出来, 作为两个单独的表
tag_list
&gid_tid
@bf179 #2 - 无法正确识别
copyright
- 避免
DownloadWebGallery()
下载重复图片, 现在下载图片的格式是00000001.jpg
- 捕获
TLS/SSL connection has been closed (EOF) (_ssl.c:1006)
错误 - 捕获
Your IP address has been temporarily banned for excessive pageloads
错误 - 无法获取
current_key
注意
本次更新数据库字段存在变更: eh_data表移除tag字段
现在使用 DownloadWebGallery()
时IP配额必须小于 total_limits * 80%
总数的80%, 假如触发了该预警, 将会等待恢复到 total_limits * 30%
总数的30%才会继续运行(total_limits * 30% = lower_value < image_limits < total_limits * 80%
)
如果你不喜欢这样的设置, 请修改 Config().wait_image_limits()
代码
🔔旧版本迁移
- v1.2.1>=version<=v1.2.2
- 执行下面代码
import sqlite3
dbs_name = "./data.db"
with sqlite3.connect(dbs_name) as co:
print("RENAME eh_data TO old_eh_data...")
co.execute('ALTER TABLE eh_data RENAME TO old_eh_data')
co.commit()
print("CREATE TABLE eh_data...")
co.execute('''
CREATE TABLE IF NOT EXISTS "eh_data" (
"gid" INTEGER PRIMARY KEY NOT NULL,
"token" TEXT NOT NULL,
"title" TEXT,
"title_jpn" TEXT,
"category" TEXT /*分类*/,
"thumb" TEXT /*封面 URL*/,
"uploader" TEXT /*上传者*/,
"posted" TEXT /*发布时间*/,
"filecount" INTEGER /*页数*/,
"filesize" INTEGER /*大小*/,
"expunged" INTEGER NOT NULL DEFAULT 0 /*是否被隐藏*/,
"copyright_flag" INTEGER NOT NULL DEFAULT 0 /*是否被版权*/,
"rating" TEXT /*评分*/,
"current_gid" INTEGER /*画廊最新gid*/,
"current_token" TEXT /*画廊最新token*/
)''')
co.commit()
print("Move gid and token from old_eh_data to new eh_data...")
co.execute('''
INSERT INTO eh_data (gid, token)
SELECT gid, token FROM old_eh_data;
''')
co.commit()
print("Data transfer complete.")
-
下载新版本代码
-
执行
2. Update Gallery Metadata (Update Tags)
完成
🔈What's Changed
Added
- IP quota-based download (only applicable to
DownloadWebGallery()
) @bf179 #2 - Added
tags_translation
field inconfig.yaml
for Chinese tag translation @bf179 #2 - Implemented hash check for duplicate files in
DownloadWebGallery()
@bf179 #2 - Added
sk
andhath_perks
fields tocookies
inconfig.yaml
- If not set, the IP quota will drop from 50,000 to 5,000 (depending on your account)
- Check if a new version of the gallery exists (
AddFavData().apply()
). - Added annotations
Fixed
- Address returning 509.gif when IP quota is exhausted @bf179 #2
- Moved the
tag
field fromeh_data
to two separate tables:tag_list
&gid_tid
@bf179 #2 - Issue with correctly recognizing
copyright
- Prevented
DownloadWebGallery()
from downloading duplicate images; image format is now00000001.jpg
- Captured
TLS/SSL connection has been closed (EOF) (_ssl.c:1006)
error - Captured
Your IP address has been temporarily banned for excessive pageloads
error - Unable to retrieve
current_key
Note
This update includes a change to the database fields: the tag field has been removed from the eh_data table.
Now, when using DownloadWebGallery()
, the IP quota must be less than 80% of the total_limits
. If this threshold is triggered, the process will pause until the quota drops below 30% of the total_limits
before continuing (total_limits * 30% = lower_value < image_limits < total_limits * 80%
).
If you don't like this setting, please modify the Config().wait_image_limits()
code.
🔔 Migration from Previous Versions
- v1.2.1 >= version <= v1.2.2
- Run the following code:
import sqlite3
dbs_name = "./data.db"
with sqlite3.connect(dbs_name) as co:
print("RENAME eh_data TO old_eh_data...")
co.execute('ALTER TABLE eh_data RENAME TO old_eh_data')
co.commit()
print("CREATE TABLE eh_data...")
co.execute('''
CREATE TABLE IF NOT EXISTS "eh_data" (
"gid" INTEGER PRIMARY KEY NOT NULL,
"token" TEXT NOT NULL,
"title" TEXT,
"title_jpn" TEXT,
"category" TEXT /*Category*/,
"thumb" TEXT /*Thumbnail URL*/,
"uploader" TEXT /*Uploader*/,
"posted" TEXT /*Posted Time*/,
"filecount" INTEGER /*Page Count*/,
"filesize" INTEGER /*Size*/,
"expunged" INTEGER NOT NULL DEFAULT 0 /*Expunged*/,
"copyright_flag" INTEGER NOT NULL DEFAULT 0 /*Copyrighted*/,
"rating" TEXT /*Rating*/,
"current_gid" INTEGER /*Latest gallery gid*/,
"current_token" TEXT /*Latest gallery token*/
)''')
co.commit()
print("Move gid and token from old_eh_data to new eh_data...")
co.execute('''
INSERT INTO eh_data (gid, token)
SELECT gid, token FROM old_eh_data;
''')
co.commit()
print("Data transfer complete.")
-
Download the new version of the code.
-
Run
2. Update Gallery Metadata (Update Tags)
.
Done.
Full Changelog: v1.2.2...v1.2.3
v1.2.2
请先阅读 v1.2.0 更新日志
请先阅读 v1.2.1 更新日志
🔈更新内容
修复了 AddFavData().delete_fav_category_del_flag() 报错, 并优化了 del_flag
的检测
Full Changelog: v1.2.1...v1.2.2
Please read the v1.2.0 release notes first.
Please read the v1.2.1 release notes first.
🔈 Update Details
- Fixed the error in
AddFavData().delete_fav_category_del_flag()
- Improved detection of the
del_flag
Full Changelog: v1.2.1...v1.2.2
v1.2.1
请先阅读 v1.2.0 更新日志
请先阅读 v1.2.0 更新日志
请先阅读 v1.2.0 更新日志
🔈更新内容
DownloadWebGallery
获取图片不再使用Streaming Response Content
。并且也支持恢复下载了, 只要下载内容还在web/tmp
里面。- 💥现在使用
aiohttp
代替原有httpx
。 - 💥所有EH请求由
Config().fetch_data()
以及Config().fetch_data_stream()
同一处理, 并且所有网络请求相关的异常和判断也都在这里处理。 - 删除了大量早期的屎山代码👍
async def check_fetch_err(self, response, msg):
content_type = response.headers.get('Content-Type', '').lower()
if 'text' in content_type or 'json' in content_type or 'html' in content_type:
content = await response.text()
if "IP quota exhausted" in content:
logger.warning("IP quota exhausted.")
raise Exception("IP quota exhausted")
elif "You have clocked too many downloaded bytes on this gallery" in content:
logger.warning("You have clocked too many downloaded bytes on this gallery.")
logger.warning("Please open Gallery---Archive Download---Cancel")
logger.warning(msg)
raise Exception("You have clocked too many downloaded bytes on this gallery")
elif response.status != 200:
logger.warning(f"code: {response.status_code}")
logger.warning(f'{content}: {msg}')
Full Changelog: v1.2.0...v1.2.1
Please read the v1.2.0 release notes first.
Please read the v1.2.0 release notes first.
Please read the v1.2.0 release notes first.
🔈 What's New
DownloadWebGallery
no longer usesStreaming Response Content
for retrieving images. Additionally, it now supports resuming downloads, as long as the content remains inweb/tmp
.- 💥 The
aiohttp
library has replacedhttpx
. - 💥 All EH requests are now handled by
Config().fetch_data()
andConfig().fetch_data_stream()
. This centralizes the handling of all network-related exceptions and checks. - Removed a substantial amount of early, messy code. 👍
async def check_fetch_err(self, response, msg):
content_type = response.headers.get('Content-Type', '').lower()
if 'text' in content_type or 'json' in content_type or 'html' in content_type:
content = await response.text()
if "IP quota exhausted" in content:
logger.warning("IP quota exhausted.")
raise Exception("IP quota exhausted")
elif "You have clocked too many downloaded bytes on this gallery" in content:
logger.warning("You have clocked too many downloaded bytes on this gallery.")
logger.warning("Please open Gallery---Archive Download---Cancel")
logger.warning(msg)
raise Exception("You have clocked too many downloaded bytes on this gallery")
elif response.status != 200:
logger.warning(f"code: {response.status_code}")
logger.warning(f'{content}: {msg}')
Full Changelog: v1.2.0...v1.2.1