发布 5.1 版本

JoeanAmier · Dec 1, 2023 · 313ee31 · 313ee31
1 parent b108be8
commit 313ee31
Show file tree

Hide file tree

Showing 14 changed files with 83 additions and 26 deletions.
diff --git a/README.md b/README.md
@@ -82,7 +82,7 @@
 
 [![演示视频](docs/程序运行演示.png)](https://www.bilibili.com/video/BV1Nu4y1L7LW/)
 
-<p><b>点击图片观看演示视频，建议通过配置文件进行管理账号，更多介绍请查阅 <a href="https://github.com/JoeanAmier/TikTokDownloader/wiki/Documentation">文档</a></b></p>
+<p><b>🎥 点击图片观看演示视频，建议通过配置文件进行管理账号，更多介绍请查阅 <a href="https://github.com/JoeanAmier/TikTokDownloader/wiki/Documentation">文档</a></b></p>
 
 # 📈 项目状态\(Status\)
 
@@ -186,6 +186,8 @@ TikTokDownloader
 |  采集热榜数据  | ❌无需登录  |
 | 下载账号收藏作品 | ✔️需要登录 |
 
+**Cookie 仅需在失效后重新写入配置文件，并非每次运行程序都要写入配置文件！**
+
 **程序获取数据失败时，可以尝试更新 Cookie 或者使用已登录的 Cookie！**
 
 <hr>
@@ -194,8 +196,7 @@ TikTokDownloader
 
 <ul>
 <li>程序提示用户输入时，直接回车代表返回上级菜单，输入 <code>Q</code> 或 <code>q</code> 代表结束运行</li>
-<li>由于获取账号喜欢作品和收藏作品数据仅返回喜欢 / 收藏作品的发布日期，不返回操作日期，因此程序需要获取全部喜欢 / 收藏作品数据再进行日期筛选；如果作品数量较多，可能会花费较长的时间；可通过 <code>pages</code> 参数控制请求次数</li>
-<li>使用 <code>SQLite</code> 格式储存数据时，重复获取作品数据将会更新点赞收藏等统计数据</li>
+<li>由于获取账号喜欢作品和收藏作品数据仅返回喜欢 / 收藏作品的发布日期，不返回操作日期，因此程序需要获取全部喜欢 / 收藏作品数据再进行日期筛选；如果作品数量较多，可能会花费较长的时间；可通过 <code>max_pages</code> 参数控制请求次数</li>
 <li>获取私密账号的发布作品数据需要登录后的 Cookie，且登录的账号需要关注该私密账号</li>
 <li>批量下载账号作品或合集作品时，如果对应的昵称或标识发生变化，程序会自动更新已下载作品文件名称中的昵称和标识</li>
 <li>程序下载文件时会先将文件下载至临时文件夹，下载完成后再移动至储存文件夹；程序运行结束时会清空临时文件夹</li>
@@ -204,7 +205,7 @@ TikTokDownloader
 <li>如果想要程序使用代理，必须在 <code>settings.json</code> 设置 <code>proxies</code> 参数，否则程序不会使用代理</li>
 <li>部分使用者反馈，新发布的作品过早下载会下载到低分辨率的文件，一段时间后才能下载到高分辨率文件，但时间规律尚不明确</li>
 <li>退出程序时，请以正常方式结束运行或者按下 Ctrl + C 结束运行，不要直接点击终端窗口的关闭按钮结束运行，否则会导致数据丢失</li>
-<li>程序默认不启用请求延时，但是建议使用者编辑 <code>src/Customizer.py</code> 文件启用随机延时或固定延时，可以降低被抖音风控的概率</li>
+<li>程序默认不启用请求延时，但是建议使用者编辑 <code>src/Customizer.py</code> 文件启用随机延时或固定延时，避免频繁请求导致被抖音风控</li>
 </ul>
 <hr>
 

diff --git a/docs/TikTokDownloader文档.md b/docs/TikTokDownloader文档.md
@@ -363,6 +363,38 @@
 
 <p><strong>服务器部署模式：</strong> 仅 <code>cookie</code>、<code>proxies</code>、<code>max_retry</code> 参数生效，其余参数均不生效，但仍需正确编辑配置文件。</p>
 <h2>参数详解</h2>
+<h3>下载喜欢作品</h3>
+
+```json
+{
+  "accounts_urls": [
+    {
+      "mark": "",
+      "url": "账号主页链接-1",
+      "tab": "favorite",
+      "earliest": "",
+      "latest": ""
+    },
+    {
+      "mark": "",
+      "url": "账号主页链接-2",
+      "tab": "post",
+      "earliest": "",
+      "latest": ""
+    },
+    {
+      "mark": "",
+      "url": "账号主页链接-3",
+      "tab": "favorite",
+      "earliest": "",
+      "latest": ""
+    }
+  ]
+}
+```
+
+<p>将待下载的账号信息写入配置文件，每个账号对应一个对象/字典，<code>tab</code> 参数设置为 <code>favorite</code> 代表批量下载喜欢作品，支持多账号。</p>
+
 <h3>文件储存路径</h3>
 
 ```json
@@ -643,10 +675,10 @@ document.body.removeChild(downloadLink);
 <p><strong>输入：</strong><code>猫咪 3 2</code> 等效于 <code>猫咪 直播搜索 2</code></p>
 <p><strong>含义：</strong> 关键词：<code>猫咪</code>；搜索类型：<code>直播搜索</code>；页数：<code>2</code></p>
 <h3>采集抖音热榜数据</h3>
-<p>采集 <code>抖音热榜</code>、<code>娱乐榜</code>、<code>社会榜</code>、<code>挑战榜</code> 数据并储存至文件；必须设置 <code>storage_format</code> 参数才能正常使用。</p>
+<p>无需输入，采集 <code>抖音热榜</code>、<code>娱乐榜</code>、<code>社会榜</code>、<code>挑战榜</code> 数据并储存至文件；必须设置 <code>storage_format</code> 参数才能正常使用。</p>
 <p>储存名称格式：<code>实时热榜数据_采集时间_热榜名称</code></p>
 <h3>批量下载收藏作品</h3>
-<p>需要在配置文件写入已登录的 Cookie，并在 <code>owner_url</code> 参数填入对应的账号主页链接和账号标识（可选）；目前仅支持采集当前 Cookie 对应账号的收藏作品。</p>
+<p>无需输入，需要在配置文件写入已登录的 Cookie，并在 <code>owner_url</code> 参数填入对应的账号主页链接和账号标识（可选）；目前仅支持采集当前 Cookie 对应账号的收藏作品。</p>
 <p>如果未设置 <code>owner_url</code> 参数，程序会使用临时字符串作为账号昵称和 UID。</p>
 <p>账号文件夹格式为 <code>UID123456789_mark_收藏作品</code> 或者 <code>UID123456789_账号昵称_收藏作品</code></p>
 <h2>Web API 接口模式</h2>

diff --git a/docs/WebAPI模式截图.png b/docs/WebAPI模式截图.png
diff --git a/docs/WebUI模式截图1.png b/docs/WebUI模式截图1.png
diff --git a/docs/WebUI模式截图2.png b/docs/WebUI模式截图2.png
diff --git a/docs/WebUI模式截图3.png b/docs/WebUI模式截图3.png
diff --git a/docs/终端模式截图1.png b/docs/终端模式截图1.png
diff --git a/docs/终端模式截图2.png b/docs/终端模式截图2.png
diff --git a/main.py b/main.py
@@ -72,7 +72,7 @@ class TikTokDownloader:
     # print(PROJECT_ROOT)  # 调试使用
 
     VERSION = 5.1
-    STABLE = False
+    STABLE = True
 
     REPOSITORY = "https://github.com/JoeanAmier/TikTokDownloader"
     LICENCE = "GNU General Public License v3.0"

diff --git a/src/Configuration.py b/src/Configuration.py
@@ -414,7 +414,7 @@ def get_settings_data(self) -> dict:
             "max_retry": self.max_retry,
             "max_pages": self.max_pages,
             "default_mode": int(self.default_mode),
-            "ffmpeg": self.ffmpeg.path,
+            "ffmpeg": self.ffmpeg.path or "",
         }
 
     def update_settings_data(self, data: dict, ):
@@ -451,6 +451,7 @@ def _check_system_type():
             return ['x-terminal-emulator'], False
 
     def __check_ffmpeg_path(self, path: Path):
+        # return None  # 调试使用
         return self.__check_system_ffmpeg() or self.__check_system_ffmpeg(path)
 
     def download(self, data: list[tuple], proxies, timeout, user_agent):

diff --git a/src/DataAcquirer.py b/src/DataAcquirer.py
@@ -168,9 +168,9 @@ def __set_temp_cookie(self, cookie: str):
 
 class Share:
     share_link = compile(
-        r".*?(https://v\.douyin\.com/[A-Za-z0-9]+?/).*?")
+        r"\S*?(https://v\.douyin\.com/[^/\s]+)\S*?")
     share_link_tiktok = compile(
-        r".*?(https://vm\.tiktok\.com/[a-zA-Z0-9]+/).*?")
+        r"\S*?(https://vm\.tiktok\.com/[^/\s]+)\S*?")
     headers = {
         "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome"
                       "/116.0.0.0 Safari/537.36", }
@@ -211,28 +211,30 @@ def get_url(self, url: str) -> str:
 class Link:
     # 抖音链接
     account_link = compile(
-        r"https://www\.douyin\.com/user/([A-Za-z0-9_-]+)(?:.*?\bmodal_id=(\d{19}))?")  # 账号主页链接
+        r"\S*?https://www\.douyin\.com/user/([A-Za-z0-9_-]+)(?:\S*?\bmodal_id=(\d{19}))?")  # 账号主页链接
     account_share = compile(
-        r".*?https://www\.iesdouyin\.com/share/user/(.*?)\?.*?"  # 账号主页分享短链
+        r"\S*?https://www\.iesdouyin\.com/share/user/(\S*?)\?\S*?"  # 账号主页分享链接
     )
     works_id = compile(r"\b(\d{19})\b")  # 作品 ID
     works_link = compile(
-        r".*?https://www\.douyin\.com/(?:video|note)/([0-9]{19}).*?")  # 作品链接
+        r"\S*?https://www\.douyin\.com/(?:video|note)/([0-9]{19})\S*?")  # 作品链接
     works_share = compile(
-        r".*?https://www\.iesdouyin\.com/share/(?:video|note)/([0-9]{19})/.*?"
-    )  # 作品分享短链
+        r"\S*?https://www\.iesdouyin\.com/share/(?:video|note)/([0-9]{19})/\S*?"
+    )  # 作品分享链接
     mix_link = compile(
-        r".*?https://www\.douyin\.com/collection/(\d{19}).*?")  # 合集链接
-    live_link = compile(r".*?https://live\.douyin\.com/([0-9]+).*?")  # 直播链接
+        r"\S*?https://www\.douyin\.com/collection/(\d{19})\S*?")  # 合集链接
+    mix_share = compile(
+        r"\S*?https://www\.iesdouyin\.com/share/mix/detail/(\d{19})/\S*?")  # 合集分享链接
+    live_link = compile(r"\S*?https://live\.douyin\.com/([0-9]+)\S*?")  # 直播链接
     live_link_self = compile(
-        r".*?https://www\.douyin\.com/follow\?webRid=(\d+).*?"
+        r"\S*?https://www\.douyin\.com/follow\?webRid=(\d+)\S*?"
     )
     live_link_share = compile(
-        r"https://webcast\.amemv\.com/douyin/webcast/reflow/\S+")
+        r"\S*?https://webcast\.amemv\.com/douyin/webcast/reflow/\S+")
 
     # TikTok 链接
     works_link_tiktok = compile(
-        r".*?https://www\.tiktok\.com/@.+?/video/(\d{19}).*?")  # 作品链接
+        r"\S*?https://www\.tiktok\.com/@\S+?/video/(\d{19})\S*?")  # 作品链接
 
     def __init__(self, params: Parameter):
         self.share = Share(params.logger, params.proxies, params.max_retry)
@@ -270,6 +272,8 @@ def mix(self, text: str) -> tuple:
             return False, u
         elif u := self.mix_link.findall(urls):
             return True, u
+        elif u := self.mix_share.findall(urls):
+            return True, u
         return None, []
 
     def live(self, text: str) -> tuple:

diff --git a/src/DataDownloader.py b/src/DataDownloader.py
@@ -162,7 +162,7 @@ def run_live(self, data: list[tuple]):
             self.downloader_chart(
                 download_tasks,
                 SimpleNamespace(),
-                self.__general_progress_object(),
+                self.__live_progress_object(),
                 len(download_tasks),
                 unknown_size=True,
                 headers=self.black_headers)
@@ -285,10 +285,12 @@ def download_image(
             if self.is_in_blacklist(id_):
                 count.skipped_image.add(id_)
                 self.log.info(f"图集 {id_} 存在下载记录，跳过下载")
+                count.skipped_image.add(id_)
                 break
             elif self.is_exists(p := actual_root.with_name(f"{name}_{index}.jpeg")):
                 self.log.info(f"图集 {id_}_{index} 文件已存在，跳过下载")
                 self.log.info(f"文件路径: {p.resolve()}", False)
+                count.skipped_image.add(id_)
                 continue
             tasks.append((
                 img,
@@ -452,6 +454,7 @@ def download_file(
             return False
         self.save_file(temp, actual)
         self.log.info(f"{show} 文件下载成功")
+        self.log.info(f"文件路径 {actual.resolve()}", False)
         self.blacklist.update_id(id_)
         self.add_count(show, id_, count)
         return True

diff --git a/src/FileManager.py b/src/FileManager.py
@@ -105,7 +105,20 @@ def rename_folder(
         new_folder = self.root.joinpath(f"{type_}{id_}_{mark}_{addition}")
         self.rename(old_folder, new_folder, "文件夹")
         self.log.info(f"文件夹 {old_folder} 已重命名为 {new_folder}", False)
-        return True
+
+    def __rename_works_folder(self,
+                              old_: Path,
+                              id_: str,
+                              mark: str,
+                              name: str,
+                              field: str) -> Path:
+        if (s := self.data[id_][field]) in old_.name:
+            new_ = old_.parent / old_.name.replace(
+                s, {"name": name, "mark": mark}[field], 1)
+            self.rename(old_, new_)
+            self.log.info(f"文件夹 {old_} 重命名为 {new_}", False)
+            return new_
+        return old_
 
     def scan_file(
             self,
@@ -120,7 +133,8 @@ def scan_file(
         item_list = root.iterdir()
         if solo_mode:
             for f in item_list:
-                if f.isdir():
+                if f.is_dir():
+                    f = self.__rename_works_folder(f, id_, mark, name, field)
                     files = f.iterdir()
                     self.batch_rename(f, files, id_, mark, name, field)
         else:
@@ -129,7 +143,7 @@ def scan_file(
     def batch_rename(
             self,
             root: Path,
-            files: tuple,
+            files,
             id_: str,
             mark: str,
             name: str,

diff --git a/src/main_complete.py b/src/main_complete.py
@@ -235,7 +235,7 @@ def account_works_batch(self, root, params, logger):
             if not (sec_user_id := self.check_sec_user_id(data.url)):
                 self.logger.warning(
                     f"配置文件 accounts_urls 参数"
-                    f"第 {index} 条数据的 url 无效")
+                    f"第 {index} 条数据的 url {data.url} 错误，提取 sec_user_id 失败")
                 count.failed += 1
                 continue
             if not self.deal_account_works(
@@ -567,7 +567,9 @@ def mix_batch(self, root, params, logger):
         for index, data in enumerate(self.mix, start=1):
             mix_id, id_ = self._check_mix_id(data.url)
             if not id_:
-                self.logger.warning(f"{data.url} 获取作品 ID 或合集 ID 失败")
+                self.logger.warning(
+                    f"配置文件 mix_urls 参数" f"第 {index} 条数据的 url {
+                    data.url} 错误，获取作品 ID 或合集 ID 失败")
                 count.failed += 1
                 continue
             if not self._deal_mix_works(