Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

添加一个参考音频筛选工具 #1044

Open
wants to merge 77 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
7efdf31
添加参考音频筛功能选界面
Downupanddownup Apr 23, 2024
29b8370
添加根据list,转换参考音频的方法
Downupanddownup Apr 23, 2024
e69e449
功能补全
Downupanddownup Apr 24, 2024
8c9627b
功能补全
Downupanddownup Apr 24, 2024
a1fc00a
调整目录结构
Downupanddownup Apr 24, 2024
4cbbe2a
调整目录结构
Downupanddownup Apr 24, 2024
2c8f6bd
配置文件生成、音频抽样、音频推理测试
Downupanddownup Apr 24, 2024
4daa9ad
添加文本相似度比较功能
Downupanddownup Apr 25, 2024
b6f0bb3
添加同步参考音频代码
Downupanddownup Apr 25, 2024
ecbc7d0
添加配置文件管理
Downupanddownup Apr 25, 2024
441ab54
url编码调整
Downupanddownup Apr 25, 2024
f61a723
添加3s至10s的音频过滤
Downupanddownup Apr 25, 2024
926dd6b
调整配置管理,去除写入
Downupanddownup Apr 25, 2024
d20bd37
调整配置参数,进行集中管理
Downupanddownup Apr 25, 2024
d855eec
添加目录保存
Downupanddownup Apr 25, 2024
c8be484
添加路径清理
Downupanddownup Apr 25, 2024
1da23aa
bug修复
Downupanddownup Apr 25, 2024
2880e3a
添加性能监控
Downupanddownup Apr 26, 2024
878fef2
bug修复
Downupanddownup Apr 26, 2024
684e1cf
文本相似度,添加GPU加速
Downupanddownup Apr 26, 2024
ca9ffbf
音频相似度比较,添加参考音频的预采样步骤
Downupanddownup Apr 26, 2024
e3e47d2
音频相似度比较,添加参考音频的预采样步骤
Downupanddownup Apr 26, 2024
a291629
音频相似度比较,添加参考音频的预采样步骤
Downupanddownup Apr 26, 2024
64cc2fd
将打印信息,改由日志输出
Downupanddownup Apr 26, 2024
9fe20c1
添加音频预采样开关
Downupanddownup Apr 26, 2024
1d434e1
添加初始启动时的默认值
Downupanddownup Apr 26, 2024
d8d551d
bug修复
Downupanddownup Apr 26, 2024
d1e92ed
添加一些参数的读取和保存
Downupanddownup Apr 26, 2024
2a23f95
bug修复
Downupanddownup Apr 26, 2024
c36d0a9
api推理,添加多进程请求
Downupanddownup Apr 26, 2024
1a7cf58
创建日志目录
Downupanddownup Apr 27, 2024
25b65cd
调整ui布局
Downupanddownup Apr 27, 2024
9264f7e
添加事件绑定和实现
Downupanddownup Apr 28, 2024
6cb3c15
添加非中文语言的asr操作
Downupanddownup Apr 28, 2024
27325f4
调整项目结构,修复随机采样bug
Downupanddownup Apr 28, 2024
1356736
提取一部分公共组件
Downupanddownup Apr 28, 2024
af0bd9f
添加ui初始化值
Downupanddownup Apr 28, 2024
e89f986
添加ui参数写入
Downupanddownup Apr 28, 2024
d6e255a
添加windows下启动文件
Downupanddownup Apr 28, 2024
b1ad8b5
bug修复
Downupanddownup Apr 28, 2024
c9547ab
添加url消息提示
Downupanddownup Apr 28, 2024
0146815
添加url消息提示
Downupanddownup Apr 28, 2024
536c226
添加url消息提示
Downupanddownup Apr 28, 2024
61db7f0
测试
Downupanddownup Apr 28, 2024
fe969ab
测试
Downupanddownup Apr 28, 2024
371a2d7
bug调整
Downupanddownup Apr 29, 2024
5280d17
ui布局调整
Downupanddownup Apr 29, 2024
c26fa98
参考类型,添加选择
Downupanddownup Apr 29, 2024
5081168
添加切换重置按钮事件
Downupanddownup Apr 29, 2024
8182908
添加说话人确认模型切换
Downupanddownup Apr 29, 2024
b835688
优化代码
Downupanddownup Apr 29, 2024
1de89fe
优化代码
Downupanddownup Apr 29, 2024
ed8d276
优化说明
Downupanddownup Apr 29, 2024
f70fd8f
优化说明
Downupanddownup Apr 29, 2024
2dc36d3
添加流程说明
Downupanddownup Apr 29, 2024
fa45c5a
bug修复
Downupanddownup Apr 30, 2024
7660f1c
优化监控信息
Downupanddownup Apr 30, 2024
5843d56
bug修复
Downupanddownup Apr 30, 2024
4ebcb3b
bug修复
Downupanddownup Apr 30, 2024
02fabe8
bug修复
Downupanddownup Apr 30, 2024
8a10c52
bug修复
Downupanddownup Apr 30, 2024
fdffd50
00
Downupanddownup May 1, 2024
7e1c40e
00
Downupanddownup May 1, 2024
036d828
bug修复
Downupanddownup May 1, 2024
3ac7aad
bug修复
Downupanddownup May 1, 2024
48cc70a
bug修复
Downupanddownup May 2, 2024
12fa7d8
bug修复
Downupanddownup May 2, 2024
18002ad
bug修复
Downupanddownup May 2, 2024
61b21e1
Merge branch 'main' into ref_audio_selector_tool
Downupanddownup May 2, 2024
7c3c778
添加可能存在高频齿音的文本
Downupanddownup May 7, 2024
56d6ae6
Merge branch 'main' into ref_audio_selector_tool
Downupanddownup Jun 6, 2024
5ffb193
初始化为数字
Downupanddownup Jun 6, 2024
9f418af
--
Downupanddownup Jun 6, 2024
16b3c2a
--
Downupanddownup Jun 14, 2024
2faf74b
Merge branch 'main' into ref_audio_selector_tool
Downupanddownup Jun 22, 2024
86e5b67
减少推理文本
Downupanddownup Aug 5, 2024
50a88a5
更新到gsv官方最新版本
Downupanddownup Sep 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions GPT_SoVITS/text/symbols.py
Original file line number Diff line number Diff line change
Expand Up @@ -398,4 +398,5 @@
symbols = [pad] + c + v + ja_symbols + pu_symbols + list(arpa)
symbols = sorted(set(symbols))
if __name__ == "__main__":
print(symbols)
print(len(symbols))
Empty file added Ref_Audio_Selector/__init__.py
Empty file.
Empty file.
156 changes: 156 additions & 0 deletions Ref_Audio_Selector/common/common.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
from tools import my_utils
from config import python_exec, is_half
import subprocess
import sys
import os


class RefAudioListManager:
def __init__(self, root_dir):
self.audio_dict = {'default': []}
absolute_root = os.path.abspath(root_dir)

for subdir, dirs, files in os.walk(absolute_root):
relative_path = os.path.relpath(subdir, absolute_root)

if relative_path == '.':
category = 'default'
else:
category = relative_path.replace(os.sep, '')

for file in files:
if file.endswith('.wav'):
# 将相对路径转换为绝对路径
audio_abs_path = os.path.join(subdir, file)
if category not in self.audio_dict:
self.audio_dict[category] = []
self.audio_dict[category].append(audio_abs_path)

def get_audio_list(self):
return self.audio_dict

def get_flattened_audio_list(self):
all_audio_files = []
for category_audios in self.audio_dict.values():
all_audio_files.extend(category_audios)
return all_audio_files

def get_ref_audio_list(self):
audio_info_list = []
for category, audio_paths in self.audio_dict.items():
for audio_path in audio_paths:
filename_without_extension = os.path.splitext(os.path.basename(audio_path))[0]
audio_info = {
'emotion': f"{category}-{filename_without_extension}",
'ref_path': audio_path,
'ref_text': filename_without_extension,
}
audio_info_list.append(audio_info)
return audio_info_list


def batch_clean_paths(paths):
"""
批量处理路径列表,对每个路径调用 clean_path() 函数。

参数:
paths (list[str]): 包含待处理路径的列表。

返回:
list[str]: 经过 clean_path() 处理后的路径列表。
"""
cleaned_paths = []
for path in paths:
cleaned_paths.append(my_utils.clean_path(path))
return cleaned_paths


def read_text_file_to_list(file_path):
# 按照UTF-8编码打开文件(确保能够正确读取中文)
with open(file_path, mode='r', encoding='utf-8') as file:
# 读取所有行并存储到一个列表中
lines = file.read().splitlines()
return lines


def get_filename_without_extension(file_path):
"""
Given a file path string, returns the file name without its extension.

Parameters:
file_path (str): The full path to the file.

Returns:
str: The file name without its extension.
"""
base_name = os.path.basename(file_path) # Get the base name (file name with extension)
file_name, file_extension = os.path.splitext(base_name) # Split the base name into file name and extension
return file_name # Return the file name without extension


def read_file(file_path):
# 使用with语句打开并读取文件
with open(file_path, 'r', encoding='utf-8') as file: # 'r' 表示以读取模式打开文件
# 一次性读取文件所有内容
file_content = file.read()

# 文件在with语句结束时会自动关闭
# 现在file_content变量中存储了文件的所有文本内容
return file_content


def write_text_to_file(text, output_file_path):
try:
with open(output_file_path, 'w', encoding='utf-8') as file:
file.write(text)
except IOError as e:
print(f"Error occurred while writing to the file: {e}")
else:
print(f"Text successfully written to file: {output_file_path}")


def check_path_existence_and_return(path):
"""
检查给定路径(文件或目录)是否存在。如果存在,返回该路径;否则,返回空字符串。
:param path: 待检查的文件或目录路径(字符串)
:return: 如果路径存在,返回原路径;否则,返回空字符串
"""
if os.path.exists(path):
return path
else:
return ""


def open_file(filepath):
if sys.platform.startswith('darwin'):
subprocess.run(['open', filepath]) # macOS
elif os.name == 'nt': # For Windows
os.startfile(filepath)
elif os.name == 'posix': # For Linux, Unix, etc.
subprocess.run(['xdg-open', filepath])


def start_new_service(script_path):
# 对于Windows系统
if sys.platform.startswith('win'):
cmd = f'start cmd /k {python_exec} {script_path}'
# 对于Mac或者Linux系统
else:
cmd = f'xterm -e {python_exec} {script_path}'

proc = subprocess.Popen(cmd, shell=True)

# 关闭之前启动的子进程
# proc.terminate()

# 或者如果需要强制关闭可以使用
# proc.kill()

return proc


if __name__ == '__main__':
dir = r'C:\Users\Administrator\Desktop/test'
dir2 = r'"C:\Users\Administrator\Desktop\test2"'
dir, dir2 = batch_clean_paths([dir, dir2])
print(dir, dir2)
46 changes: 46 additions & 0 deletions Ref_Audio_Selector/common/model_manager.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
import os
import re

pretrained_sovits_name = "GPT_SoVITS/pretrained_models/s2G488k.pth"
pretrained_gpt_name = "GPT_SoVITS/pretrained_models/s1bert25hz-2kh-longer-epoch=68e-step=50232.ckpt"
SoVITS_weight_root = "SoVITS_weights"
GPT_weight_root = "GPT_weights"
os.makedirs(SoVITS_weight_root, exist_ok=True)
os.makedirs(GPT_weight_root, exist_ok=True)

speaker_verification_models = {
'speech_campplus_sv_zh-cn_16k-common': {
'task': 'speaker-verification',
'model': 'Ref_Audio_Selector/tool/speaker_verification/models/speech_campplus_sv_zh-cn_16k-common',
'model_revision': 'v1.0.0'
},
'speech_eres2net_sv_zh-cn_16k-common': {
'task': 'speaker-verification',
'model': 'Ref_Audio_Selector/tool/speaker_verification/models/speech_eres2net_sv_zh-cn_16k-common',
'model_revision': 'v1.0.5'
}
}

def custom_sort_key(s):
# 使用正则表达式提取字符串中的数字部分和非数字部分
parts = re.split('(\d+)', s)
# 将数字部分转换为整数,非数字部分保持不变
parts = [int(part) if part.isdigit() else part for part in parts]
return parts


def get_gpt_model_names():
gpt_names = [pretrained_gpt_name]
for name in os.listdir(GPT_weight_root):
if name.endswith(".ckpt"): gpt_names.append("%s/%s" % (GPT_weight_root, name))
sorted(gpt_names, key=custom_sort_key)
return gpt_names


def get_sovits_model_names():
sovits_names = [pretrained_sovits_name]
for name in os.listdir(SoVITS_weight_root):
if name.endswith(".pth"): sovits_names.append("%s/%s" % (SoVITS_weight_root, name))
sorted(sovits_names, key=custom_sort_key)
return sovits_names

72 changes: 72 additions & 0 deletions Ref_Audio_Selector/common/time_util.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import time
import os
from Ref_Audio_Selector.config_param.log_config import p_logger
import Ref_Audio_Selector.config_param.config_params as params


def timeit_decorator(func):
"""
装饰器,用于计算被装饰函数的执行时间。

参数:
func (function): 要计时的函数。

返回:
function: 包含计时功能的新函数。
"""

def wrapper(*args, **kwargs):
if params.time_log_print_type != 'file':
return func(*args, **kwargs)

start_time = time.perf_counter() # 使用 perf_counter 获取高精度计时起点

func_result = func(*args, **kwargs) # 执行原函数

end_time = time.perf_counter() # 获取计时终点
elapsed_time = end_time - start_time # 计算执行耗时

# 记录日志内容
log_message = f"进程ID: {os.getpid()}, {func.__name__} 执行耗时: {elapsed_time:.6f} 秒"
p_logger.info(log_message)

return func_result

return wrapper


def time_monitor(func):
"""
返回结果,追加时间
"""

def wrapper(*args, **kwargs):

start_time = time.perf_counter() # 使用 perf_counter 获取高精度计时起点

func_result = func(*args, **kwargs) # 执行原函数

end_time = time.perf_counter() # 获取计时终点
elapsed_time = end_time - start_time # 计算执行耗时

return elapsed_time, func_result

return wrapper


# 使用装饰器
@timeit_decorator
def example_function(n):
time.sleep(n) # 假设这是需要计时的函数,这里模拟耗时操作
return n * 2


def example_function2(n):
time.sleep(n) # 假设这是需要计时的函数,这里模拟耗时操作
return n * 2


if __name__ == "__main__":
# 调用经过装饰的函数
# result = example_function(2)
print(time_monitor(example_function2)(2))
57 changes: 57 additions & 0 deletions Ref_Audio_Selector/config.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# config.ini

[Base]
# 服务端口号
server_port = 9423
# 参考音频目录
reference_audio_dir = refer_audio
# 临时文件目录
temp_dir = Ref_Audio_Selector/temp

[Log]
# 日志保存目录路径
log_dir = Ref_Audio_Selector/log/general
# 日志级别 CRITICAL、FATAL、ERROR、WARNING、WARN、INFO、DEBUG、NOTSET、
log_level = INFO
# 函数时间消耗日志打印类型 file 打印到文件; close 关闭
time_log_print_type = file
# 函数时间消耗日志保存目录路径
time_log_print_dir = Ref_Audio_Selector/log/performance

[AudioSample]
# list转换待选参考音频目录
list_to_convert_reference_audio_dir = refer_audio_all
# 音频相似度目录
audio_similarity_dir = similarity
# 是否开启基准音频预采样 true false
enable_pre_sample = true

[Inference]
# 默认测试文本位置
default_test_text_path = Ref_Audio_Selector/file/test_content/test_content.txt
# 推理音频目录
inference_audio_dir = inference_audio
# 推理音频文本聚合目录
inference_audio_text_aggregation_dir = text
# 推理音频情绪聚合目录
inference_audio_emotion_aggregation_dir = emotion

[ResultCheck]
# asr输出文件
asr_filename = asr
# 文本相似度输出目录
text_similarity_output_dir = text_similarity
# 文本情绪平均相似度报告文件名
text_emotion_average_similarity_report_filename = average_similarity
# 文本相似度按情绪聚合明细文件名
text_similarity_by_emotion_detail_filename = emotion_group_detail
# 文本相似度按文本聚合明细文件名
text_similarity_by_text_detail_filename = text_group_detail

[AudioConfig]
# 默认模板文件位置
default_template_path = Ref_Audio_Selector/file/config_template/ref_audio_template.txt
# 参考音频配置文件名
reference_audio_config_filename = refer_audio

[Other]
Empty file.
Loading