Skip to content

Commit

Permalink
Dev gzf deepspeed (#1833)
Browse files Browse the repository at this point in the history
* update with main (#1817)

* add cmakelist

* add paraformer-torch

* add debug for funasr-onnx-offline

* fix redefinition of jieba StdExtension.hpp

* add loading torch models

* update funasr-onnx-offline

* add SwitchArg for wss-server

* add SwitchArg for funasr-onnx-offline

* update cmakelist

* update funasr-onnx-offline-rtf

* add define condition

* add gpu define for offlne-stream

* update com define

* update offline-stream

* update cmakelist

* update func CompileHotwordEmbedding

* add timestamp for paraformer-torch

* add C10_USE_GLOG for paraformer-torch

* update paraformer-torch

* fix func FunASRWfstDecoderInit

* update model.h

* fix func FunASRWfstDecoderInit

* fix tpass_stream

* update paraformer-torch

* add bladedisc for funasr-onnx-offline

* update comdefine

* update funasr-wss-server

* add log for torch

* fix GetValue BLADEDISC

* fix log

* update cmakelist

* update warmup to 10

* update funasrruntime

* add batch_size for wss-server

* add batch for bins

* add batch for offline-stream

* add batch for paraformer

* add batch for offline-stream

* fix func SetBatchSize

* add SetBatchSize for model

* add SetBatchSize for model

* fix func Forward

* fix padding

* update funasrruntime

* add dec reset for batch

* set batch default value

* add argv for CutSplit

* sort frame_queue

* sorted msgs

* fix FunOfflineInfer

* add dynamic batch for fetch

* fix FetchDynamic

* update run_server.sh

* update run_server.sh

* cpp http post server support (#1739)

* add cpp http server

* add some comment

* remove some comments

* del debug infos

* restore run_server.sh

* adapt to new model struct

* 修复了onnxruntime在macos下编译失败的错误 (#1748)

* Add files via upload

增加macos的编译支持

* Add files via upload

增加macos支持

* Add files via upload

target_link_directories(funasr PUBLIC ${ONNXRUNTIME_DIR}/lib)
target_link_directories(funasr PUBLIC ${FFMPEG_DIR}/lib)
添加 if(APPLE) 限制

---------

Co-authored-by: Yabin Li <wucong.lyb@alibaba-inc.com>

* Delete docs/images/wechat.png

* Add files via upload

* fixed the issues about seaco-onnx timestamp

* fix bug (#1764)

当语音识别结果包含 `http` 时,标点符号预测会把它会被当成 url

* fix empty asr result (#1765)

解码结果为空的语音片段,text 用空字符串

* update export

* update export

* docs

* docs

* update export name

* docs

* update

* docs

* docs

* keep empty speech result (#1772)

* docs

* docs

* update wechat QRcode

* Add python funasr api support for websocket srv (#1777)

* add python funasr_api supoort

* change little to README.md

* add core tools stream

* modified a little

* fix bug for timeout

* support for buffer decode

* add ffmpeg decode for buffer

* libtorch demo

* update libtorch infer

* update utils

* update demo

* update demo

* update libtorch inference

* update model class

* update seaco paraformer

* bug fix

* bug fix

* auto frontend

* auto frontend

* auto frontend

* auto frontend

* auto frontend

* auto frontend

* auto frontend

* auto frontend

* Dev gzf exp (#1785)

* resume from step

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* batch

* train_loss_avg train_acc_avg

* train_loss_avg train_acc_avg

* train_loss_avg train_acc_avg

* log step

* wav is not exist

* wav is not exist

* decoding

* decoding

* decoding

* wechat

* decoding key

* decoding key

* decoding key

* decoding key

* decoding key

* decoding key

* dynamic batch

* start_data_split_i=0

* total_time/accum_grad

* total_time/accum_grad

* total_time/accum_grad

* update avg slice

* update avg slice

* sensevoice sanm

* sensevoice sanm

* sensevoice sanm

---------

Co-authored-by: 北念 <lzr265946@alibaba-inc.com>

* auto frontend

* update paraformer timestamp

* [Optimization] support bladedisc fp16 optimization (#1790)

* add cif_v1 and cif_export

* Update SDK_advanced_guide_offline_zh.md

* add cif_wo_hidden_v1

* [fix] fix empty asr result (#1794)

* english timestamp for valilla paraformer

* wechat

* [fix] better solution for handling empty result (#1796)

* update scripts

* modify the qformer adaptor (#1804)

Co-authored-by: nichongjia-2007 <nichongjia@gmail.com>

* add ctc inference code (#1806)

Co-authored-by: haoneng.lhn <haoneng.lhn@alibaba-inc.com>

* Update auto_model.py

修复空字串进入speaker model时报raw_text变量不存在的bug

* Update auto_model.py

修复识别出空串后spk_model内变量未定义问题

* update model name

* fix paramter 'quantize' unused issue (#1813)

Co-authored-by: ZihanLiao <liaozihan1@xdf.cn>

* wechat

* Update cif_predictor.py (#1811)

* Update cif_predictor.py

* modify cif_v1_export

under extreme cases, max_label_len calculated by batch_len misaligns with token_num

* Update cif_predictor.py

torch.cumsum precision degradation, using float64 instead

* update code

---------

Co-authored-by: 雾聪 <wucong.lyb@alibaba-inc.com>
Co-authored-by: zhaomingwork <61895407+zhaomingwork@users.noreply.github.com>
Co-authored-by: szsteven008 <97944818+szsteven008@users.noreply.github.com>
Co-authored-by: Ephemeroptera <605686962@qq.com>
Co-authored-by: 彭震东 <zhendong.peng@qq.com>
Co-authored-by: Shi Xian <40013335+R1ckShi@users.noreply.github.com>
Co-authored-by: 维石 <shixian.shi@alibaba-inc.com>
Co-authored-by: 北念 <lzr265946@alibaba-inc.com>
Co-authored-by: xiaowan0322 <wanchen.swc@alibaba-inc.com>
Co-authored-by: zhuangzhong <zhuangzhong@corp.netease.com>
Co-authored-by: Xingchen Song(宋星辰) <xingchensong1996@163.com>
Co-authored-by: nichongjia-2007 <nichongjia@gmail.com>
Co-authored-by: haoneng.lhn <haoneng.lhn@alibaba-inc.com>
Co-authored-by: liugz18 <57401541+liugz18@users.noreply.github.com>
Co-authored-by: Marlowe <54339989+ZihanLiao@users.noreply.github.com>
Co-authored-by: ZihanLiao <liaozihan1@xdf.cn>
Co-authored-by: zhong zhuang <zhuangz@lamda.nju.edu.cn>

* sensevoice

* sensevoice

* sensevoice

* sensevoice

* sensevoice

* sensevoice

* sensevoice

* sensevoice

* sensevoice

* sensevoice

---------

Co-authored-by: 雾聪 <wucong.lyb@alibaba-inc.com>
Co-authored-by: zhaomingwork <61895407+zhaomingwork@users.noreply.github.com>
Co-authored-by: szsteven008 <97944818+szsteven008@users.noreply.github.com>
Co-authored-by: Ephemeroptera <605686962@qq.com>
Co-authored-by: 彭震东 <zhendong.peng@qq.com>
Co-authored-by: Shi Xian <40013335+R1ckShi@users.noreply.github.com>
Co-authored-by: 维石 <shixian.shi@alibaba-inc.com>
Co-authored-by: 北念 <lzr265946@alibaba-inc.com>
Co-authored-by: xiaowan0322 <wanchen.swc@alibaba-inc.com>
Co-authored-by: zhuangzhong <zhuangzhong@corp.netease.com>
Co-authored-by: Xingchen Song(宋星辰) <xingchensong1996@163.com>
Co-authored-by: nichongjia-2007 <nichongjia@gmail.com>
Co-authored-by: haoneng.lhn <haoneng.lhn@alibaba-inc.com>
Co-authored-by: liugz18 <57401541+liugz18@users.noreply.github.com>
Co-authored-by: Marlowe <54339989+ZihanLiao@users.noreply.github.com>
Co-authored-by: ZihanLiao <liaozihan1@xdf.cn>
Co-authored-by: zhong zhuang <zhuangz@lamda.nju.edu.cn>
  • Loading branch information
18 people committed Jun 20, 2024
1 parent 45d7aa9 commit e65b1f7
Show file tree
Hide file tree
Showing 10 changed files with 270 additions and 64 deletions.
27 changes: 14 additions & 13 deletions examples/industrial_data_pretraining/llm_asr/demo_speech2text.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,19 +9,20 @@

from funasr import AutoModel

ckpt_dir = "/nfs/beinian.lzr/workspace/GPT-4o/Exp/exp6/5m-8gpu/exp6_speech2text_linear_ddp_0609"
ckpt_id = "model.pt.ep0.90000"
jsonl = (
"/nfs/beinian.lzr/workspace/GPT-4o/Data/Speech2Text/TestData/aishell1_test_speech2text.jsonl"
)
output_dir = f"{os.path.join(ckpt_dir, ckpt_id)}"
device = "cuda:0"

ckpt_dir = sys.argv[1]
ckpt_id = sys.argv[2]
jsonl = sys.argv[3]
output_dir = sys.argv[4]
device = sys.argv[5]
if len(sys.argv) > 1:
ckpt_dir = sys.argv[1]
ckpt_id = sys.argv[2]
jsonl = sys.argv[3]
output_dir = sys.argv[4]
device = sys.argv[5]
else:
ckpt_dir = "/nfs/beinian.lzr/workspace/GPT-4o/Exp/exp6/5m-8gpu/exp6_speech2text_linear_ddp_0609"
ckpt_id = "model.pt.ep0.90000"
jsonl = "/nfs/beinian.lzr/workspace/GPT-4o/Data/Speech2Text/TestData/aishell1_test_speech2text.jsonl"
dataset = jsonl.split("/")[-1]
output_dir = os.path.join(ckpt_dir, f"inference-{ckpt_id}", dataset)
device = "cuda:0"


model = AutoModel(
model=ckpt_dir,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
#!/usr/bin/env python3
# -*- encoding: utf-8 -*-
# Copyright FunASR (https://github.com/alibaba-damo-academy/FunASR). All Rights Reserved.
# MIT License (https://opensource.org/licenses/MIT)

import json
import os
import sys

from funasr import AutoModel


if len(sys.argv) > 1:
ckpt_dir = sys.argv[1]
ckpt_id = sys.argv[2]
jsonl = sys.argv[3]
output_dir = sys.argv[4]
device = sys.argv[5]
else:
ckpt_dir = "/nfs/beinian.lzr/workspace/GPT-4o/Exp/exp7/5m-8gpu/exp5-1-0619"
ckpt_id = "model.pt.ep6"
jsonl = (
"/nfs/beinian.lzr/workspace/GPT-4o/Data/Speech2Text/TestData/s2tchat.v20240619.test.jsonl"
)
dataset = jsonl.split("/")[-1]
output_dir = os.path.join(ckpt_dir, f"inference-{ckpt_id}", dataset)


model = AutoModel(
model=ckpt_dir,
init_param=f"{os.path.join(ckpt_dir, ckpt_id)}",
output_dir=output_dir,
device=device,
fp16=False,
bf16=False,
llm_dtype="bf16",
)


with open(jsonl, "r") as f:
lines = f.readlines()

tearchforing = False
for i, line in enumerate(lines):

key_i = f"dialog_{i}"

data_dict = json.loads(line.strip())
data = data_dict["messages"]

contents = model.model.data_template(data)

system = contents["system"]
user = contents["user"]
assistant = contents["assistant"]

system_i, user_i, assistant_i = [], [], []

contents_i = []
for j, (system_prompt, user_prompt, target_out) in enumerate(zip(system, user, assistant)):
key = f"{key_i}_turn_{j}"

if j == 0:
contents_i.append({"role": "system", "content": system_prompt})

contents_i.append({"role": "user", "content": user_prompt})
contents_i.append({"role": "assistant", "content": target_out})

res = model.generate(
input=[contents_i],
tearchforing=tearchforing,
cache={},
key=key,
)

print(res)
2 changes: 0 additions & 2 deletions funasr/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
"""Initialize funasr package."""

import os
import pkgutil
import importlib

dirname = os.path.dirname(__file__)
version_file = os.path.join(dirname, "version.txt")
Expand Down
3 changes: 2 additions & 1 deletion funasr/auto/auto_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,8 @@ def prepare_data_iterator(data_in, input_len=None, data_type=None, key=None):
if isinstance(data_i, str) and os.path.exists(data_i):
key = misc.extract_filename_without_extension(data_i)
else:
key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
if key is None:
key = "rand_key_" + "".join(random.choice(chars) for _ in range(13))
key_list.append(key)

else: # raw text; audio sample point, fbank; bytes
Expand Down
20 changes: 14 additions & 6 deletions funasr/datasets/openai_datasets/datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,10 +283,11 @@ def __init__(

self.pattern = re.compile(r"(<\|startofspeech\|>.*?<\|endofspeech\|>)")
# self.kwargs = kwargs
self.max_token_length = kwargs.get("max_token_length", 1024)
self.max_token_length = kwargs.get("max_token_length", 1500)
self.batch_size_scale_ratio_max = kwargs.get("batch_size_scale_ratio_max", 1.5)
self.batch_size_token_max = kwargs.get("batch_size_token_max", 2500)
self.multiturn_num_max = kwargs.get("multiturn_num_max", 5)
self.max_source_length = kwargs.get("max_source_length", 3000)

def get_source_len(self, index):
item = self.index_ds[index]
Expand Down Expand Up @@ -334,6 +335,12 @@ def __getitem__(self, index):
):
if i >= self.multiturn_num_max:
break
if len(input_ids) > self.max_token_length:
logging.info(
f"input_ids > max_token_length: {len(input_ids)}>{self.max_token_length}, {item}"
)
break

if i == 0:
source_input = f"<|im_start|>system\n{system_prompt}<|im_end|>\n<|im_start|>user\n{user_prompt}<|im_end|>\n<|im_start|>assistant\n"
else:
Expand Down Expand Up @@ -372,6 +379,11 @@ def __getitem__(self, index):
frontend=self.frontend,
is_final=True,
) # speech: [b, T, d]
if speech_lengths > self.max_source_length:
logging.info(
f"speech_lengths > max_source_length: {speech_lengths}>{self.max_source_length}, {item}"
)
badcase_flag = True
if self.permute:
speech = speech.permute(0, 2, 1)
# if speech_lengths > self.batch_size:
Expand Down Expand Up @@ -399,13 +411,9 @@ def __getitem__(self, index):
fbank_mask += fbank_mask_i
fbank_lens.append(speech_lengths)

if len(input_ids) > self.max_token_length:
logging.info(
f"input_ids > max_token_length: {len(input_ids)}>{self.max_token_length}, {item}"
)
badcase_flag = True
if badcase_flag:
continue

input_ids = torch.tensor(input_ids, dtype=torch.int64) # [: self.max_token_length]
attention_mask = torch.tensor([1] * len(input_ids), dtype=torch.int32)
labels = torch.tensor(labels, dtype=torch.int64) # [: self.max_token_length]
Expand Down
15 changes: 15 additions & 0 deletions funasr/datasets/openai_datasets/index_ds.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,12 @@ class OpenAIIndexDSJsonl(torch.utils.data.Dataset): # torch.utils.data.Dataset
def __init__(self, path: str, **kwargs):
super().__init__()

self.max_source_length = kwargs.get("max_source_length", 3000)
self.min_source_length = kwargs.get("min_source_length", 0)
self.max_target_length = kwargs.get("max_target_length", 2048)
self.min_target_length = kwargs.get("min_target_length", 0)
self.max_token_length = kwargs.get("max_token_length", 2200)

is_training = kwargs.get("is_training", True)
if not (path.endswith(".jsonl") or path.endswith(".json")):
# jsonl list file
Expand Down Expand Up @@ -47,6 +53,15 @@ def __init__(self, path: str, **kwargs):
data = data_dict["messages"]
speech_length = data_dict.get("speech_length", -1) // 8
text_length = data_dict.get("text_length", 0)
if speech_length > self.max_source_length:
logging.info(
"speech_length: {speech_length} > {self.max_source_length}, drop it"
)
continue
if text_length > self.max_target_length:
continue

self.max_target_length = kwargs.get("max_target_length", 2048)

system, user, assistant = [], [], []
for i, item in enumerate(data):
Expand Down
6 changes: 6 additions & 0 deletions funasr/download/download_from_hub.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,12 @@ def download_from_ms(**kwargs):
from funasr.utils.install_model_requirements import install_requirements

install_requirements(requirements)
if kwargs.get("trust_remote_code", False):

import model

# from funasr.register import tables
# tables.print("model")
return kwargs


Expand Down
Loading

0 comments on commit e65b1f7

Please sign in to comment.