## 1. 模型加载

In [1]:
import torch
from transformers import pipeline

pipe = pipeline(
    "automatic-speech-recognition",
    model="/opt/Data/ModelWeight/openai/whisper-large-v3",
    chunk_length_s=30,
    device="cuda:0"
)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


## 2. 能力测试

<b>英文语音识别</b>

speech transcription

In [2]:
prediction = pipe("/opt/WorkSpace/whisper/voice/sample1.flac",
                  return_timestamps=True)

dickValue = prediction["chunks"]
for key in dickValue:
    print(str(key["timestamp"][0]) + " - " + str(key["timestamp"][1]) + "\t:" + key["text"])

Due to a bug fix in https://github.com/huggingface/transformers/pull/28687 transcription using a multilingual Whisper will default to language detection followed by transcription instead of translation to English.This might be a breaking change for your use case. If you want to instead always translate your audio to English, make sure to pass `language='en'`.


0.0 - 13.7	: going along slushy country roads and speaking to damp audiences in draughty schoolrooms day after day for a fortnight he'll have to put in an appearance at some place of worship on sunday morning and he can come to us immediately afterwards


In [3]:
prediction = pipe("/opt/WorkSpace/whisper/voice/sample1.flac",
                  return_timestamps=True,
                  generate_kwargs={"language": "english"})

dickValue = prediction["chunks"]
for key in dickValue:
    print(str(key["timestamp"][0]) + " - " + str(key["timestamp"][1]) + "\t:" + key["text"])

0.0 - 13.7	: going along slushy country roads and speaking to damp audiences in draughty schoolrooms day after day for a fortnight he'll have to put in an appearance at some place of worship on sunday morning and he can come to us immediately afterwards


<b>英文语音翻译</b>

speech translation

In [6]:
prediction = pipe("/opt/WorkSpace/whisper/voice/sample1.flac",
                  return_timestamps=True,
                  batch_size=4,
                  generate_kwargs={"language": "chinese", "task": "transcribe"})

dickValue = prediction["chunks"]
for key in dickValue:
    print(str(key["timestamp"][0]) + " - " + str(key["timestamp"][1]) + "\t:" + key["text"])

0.0 - 2.12	:逛逛街道的街道
2.12 - 3.84	:和沙漠的观众
3.84 - 5.64	:在每天一周的学校房间
5.64 - 7.04	:在一天一夜
7.04 - 8.8	:他要在星期天早上
8.8 - 10.16	:在某个宗教的地方
10.16 - 11.28	:出席
11.28 - 13.7	:他可以马上来到我们那里


In [16]:
prediction = pipe("/opt/WorkSpace/whisper/voice/sample1.flac",
                  return_timestamps=True,
                  generate_kwargs={"language": "french", "task": "transcribe"})

dickValue = prediction["chunks"]
for key in dickValue:
    print(str(key["timestamp"][0]) + " - " + str(key["timestamp"][1]) + "\t:" + key["text"])

0.0 - 6.88	: en passant par les rues de la nature et en parlant avec des spectateurs d'un quart d'heure dans des écoles de boulot.
6.88 - 13.7	: Il va devoir se mettre à l'apparition à un endroit de prière le dimanche matin, et il peut venir nous demander tout de suite.


 ## 2.1 语音自动化转换为文本

<b>语言转换为中文文本</b>

In [4]:
prediction = pipe("/opt/WorkSpace/whisper/voice/c++对象模型探索课程详细介绍.aac",
                  batch_size=4,
                  return_timestamps=True) #generate_kwargs={"language": "zh", "task": "transcribe"}
                  
dickValue = prediction["chunks"]
for key in dickValue:
    print(str(key["timestamp"][0]) + "-" + str(key["timestamp"][1]) + ":" + key["text"])

0.0-1.56:大家好
1.56-4.3:非常高兴和大家见面
4.3-8.18:这门课是老师的一门新课
8.18-11.14:名字叫西佳佳对象模型探索
11.14-13.92:也可以说是老师的第四门课
13.92-16.5:前三门课分别是
16.5-18.54:第一门课是西语言入门
18.54-22.12:第二门课是咱们西佳佳从入门到精通
22.12-24.56:西佳佳十一十四十七
24.56-29.0:第三门课是西佳佳十一并发与多线程
41.56-44.14:去微性的这么一门课那么咱们现在讲的是
44.14-45.8:这门课的第一章
45.8-49.58:西加加对象模型探索课程介绍的第一节
49.58-53.1:西加加对象模型探索课程详细介绍
53.1-54.66:那本节课呢
54.66-56.66:咱们会讲如下这六个问题
56.66-59.14:前沿和学习效果
59.14-61.84:西加加对象模型研究的是什么
61.84-64.52:学习这门课程的基础要求
64.52-66.3:简要自我介绍
66.3-67.9:讲解参照
67.9-92.88:演示环境说明咱们这边课的标题大家注意是西加加对象模型探索西加加对象模型这个概念
92.88-94.32:提起来的话
94.32-97.54:我相信有一部分朋友可能会非常陌生
97.54-99.84:甚至可能从来没听过
99.84-101.66:我同时也相信
101.66-105.32:还有另外一部分朋友非常熟悉
105.32-130.3:但实际上西加加对象模型有这么一本书叫深度探索
130.3-131.84:西加加对象模型
131.84-133.64:它的英文呢
133.64-134.8:就是inside
134.8-137.32:the
137.32-138.72:西加加
138.72-141.44:object model
141.44-142.98:它是这么一本书
142.98-144.64:那么这本书呢
144.64-146.52:可以说在咱们西加加界呀
146.52-170.0:具有很高的美誉度他主要研究的是很多不被外人所知的西佳佳对象内部工作原理
170.0-178.0:研究西佳佳对象内部工作原理
178.0-184.0:底层的一些具体实现机制方面的知识
184.0-211.52:所以大家可以这么理解西佳佳

<b>语言转换翻译统一为英文</b>

In [7]:
prediction = pipe("/opt/WorkSpace/whisper/voice/c++对象模型探索课程详细介绍.aac",
                  batch_size=4,
                  return_timestamps=True,
                  generate_kwargs={"language": "zh", "task": "translate"}) 
dickValue = prediction["chunks"]
for key in dickValue:
    print(str(key["timestamp"][0]) + "-" + str(key["timestamp"][1]) + ":" + key["text"])

0.0 - 2.48	: Hello, everyone
2.48 - 5.28	: Nice to meet you
5.28 - 8.96	: This class is a new class from the teacher
8.96 - 12.04	: The name is Xijiajia object model exploration
12.04 - 14.64	: It can also be said to be the fourth class of the teacher
14.64 - 19.16	: The first three classes are the first class is the west language entrance
19.16 - 22.68	: The second class is our Xijiajia from entry to the end
22.68 - 25.42	: Xijiajia 11 14 17 The third class is Xijiajia from entry to completion Xijiajia 11 14 17
25.44 - 27.68	: The third class is Xijiajia 11
27.68 - 29.72	: The occurrence and the multi-line
29.72 - 33.32	: So this class is called Xijiajia object model exploration
33.32 - 35.66	: It's the fourth class of the teacher
35.66 - 36.68	: So this class
36.68 - 39.44	: He's a little bit of a challenge
39.44 - 41.62	: There's also a little bit of a fun lesson
42.86 - 67.02	: So what we're talking about now is the first chapter of this class The first chapter of the exploration o

In [5]:
prediction = pipe("/opt/WorkSpace/whisper/voice/c++模板及其实例化详细分析.aac",
                  batch_size=4,
                  return_timestamps=True)

dickValue = prediction["chunks"]
for key in dickValue:
    print(str(key["timestamp"][0]) + "-" + str(key["timestamp"][1]) + ":" + key["text"])

0.0-1.16:大家好
1.16-4.02:非常高兴又跟大家见面了
4.02-5.52:本节课是第七章
5.52-8.48:对象模型之巅的第一节
8.48-11.28:模板及其实力化分析
11.28-14.84:本节课咱们讲三个主要话题
14.84-16.38:第一函数模板
16.38-19.08:第二类模板的实力化分析
19.08-21.88:包括模板中的美举类型
21.88-25.3:类模板中的静态成员变量
25.3-27.36:类模板的实力化
27.36-50.0:成员函数的实力化开始咱们今天的课程第一个问题
50.0-51.0:函数模板
51.0-53.0:咱们拿着代码
55.0-57.0:tablet
57.0-59.0:class t
60.0-62.0:t f u n c
62.0-64.0:add 咱们做个加法
64.0-91.02:t a t就这么简单就这么一个函数模板然后咱们这个主函数中F1C中咱们这么来CRT
91.02-92.42:调用一下
92.42-93.62:F1C ADD
93.62-95.04:12给个14
95.04-101.18:1.2
101.18-102.36:好
102.36-103.4:考察F5走一个
103.4-105.62:12加14
105.62-107.56:看到了
107.56-108.22:得到26
108.22-130.74:得到26编译器它有一个推断它在编译的时候就根据咱们调用函数模板的这么一个调用那么编译器就能够
130.74-132.1:推断出来T的类型
132.1-133.52:推断出来T的类型
133.52-139.3:也就是说T类型的推断
139.3-141.68:是编译器在编译的时候
141.68-144.18:编译器在编译的时候
144.18-172.18:根据针对F1C型T的类型推断
172.18-179.36:是编译器在编译的时候
179.36-185.44:根据针对FNCADD的调用来确定
185.44-188.34:我们验证一下这事
188.34-210.82:理解一下这句话这个往下走走走走找到v367的文件夹里边有一个confirm命令提示符打开打开之后呢
210.82-213.94:我们找到这个prd100.cpp文件的位置
213.94-215.12:我们又看一看
215.1