-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathospac.1
301 lines (285 loc) · 8.42 KB
/
ospac.1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
.\" Process this file with
.\" groff -man -Tascii ospac.1
.\"
.TH OSPAC 1 "May 2 2016"
.SH NAME
ospac \- open source podcast audio chain
.SH SYNOPSIS
.B ospac [-voice] [-mix]
.I input1.wav input2.wav
.B ...
.B --output
.I output.wav
.SH DESCRIPTION
.B ospac
will take a multi-channel recording of a conversation and master this to
a high-quality mix-down with support for intro and outro. It was developed
due to the need of a batch solution for audio podcast creation.
.SH "INFORMATION OPTIONS"
.IP --help
Display help
.IP "--verbosity [n]"
Set verbosity to level
.I [n]
.SH "OUTPUT OPTIONS"
.IP --spatial
Produce a stereo output with inter aural delay.
.IP --stereo
Standard mode to produce a stereo output.
.IP --mono
Create a mono output.
.IP --multi
Create a multichannel file where each first and further channels in
each input phases are mixed to each other
.IP "--set-stereo-level [n]"
Set maximum volume factor for stereo and spatial mode (0.9) -
the lower the stronger the effect is
.IP "--set-stereo-spatial [n]"
Set maximum inter-aural delay distance in meter in spatial mode (0.03) -
the higher the stronger - the average distance between human ears are
0.22 (meter) so you probably should not go above that limit
.SH "OUTPUT TARGETS"
.IP "--output [file]"
Write final output to
.I [file]
in wave format
.IP "--mp3 [file]"
Write final output to
.I [file]
in mp3 format using the external
.B lame
command which has to be in the path
.IP "--ogg [file]"
Write final output to
.I [file]
in ogg format using the external
.B oggenc
command which has to be in the path
.IP "--plot [file]"
Write a wave form image to
.I [file]
in netpbm PPM format
.SH "OUTPUT META DATA"
.IP "--title [text]"
Set title tag to
.I [text]
if this exists in output
.IP "--artist [text]"
Set artist tag to
.I [text]
if this exists in output
.IP "--album [text]"
Set album tag to
.I [text]
if this exists in output
.IP "--comment [text]"
Set comment tag to
.I [text]
if this exists in output
.IP "--category [text]"
Set category tag to
.I [text]
if this exists in output
.IP "--episode [text]"
Set episode tag to
.I [text]
if this exists in output
.IP "--year [text]"
Set year tag to
.I [text]
if this exists in output
.IP "--image [file]"
Set image tag to
.I [file]
if this exists in output
.SH "NEW SEGMENT OPTIONS"
.IP --voice
Start of voice channel segment (default)
.IP --mix
Start pre-mixed channel segment
.IP --raw
Start of raw channel segment
.SH "SEGMENT TRANSITION OPTIONS"
.IP "--fade [s]"
Fading transition over
.I [s]
seconds
.IP "--overlap [s]"
Overlapping transition over
.I [s]
seconds
.IP "--parallel"
Render the previous and next segment in parallel. By this, you can have
independent processing for parallel audio channels. For example, for music
or noise channels. Be aware, that filters such as
.I --skip
or
.I --noise
or
.I --trim
will most likely generate out-of-sync audio as the cuts are not synchronized
between parallel segments.
.SH "CROSSTALK FILTER OPTIONS"
.IP --xgate
Enable robust crosstalk gate: This tries to mute channels that do not
contribute significant input to cancel crosstalk.
Each channel is given an activity score computed by its windowed
energy multiple compared to its assumed silence level. The channel
with highest score has 100%, the rest are scaled down linearily
with their score compared to the highest.
.IP --no-xgate
Disable crosstalk gate
.IP --xfilter
Enable experimental crosstalk filter: This tries to mute channels that
contain mainly input that was recorded by other channels before, i.e.
crosstalk.
.IP --no-xfilter
Disable crosstalk filter for current input section
.SH "ADAPTIVE SILENCE SKIP OPTIONS"
.IP --skip
Soft skip silent passages over 0.5s length
.IP "--skip-level [n]"
Fraction of maximum level considered silence (0.01)
.IP "--skip-order [n]"
Order of time of detected silence that is to be skipped.
A level of 0 only skips the 0.5s,
a level 0.5 skips about the square root of the silence,
a level of 1 skips all of the silence (default: 0.75).
.IP "--skip-target [n]"
Try to adapt given parameters of level and order to achieve the given
fraction of length. A parameter of 0.9 will try to reduce the file
to 90% of its size. Note, that the result will sound less acceptible
as soon as all silence space was used up. The algorithm iterates to
the target size, and is therefore only suitable for small files.
.IP --no-skip
Do not skip any content
.IP --noise
Skip all actual signal to get an impression of the noise on the channel.
.IP --trim
Skip silence from the beginning or the end of channels.
.SH "LEVELING, EQUALIZER AND NORMALIZATION OPTIONS"
.IP --leveler
Enable selective leveler: Enable windowed energy based leveler for current input section.
Depending on the energy level in a window around the current position
the filter decideds whether this is an active channel or not. In
case of an active channel it tries to amplify the current position
such the a target energy for the window is achieved. If the channel
is deemed inactive, the channel is de-amplified. To avoid to fast
changes in amplification levels, the levels are averaged over time.
.IP "--level-mode {single|stereo|multi}"
The leveling can work on single channels, on each two consecutive stereo
channels, or on all channels at once. When using the single mode on a stereo
signal, it might destroy wanted differences in loudness between the two channels-
therefore they should be joined in the leveling. The multi mode joins all channels
for the analysis, and levels all channels by the same amount. The single mode
is default for voice segments, the stereo mode is default for mix segments.
.IP "--target [n]"
Set average target L2 energy
.I [n]
for leveler. The L2 energy is normed to one entry, so a level of
2000-6000 seems suitable. Default: 3000.
.IP --no-leveler
Disable selective leveler
.IP "--factor [n]"
Multiply channels by factor
.I [n]
with sigmoid limiter (default: 1.25):
Amplify the signal by given factor and pass the resulting signal
through a sigmoid function to prevent overdrive, but eventual
distortion cannot be avoided.
.IP --no-factor
Disable channel multiplier
.IP --eqvoice
Attenuate frequency bands for improved diction
.IP --no-eqvoice
Do not attenuate frequency bands
.IP --analyze
Analyze frequency band components of active segments.
.IP --normalize
Normalize final mix
.IP --no-normalize
Disable final normalization
.IP "--lowpass [frequency] [transition]"
Apply a lowpass filter to the current audio segment up to
.I [frequency]
Hertz. The
.I [transition]
specifies the quality of the filter in Hertz of transition.
.IP "--highpass [frequency] [transition]"
Apply a highpass filter to the current audio segment from
.I [frequency]
Hertz. The
.I [transition]
specifies the quality of the filter in Hertz of transition.
.IP "--bandpass [low] [high] [transition]"
Apply a bandpass filter to the current audio segment, starting from
.I [low]
Hertz up to
.I [high]
Hertz. The
.I [transition]
specifies the quality of the filter in Hertz of transition.
.SH "INPUT AUDIO FILES"
.IP "[wave file]"
Load the file
.I [wave file]
and regard each channel as an individual input.
.IP "--left [wave file]"
Load the file
.I [wave file]
and only use the left channel if it was in stereo.
.IP "--right [wave file]"
Load the file
.I [wave file]
and only use the right channel if it was in stereo.
.IP "--to-mono [wave file]"
Load the file
.I [wave file]
and use a mono-mixdown of all channels in that file.
.IP "--ascii [sample rate] [text file]"
Load the ascii file
.I [text file]
which is assumed to have a sample rate of
.I [sample rate]
Hertz. The values can be integer or float values separated by
white space. The input is rescaled to [-32000,32000] and comments
starting with '#' are discarded until the next end of line.
.SH EXAMPLES
Mix 2 mono voice recordings with crosstalk filter, leveling and normalization:
.PP
.nf
.RS
ospac person1.wav person2.wav --output target.wav
.RE
.fi
.PP
Mix podcast with stereo intro and outro:
.PP
.nf
.RS
ospac --mix in.wav --overlap 4 \\
--voice person1.wav person2.wav --overlap 4 \\
--mix out.wav --output target.wav
.RE
.fi
.PP
Again with shortened options:
.PP
.nf
.RS
ospac -mi in.wav -ov 4 -vo person1.wav person2.wav -ov 4 -mi out.wav -out target.wav
.RE
.fi
.PP
Just run the crosstalk filter and create an un-mixed multi-channel output:
.PP
.nf
.RS
ospac --multi --raw t1.wav t2.wav t3.wav t4.wav --xfilter --output multi.wav
.RE
.fi
.PP
.SH AUTHOR
Sebastian Ritterbusch <ospac at ritterbusch dot de>
.SH "SEE ALSO"
.BR sox (1), lame (1), oggenc (1), pnmtopng (1), ppm (5)