Sound Source Localization support for "T" microphone array #27

chengl2 · 2018-03-26T02:50:04Z

Hello,
I took one picture to show my microphone array:

I want to use 'Sound Source Localization' to estimate elevation and azimut on my project.

We have 8 microphone. You can think of it as two linear microphone array, each array have 4 microphones.

By 1,2,3,4 I got azimut. (relative to xoy plane)

By 5,6,7,8 I got elevation. (relative to xoz plane)

With azimut and elevation, I can control the camera to rotate to the sound location.

Can you tell my how to use odas to make ssl work.

thank you.

FrancoisGrondin · 2018-03-26T13:40:09Z

Hi,

Thank you for the info. In this case, assuming omnidirectional microphones, I would do something like that. The origin could be between microphones 6 and 7 (what's labeled centre on your photo). Then all microphones xyz-coordinates would be reference to that point.

mics = (
        
        # Microphone 1
        { 
            mu = ( <mic1-x>, <mic1-y>, <mic1-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },

        # Microphone 2
        { 
            mu = ( <mic2-x>, <mic2-y>, <mic2-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },        

        # Microphone 3
        { 
            mu = ( <mic3-x>, <mic3-y>, <mic3-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },

        # Microphone 4
        { 
            mu = ( <mic4-x>, <mic4-y>, <mic4-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },               

        # Microphone 5
        { 
            mu = ( <mic5-x>, <mic5-y>, <mic5-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },

        # Microphone 6
        { 
            mu = ( <mic6-x>, <mic6-y>, <mic6-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },        

        # Microphone 7
        { 
            mu = ( <mic7-x>, <mic7-y>, <mic7-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },

        # Microphone 8
        { 
            mu = ( <mic8-x>, <mic8-y>, <mic8-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        }            
);

Moreover, to detect only source in front of your cameras, I would aim the spatial filter in the direction of your cameras field of view (if I understand your sketch, the microphones 1-4 form a line on the z-axis, so the cameras would point in that direction). This should do it:

spatialfilter: {

    direction = ( +0.000, +0.000, +1.000 );
    angle = (80.0, 100.0);

};

Let me know if you have any questions,

Cheers

chengl2 · 2018-03-27T13:51:44Z

thank you for your help. I've got one more question.
There are 8 microphones on my board, each microphone is one channel.
In fact, the number I drew on the above picture is not the real channel number.
How the coordinate corresponding to the channel number?

chengl2 · 2018-03-27T14:01:08Z

I recorded the 8 channels' audio by audicity, I drew the microphone number on the picture.

FrancoisGrondin · 2018-03-27T14:34:28Z

Hi, you can use the mapping parameter to achieve this:

mapping:
{

    map: (2, 1, 4, 3, 7, 8, 5, 6);

}

Which maps mic 1 to channel 2, mic 2 to channel 1, mic 3 to channel 4, mic 4 to channel 3, mic 5 to channel 7, mic 6 to channel 8, mic 7 to channel 5 and mic 8 to channel 6.

chengl2 · 2018-03-28T00:50:44Z

Thank you. I got the ssl data like below:

{
"timeStamp": 9,
"src": [
{ "x": 1.000, "y": -0.000, "z": 0.000, "E": 0.284 },
{ "x": -0.981, "y": 0.151, "z": 0.118, "E": 0.133 },
{ "x": 0.988, "y": -0.156, "z": 0.000, "E": 0.100 },
{ "x": 0.979, "y": -0.034, "z": 0.199, "E": 0.073 }
]
}
{
"timeStamp": 10,
"src": [
{ "x": 1.000, "y": -0.000, "z": 0.000, "E": 0.309 },
{ "x": 0.972, "y": -0.104, "z": 0.209, "E": 0.138 },
{ "x": -0.981, "y": 0.151, "z": 0.118, "E": 0.091 },
{ "x": 0.981, "y": 0.038, "z": 0.188, "E": 0.049 }
]
}

I don't know what the xyzE mean.

FrancoisGrondin · 2018-03-28T01:13:10Z

These are the xyz-coordinates of the direction of arrival of sound, and E is the energy level (between 0 and 1). A value of 0 means no energy, and a value of 1 means high energy. A potential source with high energy will most likely trigger the tracking of this source by the tracking module. Right now you output in the terminal the results of the localization module, which can be quite noisy. If you want to look at the tracked sources, you should print in the terminal the results of the tracked module.

chengl2 · 2018-03-28T01:24:26Z

Thanks.
You mean I should look at file tracks.txt? I don't know the meaning of the data either.

{
"timeStamp": 145847,
"src": [
{ "id": 1, "tag": "dynamic", "x": 1.000, "y": -0.008, "z": 0.007, "activity": 0.998 },
{ "id": 411, "tag": "dynamic", "x": -0.996, "y": 0.084, "z": 0.007, "activity": 0.006 },
{ "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 },
{ "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 }
]
}
{
"timeStamp": 145848,
"src": [
{ "id": 1, "tag": "dynamic", "x": 1.000, "y": -0.009, "z": 0.006, "activity": 0.998 },
{ "id": 411, "tag": "dynamic", "x": -0.996, "y": 0.084, "z": 0.007, "activity": 0.000 },
{ "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 },
{ "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 }
]
}

Is there document to tell me the meaning of: id tag activity?

FrancoisGrondin · 2018-03-28T01:28:15Z

Documentation is currently being written. I know it would help to have this info that's why I'm speeding things to get something out asap.

id is a unique id that is assigned to each newly tracked source
tag identifies the type of tracked source: in this case "dynamic" means that the source "appeared" and was generated from the localization module, and was not set in advance by the user.
activity indicates for the actual frame what is the probability the source is active (between 0 and 1). From the log you showed me, it seems the source located at approx. x = 1, y = 0 and z = 0 is active, while the one located at approx. x = -1, y = 0 and z = 0 is inactive

chengl2 · 2018-03-28T09:56:12Z

I don't quite understand below items in bold:

#Microphone
{
mu = ( , , );
sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
direction = ( +0.000, +0.000, +1.000 );
angle = ( 180.0, 180.0 );
},

and

spatialfilter: {
direction = ( +0.000, +0.000, +1.000 );
angle = (80.0, 100.0);
};

Could you please explain these items with a picture for me?

chengl2 · 2018-03-28T10:23:49Z

It's a conference system, I want to detect the direction of human voice, then control the ptz of two cameras to the direction of human. ptz: up and down, left and right.
the picture below:

Is odas good at to do this?

GodCed · 2018-03-28T13:24:06Z

Hi, I suggest you take a look at this discussion about the spatial filter. What it does is adjust the gain according to the angle of arrival. When specified in the mic configuration, the gain applies to the microphone. When specifies in the spatial filter it applies to the system output.

It is used to account for the directivity of the microphones and to limit the sound "search area" to a specified zone. With your setup, I would suggest you use

direction = ( +1.000, +0.000, +0.000 );
angle = (80.0, 100.0);

for both, as your microphones and your array are listening in the X axis direction.

GodCed · 2018-03-28T13:32:46Z

Considering your conference system, I personally used ODAS to add an overlay to a video stream to show audio sources and also to track an audio source with a PTZ camera.

With your setup, you should be able to use the tracking module to aim your camera by converting x, y and z, which represent a direction vector, to an azimut and elevation. If you have an idea of the distance between to speaker and the camera, you may want to account for the offset between your matrix origin and your cameras origin, to improve your aiming precision.

chengl2 · 2018-03-29T02:20:47Z

Thank you very much.

If I want to check if ODAS give me the right direction in real time, what should I do?

GodCed · 2018-03-29T02:38:03Z

I suggest you have a look at ODAS Studio. It’s a desktop app built to display ODAS data in real time. You can see acoustic energy and tracked sources in real time both in azimut-elevation and unit-sphere x-y-z format.

chengl2 · 2018-03-29T09:28:59Z

I tried ODAS Studio, it's powerful.

Even if no person talking, it was able to draw a lot of points near the x axis:

I tried to adjust the "energy range", it works. But the voice of the people seem to be filtered out a lot either.

I stood at 2 meters away, say very loudly to be detected by ODAS. how to fine tuning this?

Is there a way to filter out noise and sensitive to human voice?

FrancoisGrondin · 2018-03-29T16:07:27Z

This is strange. Seems like there is always a noise source in front of your setup, which I doubt is true in reality. Can you provide us with you config file, and maybe raw recordings from the mic array?

GodCed · 2018-03-29T17:01:51Z

I would also suggest you stop ODAS, do a recording in audacity and retry. Sometimes a weird glitch happens when opening the soundcard trough ODAS and the card output is corrupted. Opening it in another app seems to solve the issue.

chengl2 · 2018-03-30T01:36:14Z

These is my cfg file and audio data recorded by audacity.

chengl.cfg.zip

audacity_project.zip

chengl2 · 2018-04-08T06:11:36Z

I did a recording in audacity and retry and I tried below configuration too, the noise source still there.

chengl.cfg.zip

FrancoisGrondin · 2018-04-08T22:12:37Z

Thank you for this feedback. I have been quite busy lately, but I'll try to run your data in the coming days! I'll keep you updated!

chengl2 · 2018-04-12T03:40:55Z

Hello,
I found the DC offset of my 8 channels' audio data are all below 0, and the offset value are diff from each other.

How could I adjust the DC offset to 0 in odas?
Could this DC offset be a cause of my ssl porblem?

huotuichang1 · 2018-06-12T05:51:39Z

what is the tool to calculate these data?

hritiksth764 · 2022-02-10T05:57:04Z

Hello, I found the DC offset of my 8 channels' audio data are all below 0, and the offset value are diff from each other.

How could I adjust the DC offset to 0 in odas? Could this DC offset be a cause of my ssl porblem?

how are you getting these values ?

chengl2 mentioned this issue Mar 26, 2018

Please tell us what you do with ODAS ;) #25

Open

FrancoisGrondin self-assigned this Mar 26, 2018

FrancoisGrondin added the help wanted label Mar 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sound Source Localization support for "T" microphone array #27

Sound Source Localization support for "T" microphone array #27

chengl2 commented Mar 26, 2018 •

edited

Loading

FrancoisGrondin commented Mar 26, 2018

chengl2 commented Mar 27, 2018

chengl2 commented Mar 27, 2018

FrancoisGrondin commented Mar 27, 2018

chengl2 commented Mar 28, 2018

FrancoisGrondin commented Mar 28, 2018

chengl2 commented Mar 28, 2018

FrancoisGrondin commented Mar 28, 2018

chengl2 commented Mar 28, 2018

chengl2 commented Mar 28, 2018 •

edited

Loading

GodCed commented Mar 28, 2018

GodCed commented Mar 28, 2018

chengl2 commented Mar 29, 2018

GodCed commented Mar 29, 2018 •

edited

Loading

chengl2 commented Mar 29, 2018

FrancoisGrondin commented Mar 29, 2018

GodCed commented Mar 29, 2018 •

edited

Loading

chengl2 commented Mar 30, 2018

chengl2 commented Apr 8, 2018

FrancoisGrondin commented Apr 8, 2018

chengl2 commented Apr 12, 2018

huotuichang1 commented Jun 12, 2018

hritiksth764 commented Feb 10, 2022

Sound Source Localization support for "T" microphone array #27

Sound Source Localization support for "T" microphone array #27

Comments

chengl2 commented Mar 26, 2018 • edited Loading

FrancoisGrondin commented Mar 26, 2018

chengl2 commented Mar 27, 2018

chengl2 commented Mar 27, 2018

FrancoisGrondin commented Mar 27, 2018

chengl2 commented Mar 28, 2018

FrancoisGrondin commented Mar 28, 2018

chengl2 commented Mar 28, 2018

FrancoisGrondin commented Mar 28, 2018

chengl2 commented Mar 28, 2018

chengl2 commented Mar 28, 2018 • edited Loading

GodCed commented Mar 28, 2018

GodCed commented Mar 28, 2018

chengl2 commented Mar 29, 2018

GodCed commented Mar 29, 2018 • edited Loading

chengl2 commented Mar 29, 2018

FrancoisGrondin commented Mar 29, 2018

GodCed commented Mar 29, 2018 • edited Loading

chengl2 commented Mar 30, 2018

chengl2 commented Apr 8, 2018

FrancoisGrondin commented Apr 8, 2018

chengl2 commented Apr 12, 2018

huotuichang1 commented Jun 12, 2018

hritiksth764 commented Feb 10, 2022

chengl2 commented Mar 26, 2018 •

edited

Loading

chengl2 commented Mar 28, 2018 •

edited

Loading

GodCed commented Mar 29, 2018 •

edited

Loading

GodCed commented Mar 29, 2018 •

edited

Loading