Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sound Source Localization support for "T" microphone array #27

Open
chengl2 opened this issue Mar 26, 2018 · 23 comments
Open

Sound Source Localization support for "T" microphone array #27

chengl2 opened this issue Mar 26, 2018 · 23 comments
Assignees

Comments

@chengl2
Copy link

chengl2 commented Mar 26, 2018

Hello,
I took one picture to show my microphone array:
the_construction

I want to use 'Sound Source Localization' to estimate elevation and azimut on my project.

We have 8 microphone. You can think of it as two linear microphone array, each array have 4 microphones.

By 1,2,3,4 I got azimut. (relative to xoy plane)

By 5,6,7,8 I got elevation. (relative to xoz plane)

With azimut and elevation, I can control the camera to rotate to the sound location.

Can you tell my how to use odas to make ssl work.

thank you.

@FrancoisGrondin
Copy link
Member

Hi,

Thank you for the info. In this case, assuming omnidirectional microphones, I would do something like that. The origin could be between microphones 6 and 7 (what's labeled centre on your photo). Then all microphones xyz-coordinates would be reference to that point.

mics = (
        
        # Microphone 1
        { 
            mu = ( <mic1-x>, <mic1-y>, <mic1-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },

        # Microphone 2
        { 
            mu = ( <mic2-x>, <mic2-y>, <mic2-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },        

        # Microphone 3
        { 
            mu = ( <mic3-x>, <mic3-y>, <mic3-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },

        # Microphone 4
        { 
            mu = ( <mic4-x>, <mic4-y>, <mic4-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },               

        # Microphone 5
        { 
            mu = ( <mic5-x>, <mic5-y>, <mic5-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },

        # Microphone 6
        { 
            mu = ( <mic6-x>, <mic6-y>, <mic6-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },        

        # Microphone 7
        { 
            mu = ( <mic7-x>, <mic7-y>, <mic7-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        },

        # Microphone 8
        { 
            mu = ( <mic8-x>, <mic8-y>, <mic8-z> ); 
            sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
            direction = ( +0.000, +0.000, +1.000 );
            angle = ( 180.0, 180.0 );
        }            
);

Moreover, to detect only source in front of your cameras, I would aim the spatial filter in the direction of your cameras field of view (if I understand your sketch, the microphones 1-4 form a line on the z-axis, so the cameras would point in that direction). This should do it:

spatialfilter: {

    direction = ( +0.000, +0.000, +1.000 );
    angle = (80.0, 100.0);

};    

Let me know if you have any questions,

Cheers

@chengl2
Copy link
Author

chengl2 commented Mar 27, 2018

thank you for your help. I've got one more question.
There are 8 microphones on my board, each microphone is one channel.
In fact, the number I drew on the above picture is not the real channel number.
How the coordinate corresponding to the channel number?

@chengl2
Copy link
Author

chengl2 commented Mar 27, 2018

I recorded the 8 channels' audio by audicity, I drew the microphone number on the picture.
selection_002

@FrancoisGrondin
Copy link
Member

Hi, you can use the mapping parameter to achieve this:

mapping:
{

    map: (2, 1, 4, 3, 7, 8, 5, 6);

}

Which maps mic 1 to channel 2, mic 2 to channel 1, mic 3 to channel 4, mic 4 to channel 3, mic 5 to channel 7, mic 6 to channel 8, mic 7 to channel 5 and mic 8 to channel 6.

@chengl2
Copy link
Author

chengl2 commented Mar 28, 2018

Thank you. I got the ssl data like below:

{
"timeStamp": 9,
"src": [
{ "x": 1.000, "y": -0.000, "z": 0.000, "E": 0.284 },
{ "x": -0.981, "y": 0.151, "z": 0.118, "E": 0.133 },
{ "x": 0.988, "y": -0.156, "z": 0.000, "E": 0.100 },
{ "x": 0.979, "y": -0.034, "z": 0.199, "E": 0.073 }
]
}
{
"timeStamp": 10,
"src": [
{ "x": 1.000, "y": -0.000, "z": 0.000, "E": 0.309 },
{ "x": 0.972, "y": -0.104, "z": 0.209, "E": 0.138 },
{ "x": -0.981, "y": 0.151, "z": 0.118, "E": 0.091 },
{ "x": 0.981, "y": 0.038, "z": 0.188, "E": 0.049 }
]
}

I don't know what the xyzE mean.

@FrancoisGrondin
Copy link
Member

These are the xyz-coordinates of the direction of arrival of sound, and E is the energy level (between 0 and 1). A value of 0 means no energy, and a value of 1 means high energy. A potential source with high energy will most likely trigger the tracking of this source by the tracking module. Right now you output in the terminal the results of the localization module, which can be quite noisy. If you want to look at the tracked sources, you should print in the terminal the results of the tracked module.

@chengl2
Copy link
Author

chengl2 commented Mar 28, 2018

Thanks.
You mean I should look at file tracks.txt? I don't know the meaning of the data either.

{
"timeStamp": 145847,
"src": [
{ "id": 1, "tag": "dynamic", "x": 1.000, "y": -0.008, "z": 0.007, "activity": 0.998 },
{ "id": 411, "tag": "dynamic", "x": -0.996, "y": 0.084, "z": 0.007, "activity": 0.006 },
{ "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 },
{ "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 }
]
}
{
"timeStamp": 145848,
"src": [
{ "id": 1, "tag": "dynamic", "x": 1.000, "y": -0.009, "z": 0.006, "activity": 0.998 },
{ "id": 411, "tag": "dynamic", "x": -0.996, "y": 0.084, "z": 0.007, "activity": 0.000 },
{ "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 },
{ "id": 0, "tag": "", "x": 0.000, "y": 0.000, "z": 0.000, "activity": 0.000 }
]
}

Is there document to tell me the meaning of: id tag activity?

@FrancoisGrondin
Copy link
Member

Documentation is currently being written. I know it would help to have this info that's why I'm speeding things to get something out asap.

id is a unique id that is assigned to each newly tracked source
tag identifies the type of tracked source: in this case "dynamic" means that the source "appeared" and was generated from the localization module, and was not set in advance by the user.
activity indicates for the actual frame what is the probability the source is active (between 0 and 1). From the log you showed me, it seems the source located at approx. x = 1, y = 0 and z = 0 is active, while the one located at approx. x = -1, y = 0 and z = 0 is inactive

@chengl2
Copy link
Author

chengl2 commented Mar 28, 2018

I don't quite understand below items in bold:

#Microphone
{
mu = ( , , );
sigma2 = ( +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000, +0.000 );
direction = ( +0.000, +0.000, +1.000 );
angle = ( 180.0, 180.0 );
},

and

spatialfilter: {
direction = ( +0.000, +0.000, +1.000 );
angle = (80.0, 100.0);
};

Could you please explain these items with a picture for me?

@chengl2
Copy link
Author

chengl2 commented Mar 28, 2018

It's a conference system, I want to detect the direction of human voice, then control the ptz of two cameras to the direction of human. ptz: up and down, left and right.
the picture below:
_003

Is odas good at to do this?

@GodCed
Copy link
Contributor

GodCed commented Mar 28, 2018

Hi, I suggest you take a look at this discussion about the spatial filter. What it does is adjust the gain according to the angle of arrival. When specified in the mic configuration, the gain applies to the microphone. When specifies in the spatial filter it applies to the system output.

It is used to account for the directivity of the microphones and to limit the sound "search area" to a specified zone. With your setup, I would suggest you use

direction = ( +1.000, +0.000, +0.000 );
angle = (80.0, 100.0);

for both, as your microphones and your array are listening in the X axis direction.

@GodCed
Copy link
Contributor

GodCed commented Mar 28, 2018

Considering your conference system, I personally used ODAS to add an overlay to a video stream to show audio sources and also to track an audio source with a PTZ camera.

With your setup, you should be able to use the tracking module to aim your camera by converting x, y and z, which represent a direction vector, to an azimut and elevation. If you have an idea of the distance between to speaker and the camera, you may want to account for the offset between your matrix origin and your cameras origin, to improve your aiming precision.

@chengl2
Copy link
Author

chengl2 commented Mar 29, 2018

Thank you very much.

If I want to check if ODAS give me the right direction in real time, what should I do?

@GodCed
Copy link
Contributor

GodCed commented Mar 29, 2018

I suggest you have a look at ODAS Studio. It’s a desktop app built to display ODAS data in real time. You can see acoustic energy and tracked sources in real time both in azimut-elevation and unit-sphere x-y-z format.

@chengl2
Copy link
Author

chengl2 commented Mar 29, 2018

I tried ODAS Studio, it's powerful.

Even if no person talking, it was able to draw a lot of points near the x axis:

_004

I tried to adjust the "energy range", it works. But the voice of the people seem to be filtered out a lot either.

I stood at 2 meters away, say very loudly to be detected by ODAS. how to fine tuning this?

Is there a way to filter out noise and sensitive to human voice?

@FrancoisGrondin
Copy link
Member

This is strange. Seems like there is always a noise source in front of your setup, which I doubt is true in reality. Can you provide us with you config file, and maybe raw recordings from the mic array?

@GodCed
Copy link
Contributor

GodCed commented Mar 29, 2018

I would also suggest you stop ODAS, do a recording in audacity and retry. Sometimes a weird glitch happens when opening the soundcard trough ODAS and the card output is corrupted. Opening it in another app seems to solve the issue.

@chengl2
Copy link
Author

chengl2 commented Mar 30, 2018

These is my cfg file and audio data recorded by audacity.

chengl.cfg.zip

audacity_project.zip

@chengl2
Copy link
Author

chengl2 commented Apr 8, 2018

I did a recording in audacity and retry and I tried below configuration too, the noise source still there.

_007

chengl.cfg.zip

@FrancoisGrondin
Copy link
Member

Thank you for this feedback. I have been quite busy lately, but I'll try to run your data in the coming days! I'll keep you updated!

@chengl2
Copy link
Author

chengl2 commented Apr 12, 2018

Hello,
I found the DC offset of my 8 channels' audio data are all below 0, and the offset value are diff from each other.

dcoffset

How could I adjust the DC offset to 0 in odas?
Could this DC offset be a cause of my ssl porblem?

@huotuichang1
Copy link

what is the tool to calculate these data?

@hritiksth764
Copy link

Hello, I found the DC offset of my 8 channels' audio data are all below 0, and the offset value are diff from each other.

dcoffset

How could I adjust the DC offset to 0 in odas? Could this DC offset be a cause of my ssl porblem?

how are you getting these values ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants