Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about Table 1 #32

Closed
Shoawen0213 opened this issue Apr 6, 2022 · 4 comments
Closed

Question about Table 1 #32

Shoawen0213 opened this issue Apr 6, 2022 · 4 comments
Labels
question Further information is requested

Comments

@Shoawen0213
Copy link

Hello!! Thanks for your sharing !! It's a masterpiece work!
I am just confused about what is "oracle segmentation" in the Table.1
I can't get it~I try to survey it but failed to get the answer.
thanks!

@juanmc2005 juanmc2005 added the question Further information is requested label Apr 7, 2022
@juanmc2005
Copy link
Owner

Hello @Shoawen0213 and thank you :)

The "oracle segmentation" system replaces the segmentation model output with the actual diarization ground truth for the chunk. Under a scenario with perfect speaker re-identification and new-speaker detection, the DER should be zero in this case. Of course this is not what happens because part of the errors come from the speaker embedding model and from incremental clustering. Also, notice that false alarm and missed detection are not zero because speaker tracking errors propagate to the final segmentation during output window aggregation (see Figure 4).

Let me know if this answers your question.

@juanmc2005 juanmc2005 changed the title question of Table.1 Question about Table 1 Apr 7, 2022
@Shoawen0213
Copy link
Author

Hello @juanmc2005 !! Thanks for your precise answer!!
I still have some questions, sorry...
One is "why u do this test (with oracle segmentation)? Does that have some meaning, such as more reality or make performance better ?"
Another one is "why it is called "oracle", I'm just curious because I saw this in many papers, does that a proper noun or has some meaning?
The last is do you release the version with oracle segmentation?

Again, thanks for your answer!! It's helped a lot due to I'm a rookie in this domain without any mentor to ask...
Looking forward to your reply...

@juanmc2005
Copy link
Owner

1) Why do the experiment? It's an ablative study to understand the behavior of speaker tracking if the segmentation was perfect (which is never the case). This allows us to see what and how many errors come from speaker embedding, clustering and aggregation.

2) About the name. In general, "oracle" experiments mean that we assume we know the actual value of something that we don't know in reality. This is not a usable system because that information is only available in controlled scenarios.

3) Do we release the oracle version? No. This implementation is mostly for real use cases and having the possibility of injecting the ground truth is prone to errors (you can leak ground-truth information without knowing). This is why we chose not to include it. However it shouldn't be a very hard feature to add. If you want to try that I would suggest subclassing FrameWiseModel.

@Shoawen0213
Copy link
Author

Hi, @juanmc2005 thanks for your answer!!
It's very clear and useful!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants