Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example_5.1 - Predicting an entire song & Data generator details #10

Closed
jokandre opened this issue Jan 7, 2018 · 1 comment
Closed

Comments

@jokandre
Copy link

jokandre commented Jan 7, 2018

While trying to visualize the predictions on this example, I am having difficulties understanding the data_gen() function and y_sample_to_frame().

  1. From what I read, a data generator on image processing has the purpose of providing smaller chunks of the data. Is there any recommended way to predict an entire song?

1.1. On a test data, of one song.
n_hop = 256
nsp_y = 5637632
I end up receiving 20 chunks of len 22022, which is not equivalent to the entire song.
Shouldn't I need 256 (5637632// 256) of those?

1.2 Using predict_generator returns only 22022 predictions... which leads me back to question 1

  1. On y_sample_to_frame().
n_hop = N_HOP
nsp_y = len(y)
ret = np.array([np.round(np.mean(y[max(0, (i - 1) * n_hop): min(nsp_y, (i + 1) * n_hop)])) \
                    for i in range(nsp_y // n_hop)], dtype=np.int)

Could you provide some comments on line 3?

In fact, I am trying to modify your example and see how it performs on SALAMI dataset. But it seems that the understanding of this two functions is fundamental. I have found relatively less information about the pre-processing of data for music structure analysis.
Sorry if my questions are not very clearly formulated, any extra information or source would be helpful.
Thanks in advance

@keunwoochoi
Copy link
Owner

  1. There are many heuristics (e.g. averaging the prediction, major voting). It's hard to pick one though.
    1.1 Where is it happening exactly? The original code doesn't have lines for testing other tracks, so probably it's about how you'd implement?
  2. I think (!) it's to generate labels of which the rate is aligned to the prediction rate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants