Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I want to construct a dataset(.npz) with my midi data. #108

Closed
bokyungJ opened this issue Nov 8, 2020 · 4 comments
Closed

I want to construct a dataset(.npz) with my midi data. #108

bokyungJ opened this issue Nov 8, 2020 · 4 comments
Labels

Comments

@bokyungJ
Copy link

bokyungJ commented Nov 8, 2020

Hi! Thanks for sharing the MuseGan source code!!

But, how can I consist of dataset using my midi data?
If you don't mind, could you share the source code or reference(documents) which how did you consist 'train_x_lpd_5_phr.npz' with midi data?
I already looked the 'https://github.com/wayne391/symbolic-musical-datasets/tree/master/5-track-pianoroll' and 'https://salu133445.github.io/pypianoroll/'
but I'm still not sure about it.

Thanks :)

@salu133445
Copy link
Owner

Hi, train_x_lpd_5_phr.npz is a simply sparse-encoded version of train_x_lpd_5_phr.npy, which is too large to put on the web. You will end up with a .npy file after running the code in this repository. You can then store the data to shared array by running python src/process_data.py train_x_lpd_5_phr.npy, or by setting data_source='npy' and data_filename = train_x_lpd_5_phr.npy in config.yaml.

This paragraph in the README might also help.

As pianoroll matrices are generally sparse, we store only the indices of nonzero elements and the array shape into a npz file to save space, and later restore the original array. To save some training data into this format, simply run np.savez_compressed("data.npz", shape=data.shape, nonzero=data.nonzero())

@bokyungJ
Copy link
Author

Thank you! @salu133445

that repository is a .npz files to .npy file.
but I don't know how to change midi file to .npz file

I did pypianoroll.read()/pypianoroll.save()
but I have a error ''TypeError: Object of type int32 is not JSON serializable"

Could I get some solution how to change midi file to .npz file?

@bokyungJ
Copy link
Author

@salu133445
The data type of <.npz> is float16 and the data type of <.npy> is boolean.
In this case, is it OK just setting data_source='npy' and data_filename = train_x_lpd_5_phr.npy in config.yaml?

<.npz>
array([[69., 62., 54., 50.],
[69., 62., 54., 50.],
[69., 62., 54., 50.],
...,
[62., 62., 54., 50.],
[62., 62., 54., 50.],
[62., 62., 54., 50.]], dtype=float16)

<.npy> (derived from that repository)
[[False False False False False]
[False False False False False]
[False False False False False]
...
[False False False False False]
[False False False False False]
[False False False False False]]

@salu133445
Copy link
Owner

Hi, both .npy and .npz files used in MuseGAN are raw piano rolls, in different formats. (The .npz file here is different from the .npz file in Pypianoroll.)

And yes, you are right. the output is a .npy file, so you can simply set data_source='npy' and data_filename = 'train_x_lpd_5_phr.npy' in config.yaml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants