Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Model from Scratch #17

Closed
gstranger opened this issue Oct 26, 2020 · 5 comments
Closed

Training Model from Scratch #17

gstranger opened this issue Oct 26, 2020 · 5 comments

Comments

@gstranger
Copy link

gstranger commented Oct 26, 2020

Hello, thank you for making your source code available for this project. Is it possible to train the model from scratch using our own midi dataset without using one of the shared checkpoints? I've tried the finetune.py script you've included but the resulting model is still too biased toward the initial training set for what I'm trying to do. Thanks again for sharing your work in this space.

@remyhuang
Copy link
Contributor

You can see #99 in model.py which is used to restore the pre-trained checkpoint. You can remove this line if you want to train from scratch.

@gstranger
Copy link
Author

gstranger commented Nov 2, 2020

thank you for your help, in your dictionary.pkl file are there any vocabulary semantics around midi events and bars? For example If I add more events to the dictionary based on my training data do I need to keep in mind the current indices of existing events? Or can I just add them to the end of the dictionary as new keys? Also, do I need to adjust the data preparation in any way if my training data are each only around 16 bars? I'm assuming I need to adjust this section that creates the segments based on group size. Apologies on the long winded comment but I'm trying to cover all the issues I've ran into in training from scratch.

@remyhuang
Copy link
Contributor

  1. Python dictionary is orderless. You can just add your new tokens into the dictionaries as new items. If you add/revise the dictionary, you need to revise the data pre-preparation code and re-train the model by yourself.

  2. "group_size" is a specific parameter for the transformer-xl data preparation. The model.py#18 is used to define the segment length for the auto-regressive training data.

@gstranger
Copy link
Author

So I've revised the code in model.py above to not load the model, and have verified that model.prepare_date() is generating an actual ndarray. However when I run model.finetune I'm getting the below error.

FailedPreconditionError: 2 root error(s) found.
(0) Failed precondition: Attempting to use uninitialized value transformer/normal_softmax/bias
[[{{node transformer/normal_softmax/bias/read}}]]
[[transformer/StopGradient_8/1103]]
(1) Failed precondition: Attempting to use uninitialized value transformer/normal_softmax/bias
[[{{node transformer/normal_softmax/bias/read}}]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
FailedPreconditionError Traceback (most recent call last)
in ()
1 model.finetune(
2 training_data=training_data,
----> 3 output_checkpoint_folder=output_checkpoint_folder)
/content/remi/model.py in finetune(self, training_data, output_checkpoint_folder)
278 feed_dict[m] = m_np
279 # run
--> 280 , gs, loss
, new_mem_ = self.sess.run([self.train_op, self.global_step, self.avg_loss, self.new_mem], feed_dict=feed_dict)
281 batch_m = new_mem_
282 total_loss.append(loss_)

Is there some additional initialization I need to set manually when training from scratch?

@a1004123217
Copy link

can you train from the scratch after remove the restore line? i removed this line but i can't train,it says:" self._traceback = tf_stack.extract_stack_for_node(self._c_op)"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants