questions about log_prob implementation? #1

dragen1860 · 2019-08-28T23:36:58Z

Hi, thanks for your PG implementation.
I found it's difficult to understand this episode code:

	#train model
	def log_prob(self, policy_param, acs):
		if self.is_discrete:
			logits = policy_param
			log_prob = tf.keras.losses.sparse_categorical_crossentropy(\
				y_true = acs, y_pred = logits, from_logits = True)

I think the log_prob function will just return tf.math.log(policy_param) and I do not understand why you calculate crossentropy loss here? Thank you.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

questions about log_prob implementation? #1

questions about log_prob implementation? #1

dragen1860 commented Aug 28, 2019

questions about log_prob implementation? #1

questions about log_prob implementation? #1

Comments

dragen1860 commented Aug 28, 2019