About calculating transfer entropy details #3

HongyuZhu-s · 2024-06-25T08:27:42Z

Hello! Why doesn't the code include the log(2π) part in the formula for calculating entropy in 'model_shrink.py'? This is a bit different from equation (5) in the paper.

yuwentao88 · 2024-06-25T09:26:32Z

The author has been answered the question in the article，‘𝐻(𝐹) is propor-
tional to log(𝜎) plus two additional constants. Without loss
of generality, the two constants are neglected in the follow-
ing analysis.’

yuwentao88 · 2024-06-25T09:28:57Z

Do you find the calculation code for transfer entropy？

sihaoevery · 2024-06-25T10:27:18Z

Hi,

Sorry if you misunderstood the calculation of transfer entropy. The code in 'model_shrink.py' is the result of entropy. The transfer entropy is calculated via Eq. (7). These constant terms can be neglected by subtraction. You may refer to the discussion in #1.

Best,

Sihao

HongyuZhu-s · 2024-06-26T03:51:09Z

The author has been answered the question in the article，‘𝐻(𝐹) is propor- tional to log(𝜎) plus two additional constants. Without loss of generality, the two constants are neglected in the follow- ing analysis.’

Thank you for your reply! I overlooked that question.

sihaoevery · 2024-06-26T04:12:56Z

Further to my previous answer, here are a few additional details.

You may obtain the transfer entropy by subtraction:
( entropy of complete model - entropy given some attention layers are skipped)
The entropy results are appended in an array (e.g. self.trans_entropy_head). When the population of the array is adequate, it will return the result. In our implementation, we randomly sample 50,000 images from the training set. Since the batch size is 384 in inference mode, the result at 50,000/384=130 iteration is printed.

HongyuZhu-s · 2024-06-26T04:29:43Z

Further to my previous answer, here are a few additional details.

You may obtain the transfer entropy by subtraction:
( entropy of complete model - entropy given some attention layers are skipped)

The entropy results are appended in an array (e.g. self.trans_entropy_head). When the population of the array is adequate, it will return the result. In our implementation, we randomly sample 50,000 images from the training set. Since the batch size is 384 in inference mode, the result at 50,000/384=130 iteration is printed.

Thank you for your careful answer. I also have a question: Does the transfer entropy size or rank of a certain layer change with the increase of epochs?

HongyuZhu-s · 2024-06-27T04:26:22Z

Do you find the calculation code for transfer entropy？

Hello, May I ask if we need to compare the entropy difference when calculating the transfer entropy, does this require training the model twice?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About calculating transfer entropy details #3

About calculating transfer entropy details #3

HongyuZhu-s commented Jun 25, 2024

yuwentao88 commented Jun 25, 2024

yuwentao88 commented Jun 25, 2024

sihaoevery commented Jun 25, 2024

HongyuZhu-s commented Jun 26, 2024

sihaoevery commented Jun 26, 2024

HongyuZhu-s commented Jun 26, 2024

HongyuZhu-s commented Jun 27, 2024

About calculating transfer entropy details #3

About calculating transfer entropy details #3

Comments

HongyuZhu-s commented Jun 25, 2024

yuwentao88 commented Jun 25, 2024

yuwentao88 commented Jun 25, 2024

sihaoevery commented Jun 25, 2024

HongyuZhu-s commented Jun 26, 2024

sihaoevery commented Jun 26, 2024

HongyuZhu-s commented Jun 26, 2024

HongyuZhu-s commented Jun 27, 2024