Skip to content
This repository has been archived by the owner on Oct 31, 2023. It is now read-only.

Question: How to reduce the memory in this project #7

Closed
yangyaofei opened this issue Jul 12, 2019 · 7 comments
Closed

Question: How to reduce the memory in this project #7

yangyaofei opened this issue Jul 12, 2019 · 7 comments

Comments

@yangyaofei
Copy link

Hi, I read your paper ,it's great. I'm very interesting about how to reduce the memory in the real project.

I guess the memory things are:
in

key_pe = key_pe[:, :, trim_len:]

But I just see you cut the key_pe and It's just reduce a little memory and wouldn't help for reduce the Q K things I think.

So. can you explain How to reduce the memory in the code?

thanks

@connection-ai
Copy link

I think that unskewing attention probs matrix (modeling.py line 73) can reduce the memory a lot.

@yangyaofei
Copy link
Author

yangyaofei commented Jul 16, 2019 via email

@tesatory
Copy link
Contributor

K is being cut too here

key = F.pad(key, [0, 0, -trim_len_cache, 0])

@yangyaofei
Copy link
Author

@tesatory sorry, I can't get it . that's code will pad the key to zero but not reduce the memory.

@tesatory
Copy link
Contributor

Sorry wrong line. It is cut here

key = key[:, trim_len_cache:, :]

@yangyaofei
Copy link
Author

@tesatory thank you to reply, I still have some confuse.
It's something like the 0:trim_len_catch was computed, in this time it just use the catch that was computed previous .

I want to know is :

in your paper, you said you use a sub-network to help attention to pick a range to compute the attention. It will reduce memory, because we can cut the unnecessary part. I can't find that part,because for every element in attention, the range is different. for 5th and length is 30 is from 0 to 35 like that. I have some idea same as that, but I can't find a proper way to do that.

I read your paper, I think your project can help.
thank you

@yangyaofei
Copy link
Author

@tesatory oh, sorry, I realize your paper is not what I thought. It's reduce the memory in generation. Thanks

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants