Hello, thank you for your excellent research work.
The results of this repo sound very impressive. The only problem is that it consumes a large amount of GPU memory—generating text with 200 characters already takes up 11GB of VRAM, which makes the generation process very slow. Is there any way to solve this problem?
Looking forward to your reply. ^^ ^^