Check out this reddit post: https://www.reddit.com/r/StableDiffusion/comments/xmr3ic/speed_up_stable_diffusion_by_50_using_flash/ Claims to achieve a 50% speedup by using Flash Attention. Is it possible to implement this into this repo?