Question about the model design. #8

Yangr116 · 2024-07-04T06:25:56Z

Hi, this is a great work!
I would like to know why you use bidirectional Mamba? Does a single directional Mamba have any problems in your experiments?

feizc · 2024-07-15T03:16:26Z

Hi, It is generally believed that the single direction is not as good as the birectional Mamba. At the same time, different scan strategies can further improve the generationperformance, which can refer to the discussion in Zigma and DIM paper. For simplicity, we used bidirectional Mamba here.

However, it is worth noting that there has been an increasing focus of work on autoregression, such as llamagen [3] and Kaiming He' recent work [4].

[1] ZigMa: A DiT-style Zigzag Mamba Diffusion Model
[2] DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis
[3] Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
[4] Autoregressive Image Generation without Vector Quantization

Yangr116 · 2024-07-15T03:31:23Z

thks! I have noticed these papers.😊 费政聪 ***@***.***>于2024年7月15日周一11:16写道：

…

Hi, It is generally believed that the single direction is not as good as the birectional Mamba. At the same time, different scan strategies can further improve the generationperformance, which can refer to the discussion in Zigma and DIM paper. For simplicity, we used bidirectional Mamba here. However, it is worth noting that there has been an increasing focus of work on autoregression, such as llamagen [3] and Kaiming He' recent work [4]. [1] ZigMa: A DiT-style Zigzag Mamba Diffusion Model [2] DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis [3] Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation [4] Autoregressive Image Generation without Vector Quantization — Reply to this email directly, view it on GitHub <#8 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ARTC2EH6DHHSTOVE35Y24SDZMM5J7AVCNFSM6AAAAABKK33J32VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRXGYZTMMJQGM> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Yangr116 closed this as completed Jul 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the model design. #8

Question about the model design. #8

Yangr116 commented Jul 4, 2024

feizc commented Jul 15, 2024

Yangr116 commented Jul 15, 2024 via email

Question about the model design. #8

Question about the model design. #8

Comments

Yangr116 commented Jul 4, 2024

feizc commented Jul 15, 2024

Yangr116 commented Jul 15, 2024 via email