Skip to content
View Abhinay1997's full-sized avatar

Organizations

@Hugging-Face-Helping-Hand

Block or report Abhinay1997

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Abhinay1997/README.md

Currently working on:

Idea Status References/Papers
FIFO CogVideoX In Progress. Blocked atm. https://jjihwan.github.io/projects/FIFO-Diffusion
Comparision of different cfg like methods Done. Need to put a blog post together Smoothed Energy Guidance, Guidance embedding, CFG and CFG++
Flux Image Inversion using RNRI Dropped. Understanding of prior for flow matching too low. Dropped as well due to RF Inversion. https://barakmam.github.io/rnri.github.io/
Invertible MMDiT Transformer with Diff Attention Planned. The idea is that the flow predicted by transformer is conditioned on both t_n and t_n-1 to allow for perfect inversion in theory https://arxiv.org/pdf/2406.08929 https://arxiv.org/pdf/2410.05258
Diffusion SpeedUps by caching model pred Done. Need to put a blog post together Idea is to reuse modep pred across timesteps.
CogVideoX Attention Scaling Paused. Need to recheck for higher res https://arxiv.org/abs/2306.08645
RB Modulation for FLUX Dropped. RF Inversion makes this redundant https://rb-inversion.github.io/
CogVideoX distillation using FineVideo and PeRFlow Planned. Needs compute grant. May be scrapped once BFLs video model is out. https://arxiv.org/abs/2405.07510
Underwater Image Colorization as an Inverse Problem Planned. Needs better underestanding of inverse problems https://github.com/LituRout/PSLD
Flux generation steering using SAE for CLIP Planned. Need better understanding of SAEs & apply them to T5 as well https://www.lesswrong.com/posts/Quqekpvx8BGMMcaem/interpreting-and-steering-features-in-images
LoRA (move to MoRA ?) ControlNet layer Planned. Compute ∆W for Flux dev & its controlnet layer. Decompose to LoRA and see decomposition error. If its low enough, LoRA should be enough ChatGPT conversation
MoRA finetuning Flux I have a hypothesis: MoRA might give better samples than LoRA for Flux. I'll try it out sometime next week maybe. TLDR: 1. Full finetuning > LoRA for personalization. 2. Full finetuning > MoRA > DoRA > LoRA. 3. MoRA should converge fast like LoRA but give better quality/diversity like finetuning. There should be no free lunch though. Hmm 1. MoRA: High-rank PEFT Approach 2. Full Finetuning of Flux 3. GitHub: MoRA
Transformer layers as Painters for DiTs Complete. Results published here https://arxiv.org/abs/2407.09298

Checkout my notes/blog here: abhinay1997.github.io

Pinned Loading

  1. Transformer-layers-as-painters-DiT Transformer-layers-as-painters-DiT Public

    Repo for the article: Extending transformer layers as painters to DiT's

    Python 1

  2. Attention-Scaling-CogVideoX Attention-Scaling-CogVideoX Public

    Attention scaling for inference time entropy correction as suggested in https://arxiv.org/abs/2306.08645 and https://jfischoff.github.io/blog/motion_control_with_attention_scaling.html

  3. FIFO-CogVideoX FIFO-CogVideoX Public

    FIFO applied to CogVideoX models

    Jupyter Notebook 1

  4. Flux-cfg-comparisions Flux-cfg-comparisions Public

    Comparing how different cfg like methods work on Flux.

    Jupyter Notebook