Skip to content

msdsm/paper-summary

Repository files navigation

論文まとめ

  • githubのmarkdownプレビューだと数式が崩壊してしまうため、pdf参照

論文

  • vision-basic
    • u-net : U-Net(Convolutional Networks for Biomedical Image Segmentation)
    • vision-transformer : VisionTransformer(AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE)
    • swin-transformer : SwinTransformer(Swin Transformer: Hierarchical Vision Transformer using Shifted Windows)
    • distillation : Distilling the Knowledge in a Neural Network
    • maxvit : MaxViT: Multi-Axis Vision Transformer
    • mae : Masked Autoencoders Are Scalable Vision Learners
    • simmim : SimMIM: a Simple Framework for Masked Image Modeling
    • revnet : The Reversible Residual Network: Backpropagation Without Storing Activations
    • rev-vit : Reversible Vision Transformers
  • diffusion
    • ddpm : Denoising Diffusion Probabilistc Models
    • palette : Palette: Image-to-Image Diffusion Models
    • ddim : Denoising Diffusion Implicit Models
    • improved-ddpm : Improved Denoising Diffusion Probabilistic Models
    • adm : Diffusion Models Beat GANs on Image Synthesis
    • glide : Guided Language to Image Diffusion for Generation and Editing
    • ldm : Latent Diffusion Model(Stable diffusion)
    • cdm : Cascaded Diffusion Model
    • inpaint-survey : Deep Learning-based Image and Video Inpainting: A Survey
  • super-resolution
    • srcnn-vdsr-fsrcnn-fspcn : 超解像の歴史(CNNあたりからGAN登場まで)
    • swinir : SwinIR(SwinIR: Image Restoration Using Swin Transformer)
    • hat : HAT-L(Hybrid Attention Transformer)
    • drct : DRCT(Dense Residual Connected Transformer)
    • sr3 : Image Super-Resolution via Iterative Refinement
    • ipg : Image Processing GNN: Breaking Rigidity in Super-Resolution
    • yonos-sr : You Only Need One Step: Fast Super-Resolution with Stable Diffusion via Scale Distillation
    • hmanet : HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution
    • diffusion-sr-survey : Diffusion Models, Image Super-Resolution And Everything: A Survey
    • tr-misr : TR-MISR: Multiimage Super-Resolution Based on Feature Fusion With Transformers
    • div2k : NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study
    • lsdir : LSDIR: A Large Scale Dataset for Image Restoration
    • df2k : DF2K
    • ntire-challenge-on-lfsr : NTIRE 2024 Challenge on Light Field Image Super-Resolution: Methods and Results
    • epit : (EPIT)Learning Non-Local Spatial-Angular Correlation for Light Field Image Super-Resolution
    • pixel-shuffle : Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network
    • datsr : Reference-based Image Super-Resolution with Deformable Attention Transformer
    • ais2024challenge-survey : Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey
  • image-restoration
    • aioir-survey : A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends
    • ram : Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration
  • deblurring
    • image-deblurring-survey : Deep Image Deblurring: A Survey
    • adarevd : AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring
  • 3dgs
    • 3dgs : 3D Gaussian Splatting for Real-Time Radiance Field Rendering
    • srgs: SRGS: Super-Resolution 3D Gaussian Splatting
    • gaussiansr : GaussianSR: 3D Gaussian Super-Resolution with 2D
    • supergaussian : SuperGaussian: Repurposing Video Models for 3D Super Resolution Diffusion Priors
    • supergs : SuperGS: Super-Resolution 3D Gaussian Splatting via Latent Feature Field and Gradient-guided Splitting
    • e-3dgs : Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting
    • deblurring-3dgs : Deblurring 3D Gaussian Splatting
  • nerf
    • nerf : NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
    • nerf-sr : NeRF-SR: High Quality Neural Radiance Fields using Supersampling
    • mip-nerf : Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
    • crop : Cross-Guided Optimization of Radiance Fields with Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis
  • video
    • adatad : End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
    • iaw : Aligning Step-by-Step Instructional Diagrams to Video Demonstrations
  • vision-and-language
    • clip : CLIP(Learning Transferable Visual Models From Natural Language Supervision)
    • lit : LiT : Zero-Shot Transfer with Locked-image text Tuning
    • blip : BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
    • blip2 : BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models
    • siglip : Sigmoid Loss for Language Image Pre-Training
    • Flamingo: a Visual Language Model for Few-Shot Learning
    • video-llm-survey : Video Understanding with Large Language Models: A Survey(途中)
    • llava : Visual Instruction Tuning
    • llava-next-video : blog
    • llava-next-stronger : blog
    • llava-video : VIDEO INSTRUCTION TUNING WITH SYNTHETIC DATA
    • long-vlm : LongVLM: Efficient Long Video Understanding via Large Language Models(ECCV2024)
    • tcr : Text-Conditioned Resampler For Long Form Video Understanding(ECCV2024)
  • nlp
    • keyword : LLMの用語集
    • transformer : Transformer(Attention is all you need)
    • lora : LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS
    • auxiliary-loss-free : auxiliary-loss-free load balancing strategy for mixture-of-experts
    • deepseek-v3 : DeepSeek-V3 Technical Report

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published