Skip to content
View Psjs's full-sized avatar

Block or report Psjs

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 16,899 2,210 Updated Feb 1, 2025

nsfc - 国家自然科学基金项目LaTeX模版(面青地)

TeX 470 129 Updated Mar 15, 2025

ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation"

Python 116 23 Updated Feb 11, 2022

I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models

206 4 Updated Dec 30, 2023

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Jupyter Notebook 572 31 Updated Oct 6, 2024
Python 77 7 Updated May 25, 2024

Code of "3D Shape Variational Autoencoder Latent Disentanglement via Mini-Batch Feature Swapping for Bodies and Faces"

Python 63 10 Updated Jul 2, 2022

Real3D-Portrait: One-shot Realistic 3D Talking Portrait Synthesis; ICLR 2024 Spotlight; Official code

Python 1,018 120 Updated Oct 18, 2024

CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!

1,586 147 Updated May 9, 2023
Python 184 24 Updated May 2, 2023

A Python library for audio data augmentation. Useful for making audio ML models work well in the real world, not just in the lab.

Python 1,985 197 Updated Mar 21, 2025

Source code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)

Python 805 98 Updated Sep 30, 2021
Jupyter Notebook 39 10 Updated Oct 17, 2023

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 43,225 4,816 Updated Mar 26, 2025

[CVPR-2022] Official implementation for "Knowledge Distillation with the Reused Teacher Classifier".

Python 92 18 Updated Jun 16, 2022

A PyTorch Implementation of AC-SUM-GAN from "AC-SUM-GAN: Connecting Actor-Critic and Generative Adversarial Networks for Unsupervised Video Summarization" (IEEE TCSVT 2021)

Python 29 10 Updated May 4, 2022

Source code for the paper "Unsupervised Video Summarization via Multi-source Features" published at ICMR 2021

Python 21 10 Updated Apr 5, 2022
Python 15 3 Updated Jul 10, 2024

The code for ICASSP23 paper "MHSCNet: A Multimodal Hierarchical Shot-aware Convolutional Network for Video Summarization"

Python 10 1 Updated Aug 12, 2024
Python 8 Updated Feb 29, 2024

video summarization research repo

Python 2 1 Updated Jul 10, 2022

Pytorch code for paper Contrastive Losses Are Natural Criteria for Unsupervised Video Summarization

Python 21 3 Updated Jan 7, 2023

The official implementation of 'Align and Attend: Multimodal Summarization with Dual Contrastive Losses' (CVPR 2023)

Python 75 10 Updated Apr 24, 2023
Jupyter Notebook 15 3 Updated Mar 29, 2023

Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition

Python 13 Updated May 10, 2022
Python 27 7 Updated Oct 7, 2021
Python 76 9 Updated Mar 27, 2024

[CVPR 2023] Official implementation for "CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion."

Python 488 45 Updated Jan 15, 2025

A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization

Python 1,830 150 Updated Mar 19, 2025

Swift Parameter-free Attention Network for Efficient Super-Resolution

Python 159 9 Updated Jun 27, 2024
Next
Showing results