Ko-CLIP

This repository contains code to train Korean CLIP on MS-COCO with Korean annotations in AI-HUB. Additionally, to get more Korean annotations, we use Naver Papago translator from English to Korean on VizWiz data.

Pretrained Model

The original CLIP has large-scaled dataset however ours dataset is much less than CLIP's. Due to lack Korean caption data, we use pretrained language and visual model to get representations on less dataset.

Pretrained Language Model

We fixed PLM as klue/roberta-large on huggingface to get more powerful text representation in Korean.

Pretrained Visual Model

We used PVMs as google/vit-base-patch16-224-in21k on huggingface and RN101 on torchvision to get image representations.
Actually, the images are not dependent in number of Korean dataset, but CLIP is trained pair of texts-images so Ko-CLIP trained limited images(which has Korean captions).

See WandB dashboard for check training records and model performance with comparing pretrained visual models.

Zero-shot classification

In zero-shot classification, we predict on CIFAR-10 and CIFAR-100 datasets.

We refer to CLIP, clip-training for train, koclip idea, and other pretrained models.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
demo_output		demo_output
test_images		test_images
utils		utils
zero_shot_eval_output		zero_shot_eval_output
README.md		README.md
config_data.yaml		config_data.yaml
config_train.yaml		config_train.yaml
data_loaders.py		data_loaders.py
model.py		model.py
train.py		train.py
zero_shot_demo.py		zero_shot_demo.py
zeroshot_eval.py		zeroshot_eval.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ko-CLIP

Pretrained Model

Pretrained Language Model

Pretrained Visual Model

Zero-shot classification

About

Languages

dongin1009/Ko-CLIP

Folders and files

Latest commit

History

Repository files navigation

Ko-CLIP

Pretrained Model

Pretrained Language Model

Pretrained Visual Model

Zero-shot classification

About

Resources

Stars

Watchers

Forks

Languages