## Installation of a pretrain model from Hugging Face
Hugging Face Hub is very a useful repository.  
However, installing of a model from here depends on an internet connection, so we'll need to save the model as a Kaggle dataset before we use it in a notebook for submission.  
In this notebook, I will show you how to clone a Hugging Face hub model, output it, and save it as a Kaggle dataset.

The model installed from this notebook is [here](https://www.kaggle.com/spshota/hugging-face-hub-xlmrobertalargesquad2).

## Cloning from Hugging Face hub

In [1]:
model_name = 'xlm-roberta-large-squad2'
model_path = f'deepset/{model_name}'

!git clone https://huggingface.co/{model_path}

## Let's take a look at cloned files.

In [1]:
!ls -la {model_name}

The model installed successfully.  
But "pytorch_model.bin" are very small in size.  
This happens because the larger files are managed by [Git Large File Storage (LFS)](https://git-lfs.github.com/).

Files managed by GIT LFS are installed as pointers instead of real files when cloning a repository.  
To actually install large files, you need to checkout the repository using Git LFS.

## Installing GIT LFS

In [1]:
!curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | bash
!apt-get install git-lfs

## Checking out the repository using GIT LFS

In [1]:
!git -C {model_name} lfs install
!git -C {model_name} lfs pull

## Let's take a look at the checked out files.

In [1]:
!ls -la {model_name}

The size of "pytorch_model.bin" has changed from 135 bytes to 2,239,666,418 bytes!!

## Removing files related to Git.

In [1]:
!rm -rf {model_name}/.git*
!ls -la {model_name}

## Let's save the model as a Kaggle Dataset!!  
![saved_as_detaset.png](attachment:046e591b-d064-4af9-b2cc-1ae69eaeca48.png)