# Sharing pretrained models (PyTorch)

# 🚀 Sharing Pretrained Models on the Hugging Face Hub (PyTorch)

This notebook demonstrates all the main ways to share your pretrained models and tokenizers using Hugging Face.  
You'll learn to push models via API, Python SDK, and with Git—all best practices for reproducible, shareable ML!




## 1️⃣ Install Required Dependencies

We start by installing the necessary libraries: `transformers`, `datasets`, and `evaluate` for modeling and metrics, as well as `git-lfs` for handling large files in the Hub.


In [18]:
# Install main libraries and git-lfs for model upload support
!pip install datasets evaluate transformers[sentencepiece]
!apt install git-lfs


Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
git-lfs is already the newest version (3.0.2-1ubuntu0.3).
0 upgraded, 0 newly installed, 0 to remove and 35 not upgraded.


## 2️⃣ Git Configuration

Set up your Git identity. This is required for version control and model pushes.  
Replace the email and name with your GitHub/Hugging Face credentials.


In [19]:
!git config --global user.email "lakshmi.adhikari26@gmail.com"
!git config --global user.name "lakshmi-adhikari-ai"

## 3️⃣ Authenticate with the Hugging Face Hub

Login so that your notebook can create and update model repositories for your account.


In [20]:
from huggingface_hub import notebook_login
# Login and authenticate this environment to acces your Hugging Face account
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

## 4️⃣ Define TrainingArguments for Push-to-Hub (if you use Trainer)

You can instruct Hugging Face’s Trainer to push all checkpoints and the final model automatically.


In [21]:
from transformers import TrainingArguments
# Set up training arguments to enable automatic upload to the Hub
triainig_args=TrainingArguments(
    "bert-finetuned-mrpc",
     save_strategy="epoch",
     push_to_hub=True

)

## 5️⃣ Push Model and Tokenizer Using `.push_to_hub()`

Let's load a pretrained model and tokenizer and push them to a new repo on the Hub, all from code!


In [22]:
from transformers import AutoModelForMaskedLM, AutoTokenizer

checkpoint="camembert-base"

# Load pretrained model and tokenizer
model=AutoModelForMaskedLM.from_pretrained(checkpoint)
tokenizer=AutoTokenizer.from_pretrained(checkpoint)

# Push model and tokenizer to your namespace on the Hub
model.push_to_hub("dummy-model")
tokenizer.push_to_hub("dummy-model")

Some weights of the model checkpoint at camembert-base were not used when initializing CamembertForMaskedLM: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing CamembertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing CamembertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Processing Files (0 / 0)                : |          |  0.00B /  0.00B            

New Data Upload                         : |          |  0.00B /  0.00B            

  /tmp/tmpc_mutfuz/model.safetensors    :   2%|1         | 8.35MB /  443MB            

No files have been modified since last commit. Skipping to prevent empty commit.


Processing Files (0 / 0)                : |          |  0.00B /  0.00B            

New Data Upload                         : |          |  0.00B /  0.00B            

  ...tmpc9ealn1p/sentencepiece.bpe.model: 100%|##########|  811kB /  811kB            

  ...tmpc9ealn1p/sentencepiece.bpe.model: 100%|##########|  811kB /  811kB            

No files have been modified since last commit. Skipping to prevent empty commit.


CommitInfo(commit_url='https://huggingface.co/Lakshmi26/dummy-model/commit/a2f34864c15daa6930a6cdc0c37c4ab52c7f2d13', commit_message='Upload tokenizer', commit_description='', oid='a2f34864c15daa6930a6cdc0c37c4ab52c7f2d13', pr_url=None, repo_url=RepoUrl('https://huggingface.co/Lakshmi26/dummy-model', endpoint='https://huggingface.co', repo_type='model', repo_id='Lakshmi26/dummy-model'), pr_revision=None, pr_num=None)

### 🤝 Advanced: Uploading to Organizations or with Tokens

You can also push to an org or with a specific token:


In [23]:
# Push tokenizer to an organization namespace (and/or use a specific auth token)

tokenizer.push_to_hub("dummy-model",orgainization="huggingface")
tokenizer.push_to_hub("dummy-model",organization="huggingface",use_auth_token="<TOKEN>")

Processing Files (0 / 0)                : |          |  0.00B /  0.00B            

New Data Upload                         : |          |  0.00B /  0.00B            

  ...tmp7ivgorv6/sentencepiece.bpe.model: 100%|##########|  811kB /  811kB            

No files have been modified since last commit. Skipping to prevent empty commit.


HfHubHTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/repos/create (Request ID: Root=1-68b6906d-6ec99081113a9a7058634379;f9c589e5-82f8-40cc-b6fc-4e311dce503c)

Invalid credentials in Authorization header

## 6️⃣ Managing Repositories Programmatically with `huggingface_hub`

The `huggingface_hub` package gives you fine control: create/delete repos, upload/delete files, update metadata, and more.


In [None]:
from huggingface_hub import(
# User management
login,
logout,
whoami,

# Repository management
create_repo,
delete_repo,
update_repo_visibility,

# Information
list_models,
list_datasets,
#list_metrics,
list_repo_files,
upload_file,
delete_file,
)



### Create a new model repository (your account or organization):

Use `create_repo()` for a user or org.

In [None]:
from huggingface_hub import create_repo

create_repo("dummy-model1")
# Or, in an organization
create_repo("dummy-model1", organization="huggingface")

### Upload files directly to the repository from Python (no git required):

`upload_file()` lets you push any file to a repo instantly.


In [None]:
from huggingface_hub import upload_file
upload_file(
    "<path_to_file>/config.json",
    path_in_repo="config.json",
    repo_id="<namespace>/dummy-model",
)

## 7️⃣ Full Local Git-like Control with `Repository` Class

Use this class to manage your repo with traditional Git (add, commit, push).


In [None]:
from huggingface_hub import Repository

# Clone your remote to repo to a local folder
repo= Repository("<path_to_dummy_folder>",clone_from="<namespace/dummy-model")

# Standard git operations
repo.git_pull()
repo.git_add()
repo.git_commit("Commit messgae")
repo.git_push()

# Save your model/tokenizer to this folder then push
model.save_pretrained("<path_to_dummy_folder>")
tokenizer.save_pretrained("<path_to_dummy_folder>")
repo.git_add()
repo.git_commit("Add model and tokenizer files")
repo.git_push()

## 8️⃣ (Optional) Manual Git/LFS Push

For advanced or very large projects, clone the repo with Git and use LFS to manage weights.


# ✅ Summary

This notebook covers all best practices for sharing models and tokenizers on the Hugging Face Hub using Python and Git workflows.  
After uploading, remember to:
- Check your repo on [huggingface.co](https://huggingface.co/)  
- Edit your model card for clarity and completeness  
- Share the link on your portfolio or with collaborators!

You’re now ready to share any model you build, directly from a reproducible ML notebook!
