# Hub Quickstart

This notebook walks through the [Quickstart guide]([Quickstart](https://huggingface.co/docs/huggingface_hub/main/en/quick-start#authentication)) on Hugging Face Hub. 

The `devcontainer.json` for this repo sets up the Python environment with necessary packages. We'll validate those first and install any lab-specific extras on demand.

To run the notebook, launch GitHub Codespaces to get the development environment ready. Then open this notebook and select the Python 3 kernel (currently `Python 3.12.x`) - then run cells one at a time to understand steps (or `Run All` if you want to execute everything first).

Note that the Visual Studio Code environment should have support for auto-completion and inline documentation for the Hugging Face libraries thanks to the installed `llm-vscode` extension.

---

### 1.1 Validate and fix setup

In [None]:
# Check huggingface-hub package is installed
!! pip show huggingface-hub

In [None]:
# Check if you are logged in (we can fix that later if needed)
!! huggingface-cli whoami

In [None]:
# Try downloading a file (should work without login)
from huggingface_hub import hf_hub_download
downloaded = hf_hub_download(filename="config.json", repo_id="facebook/bart-large-cnn")
print("File downloaded here: ", downloaded)

In [None]:
# Check that HF_TOKEN is defined in .env - else fix it now
import os
from dotenv import load_dotenv

load_dotenv()
token = os.getenv("HF_TOKEN")
if not token:
    print("Add Hugging Face token to .env file then Restart kernel to apply.")
else: 
    print("Your Hugging Face token is correctly defined.")


In [None]:
# Login programmatically if needed
# You can also login from CLI using `huggingface-cli login`
from huggingface_hub import login
login(token=token)

In [None]:
# Verify you are logged in using CLI
# Should identify your user name and HF orgs you belong to
!huggingface-cli whoami

### 1.2 Create & Use Repository

In [None]:
# Use the HfAPI to create a new repo, set to private for now
# Repos can be used for models, datasets or apps
# They get created at the following URL:
#     https://huggingface.co/<username>/<repo_id>
from huggingface_hub import HfApi
api = HfApi()
api.create_repo(repo_id="test-hf-repo", private=True)

Visit the repository in the browser.

- Since repo was created with default type, it assumes the type is "Model"
- The README.md is expected to reflect the [Model Card](https://huggingface.co/docs/hub/model-cards#model-card-metadata) format
- You can create/commit the default README.md from the website **now**
- Once done, return here and we can programmatically download and upload that file

In [None]:
# Download the model card README from the repo
# See: https://huggingface.co/docs/huggingface_hub/main/en/guides/download
#   for more options on download parameters
# By default downloads are saved to the default `cache_dir` in system   
#   which is usually `~/.cache/huggingface/hub`
from huggingface_hub import hf_hub_download
download = hf_hub_download(filename="README.md", repo_id="nityan/test-hf-repo")
print("File downloaded here: ", download)

In [None]:
# Lets change this to download to specific folder (local)
#   You should now see a `README.md` in the local folder
#   And its contents should reflect that of the Model card created earlier
download = hf_hub_download(filename="README.md", repo_id="nityan/test-hf-repo", local_dir=".")
print("File downloaded here: ", download)

In [None]:
# Make some changes to the README.md file locally
# Then let's upload the changes back to the repo
# And verify the changes by visiting the repo in the browser
from huggingface_hub import HfApi
api = HfApi()
result = api.upload_file(
    path_or_fileobj="README.md",
    path_in_repo="README.md",
    repo_id="nityan/test-hf-repo",
)
print("Upload result: ", result)

### 1.3 Search Hub for Models

### 1.4 Search Hub for Datasets

### 1.5 Search Hub for Collections

Working through the examples in [this documentation set](https://huggingface.co/docs/huggingface_hub/guides/collections) related to Collections.

A [collection](https://huggingface.co/docs/hub/collections) is a group of related items on the Hub (models, datasets, Spaces, papers) that are organized together on the same page. You can create, update, collaborate-on, access-control, organize, and delete, collections via the Hub UI. Here's what you can do with the Hub API.

- `get_collection` to get collection details
- `list_collections` to list all collections for given criteria
- `create_collection` to create a new collection
- `add_collection_item`, `update_collection_item` to manage them
- `delete_collection_item` or `delete_collection` to prune them

I have a collection of [#MustRead papers](https://huggingface.co/collections/nityan/mustread-papers-656253fe4e74b2075afd652b) - let me see how I can work with this

In [29]:
# My collection
my_collection = "nityan/mustread-papers-656253fe4e74b2075afd652b"

In [44]:
# Fetch it and display details
from huggingface_hub import get_collection
collection = get_collection(my_collection)
print("Has:", len(collection.items), "items")
print("From:", collection.owner["fullname"])
print("About:", collection.description)
print("Type:",collection.theme)
print("Link:", collection.url)


Has: 16 items
From: Nitya Narasimhan, PhD
About: Signature papers in AI/ML with focus on generative AI or large language models that bring unique perspectives and/or are highly cited by peers
Link: https://huggingface.co/collections/nityan/mustread-papers-656253fe4e74b2075afd652b


### 1.6 Access The Inference API