# HuggingFace Compare Models

1. get your token and run below or just use `hf_`
2. For "Base Like" and "Base Download", try to execute the table once
3. After that, find the most highest ranks nearest-top models with the same name. 
   - For example, model A is 2nd like and 13th downloads. Model B is 4th like and 7th download. And model C is 8th like and 6th download.
   - You should take model B.
   - If C is 6th like and 6th download, you could take C, as it is nearest top.
   - This gives analysis the actual "like" by downloads with threshold the nearest top like and downloads

References:
- https://huggingface.co/docs/hub/api
- https://huggingface.co/spaces/enzostvs/hub-api-playground
- https://huggingface.co/docs/hub/api#get-apimodels
- https://github.com/huggingface/huggingface_hub
- https://huggingface.co/docs/huggingface_hub/en/guides/search

In [17]:
from IPython.display import display, Image as dImage
import requests
import json
import math

import pandas as pd
import ipywidgets as widgets
from huggingface_hub import HfApi

pd.set_option('display.max_colwidth', None)  # Or a large number like 1000
pd.set_option('display.max_rows', None)  # Or a large number like 1000
pd.set_option('display.width', 1000)
pd.set_option('display.max_columns', None)

In [31]:
txt_token = widgets.Text(
  value='hf_',
  placeholder='Huggingface Token',
  description='Token',
)

txt_author = widgets.Text(
  value='huggingface',
  placeholder='Author / Owner / Orgs',
  description='Author',
)

txt_search = widgets.Text(
  placeholder='Queries and keywords...',
  description='Search',
)

int_limit = widgets.IntText(
    value=250,
    description='Limit:',
)

int_base_like = widgets.IntText(
    value=1,
    placeholder='The max like on the same model',
    description='Base Like:',
)

int_base_download = widgets.IntText(
    value=1,
    placeholder='The max download on the same model',
    description='Base Download:',
)

display(txt_token)
display(txt_author)
display(txt_search)
display(int_limit)

display(int_base_like)
display(int_base_download)

Text(value='hf_', description='Token', placeholder='Huggingface Token')

Text(value='huggingface', description='Author', placeholder='Author / Owner / Orgs')

Text(value='', description='Search', placeholder='Queries and keywords...')

IntText(value=250, description='Limit:')

IntText(value=1, description='Base Like:')

IntText(value=1, description='Base Download:')

In [4]:
api = HfApi(
  token=txt_token.value,
)

In [None]:
models = requests.get(
  url="https://huggingface.co/api/models/huggingface/time-series-transformer-tourism-monthly",
  params={},
  headers={
    "Authorization": f"Bearer {txt_token.value}",
  }
)

models.json()

In [None]:
models = requests.get(
  url="https://huggingface.co/api/models",
  params={
    "author": txt_author.value,
    "sort": "likes", # lastModified, downloads
  },
  headers={
    "Authorization": f"Bearer {txt_token.value}",
  }
)

models.json()

In [34]:
likes = [
  f"{x.id[len(txt_author.value)+1:]} ({x.likes})"
  for x in api.list_models(
    author=txt_author.value,
    sort="likes",
    search=txt_search.value,
  )
]

downloads = [
  f"{x.id[len(txt_author.value)+1:]} ({x.downloads}; {math.ceil(x.downloads * int_base_like.value / int_base_download.value)})"
  for x in api.list_models(
    author=txt_author.value,
    sort="downloads",
    search=txt_search.value,
  )
]

lastModified = [
  f"{x.id[len(txt_author.value)+1:]}"
  for x in api.list_models(
    author=txt_author.value,
    sort="lastModified",
    search=txt_search.value,
  )
]

df = pd.DataFrame(data={
  "likes": likes,
  "downloads": downloads,
  "lastModified": lastModified,
})

print(df.head(250))

                                                     likes                                                          downloads                                        lastModified
0                                  CodeBERTa-small-v1 (81)                                CodeBERTa-small-v1 (125225; 125225)             time-series-transformer-tourism-monthly
1                               CodeBERTa-language-id (62)                          autoformer-tourism-monthly (54858; 54858)                             timesfm-tourism-monthly
2             time-series-transformer-tourism-monthly (22)                            informer-tourism-monthly (54777; 54777)                               CodeBERTa-language-id
3                                     falcon-40b-gptq (12)                               CodeBERTa-language-id (51956; 51956)                                     falcon-40b-gptq
4                           autoformer-tourism-monthly (9)  prunebert-base-uncased-6-finepruned-w-distil-squad