Merge branch 'main' into implement_webhooks_api

huggingface · May 24, 2024 · 6bb269c · 6bb269c
2 parents 6910d07 + 3900efc
commit 6bb269c
Show file tree

Hide file tree

Showing 15 changed files with 220 additions and 45 deletions.
diff --git a/docs/source/en/guides/inference_endpoints.md b/docs/source/en/guides/inference_endpoints.md
@@ -22,8 +22,8 @@ The first step is to create an Inference Endpoint using [`create_inference_endpo
 ...     vendor="aws",
 ...     region="us-east-1",
 ...     type="protected",
-...     instance_size="medium",
-...     instance_type="c6i"
+...     instance_size="x2",
+...     instance_type="intel-icl"
 ... )
 ```
 
@@ -58,8 +58,8 @@ By default the Inference Endpoint is built from a docker image provided by Huggi
 ...     vendor="aws",
 ...     region="us-east-1",
 ...     type="protected",
-...     instance_size="medium",
-...     instance_type="g5.2xlarge",
+...     instance_size="x1",
+...     instance_type="nvidia-a10g",
 ...     custom_image={
 ...         "health_route": "/health",
 ...         "env": {
@@ -203,7 +203,7 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
 InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)
 
 # Update to larger instance
->>> endpoint.update(accelerator="cpu", instance_size="large", instance_type="c6i")
+>>> endpoint.update(accelerator="cpu", instance_size="x4", instance_type="intel-icl")
 InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)
 ```
 

diff --git a/docs/source/ko/_toctree.yml b/docs/source/ko/_toctree.yml
@@ -68,12 +68,12 @@
       title: Hf파일시스템
     - local: package_reference/utilities
       title: 유틸리티
-    - local: in_translation
-      title: (번역 중) Discussions and Pull Requests
+    - local: package_reference/community
+      title: Discussions 및 Pull Requests
     - local: package_reference/cache
       title: 캐시 시스템 참조
-    - local: in_translation
-      title: (번역 중) Repo Cards and Repo Card Data
+    - local: package_reference/cards
+      title: Repo Cards 와 Repo Card Data
     - local: package_reference/collections
       title: 컬렉션 관리
     - local: package_reference/space_runtime

diff --git a/docs/source/ko/guides/inference_endpoints.md b/docs/source/ko/guides/inference_endpoints.md
@@ -21,8 +21,8 @@
 ...     vendor="aws",
 ...     region="us-east-1",
 ...     type="protected",
-...     instance_size="medium",
-...     instance_type="c6i"
+...     instance_size="x2",
+...     instance_type="intel-icl"
 ... )
 ```
 
@@ -57,8 +57,8 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
 ...     vendor="aws",
 ...     region="us-east-1",
 ...     type="protected",
-...     instance_size="medium",
-...     instance_type="g5.2xlarge",
+...     instance_size="x1",
+...     instance_type="nvidia-a10g",
 ...     custom_image={
 ...         "health_route": "/health",
 ...         "env": {
@@ -202,7 +202,7 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
 InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)
 
 # 더 큰 인스턴스로 업데이트
->>> endpoint.update(accelerator="cpu", instance_size="large", instance_type="c6i")
+>>> endpoint.update(accelerator="cpu", instance_size="x4", instance_type="intel-icl")
 InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)
 ```
 
@@ -254,4 +254,4 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
 
 # 엔드포인트 중지
 >>> endpoint.pause()
-```
+```
diff --git a/docs/source/ko/package_reference/cards.md b/docs/source/ko/package_reference/cards.md
@@ -0,0 +1,72 @@
+# 리포지토리 카드[[repository-cards]]
+
+huggingface_hub 라이브러리는 모델/데이터 세트 카드를 생성, 공유 및 업데이트하기 위한 Python 인터페이스를 제공합니다.
+Hub의 모델 카드가 무엇이며 내부적으로 어떻게 작동하는지 더 깊이 있게 알아보려면 [전용 문서 페이지](https://huggingface.co/docs/hub/models-cards)를 방문하세요. 또한 이러한 유틸리티를 자신의 프로젝트에서 어떻게 사용할 수 있는지 감을 잡기 위해 [모델 카드 가이드](../how-to-model-cards)를 확인할 수 있습니다.
+
+## 리포지토리 카드[[huggingface_hub.RepoCard]]
+
+`RepoCard` 객체는 [`ModelCard`], [`DatasetCard`] 및 `SpaceCard`의 상위 클래스입니다.
+
+[[autodoc]] huggingface_hub.repocard.RepoCard
+    - __init__
+    - all
+
+## 카드 데이터[[huggingface_hub.CardData]]
+
+[`CardData`] 객체는 [`ModelCardData`]와 [`DatasetCardData`]의 상위 클래스입니다.
+
+[[autodoc]] huggingface_hub.repocard_data.CardData
+
+## 모델 카드[[model-cards]]
+
+### ModelCard[[huggingface_hub.ModelCard]]
+
+[[autodoc]] ModelCard
+
+### ModelCardData[[huggingface_hub.ModelCardData]]
+
+[[autodoc]] ModelCardData
+
+## 데이터 세트 카드[[cards#dataset-cards]]
+
+ML 커뮤니티에서는 데이터 세트 카드를 데이터 카드라고도 합니다.
+
+### DatasetCard[[huggingface_hub.DatasetCard]]
+
+[[autodoc]] DatasetCard
+
+### DatasetCardData[[huggingface_hub.DatasetCardData]]
+
+[[autodoc]] DatasetCardData
+
+## 공간 카드[[space-cards]]
+
+### SpaceCard[[huggingface_hub.SpaceCardData]]
+
+[[autodoc]] SpaceCard
+
+### SpaceCardData[[huggingface_hub.SpaceCardData]]
+
+[[autodoc]] SpaceCardData
+
+## 유틸리티[[utilities]]
+
+### EvalResult[[huggingface_hub.EvalResult]]
+
+[[autodoc]] EvalResult
+
+### model_index_to_eval_results[[huggingface_hub.repocard_data.model_index_to_eval_results]]
+
+[[autodoc]] huggingface_hub.repocard_data.model_index_to_eval_results
+
+### eval_results_to_model_index[[huggingface_hub.repocard_data.eval_results_to_model_index]]
+
+[[autodoc]] huggingface_hub.repocard_data.eval_results_to_model_index
+
+### metadata_eval_result[[huggingface_hub.metadata_eval_result]]
+
+[[autodoc]] huggingface_hub.repocard.metadata_eval_result
+
+### metadata_update[[huggingface_hub.metadata_update]]
+
+[[autodoc]] huggingface_hub.repocard.metadata_update
diff --git a/docs/source/ko/package_reference/community.md b/docs/source/ko/package_reference/community.md
@@ -0,0 +1,33 @@
+<!--⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
+rendered properly in your Markdown viewer.
+-->
+
+# Discussions 및 Pull Requests를 이용하여 상호작용하기[[interacting-with-discussions-and-pull-requests]]
+
+Hub에서 Discussions 및 Pull Requests를 이용하여 상호 작용할 수 있는 방법에 대해 참조하고자 한다면 [`HfApi`] 문서 페이지를 확인하세요.
+
+- [`get_repo_discussions`]
+- [`get_discussion_details`]
+- [`create_discussion`]
+- [`create_pull_request`]
+- [`rename_discussion`]
+- [`comment_discussion`]
+- [`edit_discussion_comment`]
+- [`change_discussion_status`]
+- [`merge_pull_request`]
+
+## 데이터 구조[[huggingface_hub.Discussion]]
+
+[[autodoc]] Discussion
+
+[[autodoc]] DiscussionWithDetails
+
+[[autodoc]] DiscussionEvent
+
+[[autodoc]] DiscussionComment
+
+[[autodoc]] DiscussionStatusChange
+
+[[autodoc]] DiscussionCommit
+
+[[autodoc]] DiscussionTitleChange
diff --git a/src/huggingface_hub/_inference_endpoints.py b/src/huggingface_hub/_inference_endpoints.py
@@ -256,9 +256,9 @@ def update(
             accelerator (`str`, *optional*):
                 The hardware accelerator to be used for inference (e.g. `"cpu"`).
             instance_size (`str`, *optional*):
-                The size or type of the instance to be used for hosting the model (e.g. `"large"`).
+                The size or type of the instance to be used for hosting the model (e.g. `"x4"`).
             instance_type (`str`, *optional*):
-                The cloud instance type where the Inference Endpoint will be deployed (e.g. `"c6i"`).
+                The cloud instance type where the Inference Endpoint will be deployed (e.g. `"intel-icl"`).
             min_replica (`int`, *optional*):
                 The minimum number of replicas (instances) to keep running for the Inference Endpoint.
             max_replica (`int`, *optional*):

diff --git a/src/huggingface_hub/hf_api.py b/src/huggingface_hub/hf_api.py
@@ -7190,9 +7190,9 @@ def create_inference_endpoint(
             accelerator (`str`):
                 The hardware accelerator to be used for inference (e.g. `"cpu"`).
             instance_size (`str`):
-                The size or type of the instance to be used for hosting the model (e.g. `"large"`).
+                The size or type of the instance to be used for hosting the model (e.g. `"x4"`).
             instance_type (`str`):
-                The cloud instance type where the Inference Endpoint will be deployed (e.g. `"c6i"`).
+                The cloud instance type where the Inference Endpoint will be deployed (e.g. `"intel-icl"`).
             region (`str`):
                 The cloud region in which the Inference Endpoint will be created (e.g. `"us-east-1"`).
             vendor (`str`):
@@ -7236,8 +7236,8 @@ def create_inference_endpoint(
             ...     vendor="aws",
             ...     region="us-east-1",
             ...     type="protected",
-            ...     instance_size="medium",
-            ...     instance_type="c6i",
+            ...     instance_size="x2",
+            ...     instance_type="intel-icl",
             ... )
             >>> endpoint
             InferenceEndpoint(name='my-endpoint-name', status="pending",...)
@@ -7260,8 +7260,8 @@ def create_inference_endpoint(
             ...     vendor="aws",
             ...     region="us-east-1",
             ...     type="protected",
-            ...     instance_size="medium",
-            ...     instance_type="g5.2xlarge",
+            ...     instance_size="x1",
+            ...     instance_type="nvidia-a10g",
             ...     custom_image={
             ...         "health_route": "/health",
             ...         "env": {
@@ -7394,9 +7394,9 @@ def update_inference_endpoint(
             accelerator (`str`, *optional*):
                 The hardware accelerator to be used for inference (e.g. `"cpu"`).
             instance_size (`str`, *optional*):
-                The size or type of the instance to be used for hosting the model (e.g. `"large"`).
+                The size or type of the instance to be used for hosting the model (e.g. `"x4"`).
             instance_type (`str`, *optional*):
-                The cloud instance type where the Inference Endpoint will be deployed (e.g. `"c6i"`).
+                The cloud instance type where the Inference Endpoint will be deployed (e.g. `"intel-icl"`).
             min_replica (`int`, *optional*):
                 The minimum number of replicas (instances) to keep running for the Inference Endpoint.
             max_replica (`int`, *optional*):

diff --git a/src/huggingface_hub/hf_file_system.py b/src/huggingface_hub/hf_file_system.py
@@ -74,8 +74,11 @@ class HfFileSystem(fsspec.AbstractFileSystem):
     Access a remote Hugging Face Hub repository as if were a local file system.
 
     Args:
-        token (`str`, *optional*):
-            Authentication token, obtained with [`HfApi.login`] method. Will default to the stored token.
+        token (`str` or `bool`, *optional*):
+            A valid user access token (string). Defaults to the locally saved
+            token, which is the recommended method for authentication (see
+            https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
+            To disable authentication, pass `False`.
 
     Usage:
 
@@ -105,7 +108,7 @@ def __init__(
         self,
         *args,
         endpoint: Optional[str] = None,
-        token: Optional[str] = None,
+        token: Union[bool, str, None] = None,
         **storage_options,
     ):
         super().__init__(*args, **storage_options)
@@ -517,6 +520,9 @@ def info(self, path: str, refresh: bool = False, revision: Optional[str] = None,
         else:
             out = None
             parent_path = self._parent(path)
+            if not expand_info and parent_path not in self.dircache:
+                # Fill the cache with cheap call
+                self.ls(parent_path, expand_info=False)
             if parent_path in self.dircache:
                 # Check if the path is in the cache
                 out1 = [o for o in self.dircache[parent_path] if o["name"] == path]
@@ -681,6 +687,9 @@ def __init__(self, fs: HfFileSystem, path: str, revision: Optional[str] = None,
                     f"{e}.\nMake sure the repository and revision exist before writing data."
                 ) from e
             raise
+        # avoid an unnecessary .info() call with expensive expand_info=True to instantiate .details
+        if kwargs.get("mode", "rb") == "rb":
+            self.details = fs.info(self.resolved_path.unresolve(), expand_info=False)
         super().__init__(fs, self.resolved_path.unresolve(), **kwargs)
         self.fs: HfFileSystem
 

diff --git a/src/huggingface_hub/hub_mixin.py b/src/huggingface_hub/hub_mixin.py
@@ -477,7 +477,7 @@ def from_pretrained(
                     model_kwargs[param.name] = config[param.name]
 
             # Check if `config` argument was passed at init
-            if "config" in cls._hub_mixin_init_parameters:
+            if "config" in cls._hub_mixin_init_parameters and "config" not in model_kwargs:
                 # Check if `config` argument is a dataclass
                 config_annotation = cls._hub_mixin_init_parameters["config"].annotation
                 if config_annotation is inspect.Parameter.empty:
@@ -505,7 +505,7 @@ def from_pretrained(
                         model_kwargs[key] = value
 
             # Finally, also inject if `_from_pretrained` expects it
-            if cls._hub_mixin_inject_config:
+            if cls._hub_mixin_inject_config and "config" not in model_kwargs:
                 model_kwargs["config"] = config
 
         instance = cls._from_pretrained(

diff --git a/src/huggingface_hub/serialization/_base.py b/src/huggingface_hub/serialization/_base.py
@@ -14,7 +14,7 @@
 """Contains helpers to split tensors into shards."""
 
 from dataclasses import dataclass, field
-from typing import Any, Callable, Dict, List, Optional, TypeVar
+from typing import Any, Callable, Dict, List, Optional, TypeVar, Union
 
 from .. import logging
 
@@ -46,7 +46,7 @@ def split_state_dict_into_shards_factory(
     get_tensor_size: TensorSizeFn_T,
     get_storage_id: StorageIDFn_T = lambda tensor: None,
     filename_pattern: str = FILENAME_PATTERN,
-    max_shard_size: int = MAX_SHARD_SIZE,
+    max_shard_size: Union[int, str] = MAX_SHARD_SIZE,
 ) -> StateDictSplit:
     """
     Split a model state dictionary in shards so that each shard is smaller than a given size.
@@ -89,6 +89,9 @@ def split_state_dict_into_shards_factory(
     current_shard_size = 0
     total_size = 0
 
+    if isinstance(max_shard_size, str):
+        max_shard_size = parse_size_to_int(max_shard_size)
+
     for key, tensor in state_dict.items():
         # when bnb serialization is used the weights in the state dict can be strings
         # check: https://github.com/huggingface/transformers/pull/24416 for more details
@@ -167,3 +170,44 @@ def split_state_dict_into_shards_factory(
         filename_to_tensors=filename_to_tensors,
         tensor_to_filename=tensor_name_to_filename,
     )
+
+
+SIZE_UNITS = {
+    "TB": 10**12,
+    "GB": 10**9,
+    "MB": 10**6,
+    "KB": 10**3,
+}
+
+
+def parse_size_to_int(size_as_str: str) -> int:
+    """
+    Parse a size expressed as a string with digits and unit (like `"5MB"`) to an integer (in bytes).
+
+    Supported units are "TB", "GB", "MB", "KB".
+
+    Args:
+        size_as_str (`str`): The size to convert. Will be directly returned if an `int`.
+
+    Example:
+
+    ```py
+    >>> parse_size_to_int("5MB")
+    5000000
+    ```
+    """
+    size_as_str = size_as_str.strip()
+
+    # Parse unit
+    unit = size_as_str[-2:].upper()
+    if unit not in SIZE_UNITS:
+        raise ValueError(f"Unit '{unit}' not supported. Supported units are TB, GB, MB, KB. Got '{size_as_str}'.")
+    multiplier = SIZE_UNITS[unit]
+
+    # Parse value
+    try:
+        value = float(size_as_str[:-2].strip())
+    except ValueError as e:
+        raise ValueError(f"Could not parse the size value from '{size_as_str}': {e}") from e
+
+    return int(value * multiplier)