Skip to content

Commit

Permalink
Merge branch 'main' into implement_webhooks_api
Browse files Browse the repository at this point in the history
  • Loading branch information
Wauplin authored May 24, 2024
2 parents 6910d07 + 3900efc commit 6bb269c
Show file tree
Hide file tree
Showing 15 changed files with 220 additions and 45 deletions.
10 changes: 5 additions & 5 deletions docs/source/en/guides/inference_endpoints.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ The first step is to create an Inference Endpoint using [`create_inference_endpo
... vendor="aws",
... region="us-east-1",
... type="protected",
... instance_size="medium",
... instance_type="c6i"
... instance_size="x2",
... instance_type="intel-icl"
... )
```

Expand Down Expand Up @@ -58,8 +58,8 @@ By default the Inference Endpoint is built from a docker image provided by Huggi
... vendor="aws",
... region="us-east-1",
... type="protected",
... instance_size="medium",
... instance_type="g5.2xlarge",
... instance_size="x1",
... instance_type="nvidia-a10g",
... custom_image={
... "health_route": "/health",
... "env": {
Expand Down Expand Up @@ -203,7 +203,7 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)

# Update to larger instance
>>> endpoint.update(accelerator="cpu", instance_size="large", instance_type="c6i")
>>> endpoint.update(accelerator="cpu", instance_size="x4", instance_type="intel-icl")
InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)
```

Expand Down
8 changes: 4 additions & 4 deletions docs/source/ko/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,12 +68,12 @@
title: Hf파일시스템
- local: package_reference/utilities
title: 유틸리티
- local: in_translation
title: (번역 중) Discussions and Pull Requests
- local: package_reference/community
title: Discussions Pull Requests
- local: package_reference/cache
title: 캐시 시스템 참조
- local: in_translation
title: (번역 중) Repo Cards and Repo Card Data
- local: package_reference/cards
title: Repo Cards Repo Card Data
- local: package_reference/collections
title: 컬렉션 관리
- local: package_reference/space_runtime
Expand Down
12 changes: 6 additions & 6 deletions docs/source/ko/guides/inference_endpoints.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
... vendor="aws",
... region="us-east-1",
... type="protected",
... instance_size="medium",
... instance_type="c6i"
... instance_size="x2",
... instance_type="intel-icl"
... )
```

Expand Down Expand Up @@ -57,8 +57,8 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
... vendor="aws",
... region="us-east-1",
... type="protected",
... instance_size="medium",
... instance_type="g5.2xlarge",
... instance_size="x1",
... instance_type="nvidia-a10g",
... custom_image={
... "health_route": "/health",
... "env": {
Expand Down Expand Up @@ -202,7 +202,7 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2
InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)

# 더 큰 인스턴스로 업데이트
>>> endpoint.update(accelerator="cpu", instance_size="large", instance_type="c6i")
>>> endpoint.update(accelerator="cpu", instance_size="x4", instance_type="intel-icl")
InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2-large', status='pending', url=None)
```

Expand Down Expand Up @@ -254,4 +254,4 @@ InferenceEndpoint(name='my-endpoint-name', namespace='Wauplin', repository='gpt2

# 엔드포인트 중지
>>> endpoint.pause()
```
```
72 changes: 72 additions & 0 deletions docs/source/ko/package_reference/cards.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# 리포지토리 카드[[repository-cards]]

huggingface_hub 라이브러리는 모델/데이터 세트 카드를 생성, 공유 및 업데이트하기 위한 Python 인터페이스를 제공합니다.
Hub의 모델 카드가 무엇이며 내부적으로 어떻게 작동하는지 더 깊이 있게 알아보려면 [전용 문서 페이지](https://huggingface.co/docs/hub/models-cards)를 방문하세요. 또한 이러한 유틸리티를 자신의 프로젝트에서 어떻게 사용할 수 있는지 감을 잡기 위해 [모델 카드 가이드](../how-to-model-cards)를 확인할 수 있습니다.

## 리포지토리 카드[[huggingface_hub.RepoCard]]

`RepoCard` 객체는 [`ModelCard`], [`DatasetCard`]`SpaceCard`의 상위 클래스입니다.

[[autodoc]] huggingface_hub.repocard.RepoCard
- __init__
- all

## 카드 데이터[[huggingface_hub.CardData]]

[`CardData`] 객체는 [`ModelCardData`][`DatasetCardData`]의 상위 클래스입니다.

[[autodoc]] huggingface_hub.repocard_data.CardData

## 모델 카드[[model-cards]]

### ModelCard[[huggingface_hub.ModelCard]]

[[autodoc]] ModelCard

### ModelCardData[[huggingface_hub.ModelCardData]]

[[autodoc]] ModelCardData

## 데이터 세트 카드[[cards#dataset-cards]]

ML 커뮤니티에서는 데이터 세트 카드를 데이터 카드라고도 합니다.

### DatasetCard[[huggingface_hub.DatasetCard]]

[[autodoc]] DatasetCard

### DatasetCardData[[huggingface_hub.DatasetCardData]]

[[autodoc]] DatasetCardData

## 공간 카드[[space-cards]]

### SpaceCard[[huggingface_hub.SpaceCardData]]

[[autodoc]] SpaceCard

### SpaceCardData[[huggingface_hub.SpaceCardData]]

[[autodoc]] SpaceCardData

## 유틸리티[[utilities]]

### EvalResult[[huggingface_hub.EvalResult]]

[[autodoc]] EvalResult

### model_index_to_eval_results[[huggingface_hub.repocard_data.model_index_to_eval_results]]

[[autodoc]] huggingface_hub.repocard_data.model_index_to_eval_results

### eval_results_to_model_index[[huggingface_hub.repocard_data.eval_results_to_model_index]]

[[autodoc]] huggingface_hub.repocard_data.eval_results_to_model_index

### metadata_eval_result[[huggingface_hub.metadata_eval_result]]

[[autodoc]] huggingface_hub.repocard.metadata_eval_result

### metadata_update[[huggingface_hub.metadata_update]]

[[autodoc]] huggingface_hub.repocard.metadata_update
33 changes: 33 additions & 0 deletions docs/source/ko/package_reference/community.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
<!--⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.
-->

# Discussions 및 Pull Requests를 이용하여 상호작용하기[[interacting-with-discussions-and-pull-requests]]

Hub에서 Discussions 및 Pull Requests를 이용하여 상호 작용할 수 있는 방법에 대해 참조하고자 한다면 [`HfApi`] 문서 페이지를 확인하세요.

- [`get_repo_discussions`]
- [`get_discussion_details`]
- [`create_discussion`]
- [`create_pull_request`]
- [`rename_discussion`]
- [`comment_discussion`]
- [`edit_discussion_comment`]
- [`change_discussion_status`]
- [`merge_pull_request`]

## 데이터 구조[[huggingface_hub.Discussion]]

[[autodoc]] Discussion

[[autodoc]] DiscussionWithDetails

[[autodoc]] DiscussionEvent

[[autodoc]] DiscussionComment

[[autodoc]] DiscussionStatusChange

[[autodoc]] DiscussionCommit

[[autodoc]] DiscussionTitleChange
4 changes: 2 additions & 2 deletions src/huggingface_hub/_inference_endpoints.py
Original file line number Diff line number Diff line change
Expand Up @@ -256,9 +256,9 @@ def update(
accelerator (`str`, *optional*):
The hardware accelerator to be used for inference (e.g. `"cpu"`).
instance_size (`str`, *optional*):
The size or type of the instance to be used for hosting the model (e.g. `"large"`).
The size or type of the instance to be used for hosting the model (e.g. `"x4"`).
instance_type (`str`, *optional*):
The cloud instance type where the Inference Endpoint will be deployed (e.g. `"c6i"`).
The cloud instance type where the Inference Endpoint will be deployed (e.g. `"intel-icl"`).
min_replica (`int`, *optional*):
The minimum number of replicas (instances) to keep running for the Inference Endpoint.
max_replica (`int`, *optional*):
Expand Down
16 changes: 8 additions & 8 deletions src/huggingface_hub/hf_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -7190,9 +7190,9 @@ def create_inference_endpoint(
accelerator (`str`):
The hardware accelerator to be used for inference (e.g. `"cpu"`).
instance_size (`str`):
The size or type of the instance to be used for hosting the model (e.g. `"large"`).
The size or type of the instance to be used for hosting the model (e.g. `"x4"`).
instance_type (`str`):
The cloud instance type where the Inference Endpoint will be deployed (e.g. `"c6i"`).
The cloud instance type where the Inference Endpoint will be deployed (e.g. `"intel-icl"`).
region (`str`):
The cloud region in which the Inference Endpoint will be created (e.g. `"us-east-1"`).
vendor (`str`):
Expand Down Expand Up @@ -7236,8 +7236,8 @@ def create_inference_endpoint(
... vendor="aws",
... region="us-east-1",
... type="protected",
... instance_size="medium",
... instance_type="c6i",
... instance_size="x2",
... instance_type="intel-icl",
... )
>>> endpoint
InferenceEndpoint(name='my-endpoint-name', status="pending",...)
Expand All @@ -7260,8 +7260,8 @@ def create_inference_endpoint(
... vendor="aws",
... region="us-east-1",
... type="protected",
... instance_size="medium",
... instance_type="g5.2xlarge",
... instance_size="x1",
... instance_type="nvidia-a10g",
... custom_image={
... "health_route": "/health",
... "env": {
Expand Down Expand Up @@ -7394,9 +7394,9 @@ def update_inference_endpoint(
accelerator (`str`, *optional*):
The hardware accelerator to be used for inference (e.g. `"cpu"`).
instance_size (`str`, *optional*):
The size or type of the instance to be used for hosting the model (e.g. `"large"`).
The size or type of the instance to be used for hosting the model (e.g. `"x4"`).
instance_type (`str`, *optional*):
The cloud instance type where the Inference Endpoint will be deployed (e.g. `"c6i"`).
The cloud instance type where the Inference Endpoint will be deployed (e.g. `"intel-icl"`).
min_replica (`int`, *optional*):
The minimum number of replicas (instances) to keep running for the Inference Endpoint.
max_replica (`int`, *optional*):
Expand Down
15 changes: 12 additions & 3 deletions src/huggingface_hub/hf_file_system.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,11 @@ class HfFileSystem(fsspec.AbstractFileSystem):
Access a remote Hugging Face Hub repository as if were a local file system.
Args:
token (`str`, *optional*):
Authentication token, obtained with [`HfApi.login`] method. Will default to the stored token.
token (`str` or `bool`, *optional*):
A valid user access token (string). Defaults to the locally saved
token, which is the recommended method for authentication (see
https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
To disable authentication, pass `False`.
Usage:
Expand Down Expand Up @@ -105,7 +108,7 @@ def __init__(
self,
*args,
endpoint: Optional[str] = None,
token: Optional[str] = None,
token: Union[bool, str, None] = None,
**storage_options,
):
super().__init__(*args, **storage_options)
Expand Down Expand Up @@ -517,6 +520,9 @@ def info(self, path: str, refresh: bool = False, revision: Optional[str] = None,
else:
out = None
parent_path = self._parent(path)
if not expand_info and parent_path not in self.dircache:
# Fill the cache with cheap call
self.ls(parent_path, expand_info=False)
if parent_path in self.dircache:
# Check if the path is in the cache
out1 = [o for o in self.dircache[parent_path] if o["name"] == path]
Expand Down Expand Up @@ -681,6 +687,9 @@ def __init__(self, fs: HfFileSystem, path: str, revision: Optional[str] = None,
f"{e}.\nMake sure the repository and revision exist before writing data."
) from e
raise
# avoid an unnecessary .info() call with expensive expand_info=True to instantiate .details
if kwargs.get("mode", "rb") == "rb":
self.details = fs.info(self.resolved_path.unresolve(), expand_info=False)
super().__init__(fs, self.resolved_path.unresolve(), **kwargs)
self.fs: HfFileSystem

Expand Down
4 changes: 2 additions & 2 deletions src/huggingface_hub/hub_mixin.py
Original file line number Diff line number Diff line change
Expand Up @@ -477,7 +477,7 @@ def from_pretrained(
model_kwargs[param.name] = config[param.name]

# Check if `config` argument was passed at init
if "config" in cls._hub_mixin_init_parameters:
if "config" in cls._hub_mixin_init_parameters and "config" not in model_kwargs:
# Check if `config` argument is a dataclass
config_annotation = cls._hub_mixin_init_parameters["config"].annotation
if config_annotation is inspect.Parameter.empty:
Expand Down Expand Up @@ -505,7 +505,7 @@ def from_pretrained(
model_kwargs[key] = value

# Finally, also inject if `_from_pretrained` expects it
if cls._hub_mixin_inject_config:
if cls._hub_mixin_inject_config and "config" not in model_kwargs:
model_kwargs["config"] = config

instance = cls._from_pretrained(
Expand Down
48 changes: 46 additions & 2 deletions src/huggingface_hub/serialization/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"""Contains helpers to split tensors into shards."""

from dataclasses import dataclass, field
from typing import Any, Callable, Dict, List, Optional, TypeVar
from typing import Any, Callable, Dict, List, Optional, TypeVar, Union

from .. import logging

Expand Down Expand Up @@ -46,7 +46,7 @@ def split_state_dict_into_shards_factory(
get_tensor_size: TensorSizeFn_T,
get_storage_id: StorageIDFn_T = lambda tensor: None,
filename_pattern: str = FILENAME_PATTERN,
max_shard_size: int = MAX_SHARD_SIZE,
max_shard_size: Union[int, str] = MAX_SHARD_SIZE,
) -> StateDictSplit:
"""
Split a model state dictionary in shards so that each shard is smaller than a given size.
Expand Down Expand Up @@ -89,6 +89,9 @@ def split_state_dict_into_shards_factory(
current_shard_size = 0
total_size = 0

if isinstance(max_shard_size, str):
max_shard_size = parse_size_to_int(max_shard_size)

for key, tensor in state_dict.items():
# when bnb serialization is used the weights in the state dict can be strings
# check: https://github.com/huggingface/transformers/pull/24416 for more details
Expand Down Expand Up @@ -167,3 +170,44 @@ def split_state_dict_into_shards_factory(
filename_to_tensors=filename_to_tensors,
tensor_to_filename=tensor_name_to_filename,
)


SIZE_UNITS = {
"TB": 10**12,
"GB": 10**9,
"MB": 10**6,
"KB": 10**3,
}


def parse_size_to_int(size_as_str: str) -> int:
"""
Parse a size expressed as a string with digits and unit (like `"5MB"`) to an integer (in bytes).
Supported units are "TB", "GB", "MB", "KB".
Args:
size_as_str (`str`): The size to convert. Will be directly returned if an `int`.
Example:
```py
>>> parse_size_to_int("5MB")
5000000
```
"""
size_as_str = size_as_str.strip()

# Parse unit
unit = size_as_str[-2:].upper()
if unit not in SIZE_UNITS:
raise ValueError(f"Unit '{unit}' not supported. Supported units are TB, GB, MB, KB. Got '{size_as_str}'.")
multiplier = SIZE_UNITS[unit]

# Parse value
try:
value = float(size_as_str[:-2].strip())
except ValueError as e:
raise ValueError(f"Could not parse the size value from '{size_as_str}': {e}") from e

return int(value * multiplier)
Loading

0 comments on commit 6bb269c

Please sign in to comment.