Translate torchrec_tutorial #599 #600

gorae17 · 2022-09-08T17:24:02Z

라이선스 동의

변경해주시는 내용에 BSD 3항 라이선스가 적용됨을 동의해주셔야 합니다.

더 자세한 내용은 기여하기 문서를 참고해주세요.

동의하시면 아래 [ ]를 [x]로 만들어주세요.

기여하기 문서를 확인하였으며, 본 PR 내용에 BSD 3항 라이선스가 적용됨에 동의합니다.

PR 종류

이 PR에 해당되는 종류 앞의 [ ]을 [x]로 변경해주세요.

오탈자를 수정하거나 번역을 개선하는 기여
번역되지 않은 튜토리얼을 번역하는 기여
공식 튜토리얼 내용을 반영하는 기여
위 종류에 포함되지 않는 기여

PR 설명

이 PR로 무엇이 달라지는지 대략적으로 알려주세요.
torchrec_tutorial 페이지가 한국어로 번역됩니다.

bub3690

고생하셨습니다. 어려운 글이군요..
오타 위주로 점검했습니다.

bub3690 · 2022-09-09T09:12:10Z

intermediate_source/torchrec_tutorial.rst

 `dlrm <https://github.com/pytorch/torchrec/tree/main/examples/dlrm>`__
-example, which includes multinode training on the criteo terabyte
-dataset, using Meta’s `DLRM <https://arxiv.org/abs/1906.00091>`__.
+예제를 참고하세요. 이 예제는 Meta’의 `DLRM <https://arxiv.org/abs/1906.00091>`__ 을 사용하여


의 가 있으니 ’는 빼도 되겠습니다

bub3690 · 2022-09-09T09:19:38Z

intermediate_source/torchrec_tutorial.rst


-We highly recommend CUDA when using TorchRec. If using CUDA: cuda >= 11.0
+TorchRec을 사용할 때는 CUDA를 적극 추천합니다. CUDA를 사용하는 경우: cuda > = 11.0


원본과 달리> 뒤에 띄어쓰기가 추가됐습니다.

intermediate_source/torchrec_tutorial.rst

-number of entity IDs per feature per example. In order to enable this
-“jagged” representation, we use the TorchRec datastructure
-|KeyedJaggedTensor|_ (KJT).
+예제 및 기능별로 엔티티 ID가 임의의 수인 다양한 예제를 효율적으료 나타내야 합니다. 


bub3690 · 2022-09-09T09:23:42Z

intermediate_source/torchrec_tutorial.rst




-Putting it all together, querying our distributed model with a KJT minibatch
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+총정리: KJT 미니배치를 사용하여 분산 모델 쿼리하기


총정리 라는 내용이 있나요?

뒤늦게 이해했습니다.
"총정리하여, KJT 미니배치를 사용하여 분산 모델 쿼리하기"
: 대신 한문장으로 하는게 어떤가요?

hyoyoung

긴 글 번역하시느라 수고하셨습니다

몇가지 수정사항을 제안드립니다

hyoyoung · 2022-09-11T15:55:12Z

intermediate_source/torchrec_tutorial.rst

-called |DistributedModelParallel|_,
-or DMP. Like PyTorch’s DistributedDataParallel, DMP wraps a model to
-enable distributed training.
+추천 시스템을 구축할 때, 제품이나 페이지와 같은 엔티티를 임베디드로 표현하고 싶은 경우가 많습니다. 


entities는 보통 존재나 독립체 등으로 번역되나
여기서는 객체로 의역해도 좋을듯합니다
embedding은 내장된다고 하거나, 번역하지 않고 임베딩으로 라고 해도 좋을듯 합니다

hyoyoung · 2022-09-11T15:55:27Z

intermediate_source/torchrec_tutorial.rst

-enable distributed training.
+추천 시스템을 구축할 때, 제품이나 페이지와 같은 엔티티를 임베디드로 표현하고 싶은 경우가 많습니다. 
+Meta AI의 `딥러닝 추천 모델 <https://arxiv.org/abs/1906.00091>`__ 또는 DLRM을 예로 들 수 있습니다. 
+엔티티의 수가 증가함에 따라, 임베디드 테이블의 크기가 단일 GPU의 메모리를 초과할 수 있습니다. 


임베딩 테이블이 조금 더 자연스럽지 않을까요?

hyoyoung · 2022-09-11T15:56:06Z

intermediate_source/torchrec_tutorial.rst



-Distributed Setup
-~~~~~~~~~~~~~~~~~
+분산 셋업


셋업은 설정으로 순화 가능 할 듯 보입니다. 아래 문단에서도 그렇게 사용하신거 같습니다

hyoyoung · 2022-09-11T15:56:56Z

intermediate_source/torchrec_tutorial.rst

@@ -118,16 +110,15 @@ on device “meta”. This will tell EBC to not allocate memory yet.
 DistributedModelParallel
 ~~~~~~~~~~~~~~~~~~~~~~~~

-Now, we’re ready to wrap our model with |DistributedModelParallel|_ (DMP). Instantiating DMP will:
+이제 모델을 |DistributedModelParallel|_ (DMP)로 래핑할 준비가 되었습니다. 


래핑은 감싸두기로 순화 가능할 듯 합니다

@hyoyoung 용어집에 wrapper를 래퍼로 번역한다고 하여 래핑이라고 사용하였는데, 순화하는 것이 좋을까요?

wrapper는 용어집에 래퍼라고 되어있고, 보통 병기를 해서 쓰곤합니다
취향차에 가까운데, 저는 래퍼는 따로 번역하기가 어렵다고 생각해서 래퍼(wrapper)로 쓰기를 권유하고
wrapping은 감싸두기, 포장하기로 순화하기 쉽다고 생각해서 바꿔달라고 요청을 하는 편입니다.

해당 부분 다시 읽어보시고, 래핑이 더 낫다면, 댓글로 얘기해주신뒤에 그대로 두셔도 무방합니다.

hyoyoung · 2022-09-11T15:57:13Z

intermediate_source/torchrec_tutorial.rst

-   embedding table(s) (i.e., the EmbeddingBagCollection).
-2. Actually shard the model. This includes allocating memory for each
-   embedding table on the appropriate device(s).
+1. 모델을 샤딩하는 방법을 결정합니다. DMP는 이용 가능한 ‘sharders’를 수집하고


여기서 샤딩(shard)라고 병기해도 좋을듯 합니다

hyoyoung · 2022-09-11T15:58:27Z

intermediate_source/torchrec_tutorial.rst

-Note that the KJT batch size is
-``batch_size = len(lengths)//len(keys)``. In the above example,
-batch_size is 3.
+KJT 배치 크기는 ``batch_size = len(lengths)//len(keys)`` 입니다. 


note 부분은 살려서 번역해도 좋을듯 합니다

KJT 배치 크기는 batch_size = len(lengths)//len(keys) 인 것을 눈여겨 봐주세요.

정도는 어떨까요?

hyoyoung · 2022-09-11T15:58:55Z

intermediate_source/torchrec_tutorial.rst


-Finally, we can query our model using our minibatch of products and
-users.
+마지막으로 제품과 사용자의 미니배치를 사용하여 모델을 쿼리합니다.  


쿼리는 질의로 순화해도 될 듯합니다

hyoyoung

몇가지 추가 제안을 드립니다

hyoyoung · 2022-09-12T15:18:23Z

intermediate_source/torchrec_tutorial.rst

-called |DistributedModelParallel|_,
-or DMP. Like PyTorch’s DistributedDataParallel, DMP wraps a model to
-enable distributed training.
+추천 시스템을 구축할 때, 제품이나 페이지와 같은 객체를 임베딩으로 표현하고 싶은 경우가 많습니다. 


구축보다는 조금 더 쉬운 단어인, 만들 때로 바꾸는건 어떨까요?

hyoyoung · 2022-09-12T15:18:38Z

intermediate_source/torchrec_tutorial.rst

-enable distributed training.
+추천 시스템을 구축할 때, 제품이나 페이지와 같은 객체를 임베딩으로 표현하고 싶은 경우가 많습니다. 
+Meta AI의 `딥러닝 추천 모델 <https://arxiv.org/abs/1906.00091>`__ 또는 DLRM을 예로 들 수 있습니다. 
+엔티티의 수가 증가함에 따라, 임베딩 테이블의 크기가 단일 GPU의 메모리를 초과할 수 있습니다. 


위에는 객체인데, 여기서는 엔티티로 되어있습니다

hyoyoung · 2022-09-12T15:19:04Z

intermediate_source/torchrec_tutorial.rst

+추천 시스템을 구축할 때, 제품이나 페이지와 같은 객체를 임베딩으로 표현하고 싶은 경우가 많습니다. 
+Meta AI의 `딥러닝 추천 모델 <https://arxiv.org/abs/1906.00091>`__ 또는 DLRM을 예로 들 수 있습니다. 
+엔티티의 수가 증가함에 따라, 임베딩 테이블의 크기가 단일 GPU의 메모리를 초과할 수 있습니다. 
+일반적인 방법은 모델 병렬화의 일종으로, 임베딩 테이블을 여러 디바이스로 샤딩하는 것입니다. 


아래 샤딩(shard) 병기를 여기로 바꾸는게 더 좋을거 같습니다

hyoyoung · 2022-09-12T15:21:48Z

intermediate_source/torchrec_tutorial.rst

@@ -118,16 +110,15 @@ on device “meta”. This will tell EBC to not allocate memory yet.
 DistributedModelParallel
 ~~~~~~~~~~~~~~~~~~~~~~~~

-Now, we’re ready to wrap our model with |DistributedModelParallel|_ (DMP). Instantiating DMP will:
+이제 모델을 |DistributedModelParallel|_ (DMP)로 래핑할 준비가 되었습니다. 


wrapper는 용어집에 래퍼라고 되어있고, 보통 병기를 해서 쓰곤합니다
취향차에 가까운데, 저는 래퍼는 따로 번역하기가 어렵다고 생각해서 래퍼(wrapper)로 쓰기를 권유하고
wrapping은 감싸두기, 포장하기로 순화하기 쉽다고 생각해서 바꿔달라고 요청을 하는 편입니다.

해당 부분 다시 읽어보시고, 래핑이 더 낫다면, 댓글로 얘기해주신뒤에 그대로 두셔도 무방합니다.

hyoyoung · 2022-09-12T15:25:14Z

intermediate_source/torchrec_tutorial.rst


-Let’s look at an example, recreating the product EmbeddingBag above:
+위의 EmbeddingBag를 만드는 예는 다음과 같습니다.


recreating the product의 의미를 살려주는게 조금 더 나을거 같습니다.
위의 예제와 다른 방식으로 임베딩백을 만들었으니까요

위의 EmbeddingBag을 다시 만들어보는, 예는 다음과 같습니다.

hyoyoung · 2022-09-12T15:25:30Z

intermediate_source/torchrec_tutorial.rst

-number of entity IDs per feature per example. In order to enable this
-“jagged” representation, we use the TorchRec datastructure
-|KeyedJaggedTensor|_ (KJT).
+예제 및 기능별로 엔티티 ID가 임의의 수인 다양한 예제를 효율적으로 나타내야 합니다. 


엔티티 -> 객체

hyoyoung · 2022-09-12T15:26:47Z

intermediate_source/torchrec_tutorial.rst

-bags, “product” and “user”. Assume the minibatch is made up of three
-examples for three users. The first of which has two product IDs, the
-second with none, and the third with one product ID.
+“product” 와 “user”, 2개의 임베딩 그룹의 컬렉션을 참조하는 방법을 살펴봅니다. 


위에서는 embeeding bags를 대게 EmbeddingBag라고 번역이 되어있습니다.
여기서도 맞추는게 더 좋을듯합니다

원문에서 EmbeddingBag으로 사용한 부분과 embedding bags로 사용한 부분을 구분해서 번역하고자 하였는데요.
제안해주신대로 통일해도 의미가 달라지지 않으니 통일하겠습니다.

garam24 · 2022-09-13T10:59:28Z

intermediate_source/torchrec_tutorial.rst

-This tutorial will cover three pieces of TorchRec: the ``nn.module`` |EmbeddingBagCollection|_, the |DistributedModelParallel|_ API, and
-the datastructure |KeyedJaggedTensor|_.
+이 튜토리얼에서는 TorchRec의 3가지 부분을 다룹니다. 
+그 3가지는 ``nn.module`` |EmbeddingBagCollection|_, |DistributedModelParallel|_ API, 데이터 구조 |KeyedJaggedTensor|_ 입니다.


원문의 뜻에 충실하게 번역을 해주셨는데
'이 튜토리얼에서는 A, B, C 이렇게 TorchRec의 세 가지 내용을 다룰 예정입니다.' 이런식으로 표현되는게 더 자연스럽지 않을까 제안드려봅니다.

garam24 · 2022-09-13T11:01:30Z

intermediate_source/torchrec_tutorial.rst

-Query vanilla nn.EmbeddingBag with input and offsets
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+입력과 오프셋이 있는 바닐라 nn.EmbeddingBag 질의
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


바닐라를 많이 쓰기도 하지만 '기본' 등으로 번역되면 어떨까요?

garam24 · 2022-09-13T11:03:42Z

intermediate_source/torchrec_tutorial.rst

-“jagged” representation, we use the TorchRec datastructure
-|KeyedJaggedTensor|_ (KJT).
+예제 및 기능별로 엔티티 ID가 임의의 수인 다양한 예제를 효율적으로 나타내야 합니다. 
+이 “jagged” 표현을 사용하기 위해, TorchRec 데이터구조 |KeyedJaggedTensor|_ (KJT)를 사용합니다.


'“jagged” 표현을 사용하기 위해'가 어떤 의미로 번역하셨는지 이해는 되는데 개인적인 의견으로는
다양한/유연한 표현이 가능하도록? 이런 식으로 번역하는 것은 어떨지 제안드립니다.

hyoyoung

good

Translate torchrec_tutorial PyTorchKorea#599

0fe8835

bub3690 suggested changes Sep 9, 2022

View reviewed changes

hyoyoung added the 컨트리뷰톤 오픈소스 컨트리뷰톤 관련 이슈/PR label Sep 9, 2022

hyoyoung requested changes Sep 11, 2022

View reviewed changes

add reviews

aded59b

gorae17 requested review from bub3690 and hyoyoung September 12, 2022 14:32

hyoyoung requested changes Sep 12, 2022

View reviewed changes

garam24 reviewed Sep 13, 2022

View reviewed changes

add reviews

722f573

gorae17 requested review from hyoyoung, bub3690 and garam24 and removed request for bub3690, hyoyoung and garam24 September 13, 2022 12:29

add reviews

79f4bf4

gorae17 removed the request for review from hyoyoung September 13, 2022 12:33

gorae17 requested review from garam24 and hyoyoung and removed request for garam24 September 13, 2022 12:33

hyoyoung approved these changes Sep 14, 2022

View reviewed changes

hyoyoung merged commit 601a656 into PyTorchKorea:master Sep 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translate torchrec_tutorial #599 #600

Translate torchrec_tutorial #599 #600

gorae17 commented Sep 8, 2022 •

edited

bub3690 left a comment

bub3690 Sep 9, 2022

bub3690 Sep 9, 2022

This comment was marked as abuse.

bub3690 Sep 9, 2022

bub3690 Sep 9, 2022

hyoyoung left a comment

hyoyoung Sep 11, 2022

hyoyoung Sep 11, 2022

hyoyoung Sep 11, 2022

hyoyoung Sep 11, 2022

gorae17 Sep 12, 2022

hyoyoung Sep 12, 2022

hyoyoung Sep 11, 2022

hyoyoung Sep 11, 2022

hyoyoung Sep 11, 2022

hyoyoung left a comment

hyoyoung Sep 12, 2022

hyoyoung Sep 12, 2022

hyoyoung Sep 12, 2022

hyoyoung Sep 12, 2022

hyoyoung Sep 12, 2022

hyoyoung Sep 12, 2022

hyoyoung Sep 12, 2022

gorae17 Sep 13, 2022

garam24 Sep 13, 2022

garam24 Sep 13, 2022

garam24 Sep 13, 2022

hyoyoung left a comment


		We highly recommend CUDA when using TorchRec. If using CUDA: cuda >= 11.0
		TorchRec을 사용할 때는 CUDA를 적극 추천합니다. CUDA를 사용하는 경우: cuda > = 11.0


		Let’s look at an example, recreating the product EmbeddingBag above:
		위의 EmbeddingBag를 만드는 예는 다음과 같습니다.

Translate torchrec_tutorial #599 #600

Translate torchrec_tutorial #599 #600

Conversation

gorae17 commented Sep 8, 2022 • edited

라이선스 동의

관련 이슈 번호

PR 종류

PR 설명

bub3690 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

This comment was marked as abuse.

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hyoyoung left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hyoyoung left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hyoyoung left a comment

Choose a reason for hiding this comment

gorae17 commented Sep 8, 2022 •

edited