From 312f0157048c61ca97c333f07281b2d33f67a593 Mon Sep 17 00:00:00 2001 From: MKhalusova Date: Mon, 13 Nov 2023 11:01:09 -0500 Subject: [PATCH] fixed a couple of broken links --- docs/source/concept_guides/performance.md | 2 +- docs/source/concept_guides/training_tpu.md | 2 +- docs/source/usage_guides/training_zoo.md | 4 ++-- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/source/concept_guides/performance.md b/docs/source/concept_guides/performance.md index 45494efa79a..81ac1009f9a 100644 --- a/docs/source/concept_guides/performance.md +++ b/docs/source/concept_guides/performance.md @@ -74,7 +74,7 @@ In this example, there are two GPUs for "Multi-GPU" and a TPU pod with 8 workers ## Learning Rates -As noted in multiple sources[[1](https://aws.amazon.com/blogs/machine-learning/scalable-multi-node-deep-learning-training-using-gpus-in-the-aws-cloud/)][[2](https://docs.nvidia.com/clara/tlt-mi_archive/clara-train-sdk-v2.0/nvmidl/appendix/training_with_multiple_gpus.html)], the learning rate should be scaled *linearly* based on the number of devices present. The below +As noted in multiple sources[[1](https://aws.amazon.com/blogs/machine-learning/scalable-multi-node-deep-learning-training-using-gpus-in-the-aws-cloud/)][[2](https://docs.nvidia.com/clara/clara-train-sdk/pt/model.html#classification-models-multi-gpu-training)], the learning rate should be scaled *linearly* based on the number of devices present. The below snippet shows doing so with Accelerate: diff --git a/docs/source/concept_guides/training_tpu.md b/docs/source/concept_guides/training_tpu.md index b0b21c8e644..45c10f0384f 100644 --- a/docs/source/concept_guides/training_tpu.md +++ b/docs/source/concept_guides/training_tpu.md @@ -36,7 +36,7 @@ Below is an example of a training function passed to the [`notebook_launcher`] i - This code snippet is based off the one from the `simple_nlp_example` notebook found [here](https://github.com/huggingface/notebooks/blob/main/examples/accelerate/simple_nlp_example.ipynb) with slight + This code snippet is based off the one from the `simple_nlp_example` notebook found [here](https://github.com/huggingface/notebooks/blob/main/examples/accelerate_examples/simple_nlp_example.ipynb) with slight modifications for the sake of simplicity diff --git a/docs/source/usage_guides/training_zoo.md b/docs/source/usage_guides/training_zoo.md index 21e9057a141..42dfe18a9f3 100644 --- a/docs/source/usage_guides/training_zoo.md +++ b/docs/source/usage_guides/training_zoo.md @@ -154,12 +154,12 @@ Below contains a non-exhaustive list of papers utilizing 🤗 Accelerate. * Puijin Cheng, Li Lin, Yijin Huang, Huaqing He, Wenhan Luo, Xiaoying Tang: “Learning Enhancement From Degradation: A Diffusion Model For Fundus Image Enhancement”, 2023; [arXiv:2303.04603](http://arxiv.org/abs/2303.04603). * Shun Shao, Yftah Ziser, Shay Cohen: “Erasure of Unaligned Attributes from Neural Representations”, 2023; [arXiv:2302.02997](http://arxiv.org/abs/2302.02997). * Seonghyeon Ye, Hyeonbin Hwang, Sohee Yang, Hyeongu Yun, Yireun Kim, Minjoon Seo: “In-Context Instruction Learning”, 2023; [arXiv:2302.14691](http://arxiv.org/abs/2302.14691). -* Shikun Liu, Linxi Fan, Edward Johns, Zhiding Yu, Chaowei Xiao, Anima Anandkumar: “Prismer: A Vision-Language Model with An Ensemble of Experts”, 2023; [arXiv:2303.02506](http://arxiv.org/abs/2303.02506 ). +* Shikun Liu, Linxi Fan, Edward Johns, Zhiding Yu, Chaowei Xiao, Anima Anandkumar: “Prismer: A Vision-Language Model with An Ensemble of Experts”, 2023; [arXiv:2303.02506](http://arxiv.org/abs/2303.02506). * Haoyu Chen, Zhihua Wang, Yang Yang, Qilin Sun, Kede Ma: “Learning a Deep Color Difference Metric for Photographic Images”, 2023; [arXiv:2303.14964](http://arxiv.org/abs/2303.14964). * Van-Hoang Le, Hongyu Zhang: “Log Parsing with Prompt-based Few-shot Learning”, 2023; [arXiv:2302.07435](http://arxiv.org/abs/2302.07435). * Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Ana Brassard, Masashi Yoshikawa, Keisuke Sakaguchi, Kentaro Inui: “Do Deep Neural Networks Capture Compositionality in Arithmetic Reasoning?”, 2023; [arXiv:2302.07866](http://arxiv.org/abs/2302.07866). * Ruoyao Wang, Peter Jansen, Marc-Alexandre Côté, Prithviraj Ammanabrolu: “Behavior Cloned Transformers are Neurosymbolic Reasoners”, 2022; [arXiv:2210.07382](http://arxiv.org/abs/2210.07382). -* Martin Wessel, Tomáš Horych, Terry Ruas, Akiko Aizawa, Bela Gipp, Timo Spinde: “Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection”, 2023; [arXiv:2304.13148](http://arxiv.org/abs/2304.13148 ). DOI: [https://dx.doi.org/10.1145/3539618.3591882 10.1145/3539618.3591882]. +* Martin Wessel, Tomáš Horych, Terry Ruas, Akiko Aizawa, Bela Gipp, Timo Spinde: “Introducing MBIB -- the first Media Bias Identification Benchmark Task and Dataset Collection”, 2023; [arXiv:2304.13148](http://arxiv.org/abs/2304.13148). DOI: [https://dx.doi.org/10.1145/3539618.3591882 10.1145/3539618.3591882]. * Hila Chefer, Yuval Alaluf, Yael Vinker, Lior Wolf, Daniel Cohen-Or: “Attend-and-Excite: Attention-Based Semantic Guidance for Text-to-Image Diffusion Models”, 2023; [arXiv:2301.13826](http://arxiv.org/abs/2301.13826). * Marcio Fonseca, Yftah Ziser, Shay B. Cohen: “Factorizing Content and Budget Decisions in Abstractive Summarization of Long Documents”, 2022; [arXiv:2205.12486](http://arxiv.org/abs/2205.12486). * Elad Richardson, Gal Metzer, Yuval Alaluf, Raja Giryes, Daniel Cohen-Or: “TEXTure: Text-Guided Texturing of 3D Shapes”, 2023; [arXiv:2302.01721](http://arxiv.org/abs/2302.01721).