-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing file permissions #1
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ShadenSmith
changed the title
Adding executable perms to install.sh
Fixing file permissions
Feb 3, 2020
ShadenSmith
force-pushed
the
install_perms
branch
from
February 3, 2020 18:54
cf59f7f
to
9bb70fb
Compare
kouml
pushed a commit
to kouml/DeepSpeed
that referenced
this pull request
Apr 3, 2020
Fixing file permissions.
ShadenSmith
referenced
this pull request
in ShadenSmith/DeepSpeed
Sep 10, 2020
* Tied module indexing bugfix. * Train and inference pipeline schedules. * Move code quality tests to Azure-hosted agents. (microsoft#368)
cli99
added a commit
that referenced
this pull request
Jan 13, 2021
rraminen
pushed a commit
to rraminen/DeepSpeed
that referenced
this pull request
Apr 28, 2021
…cript Added ds_train_bert_bsz32k_seq512_pipeclean.sh
liamcli
referenced
this pull request
in determined-ai/DeepSpeed
Sep 27, 2021
Fix all Pipeline Module Parameters being sent to cuda:0
pengwa
pushed a commit
to pengwa/DeepSpeed
that referenced
this pull request
Oct 14, 2022
* threaded tf_dl+presplit sentences+shuffled dataset with resume * elaborate in readme
pengwa
pushed a commit
to pengwa/DeepSpeed
that referenced
this pull request
Oct 14, 2022
Megatron + DeepSpeed + Pipeline Parallelism
pengwa
pushed a commit
to pengwa/DeepSpeed
that referenced
this pull request
Oct 14, 2022
* Enable Megatron-LM workload on ROCm (microsoft#1) * Enable Megatron workload on ROCm * Added ds_pretrain_gpt_350M_dense_pipeclean.sh * removed a file * Removed an extra line * Fix to resolve the below rsqrtf() error on ROCm /root/Megatron-DeepSpeed/megatron/fused_kernels/layer_norm_hip_kernel.hip:298:10: error: no matching function for call to 'rsqrtf' return rsqrtf(v); ^~~~~~ /opt/rocm-5.2.0/llvm/lib/clang/14.0.0/include/__clang_hip_math.h:521:7: note: candidate function not viable: call to __device__ function from __host__ function float rsqrtf(float __x) { return __ocml_rsqrt_f32(__x); } ^ * Simplified code * Simplified the code * Removed extra spaces
guoyejun
pushed a commit
to guoyejun/DeepSpeed
that referenced
this pull request
Nov 10, 2022
don't gather partitioned activations for mp size 1 (microsoft#2454)
This was referenced Apr 28, 2023
loadams
pushed a commit
that referenced
this pull request
Mar 6, 2024
* Add workspace capability to DSKernel * Add to injection pipeline * Validated
loadams
pushed a commit
that referenced
this pull request
Mar 6, 2024
* Initialize the fp6-quant-kernel integration. * Add necessary parameters of kernel interfaces and the linear layer selection logic. * upload kernel code * The simple script for debugging. * fix typo * update * fix split k * Fix some errors and add test case. * Workspace for Inference Kernels (#1) * Add transform_param functions and update format. * kernel debug * fix include * Update core_ops.cpp * Add split k support * fix * Fix kernel error * update * update * Fix rebase errors. * Add missed include. * Fix the bug that the attribute uses the weight information for mem alloc. * Avoid GPU preallocation during weight loading. * Add support of larger shapes for gated activation kernel. * update * model update * fix all weight preprocessing * Add split-k heuristic. * Avoid reading scale attribute on non-quantized tensors. * Change the scales from attributes to new tensors. Provide the end-to-end script given HuggingFace model id. * Hard-coded commented out the scales in the kernel to workaround the bug. * Support the user config for quantization. Fix kernel bug. * Per operator test functions. * Multiply scales by 1e12 according to the kernel design. * Revert "Workspace for Inference Kernels (#1)". This reverts commit 1528732. * Remove the format-only changes. * Put the quantization into the transform_param function. --------- Co-authored-by: Shiyang Chen <csycfl@gmail.com> Co-authored-by: Haojun Xia <xhjustc@mail.ustc.edu.cn>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.