Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
694c317
Add decilm modelling code
danielkorzekwa Nov 3, 2025
991659f
Add decilm modelling code.
danielkorzekwa Nov 3, 2025
8489cee
Add transformers codebase
danielkorzekwa Nov 3, 2025
f0afefe
Add transformers code
danielkorzekwa Nov 3, 2025
b3ed5bc
Add decilm modelling code
danielkorzekwa Nov 3, 2025
a700da5
Add decilm modelling code
danielkorzekwa Nov 3, 2025
b59b679
Correct licence headers
danielkorzekwa Nov 4, 2025
1abdf3e
Correct licence headers
danielkorzekwa Nov 4, 2025
66609b1
Add decilm code
danielkorzekwa Nov 4, 2025
7da0a8a
Add decilm code
danielkorzekwa Nov 4, 2025
6e09a81
Add decilm code
danielkorzekwa Nov 4, 2025
2e3f5da
Add decilm code
danielkorzekwa Nov 4, 2025
418890e
Add decilm code
danielkorzekwa Nov 4, 2025
01f4fc1
Make llama3 converter self-contained (no deps on internal Nvidia code)
danielkorzekwa Nov 4, 2025
c57eed4
Add common module
danielkorzekwa Nov 4, 2025
3dc37b3
module refactoring
danielkorzekwa Nov 4, 2025
10ffdfe
refactoring
danielkorzekwa Nov 5, 2025
27a4456
add shared_checkpointing_utils
danielkorzekwa Nov 5, 2025
b0e22b7
Add json tools
danielkorzekwa Nov 5, 2025
52e7827
add logger
danielkorzekwa Nov 5, 2025
f5c1c87
import refactoring
danielkorzekwa Nov 5, 2025
0aa6320
add post_init_sparse module
danielkorzekwa Nov 5, 2025
35d0dbc
Add post_init_sparse
danielkorzekwa Nov 5, 2025
e39a1ad
merginy hydra.py and hydra_utils.py
danielkorzekwa Nov 5, 2025
1bd0c67
Add integrationt test for attention pruning
danielkorzekwa Nov 5, 2025
0ecd52b
add score_pruning_activations
danielkorzekwa Nov 5, 2025
278c6b7
import refactoring
danielkorzekwa Nov 5, 2025
7a0af16
add dist_utils
danielkorzekwa Nov 5, 2025
0f0cbbd
Add validate_model
danielkorzekwa Nov 5, 2025
cb5cf25
Add activation scoring hooks for pruning
danielkorzekwa Nov 5, 2025
fadda1b
Merge branch 'feature/compress' into dkorzekwa/score_pruning_activati…
danielkorzekwa Nov 18, 2025
31d01cc
Delete not needed tokenizer
danielkorzekwa Nov 18, 2025
5d8b5b2
Improve docs
danielkorzekwa Nov 18, 2025
19e3f94
remove unused hooks
danielkorzekwa Nov 19, 2025
c6560c1
Remove not needed mypy ignore
danielkorzekwa Nov 19, 2025
93cb161
Improve doc strings
danielkorzekwa Nov 19, 2025
5a5f566
Fix type in LayerNormlContributionHook name
danielkorzekwa Nov 20, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion examples/compress/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ pip install -e .[hf,compress]
How to choose `intermediate_size_list`?
The list specifies the candidate FFN sizes that we wish to search over. It is recommended to choose several pruning sizes (e.g. 15%, 20%, 30% etc of the original). Note that the values must be hardware-friendly (divisible by a 256) to avoid issues with tensor operations in subsequent steps.

Let's first shoot for 32% GPU memory reduction setting `target_memory = 78_000` GiB. This means that the algorithm will choose the candidates with highest accuracy that also meet the specified requirements.
Let's first shoot for 32% GPU memory reduction setting `target_memory = 78_000` MiB. This means that the algorithm will choose the candidates with highest accuracy that also meet the specified requirements.

2. Download and prepare the [Nemotron-Post-Training-Dataset-v2](https://huggingface.co/datasets/nvidia/Nemotron-Post-Training-Dataset-v2).

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Loading