-
Notifications
You must be signed in to change notification settings - Fork 50
Changing the hashing methodology for cache folder creation of models. #481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
review WIP.
QEfficient/base/modeling_qeff.py
Outdated
@@ -5,7 +5,7 @@ | |||
# | |||
# ---------------------------------------------------------------------------- | |||
|
|||
import hashlib | |||
# import hashlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
commented code.
Make sure that commented code is not there in ready to review PRs.
QEfficient/base/modeling_qeff.py
Outdated
self.model_params.update(kwargs) | ||
self.model_params["config"] = self.model.config.to_diff_dict() | ||
self.model_params["_transform_names"] = self._transform_names() | ||
self.compile_params = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initialize this only when compile is called.
No point in creating this dictionary if user not calling compile.
QEfficient/base/modeling_qeff.py
Outdated
self.model_params = {} | ||
self.model_params.update(kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to do self.model_params = copy.deepcopy(kwargs)
This lets other methods mutate kwargs.
Otherwise we would need to ensure that no other method mutates the kwargs.
QEfficient/base/modeling_qeff.py
Outdated
if export_kwargs is not None: | ||
self.model_params.update(export_kwargs) | ||
if onnx_transform_kwargs is not None: | ||
self.model_params.update(onnx_transform_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One liners are better
self.model_params.update(export_kwargs) if export_kwargs is not None else None
self.model_params.update(onnx_transform_kwargs) if export_kwargs is not None else None
QEfficient/base/modeling_qeff.py
Outdated
self.model_params["output_names"] = output_names | ||
self.model_params["dynamic_axes"] = dynamic_axes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to keep them in one more level as
self.model_params["export_params"] = export_params
And add all exports params in export_params
which is another dict.
Makes the dumped JSON readable by user.
QEfficient/base/modeling_qeff.py
Outdated
model_params_json = export_dir / "model_params.json" | ||
with open(model_params_json, "w") as fp: | ||
json.dump( | ||
{ | ||
"model_params": [ | ||
{k: make_serializable(self.model_params[k]) for k in sorted(self.model_params.keys())} | ||
] | ||
}, | ||
fp, | ||
indent=4, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dumping should happen after export.
If model errors out during export and we still dump the json, it does not make sense
|
||
self.pretrained_model_name_or_path = kwargs.get("pretrained_model_name_or_path", None) | ||
# self.pretrained_model_name_or_path = kwargs.get("pretrained_model_name_or_path", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To control the user should not initialize the model directly using the init of modeling class. Please use Metaclass Control
to control the flow and print the warning or error. For example-
class NoInitMeta(type):
def __call__(cls, *args, **kwargs):
raise RuntimeError("Use `from_pretrained` to create an instance.")
class MyModel(metaclass=NoInitMeta):
def __init__(self, data):
self.data = data
@classmethod
def from_pretrained(cls, path):
instance = object.__new__(cls)
instance.__init__(f"Loaded from {path}")
return instance
More about this you can read from here- https://stackoverflow.com/questions/100003/what-are-metaclasses-in-python.
Put that meta class in the utils and use this for all the modelling class.
QEfficient/base/modeling_qeff.py
Outdated
@@ -357,6 +388,19 @@ def _compile( | |||
logger.info(f"Running compiler: {' '.join(command)}") | |||
try: | |||
subprocess.run(command, capture_output=True, check=True) | |||
|
|||
# Dumping compile paramters in a JSON file after successful QPC compilation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove all the code related to compile_param_json
from here including dumping
and handle these inside the decorator dump_qconfig
. Lets keep the base methods clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There is a typo in "paramters" line 391
- Define a method for creating compile/export params_json in utils, use the same method for creating the export and compiler json and dump using dump_json method.
QEfficient/base/modeling_qeff.py
Outdated
model_params_json = export_dir / "model_params.json" | ||
with open(model_params_json, "w") as fp: | ||
json.dump( | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same like compile create a decorator and handle all these param updates and dumping inside that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also use the dump_json from the utils.
QEfficient/base/modeling_qeff.py
Outdated
export_params["output_names"] = output_names | ||
export_params["dynamic_axes"] = dynamic_axes | ||
|
||
self.model_params["export_params"] = export_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle in decorator. Lets keep our base methods clean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, lets write decorator implementation to handle this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a decorator here? how we would be able to return the hash from the decorator and construct the export dir?
Wouldnt it be better if we can move the hash creation to a separate function and call it from here?
QEfficient/base/modeling_qeff.py
Outdated
|
||
# Store Model parameters to Calculate Hash for caching | ||
self.model_params = {} | ||
self.model_params = copy.deepcopy(kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we need self.model_params = {}?
- move the updating model_params to a method. it keeps the init clean
QEfficient/base/modeling_qeff.py
Outdated
export_params["output_names"] = output_names | ||
export_params["dynamic_axes"] = dynamic_axes | ||
|
||
self.model_params["export_params"] = export_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a decorator here? how we would be able to return the hash from the decorator and construct the export dir?
Wouldnt it be better if we can move the hash creation to a separate function and call it from here?
QEfficient/base/modeling_qeff.py
Outdated
model_params_json = export_dir / "model_params.json" | ||
with open(model_params_json, "w") as fp: | ||
json.dump( | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also use the dump_json from the utils.
QEfficient/base/modeling_qeff.py
Outdated
@@ -357,6 +388,19 @@ def _compile( | |||
logger.info(f"Running compiler: {' '.join(command)}") | |||
try: | |||
subprocess.run(command, capture_output=True, check=True) | |||
|
|||
# Dumping compile paramters in a JSON file after successful QPC compilation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- There is a typo in "paramters" line 391
- Define a method for creating compile/export params_json in utils, use the same method for creating the export and compiler json and dump using dump_json method.
99ff668
to
7b9cfba
Compare
QEfficient/base/modeling_qeff.py
Outdated
model_params = copy.deepcopy(kwargs) | ||
|
||
model_params["config"] = self.model.config.to_diff_dict() | ||
model_params["_transform_names"] = self._transform_names() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can make it as applied transforms instead of "_transform_names", Also move this method below init
QEfficient/base/modeling_qeff.py
Outdated
if hasattr(self.model.config, "architectures"): | ||
self.model_architecture = self.model.config.architectures[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would raise an attribute error if config doesnt have architectures. instead you can use
self.model_architecture = getattr(self.model.config, "architectures", [None])[0]
QEfficient/base/modeling_qeff.py
Outdated
dynamic_axes=dynamic_axes, | ||
export_kwargs=export_kwargs, | ||
onnx_transform_kwargs=onnx_transform_kwargs, | ||
export_dir=export_dir, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need export dir in hash params?
QEfficient/base/modeling_qeff.py
Outdated
@@ -210,6 +236,11 @@ def _export( | |||
finally: | |||
shutil.rmtree(tmp_onnx_dir, ignore_errors=True) | |||
|
|||
# Dump JSON file with hashed parameters | |||
hashed_params_export_path = export_dir / "hashed_model_params.json" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Suggestion) would it be better if we name it as export_model_params or something like this? Since we have compile_model_params as well
# Check if already compiled | ||
compile_hash = compile_hash.hexdigest()[:16] | ||
compile_hash, hashed_params = hash_compile_params( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the above comment as its not needed here " # Check if already compiled"
|
||
# Make Embedding specific transforms like appending pooling | ||
if pooling: | ||
self.model, _ = PoolingTransform.apply(self.model, pooling) | ||
|
||
self.model.base_model.config.use_cache = True | ||
|
||
self.pretrained_model_name_or_path = kwargs.get("pretrained_model_name_or_path", None) | ||
self.hash_params["qeff_class"] = self.__class__.__name__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: which would be better to better readability "qeff_class" or "qeff_auto_class"?
QEfficient/utils/cache.py
Outdated
@@ -39,3 +43,11 @@ def to_hashable(obj) -> bytes: | |||
default=json_serializable, | |||
sort_keys=True, | |||
).encode() | |||
|
|||
|
|||
def hash_dict_params(dict_items: Dict, hash_string_size: int = HASH_HEXDIGEST_STR_LEN): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be better to create a new python file in utils and maintain all hash related modules there.
b28d33e
to
a07406b
Compare
… QNN compilation not included yet. Cache folder mechanism has been modified to have a parent directory for a model based on the architecture that we retrieve from the model config. The hash calculation for the ONNX export now incorporates all model kwargs as well as export kwargs and parameters. the parameters that were used to create the hash also gets dumped as a serialized JSON file in the ONNX folder, the same happens for the compile parameters inside the respective qpc folder. Signed-off-by: Dhiraj Kumar Sah <quic_dhirajku@quicinc.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
This fixed the issue with higher BS compilation for SwiftKV models ``` Compiler command: ['/opt/qti-aic/exec/qaic-exec', '-aic-hw', '-aic-hw-version=2.0', '-m=/prj/qct/aisyssol_scratch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879ebc0e59ab40/LlamaSwiftKVForCausal LM.onnx', '-compile-only', '-retained-state', '-convert-to-fp16', '-aic-num-cores=16', '-network-specialization-config=/prj/qct/aisyssol_scratch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879eb c0e59ab40/qpc-60f86f912a187346/specializations.json', '-custom-IO-list-file=/prj/qct/aisyssol_scratch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879ebc0e59ab40/qpc-60f86f912a187346/custom_io.ya ml', '-mdp-load-partition-config=/prj/qct/aisyssol_scratch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879ebc0e59ab40/qpc-60f86f912a187346/mdp_ts_4.json', '-aic-binary-dir=/prj/qct/aisyssol_scra tch/users/shagsood/quic_shagun/LlamaSwiftKVForCausalLM-a5879ebc0e59ab40/qpc-60f86f912a187346/qpc'] Compiler exitcode: 1 Compiler stderr: QAIC_ERROR: Error message: [Operator-'/model/layers.16/self_attn/Reshape'] : Reshape: input shape (4, 4, 4096) and output shape (4, 1, 32, 128) have different number of elements (in 65536 vs. out 16384) Unable to AddNodesToGraphFromModel ``` Tested with BS4. Able to compile now Signed-off-by: quic-shagun <quic_shagsood@quicinc.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
CI enablement and other minor fixes for Gemma3 --------- Signed-off-by: Ann Kuruvilla <quic_akuruvil@quicinc.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Added fix for spdtransform due to change in hash --------- Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
- Enabled CI tests for Finetuning. - Updated Jenkins file to install torch_qaic as it is required during FT tests. - Added finetune as a new pytest flag and updated other existing tests not to trigger for this flag. --------- Signed-off-by: meetkuma <meetkuma@qti.qualcomm.com> Co-authored-by: Meet Patel <meetkuma@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
CI enablement and other minor fixes for Gemma3 --------- --------- Signed-off-by: Ann Kuruvilla <quic_akuruvil@quicinc.com> Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com> Co-authored-by: Dipankar Sarkar <dipankar@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Reverts quic#484 Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…boarded features in docs (quic#423) This PR is created for updating the readme and docs for adding the latest features added in this release. --------- Signed-off-by: Abukhoyer Shaik <abukhoye@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…next file. (quic#475) Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Rishin Raj <rishinr@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Padding the dataset with dummy samples (they won't contribute in total_loss) to make the #samples a multiple of degree of ddp*batch_size) in case of 1) Fine tuning through DDP 2) train_batch_size > 1 or val_batch_size > 0 --------- Signed-off-by: Swati Allabadi <sallabad@qti.qualcomm.com> Co-authored-by: Swati Allabadi <sallabad@qti.qualcomm.com> Co-authored-by: Mamta Singh <168400541+quic-mamta@users.noreply.github.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Generating data format config file fails for encoder onnx graph without past key or past value. Fixed a coding bug in the function. --------- Signed-off-by: Shubham Agrawal <shubhagr@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Changed Total (E2E) inference time from decode/sec to sec. Signed-off-by: Asmita Goswami <quic_asmigosw@quicinc.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…ng optimizer step only. (quic#477) Disabling gradient is necessary when using gradient_accumulation_step > 1 with ddp enabled. Currently, we are syncing gradient at every loss.backward() call, which is called at all steps. When using gradient accumulation, the weight update during opt.step() step. Only during that step, the gradients across each devices should sync with each other. with model.no_sync() --> context manager solves this issue. Here, we are not using it but instead setting ddp_model.require_backward_grad_sync to True or False depending on which step we are. --------- Signed-off-by: Meet Patel <meetkuma@qti.qualcomm.com> Signed-off-by: meetkuma <meetkuma@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…uic#371) 1. Implement logger for finetuning 2. enable dumping logs by given flag --------- Signed-off-by: Mamta Singh <mamtsing@qti.qualcomm.com> Co-authored-by: Mamta Singh <mamtsing@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Falcon Modeling fix to accommodate multiple config. This is a fix for falcon 40b Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…ic#482) - Removed all the references of samsum dataset from finetuning code. - Samsum dataset can be used via custom dataset path. --------- Signed-off-by: Meet Patel <meetkuma@qti.qualcomm.com> Signed-off-by: meetkuma <meetkuma@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Rishin <rishinr@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Upgrading onnx , onnxruntime ,onnscript and protobuff. Also Updating transformer to 4.52.3 1.onnx==1.18.0 2.onnxruntime==1.22 3.onnxscript==0.2.5 4. protobuff ==6.31.0 --------- Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
1. fix task_type variable in configs 2. enabled passing peft_config yaml/json file from command line. 3. updated run_ft_model.py --------- Signed-off-by: Mamta Singh <mamtsing@qti.qualcomm.com> Co-authored-by: Mamta Singh <mamtsing@qti.qualcomm.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Dhiraj Kumar Sah <quic_dhirajku@quicinc.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Dhiraj Kumar Sah <quic_dhirajku@quicinc.com> Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…or export Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Functions made to filter and modify hashes. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…m_pretrained based model creation. Will add that functionality separately after the hashing methodlogy is finalized. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…ents on naming and ordering as well. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…file to contain all hashing related methods and tools. Minor code clean ups. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
a07406b
to
c5bed92
Compare
…' instead of 'model_params'. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…into an error if that parameter doesn't exist in the config. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Minor edits to the modeling scripts. Need to confirm why we're using an exclusion list and not an inclusion list of paramters for kwargs to be hashed. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…ule for all models in Qefficient. This PR contains changes made to the modelling_qeff, modeling_auto to allow usage of certain export parameters and kwargs passed during model creation. The hashing module is now made independant of the calling class and the test scripts are updated accordingly to test for this functionality. Added functionality to have an overarching parent directory in cache to contain all different exported model configs belonging to the same architecture. In case the architecture isn't present in the config of the model, we instead proceed with self.model_name based parent directory creation. Hash is now created during export, so as to incorporate all the additional params needed for unique hash creation thus, the test scripts have been modified to test hashing functionalities accordingly. We maintain an Exclusion list of params for kwargs to be discarded during hashing parameter selection. We'll need to look into the alternate approach of maintaining an Inclusion list instead. There was a comment to use MetaClasses to handle raising a warning whenever someone loads a model without using 'from_pretrained' method but the current class architecture of VLMs and SpeechSeq2Seq models don't allow for this, this use case will be handled in a different PR. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
…ule for all models in Qefficient. This PR contains changes made to the modelling_qeff, modeling_auto to allow usage of certain export parameters and kwargs passed during model creation. The hashing module is now made independant of the calling class and the test scripts are updated accordingly to test for this functionality. Added functionality to have an overarching parent directory in cache to contain all different exported model configs belonging to the same architecture. In case the architecture isn't present in the config of the model, we instead proceed with self.model_name based parent directory creation. Hash is now created during export, so as to incorporate all the additional params needed for unique hash creation thus, the test scripts have been modified to test hashing functionalities accordingly. We maintain an Exclusion list of params for kwargs to be discarded during hashing parameter selection. We'll need to look into the alternate approach of maintaining an Inclusion list instead. There was a comment to use MetaClasses to handle raising a warning whenever someone loads a model without using 'from_pretrained' method but the current class architecture of VLMs and SpeechSeq2Seq models don't allow for this, this use case will be handled in a different PR. Signed-off-by: Dhiraj Kumar Sah <dhirajku@qti.qualcomm.com>
Detaching hash function for model cache path calculation. changes for QNN compilation not included yet.
Cache folder mechanism has been modified to have a parent directory for a model based on the architecture that we retrieve from the model config. The hash calculation for the ONNX export now incorporates all model kwargs as well as export kwargs and parameters. the parameters that were used to create the hash also gets dumped as a serialized JSON file in the ONNX folder, the same happens for the compile parameters inside the respective qpc folder.