-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lora微调后输出的模型文件发生变化,导致调用微调后的模型出现错误 #390
Comments
请问一下您这边是多卡微调的吗? |
不是,是单卡,现在倒是修好了,好像是环境的问题,不过不知道怎么回事adapter_model.bin现在从四百多M变成两百多M了,但是模型效果好像没变化 |
请问下单卡多大的显存才能跑的动微调呢,我单机4卡,共90G的显存,不管是用loar还是qlora,以及分布式还是单机都一直报OOM |
我这里是单卡跑,单卡跑千问7B的lora的话,显存占用和finetune_lora_single_gpu.sh文件中的model_max_length参数有关,参数值越大显存占用越大,我这里参数值384,显存占用24G,参数值2048,显存占用45G |
降低length之后,直接报tensor错误了,请问有遇到吗,数据集也是直接用的官方的,就那两张图片,batch_size也是1,按理不会出现这种错误: |
我也降低了model_max_length,不过是基于chat模型在2张3090上做lora微调,但是效果很差,我理解应该是没微调visual模块的参数导致的,请问大佬们有微调过visual模块吗?只微调visual模块,需要多少算力? |
@InvincibleMinions 请问一下你这个问题是怎么解决的。我也是遇到同样的问题。可以提供你解决这个问题的方法吗?😊 |
调用模型IMAGE_SET的这个错误可以参考
|
原来的lora微调后模型文件
![image](https://private-user-images.githubusercontent.com/153627466/332292203-3c2ec18a-12cc-4049-8dd8-78cd85cfeb06.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA2NzcxOTQsIm5iZiI6MTcyMDY3Njg5NCwicGF0aCI6Ii8xNTM2Mjc0NjYvMzMyMjkyMjAzLTNjMmVjMThhLTEyY2MtNDA0OS04ZGQ4LTc4Y2Q4NWNmZWIwNi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzExJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcxMVQwNTQ4MTRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1mZDI5MzJiMjk3MGNkOTAwYzU2MjliOGZjZTAwOThjNTZjYmIxYmMxNTZkOWEwZTMxYzA0ZjBkMmMzZDBlYTEyJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.D5h4x0VMlnbLOnuJxeCq1EcUE4toACvRh_jmwfXkB-k)
不知道怎么回事,也没动其他的东西,最近再微调,自己就变成下面这样
![image](https://private-user-images.githubusercontent.com/153627466/332291995-30d940a3-4b87-4adc-b514-ffb3d07fa5cc.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA2NzcxOTQsIm5iZiI6MTcyMDY3Njg5NCwicGF0aCI6Ii8xNTM2Mjc0NjYvMzMyMjkxOTk1LTMwZDk0MGEzLTRiODctNGFkYy1iNTE0LWZmYjNkMDdmYTVjYy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzExJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcxMVQwNTQ4MTRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT02ODJjOGU5Y2I3NjVkM2VmOGNhYzJiNGNiYmE5MTgwZTE0N2FiMzdjNzk1NjkwNTcxNzI1Y2I5MjgxYjdiZDNjJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.4ke2RhAo7uKeawYFlWESh1xfBP_sJCL_7yysQLYz_9w)
![image](https://private-user-images.githubusercontent.com/153627466/332292441-27e86cfa-8979-4095-86b3-591694bf0081.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA2NzcxOTQsIm5iZiI6MTcyMDY3Njg5NCwicGF0aCI6Ii8xNTM2Mjc0NjYvMzMyMjkyNDQxLTI3ZTg2Y2ZhLTg5NzktNDA5NS04NmIzLTU5MTY5NGJmMDA4MS5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzExJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcxMVQwNTQ4MTRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT0wMDYyMDBjY2JhNTM1ZjUyNjY2YzVhNGUyZDhmOTA3Mzc4YzQ2ODJjZTg1ZThjMzU2NTgyZDQwZmUxZGQzNDU0JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.0UmAuRVCjTK9N8TEvaQcoTOyell2SROa0m40DhzzJag)
![image](https://private-user-images.githubusercontent.com/153627466/332292566-e2f7d605-51c9-41bd-953a-d51fadc5f96b.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjA2NzcxOTQsIm5iZiI6MTcyMDY3Njg5NCwicGF0aCI6Ii8xNTM2Mjc0NjYvMzMyMjkyNTY2LWUyZjdkNjA1LTUxYzktNDFiZC05NTNhLWQ1MWZhZGM1Zjk2Yi5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwNzExJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDcxMVQwNTQ4MTRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1kMjc0MDNhODNlNmM3Y2RkNTlmMzI0ZDZmMzRmYjM4MDkyOTQ5MTg4Yzc4ZTc2Mzk5NmFmZmRlNjgzNTU3YmFiJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.OyuOMxvpWyohXpU3ZdDVBAkkR_TfvfUfY-6B8zK2KA4)
然后微调时也报错如下:
然后调用微调后的模型也报错:
之前都没有问题,突然就这样了
The text was updated successfully, but these errors were encountered: