Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

预训练完毕后,进行lora合成报错,只训练lora参数,但是用你们提供的alpaca-lora 合成是OK的 #53

Closed
3 tasks done
musellama opened this issue Aug 2, 2023 · 11 comments
Labels

Comments

@musellama
Copy link

提交前必须检查以下项目

  • 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。
  • 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案
  • 第三方插件问题:例如llama.cpptext-generation-webui等,同时建议到对应的项目中查找解决方案

问题类型

模型推理

基础模型

Alpaca-2-7B

操作系统

Windows

详细描述问题

预训练完毕后,进行lora合成,只训练了lora的参数。但是合并后,运行推理是报错的。用你们提供的alpaca-lora是正确的。
lr=2e-4
lora_rank=64
lora_alpha=128
lora_trainable="q_proj,v_proj"
modules_to_save="embed_tokens,lm_head" 这里已经去掉
lora_dropout=0.05

pretrained_model=path/model/chinese-alpaca-2-7b-hf
chinese_tokenizer_path=path/tokenizer
dataset_dir=path/yuliao
data_cache=temp_data_cache_dir
per_device_train_batch_size=1
per_device_eval_batch_size=1
gradient_accumulation_steps=3
output_dir=output_dir
block_size=512
这是PT训练参数

依赖情况(代码类问题务必提供)

 pip list | grep -E 'transformers|peft|torch'
peft                     0.3.0.dev0
torch                    2.0.1
transformers             4.31.0

运行日志或截图

Loading path_to_output_dir...
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.56s/it]
Loaded the model in 7.85 seconds.
Loading the extension "gallery"... Ok.
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
I:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\generation\utils.py:1219: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
warnings.warn(
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [108,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [109,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [110,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [111,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [112,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [113,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [114,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [115,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [116,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [117,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [118,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [119,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [120,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [121,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [122,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [123,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [124,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [125,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [126,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [127,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [1,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [2,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [3,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [4,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [5,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [8,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [9,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [10,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [11,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [12,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [13,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [14,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [15,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [16,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [17,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [18,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [19,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [20,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [21,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [22,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [23,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [24,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [25,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [26,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [27,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [28,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [29,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [30,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1005,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [64,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [65,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [66,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [67,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [68,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [69,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [70,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [71,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [72,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [73,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [74,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [75,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [76,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [77,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [78,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [79,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [80,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [81,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [82,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [83,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [84,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [85,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [86,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [87,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [88,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [89,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [90,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [91,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [92,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [93,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [94,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [95,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [1,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [2,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [3,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [4,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [5,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [8,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [9,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [10,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [11,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [12,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [13,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [14,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [15,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [16,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [17,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [18,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [19,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [20,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [21,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [22,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [23,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [24,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [25,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [26,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [27,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [28,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [29,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [30,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [108,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [109,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [110,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [111,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [112,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [113,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [114,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [115,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [116,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [117,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [118,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [119,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [120,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [121,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [122,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [123,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [124,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [125,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [126,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [127,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [32,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [33,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [34,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [35,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [36,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [37,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [38,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [39,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [40,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [41,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [42,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [43,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [44,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [45,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [46,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [47,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [48,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [49,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [50,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [51,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [52,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [53,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [54,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [55,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [56,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [57,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [58,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [59,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [60,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [61,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [62,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1014,0,0], thread: [63,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [32,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [33,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [34,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [35,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [36,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [37,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [38,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [39,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [40,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [41,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [42,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [43,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [44,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [45,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [46,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [47,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [48,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [49,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [50,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [51,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [52,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [53,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [54,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [55,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [56,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [57,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [58,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [59,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [60,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [61,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [62,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [63,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [96,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [97,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [98,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [99,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [100,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [101,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [102,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [103,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [104,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [105,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [106,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [107,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [108,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [109,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [110,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [111,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [112,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [113,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [114,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [115,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [116,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [117,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [118,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [119,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [120,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [121,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [122,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [123,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [124,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [125,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [126,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [127,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [64,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [65,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [66,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [67,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [68,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [69,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [70,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [71,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [72,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [73,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [74,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [75,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [76,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [77,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [78,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [79,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [80,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [81,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [82,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [83,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [84,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [85,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [86,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [87,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [88,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [89,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [90,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [91,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [92,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [93,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [94,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [95,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [0,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [1,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [2,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [3,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [4,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [5,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [6,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [7,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [8,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [9,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [10,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [11,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [12,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [13,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [14,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [15,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [16,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [17,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [18,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [19,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [20,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [21,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [22,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [23,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [24,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [25,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [26,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [27,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [28,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [29,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [30,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1019,0,0], thread: [31,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [32,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [33,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [34,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [35,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [36,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [37,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [38,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [39,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [40,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [41,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [42,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [43,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [44,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [45,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [46,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [47,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [48,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [49,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [50,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [51,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [52,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [53,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [54,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [55,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [56,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [57,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [58,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [59,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [60,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [61,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [62,0,0] Assertion srcIndex < srcSelectDimSize failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:1146: block: [1004,0,0], thread: [63,0,0] Assertion srcIndex < srcSelectDimSize failed.
Traceback (most recent call last):
File "I:\oobabooga_windows\text-generation-webui\modules\callbacks.py", line 66, in gentask
ret = self.mfunc(callback=_callback, **self.kwargs)
File "I:\oobabooga_windows\text-generation-webui\modules\text_generation.py", line 290, in generate_with_callback
shared.model.generate(**kwargs)
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 1485, in generate
return self.sample(
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 2524, in sample
outputs = self(
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 687, in forward
outputs = self.model(
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 536, in forward
attention_mask = self._prepare_decoder_attention_mask(
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 464, in _prepare_decoder_attention_mask
combined_attention_mask = _make_causal_mask(
File "I:\oobabooga_windows\installer_files\env\lib\site-packages\transformers\models\llama\modeling_llama.py", line 49, in _make_causal_mask
mask = torch.full((tgt_len, tgt_len), torch.tensor(torch.finfo(dtype).min, device=device), device=device)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Output generated in 0.69 seconds (0.00 tokens/s, 0 tokens, context 38, seed 2094177833


@musellama
Copy link
Author

请帮帮我

@airaria
Copy link
Contributor

airaria commented Aug 3, 2023

请提供完整的训练和合并的sh脚本

@musellama
Copy link
Author

请提供完整的训练和合并的sh脚本

#--modules_to_save ${modules_to_save}
#--gradient_checkpointing
#lora_trainable="q_proj,v_proj,k_proj,o_proj,gate_proj,down_proj,up_proj"
lr=2e-4
lora_rank=64
lora_alpha=128
lora_trainable="q_proj,v_proj"
modules_to_save="embed_tokens,lm_head"
lora_dropout=0.05

pretrained_model=path/model/chinese-alpaca-2-7b-hf
chinese_tokenizer_path=path/tokenizer
dataset_dir=path/yuliao
data_cache=temp_data_cache_dir
per_device_train_batch_size=1
per_device_eval_batch_size=1
gradient_accumulation_steps=3
output_dir=output_dir
block_size=512

deepspeed_config_file=ds_zero2_no_offload.json

torchrun --nnodes 1 --nproc_per_node 1 run_clm_pt_with_peft.py
--deepspeed ${deepspeed_config_file}
--model_name_or_path ${pretrained_model}
--tokenizer_name_or_path ${chinese_tokenizer_path}
--dataset_dir ${dataset_dir}
--data_cache_dir ${data_cache}
--validation_split_percentage 0.001
--per_device_train_batch_size ${per_device_train_batch_size}
--per_device_eval_batch_size ${per_device_eval_batch_size}
--do_train
--seed $RANDOM
--fp16
--num_train_epochs 1
--lr_scheduler_type cosine
--learning_rate ${lr}
--warmup_ratio 0.05
--weight_decay 0.01
--logging_strategy steps
--logging_steps 10
--save_strategy steps
--save_total_limit 3
--save_steps 200
--gradient_accumulation_steps ${gradient_accumulation_steps}
--preprocessing_num_workers 8
--block_size ${block_size}
--output_dir ${output_dir}
--overwrite_output_dir
--ddp_timeout 30000
--logging_first_step True
--lora_rank ${lora_rank}
--lora_alpha ${lora_alpha}
--trainable ${lora_trainable}
--lora_dropout ${lora_dropout}
--torch_dtype float16
--ddp_find_unused_parameters False

合并脚本
python scripts/merge_llama2_with_chinese_lora_low_mem.py
--base_model path_to_original_llama2_hf_dir
--lora_model pt_lora_model
--output_type huggingface
--output_dir path_to_output_dir

合并脚本 没变动
"""
Usage:
python merge_llama2_with_chinese_lora_low_mem.py
--base_model path/to/llama2-hf-model
--lora_model path/to/chinese-llama2-or-alpaca2-lora
--output_type [huggingface|pth|]
--output_dir path/to/output-dir
"""
import argparse
import json
import os
import gc
import torch
import peft
from transformers import LlamaTokenizer
from transformers.modeling_utils import dtype_byte_size
from huggingface_hub import snapshot_download
import re

parser = argparse.ArgumentParser(description='Script to merge Llama-2-hf with Chinese LLaMA-2 or Alpaca-2 LoRA weights')
parser.add_argument('--base_model', default=None, required=True,
type=str, help="Base model path (basically Llama-2-hf)")
parser.add_argument('--lora_model', default=None, required=True,
type=str, help="LoRA model path (Chinese-LLaMA-2-LoRA, Chinese-Alpaca-2-LoRA)")
parser.add_argument('--output_type', default='huggingface',choices=['huggingface', 'pth'],
type=str, help="Output model type can be 'huggingface' (default) or 'pth' format")
parser.add_argument('--output_dir', default='./merged_model',
type=str, help="Output path for the merged model")
parser.add_argument('--verbose', default=False, action='store_true',
help="Show detailed debugging messages")

emb_to_model_size = {
4096 : '7B',
5120 : '13B',
8192 : '70B',
}
num_shards_of_models = {'7B': 1, '13B': 2, '70B': 8}
params_of_models = {
'7B':
{
"dim": 4096,
"multiple_of": 256,
"n_heads": 32,
"n_layers": 32,
"norm_eps": 1e-05,
"vocab_size": -1,
},
'13B':
{
"dim": 5120,
"multiple_of": 256,
"n_heads": 40,
"n_layers": 40,
"norm_eps": 1e-05,
"vocab_size": -1,
},
'70B':
{
"dim": 8192,
"multiple_of": 4096,
"ffn_dim_multiplier": 1.3,
"n_heads": 64,
"n_kv_heads": 8,
"n_layers": 80,
"norm_eps": 1e-05,
"vocab_size": -1,
},
}

def transpose(weight, fan_in_fan_out):
return weight.T if fan_in_fan_out else weight

Borrowed and modified from https://github.com/tloen/alpaca-lora

def translate_state_dict_key(k):
k = k.replace("base_model.model.", "")
if k == "model.embed_tokens.weight":
return "tok_embeddings.weight"
elif k == "model.norm.weight":
return "norm.weight"
elif k == "lm_head.weight":
return "output.weight"
elif k.startswith("model.layers."):
layer = k.split(".")[2]
if k.endswith(".self_attn.q_proj.weight"):
return f"layers.{layer}.attention.wq.weight"
elif k.endswith(".self_attn.k_proj.weight"):
return f"layers.{layer}.attention.wk.weight"
elif k.endswith(".self_attn.v_proj.weight"):
return f"layers.{layer}.attention.wv.weight"
elif k.endswith(".self_attn.o_proj.weight"):
return f"layers.{layer}.attention.wo.weight"
elif k.endswith(".mlp.gate_proj.weight"):
return f"layers.{layer}.feed_forward.w1.weight"
elif k.endswith(".mlp.down_proj.weight"):
return f"layers.{layer}.feed_forward.w2.weight"
elif k.endswith(".mlp.up_proj.weight"):
return f"layers.{layer}.feed_forward.w3.weight"
elif k.endswith(".input_layernorm.weight"):
return f"layers.{layer}.attention_norm.weight"
elif k.endswith(".post_attention_layernorm.weight"):
return f"layers.{layer}.ffn_norm.weight"
elif k.endswith("rotary_emb.inv_freq") or "lora" in k:
return None
else:
print(layer, k)
raise NotImplementedError
else:
print(k)
raise NotImplementedError

def unpermute(w):
return (
w.view(n_heads, 2, dim // n_heads // 2, dim).transpose(1, 2).reshape(dim, dim)
)

def save_shards(model_sd, num_shards: int, prefix="", verbose=False):
"""
Convert and save the HF format weights to PTH format weights
"""
with torch.no_grad():
if num_shards == 1:
new_state_dict = {}
for k, v in model_sd.items():
new_k = translate_state_dict_key(k)
if new_k is not None:
if "wq" in new_k or "wk" in new_k:
new_state_dict[new_k] = unpermute(v)
else:
new_state_dict[new_k] = v

        os.makedirs(output_dir, exist_ok=True)
        print(f"Saving shard 1 of {num_shards} into {output_dir}/{prefix}consolidated.00.pth")
        torch.save(new_state_dict, output_dir + f"/{prefix}consolidated.00.pth")
    else:
        new_state_dicts = [dict() for _ in range(num_shards)]
        for k in list(model_sd.keys()):
            v = model_sd[k]
            new_k = translate_state_dict_key(k)
            if new_k is not None:
                if new_k=='tok_embeddings.weight':
                    assert v.size(1)%num_shards==0
                    splits = v.split(v.size(1)//num_shards,dim=1)
                elif new_k=='output.weight':
                    if v.size(0)%num_shards==0:
                        splits = v.split(v.size(0)//num_shards,dim=0)
                    else:
                        size_list = [v.size(0)//num_shards] * num_shards
                        size_list[-1] += v.size(0)%num_shards
                        splits = v.split(size_list, dim=0)  # 13B: size_list == [24976,24977]
                elif new_k=='norm.weight':
                    splits = [v] * num_shards
                elif 'ffn_norm.weight' in new_k:
                    splits = [v] * num_shards
                elif 'attention_norm.weight' in new_k:
                    splits = [v] * num_shards


                elif 'w1.weight' in new_k:
                    splits = v.split(v.size(0)//num_shards,dim=0)
                elif 'w2.weight' in new_k:
                    splits = v.split(v.size(1)//num_shards,dim=1)
                elif 'w3.weight' in new_k:
                    splits = v.split(v.size(0)//num_shards,dim=0)


                elif 'wo.weight' in new_k:
                    splits = v.split(v.size(1)//num_shards,dim=1)

                elif 'wv.weight' in new_k:
                    splits = v.split(v.size(0)//num_shards,dim=0)

                elif "wq.weight" in new_k or "wk.weight" in new_k:
                    v = unpermute(v)
                    splits = v.split(v.size(0)//num_shards,dim=0)
                else:
                    print(f"Unexpected key {new_k}")
                    raise ValueError
                if verbose:
                    print(f"Processing {new_k}")
                for sd,split in zip(new_state_dicts,splits):
                    sd[new_k] = split.clone()
                    del split
                del splits
            del model_sd[k],v
            gc.collect()    # Effectively enforce garbage collection

        os.makedirs(output_dir, exist_ok=True)
        for i,new_state_dict in enumerate(new_state_dicts):
            print(f"Saving shard {i+1} of {num_shards} into {output_dir}/{prefix}consolidated.0{i}.pth")
            torch.save(new_state_dict, output_dir + f"/{prefix}consolidated.0{i}.pth")

def merge_shards(output_dir, num_shards: int):
ckpt_filenames = sorted([f for f in os.listdir(output_dir) if re.match('L(\d+)-consolidated.(\d+).pth',f)])

for i in range(num_shards):
    shards_filenames = sorted([f for f in ckpt_filenames if re.match(f'L(\d+)-consolidated.0{i}.pth',f)])
    print(f"Loading {shards_filenames} ...")
    shards_dicts = [torch.load(os.path.join(output_dir,fn)) for fn in shards_filenames]
    shards_merged = {}
    for d in shards_dicts:
        shards_merged |= d

    print(f"Saving the merged shard to " + os.path.join(output_dir, f"consolidated.0{i}.pth"))
    torch.save(shards_merged, os.path.join(output_dir, f"consolidated.0{i}.pth"))

    print("Cleaning up...")
    del shards_merged
    for d in shards_dicts:
        del d
    del shards_dicts
    gc.collect()    # Effectively enforce garbage collection
    for fn in shards_filenames:
        os.remove(os.path.join(output_dir,fn))

if name=='main':
args = parser.parse_args()
base_model_path = args.base_model
lora_model_path = args.lora_model
output_dir = args.output_dir
output_type = args.output_type
os.makedirs(output_dir, exist_ok=True)

print(f"="*80)
print(f"Base model: {base_model_path}")
print(f"LoRA model: {lora_model_path}")

tokenizers_and_loras = []
print(f"Loading {lora_model_path}")
if not os.path.exists(lora_model_path):
    print("Cannot find lora model on the disk. Downloading lora model from hub...")
    lora_model_path = snapshot_download(repo_id=lora_model_path)
tokenizer = LlamaTokenizer.from_pretrained(lora_model_path, legacy=True)
lora_config = peft.LoraConfig.from_pretrained(lora_model_path)
lora_state_dict = torch.load(os.path.join(lora_model_path,'adapter_model.bin'),map_location='cpu')
if 'base_model.model.model.embed_tokens.weight' in lora_state_dict:
    lora_vocab_size = lora_state_dict['base_model.model.model.embed_tokens.weight'].shape[0]
    assert lora_vocab_size==len(tokenizer), \
    (f"The vocab size of the tokenizer {len(tokenizer)} does not match the vocab size of the LoRA weight {lora_vocab_size}!\n")
tokenizers_and_loras.append(
    {
        "tokenizer"  :tokenizer,
        "state_dict" :lora_state_dict,
        "config": lora_config,
        "scaling": lora_config.lora_alpha / lora_config.r,
        "fan_in_fan_out" : lora_config.fan_in_fan_out,
    })

if not os.path.exists(base_model_path):
    print("Cannot find lora model on the disk. Downloading lora model from hub...")
    base_model_path = snapshot_download(repo_id=base_model_path)
ckpt_filenames = sorted([f for f in os.listdir(base_model_path) if re.match('pytorch_model-(\d+)-of-(\d+).bin',f)])

embedding_size = None
model_size = None
total_size = 0
for index, filename in enumerate(ckpt_filenames):
    print(f"Loading ckpt {filename}")
    state_dict = torch.load(os.path.join(base_model_path,filename), map_location='cpu')
    if index == 0:
        embedding_size = state_dict['model.embed_tokens.weight'].shape[1]
        model_size = emb_to_model_size[embedding_size]
        if output_type=='pth':
            params = params_of_models[model_size]
            num_shards = num_shards_of_models[model_size]
            n_layers = params["n_layers"]
            n_heads = params["n_heads"]
            dim = params["dim"]
            dims_per_head = dim // n_heads
            base = 10000.0
            inv_freq = 1.0 / (base ** (torch.arange(0, dims_per_head, 2).float() / dims_per_head))
    print("Merging...")
    for k in state_dict:
        for tl_idx, t_and_l in enumerate(tokenizers_and_loras):
            saved_key = 'base_model.model.'+k
            lora_key_A = saved_key.replace('.weight','.lora_A.weight')
            if saved_key in t_and_l['state_dict']:
                if args.verbose:
                    print(f"copying {saved_key} from {tl_idx}-th LoRA weight to {k}")
                state_dict[k] = t_and_l['state_dict'][saved_key].half().clone() # do we need half()?
            if lora_key_A in t_and_l['state_dict']:
                lora_key_B = lora_key_A.replace('lora_A.weight','lora_B.weight')
                if args.verbose:
                    print(f"merging {lora_key_A} and lora_B.weight form {tl_idx}-th LoRA weight to {k}")
                state_dict[k] += (
                    transpose(
                        t_and_l['state_dict'][lora_key_B].float()
                      @ t_and_l['state_dict'][lora_key_A].float(), t_and_l['fan_in_fan_out']) * t_and_l['scaling']
                )
        weight_size = state_dict[k].numel() * dtype_byte_size(state_dict[k].dtype)
        total_size += weight_size

    if output_type=='huggingface':
        print(f"Saving ckpt {filename} to {output_dir} in HF format...")
        torch.save(state_dict,os.path.join(output_dir, filename))
    elif output_type=='pth':
        print(f"Converting to pth format...")
        save_shards(model_sd=state_dict, num_shards=num_shards,prefix=f"L{index+1}-", verbose=args.verbose)
    del state_dict
    gc.collect()    # Effectively enforce garbage collection

print(f"Saving tokenizer")
tokenizers_and_loras[-1]['tokenizer'].save_pretrained(output_dir)
if output_type == 'pth':
    with open(output_dir + "/params.json", "w") as f:
        print(f"Saving params.json into {output_dir}/params.json")
        json.dump(params, f)
    merge_shards(output_dir, num_shards=num_shards)

if output_type=='huggingface':
    configs = ('config.json', 'generation_config.json', 'pytorch_model.bin.index.json')
    for config in configs:
        if os.path.exists(os.path.join(base_model_path, config)):
            print(f"Saving {config}")
            with open(os.path.join(base_model_path, config),'r') as f:
                obj = json.load(f)
            if config=='config.json':
                obj['vocab_size'] = len(tokenizers_and_loras[-1]['tokenizer'])
            if config=='pytorch_model.bin.index.json':
                obj['metadata']['total_size'] = total_size
            with open(os.path.join(output_dir, config), 'w') as f:
                json.dump(obj, f, indent=2)
print("Done.")
print(f"Check output dir: {output_dir}")

@iMountTai
Copy link
Collaborator

上面的报错信息是text-generation-webui的,不是合并lora报错的呀。你的意思是合并没出问题,但是推理时出了问题?

@musellama
Copy link
Author

是的不是 lora 合并出的问题 是推理的时候 出的问题,用 alpaca-lora 合并出来的 能跑,用pt完的就不能跑了。

@airaria
Copy link
Contributor

airaria commented Aug 3, 2023

合并脚本
python scripts/merge_llama2_with_chinese_lora_low_mem.py
--base_model path_to_original_llama2_hf_dir
--lora_model pt_lora_model
--output_type huggingface
--output_dir path_to_output_dir

是基于原版llama-2合并?你是在chinese-alpaca-2上训练lora的,--base_model应该指向chinese-alpaca-2

@musellama
Copy link
Author

是基于原版llama-2合并的,训练是在chinese-alpaca-2上训练lora的,就是在合并的时候 应该指向 chinese-alpaca-2 对吧

@airaria
Copy link
Contributor

airaria commented Aug 3, 2023

是基于原版llama-2合并的,训练是在chinese-alpaca-2上训练lora的,就是在合并的时候 应该指向 chinese-alpaca-2 对吧

是的

@musellama
Copy link
Author

是基于原版llama-2合并的,训练是在chinese-alpaca-2上训练lora的,就是在合并的时候 应该指向 chinese-alpaca-2 对吧

是的
OK 感谢解答问题,那如果多个lora合并的话,是不是不能跟以前一样 用,分隔 模式了,从而变成了 , 多次合并呢?
训练基于 llama2 , 合并 指向 llama2
在训练基于 合并后的,再次合并 指向 训练的这个呢?

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your consideration.

@github-actions github-actions bot added the stale label Aug 13, 2023
@github-actions
Copy link

Closing the issue, since no updates observed. Feel free to re-open if you need any further assistance.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 18, 2023
@ymcui ymcui closed this as completed Aug 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants