fix view_copy kernel striding check logic #81553

bdhirsh · 2022-07-15T14:56:47Z

The composite kernel for view_copy that we generate is special-cased a bit for efficiency to avoid having to do extra clones in some cases.

That logic was slightly wrong though, and is fixed here (it needs to mirror the logic in reshape()).

It manifested as a debug assert firing for Lazy Tensor, which I confirmed no longer fires when running this script:

# ran with "python test_ltc_only_torch.py --device=lazy --sync=1 --nvtx=1"
import torch

import torch._lazy
from torch._lazy.ts_backend import init as init_ts_backend
init_ts_backend()
torch.manual_seed(42)
from transformers import BertForSequenceClassification

def parse_args():
  import argparse
  parser = argparse.ArgumentParser(description='')
  parser.add_argument('--device', type=str, default='cuda')
  parser.add_argument('--sync', type=bool, default=False)
  parser.add_argument('--nvtx', type=bool, default=False)
  return parser.parse_args()

args = parse_args()

device = args.device
model = BertForSequenceClassification.from_pretrained('bert-base-uncased', return_dict=True)

from transformers import AdamW
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
text_batch = ["I love Pixar.", "I don't care for Pixar."]
encoding = tokenizer(text_batch, return_tensors='pt', padding=True, truncation=True)
input_ids = encoding['input_ids'].to(device)
attention_mask = encoding['attention_mask'].to(device)

model = model.to(device)
model.train()

no_decay = ['bias', 'LayerNorm.weight']
optimizer_grouped_parameters = [
    {'params': [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01},
    {'params': [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)], 'weight_decay': 0.0}
]
optimizer = AdamW(optimizer_grouped_parameters, lr=1e-5)
labels = torch.tensor([1,0]).unsqueeze(0).to(device)
for _ in range(6):
  torch.cuda.nvtx.range_push(f'Iter{_}')

  torch.cuda.nvtx.range_push('F')
  outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
  if args.sync:
    torch._lazy.mark_step()
    torch._lazy.wait_device_ops()
  torch.cuda.nvtx.range_pop()

  loss = outputs.loss

  torch.cuda.nvtx.range_push('B')
  optimizer.zero_grad()
  loss.backward()
  if args.sync:
    torch._lazy.mark_step()
    torch._lazy.wait_device_ops()
  torch.cuda.nvtx.range_pop()

  torch.cuda.nvtx.range_push('O')
  optimizer.step()
  if args.sync:
    torch._lazy.mark_step()
    torch._lazy.wait_device_ops()
  torch.cuda.nvtx.range_pop()

  torch.cuda.nvtx.range_pop()
torch._lazy.mark_step()
torch._lazy.wait_device_ops()

Stack from ghstack:

-> fix view_copy kernel striding check logic #81553

[ghstack-poisoned]

facebook-github-bot · 2022-07-15T14:56:53Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/81553
📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓Need help or want to give feedback on the CI? Visit our office hours

✅ No Failures (0 Pending)

As of commit 2a9c14b (more details on the Dr. CI page):

Expand to see more

💚 💚 Looks good so far! There are no failures yet. 💚 💚

This comment was automatically generated by Dr. CI (expand for details).

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

ghstack-source-id: 233e3e3 Pull Request resolved: #81553

bdhirsh · 2022-07-19T13:43:48Z

@pytorchbot merge

pytorchmergebot · 2022-07-19T13:47:00Z

@pytorchbot successfully started a merge job. Check the current status here

github-actions · 2022-07-19T13:48:22Z

Hey @bdhirsh.
You've committed this PR, but it does not have both a 'release notes: ...' and 'topics: ...' label. Please add one of each to the PR. The 'release notes: ...' label should represent the part of PyTorch that this PR changes (fx, autograd, distributed, etc) and the 'topics: ...' label should represent the kind of PR it is (not user facing, new feature, bug fix, perf improvement, etc). The list of valid labels can be found here for the 'release notes: ...' and here for the 'topics: ...'.
For changes that are 'topic: not user facing' there is no need for a release notes label.

Summary: The composite kernel for `view_copy` that we generate is special-cased a bit for efficiency to avoid having to do extra clones in some cases. That logic was slightly wrong though, and is fixed here (it needs to mirror the logic in `reshape()`). It manifested as a debug assert firing for Lazy Tensor, which I confirmed no longer fires when running this script: ``` # ran with "python test_ltc_only_torch.py --device=lazy --sync=1 --nvtx=1" import torch import torch._lazy from torch._lazy.ts_backend import init as init_ts_backend init_ts_backend() torch.manual_seed(42) from transformers import BertForSequenceClassification def parse_args(): import argparse parser = argparse.ArgumentParser(description='') parser.add_argument('--device', type=str, default='cuda') parser.add_argument('--sync', type=bool, default=False) parser.add_argument('--nvtx', type=bool, default=False) return parser.parse_args() args = parse_args() device = args.device model = BertForSequenceClassification.from_pretrained('bert-base-uncased', return_dict=True) from transformers import AdamW from transformers import BertTokenizer tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') text_batch = ["I love Pixar.", "I don't care for Pixar."] encoding = tokenizer(text_batch, return_tensors='pt', padding=True, truncation=True) input_ids = encoding['input_ids'].to(device) attention_mask = encoding['attention_mask'].to(device) model = model.to(device) model.train() no_decay = ['bias', 'LayerNorm.weight'] optimizer_grouped_parameters = [ {'params': [p for n, p in model.named_parameters() if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01}, {'params': [p for n, p in model.named_parameters() if any(nd in n for nd in no_decay)], 'weight_decay': 0.0} ] optimizer = AdamW(optimizer_grouped_parameters, lr=1e-5) labels = torch.tensor([1,0]).unsqueeze(0).to(device) for _ in range(6): torch.cuda.nvtx.range_push(f'Iter{_}') torch.cuda.nvtx.range_push('F') outputs = model(input_ids, attention_mask=attention_mask, labels=labels) if args.sync: torch._lazy.mark_step() torch._lazy.wait_device_ops() torch.cuda.nvtx.range_pop() loss = outputs.loss torch.cuda.nvtx.range_push('B') optimizer.zero_grad() loss.backward() if args.sync: torch._lazy.mark_step() torch._lazy.wait_device_ops() torch.cuda.nvtx.range_pop() torch.cuda.nvtx.range_push('O') optimizer.step() if args.sync: torch._lazy.mark_step() torch._lazy.wait_device_ops() torch.cuda.nvtx.range_pop() torch.cuda.nvtx.range_pop() torch._lazy.mark_step() torch._lazy.wait_device_ops() ``` Pull Request resolved: #81553 Approved by: https://github.com/ezyang Test Plan: contbuild & OSS CI, see https://hud.pytorch.org/commit/pytorch/pytorch/ed0091f8db1265449f13e2bdd1647bf873bd1fea Reviewed By: jeanschmidt Differential Revision: D37990671 Pulled By: bdhirsh fbshipit-source-id: 76a8292f7050e3f24cbe5bacdc6cb8c392ddd4fd

fix view_copy kernel striding check logic

2a9c14b

[ghstack-poisoned]

facebook-github-bot added the cla signed label Jul 15, 2022

bdhirsh added a commit that referenced this pull request Jul 15, 2022

fix view_copy kernel striding check logic

c0a3d28

ghstack-source-id: 233e3e3 Pull Request resolved: #81553

bdhirsh requested a review from ezyang July 15, 2022 14:59

ezyang approved these changes Jul 16, 2022

View reviewed changes

pytorchmergebot added the Merged label Jul 19, 2022

pytorchmergebot closed this in ed0091f Jul 19, 2022

facebook-github-bot deleted the gh/bdhirsh/276/head branch July 22, 2022 14:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix view_copy kernel striding check logic #81553

fix view_copy kernel striding check logic #81553

Uh oh!

bdhirsh commented Jul 15, 2022 •

edited

Loading

Uh oh!

facebook-github-bot commented Jul 15, 2022 •

edited

Loading

Uh oh!

bdhirsh commented Jul 19, 2022

Uh oh!

pytorchmergebot commented Jul 19, 2022

Uh oh!

github-actions bot commented Jul 19, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix view_copy kernel striding check logic #81553

fix view_copy kernel striding check logic #81553

Uh oh!

Conversation

bdhirsh commented Jul 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Jul 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

✅ No Failures (0 Pending)

Uh oh!

bdhirsh commented Jul 19, 2022

Uh oh!

pytorchmergebot commented Jul 19, 2022

Uh oh!

github-actions bot commented Jul 19, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bdhirsh commented Jul 15, 2022 •

edited

Loading

facebook-github-bot commented Jul 15, 2022 •

edited

Loading