Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR] cover static runtime for ops that uses getitem instead of listunpack #124148

Closed

Conversation

lakshmananrm1993
Copy link

@lakshmananrm1993 lakshmananrm1993 commented Apr 16, 2024

Summary: cover static runtime for ops that uses getitem instead of listunpack

Differential Revision: D55924256

Copy link

pytorch-bot bot commented Apr 16, 2024

This appears to be a diff that was exported from phabricator, but the PR author does not have sufficient permissions to run CI. @lakshmananrm1993, please do step 2 of internal wiki to get write access so you do not need to get CI approvals in the future. If you think this is a mistake, please contact the Pytorch Dev Infra team.

Copy link

linux-foundation-easycla bot commented Apr 16, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: lakshmananrm1993 (e7bbf7d)

@pytorch-bot pytorch-bot bot added the release notes: jit release notes category label Apr 16, 2024
Copy link

pytorch-bot bot commented Apr 16, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/124148

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit e7bbf7d with merge base 402b289 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@lakshmananrm1993 lakshmananrm1993 changed the title [SR] fix SR for op gb::equally_split for jit scripted model [SR] cover static runtime for ops that uses getitem instead of listtupleunpack Apr 16, 2024
@lakshmananrm1993 lakshmananrm1993 changed the title [SR] cover static runtime for ops that uses getitem instead of listtupleunpack [SR] cover static runtime for ops that uses getitem instead of listunpack Apr 16, 2024
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

Copy link
Contributor

@swolchok swolchok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall concerned about it being easier to create cases it looks like this code didn't consider using __getitem__

Comment on lines 815 to 829
Node* list_unpack_node = value_out->uses()[0].user;
if (list_unpack_node->kind() != prim::ListUnpack) {
Node* user_node = value_out->uses()[0].user;
if (user_node->kind() != prim::ListUnpack && user_node->kind() != aten::__getitem__) {
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this worries me. 1) what if there is more than one __getitem__? as written it looks like we will just leave the other ones around?
2) what if the number of __getitem__ calls in the input program doesn't match the size of the output list?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i changed it to work only fb::equally_split. This has an argument num_splits and I am using that to do all the following checks

  1. same as number of uses of this module
  2. all the uses are get_item
  3. get_item uses is not duplicated

only after this I do the fuse

torch/csrc/jit/runtime/static/passes.cpp Outdated Show resolved Hide resolved
torch/csrc/jit/runtime/static/passes.cpp Outdated Show resolved Hide resolved
torch/csrc/jit/runtime/static/passes.cpp Show resolved Hide resolved

std::vector<std::tuple<Node*, ArrayRef<Value*>>> user_to_remove;

for(auto index = 0; index < value_out->uses().size(); index++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use c10::irange

torch/csrc/jit/runtime/static/passes.cpp Outdated Show resolved Hide resolved
torch/csrc/jit/runtime/static/passes.cpp Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@lakshmananrm1993
Copy link
Author

overall concerned about it being easier to create cases it looks like this code didn't consider using __getitem__

I have changed it to cover the scenarios. I do the fuse only if the following validations succeed and I changed it to work only for equally_split

same as number of uses of this module
all the uses are get_item
get_item uses is not duplicated


const auto num_splits_ivalue = toIValue(value_num_splits);

// check if equally_split num of splits is 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment repeats the code; delete

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

// Its only for fb::equally_split operator
// the operators results are being accessed by get_item rather than ListUnpack
void FuseEquallySplitGetItemUnpack(std::shared_ptr<torch::jit::Graph>& graph) {
// replacement contains (old_node, new_node, list_unpack_node)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stale comment, delete

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

const Value* value_num_splits = node->inputs()[1];

// check if equally_split num of splits is equal to the number of output uses
if(!toIValue(value_num_splits).has_value() || !toIValue(value_num_splits).value().isInt() || toIValue(value_num_splits).value().toInt() != value_out->uses().size()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't call toIValue repeatedly; we should not be requiring LTO for good performance

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replaced it with constant_as()


const Value* value_num_splits = node->inputs()[1];

// check if equally_split num of splits is equal to the number of output uses
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment repeats the code; delete

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

break;
}

auto getItem_indice_out = toIValue(user->inputs()[1]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be getItem_index_out

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use constant_as<int>() as recommended above

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 973 to 983
bool should_fuse = true;

std::vector<UserToRemove> user_to_remove_list;

std::unordered_set<std::int64_t> getItem_indices;

for(auto index = 0; index < value_out->uses().size(); index++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please do not put so many blank lines; keep style consistent with the surrounding code

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the not needed blank lines


std::vector<UserToRemove> user_to_remove_list;

std::unordered_set<std::int64_t> getItem_indices;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use c10::FastSet

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


LOG(INFO) << "Found fb::equally_split node2: " << user->kind().toQualString();

getItem_indice_out.emplace(getItem_indice_out);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this does nothing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should have been getItem_Indices. Updated it now

break;
}

LOG(INFO) << "Found fb::equally_split node2: " << user->kind().toQualString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why node2 and not just node?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was testing comment. removed it

Comment on lines 1025 to 1054
for(const auto& replacement : replacement_list) {
auto* new_node = graph->create(replacement.new_sym, 0);
for (Value* in : replacement.node_to_be_fused->inputs()) {
new_node->addInput(in);
}
for(const auto& user_to_remove_node : replacement.user_to_remove) {
for (Value* out : user_to_remove_node.user_outputs) {
Value* new_out = new_node->addOutput();
new_out->copyMetadata(out);
out->replaceAllUsesWith(new_out);
}
user_to_remove_node.user->destroy();
}
new_node->insertAfter(replacement.node_to_be_fused);
replacement.node_to_be_fused->destroy();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be easier for me to review this for correctness if it ended up looking the same as the FuseListUnpack path. If you need to send a separate stacked PR to get rid of the std::tuple in FuseListUnpack, go for it

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

lakshmananrm1993 added a commit to lakshmananrm1993/pytorch that referenced this pull request May 23, 2024
…pack (pytorch#124148)

Summary:

cover static runtime for ops that uses getitem instead of listunpack.

Reviewed By: snabelkabiya, swolchok

Differential Revision: D55924256
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

lakshmananrm1993 added a commit to lakshmananrm1993/pytorch that referenced this pull request Jun 4, 2024
…pack (pytorch#124148)

Summary:

cover static runtime for ops that uses getitem instead of listunpack.

Reviewed By: snabelkabiya, swolchok

Differential Revision: D55924256
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

lakshmananrm1993 added a commit to lakshmananrm1993/pytorch that referenced this pull request Jun 4, 2024
…pack (pytorch#124148)

Summary:

cover static runtime for ops that uses getitem instead of listunpack.

Reviewed By: snabelkabiya, swolchok

Differential Revision: D55924256
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

…pack (pytorch#124148)

Summary:
Pull Request resolved: pytorch#124148

cover static runtime for ops that uses getitem instead of listunpack.

Reviewed By: snabelkabiya, swolchok

Differential Revision: D55924256
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D55924256

Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the Stale label Aug 10, 2024
@github-actions github-actions bot closed this Sep 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk Trigger trunk jobs on your pull request fb-exported release notes: jit release notes category Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants