Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add calls to reserve() before populating vectors #51739

Closed
wants to merge 24 commits into from

Conversation

SamuelMarks
Copy link
Contributor

…and am at capacity

PS: WiP. Will finish going through you codebase adding capacity hints to all vectors with obvious opportunity for this optimisation.

…sorflow/cc/ops/while_loop.cc,tensorflow/compiler/jit/extract_outside_compilation_pass.cc,tensorflow/compiler/mlir/tools/kernel_gen/transforms/gpu_kernel_to_blob_pass.cc] Space height
@google-ml-butler google-ml-butler bot added the size:XS CL Change Size: Extra Small label Aug 30, 2021
@google-ml-butler google-ml-butler bot added the awaiting review Pull request awaiting review label Aug 30, 2021
@google-cla google-cla bot added the cla: yes label Aug 30, 2021
@gbaned gbaned self-assigned this Aug 30, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Aug 30, 2021
@gbaned gbaned requested a review from cheshire October 1, 2021 14:43
@cheshire cheshire removed their request for review October 1, 2021 16:13
@SamuelMarks
Copy link
Contributor Author

Following the huge amount of interest from @sanjoy @kkimdev @sherhut @qqfish and @joker-eph—over the past 5 weeks—I merged in the latest master and reserved vector allocation to another 82 files in the TensorFlow codebase.

@joker-eph
Copy link
Contributor

Nice! I had missed this pull-request originally.

It is likely that I read the title as some spam somehow though, can you title this explicitly: Add calls to `reserve()` before populating vectors or something like that.

@SamuelMarks SamuelMarks changed the title I have reservations Add calls to reserve() before populating vectors Oct 4, 2021
@SamuelMarks
Copy link
Contributor Author

123 CAN HAZ REZERVATIONS

Sure things @joker-eph ; and whilst I was at it I finished the cc files in the compiler dir

…l_ir_emitter}.cc,tensorflow/compiler/tf2xla/kernels/cross_op.cc] Properly reserve vector space
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Oct 4, 2021
PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Oct 4, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Oct 4, 2021
@gbaned gbaned removed awaiting review Pull request awaiting review ready to pull PR ready for merge process labels Oct 4, 2021
@joker-eph
Copy link
Contributor

so I didn't have much option left 😕

There is a wide spectrum between "everything in one PR" and "one PR per file": you went from one extreme to the other ; I'm saying there is also the possibility to exercise some reasonable judgement.
For example it is quite common that a large software has many components, and different people working on these various component. Sharding across these lines helps having the right people reviewing the right PR, without exploding the overhead of so many PR to click through individually.

(I'm puzzled how you still like the one big PR after suffering through rebase / merge conflicts... the larger the code change the more likely it is to suffer through these).

@bhack
Copy link
Contributor

bhack commented Oct 15, 2021

For example it is quite common that a large software has many components, and different people working on these various component.

This is true but if we give more care in the maintainership of our codeowners file it could be easier to identifiy 1 PR x component logic:

https://github.com/tensorflow/tensorflow/blob/master/CODEOWNERS

Currently It seems to me quite partial.

@james-martens
Copy link
Member

james-martens commented Oct 15, 2021 via email

@bhack
Copy link
Contributor

bhack commented Oct 15, 2021

@james-martens I don't see your user name in any comment.

@james-martens
Copy link
Member

james-martens commented Oct 15, 2021 via email

@bhack
Copy link
Contributor

bhack commented Oct 15, 2021

OK I think it was a typo searching for the alias auto-completion mark->mart

@joker-eph
Copy link
Contributor

For example it is quite common that a large software has many components, and different people working on these various component.

This is true but if we give more care in the maintainership of our codeowners file it could be easier to identifiy 1 PR x component logic:

https://github.com/tensorflow/tensorflow/blob/master/CODEOWNERS

Currently It seems to me quite partial.

Thanks @bhack, I wasn't aware of this file. There is another tool we have that auto-assign based on path (for example I get all reviews in tensorflow/compiler/mlir auto-assigned. Seems like we could use this CODEOWNERS files instead!

@mihaimaruseac
Copy link
Collaborator

The auto-assignment is via a github bot configured by the GitHub team (cc @gbaned ) https://github.com/tensorflow/tensorflow/blob/master/.github/bot_config.yml

@bhack
Copy link
Contributor

bhack commented Oct 15, 2021

Thanks @bhack, I wasn't aware of this file. There is another tool we have that auto-assign based on path (for example I get all reviews in tensorflow/compiler/mlir auto-assigned. Seems like we could use this CODEOWNERS files instead!

I think that could be useful to maintain a reference github account for every folder. If not in CODEOWNERS as we don't want to notify directly team members in another reference file.

This could help up to write in the contribution guide how to segment a code contribution in multiple PR, like this one, for the single team review unit if and when this is possible.

@bhack
Copy link
Contributor

bhack commented Oct 15, 2021

This was transformed in 116 open PRs.
I suggest that they could be aggregated by folder as this approach it is also going to spam CI resources.

@bhack
Copy link
Contributor

bhack commented Oct 15, 2021

Just to make an example:

git fetch origin pull/51739/head:reserve
git diff --dirstat=files,0 reserve
   0.3% tensorflow/c/eager/
   0.3% tensorflow/c/experimental/filesystem/plugins/gcs/
   1.9% tensorflow/c/
   0.7% tensorflow/cc/gradients/
   0.3% tensorflow/cc/ops/
   0.7% tensorflow/cc/saved_model/
   0.3% tensorflow/compiler/aot/
   2.2% tensorflow/compiler/jit/
   1.1% tensorflow/compiler/mlir/hlo/include/mlir-hlo/Dialect/mhlo/IR/
   0.3% tensorflow/compiler/mlir/hlo/lib/Analysis/
   0.3% tensorflow/compiler/mlir/hlo/
   0.3% tensorflow/compiler/mlir/lite/ir/
   0.3% tensorflow/compiler/mlir/lite/transforms/
   0.3% tensorflow/compiler/mlir/python/
   1.1% tensorflow/compiler/mlir/tensorflow/ir/
   2.2% tensorflow/compiler/mlir/tensorflow/tests/
  13.3% tensorflow/compiler/mlir/tensorflow/transforms/
   0.3% tensorflow/compiler/mlir/tensorflow/utils/
   0.3% tensorflow/compiler/mlir/tensorflow/
   0.3% tensorflow/compiler/mlir/tfr/examples/mnist/
   0.3% tensorflow/compiler/mlir/tfr/integration/
   0.3% tensorflow/compiler/mlir/tfrt/jit/
   0.3% tensorflow/compiler/mlir/tfrt/python_tests/regression_tests/
   0.3% tensorflow/compiler/mlir/tfrt/python_tests/
   0.3% tensorflow/compiler/mlir/tfrt/tests/tf_to_corert/
   0.7% tensorflow/compiler/mlir/tfrt/tests/tf_to_tfrt_data/
   0.3% tensorflow/compiler/mlir/tfrt/transforms/lmhlo_to_gpu/
   0.3% tensorflow/compiler/mlir/tfrt/
   1.5% tensorflow/compiler/mlir/tools/kernel_gen/transforms/
   0.3% tensorflow/compiler/mlir/tools/kernel_gen/
   0.3% tensorflow/compiler/mlir/xla/experimental/conv_emitter/
   0.3% tensorflow/compiler/mlir/xla/ir/
   0.3% tensorflow/compiler/mlir/xla/tests/
   0.7% tensorflow/compiler/mlir/xla/transforms/
   0.3% tensorflow/compiler/mlir/
   0.3% tensorflow/compiler/tests/
   1.1% tensorflow/compiler/tf2tensorrt/convert/
   0.3% tensorflow/compiler/tf2tensorrt/kernels/
   4.1% tensorflow/compiler/tf2xla/kernels/
   0.3% tensorflow/compiler/tf2xla/lib/
   2.2% tensorflow/compiler/tf2xla/
   2.2% tensorflow/compiler/xla/client/lib/
   1.5% tensorflow/compiler/xla/client/
   0.3% tensorflow/compiler/xla/pjrt/
   0.7% tensorflow/compiler/xla/python/tpu_driver/
   1.5% tensorflow/compiler/xla/python/xla_extension/
   1.5% tensorflow/compiler/xla/python/
   1.9% tensorflow/compiler/xla/service/cpu/
   0.7% tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/
   1.1% tensorflow/compiler/xla/service/gpu/tests/
   2.2% tensorflow/compiler/xla/service/gpu/
   0.3% tensorflow/compiler/xla/service/interpreter/
   1.9% tensorflow/compiler/xla/service/spmd/
  12.9% tensorflow/compiler/xla/service/
   4.1% tensorflow/compiler/xla/tests/
   1.9% tensorflow/compiler/xla/
   0.3% tensorflow/compiler/xrt/kernels/
   0.3% tensorflow/compiler/xrt/tests/
   0.3% tensorflow/core/common_runtime/eager/
   0.3% tensorflow/core/common_runtime/
   0.3% tensorflow/core/distributed_runtime/eager/
   0.3% tensorflow/core/framework/
   0.3% tensorflow/core/kernels/
   0.3% tensorflow/core/ops/
   0.7% tensorflow/core/profiler/utils/
   1.1% tensorflow/core/protobuf/
   0.3% tensorflow/core/public/
   1.1% tensorflow/core/runtime_fallback/kernel/
   0.3% tensorflow/core/runtime_fallback/runtime/
   0.3% tensorflow/core/runtime_fallback/
   0.3% tensorflow/go/op/
   0.3% tensorflow/lite/kernels/
   0.3% tensorflow/lite/toco/graph_transformations/
   0.3% tensorflow/python/autograph/impl/
   0.3% tensorflow/python/compat/
   0.3% tensorflow/python/data/ops/
   0.3% tensorflow/python/distribute/integration_test/
   1.1% tensorflow/python/distribute/
   1.5% tensorflow/python/eager/
   1.1% tensorflow/python/framework/
   0.7% tensorflow/python/keras/estimator/
   0.3% tensorflow/python/keras/
   0.3% tensorflow/python/training/
   0.3% tensorflow/python/types/
   0.7% tensorflow/tools/api/golden/v1/
   0.7% tensorflow/tools/api/golden/v2/
   0.3% tensorflow/tools/api/lib/
   0.7% tensorflow/tools/ci_build/
   0.7% tensorflow/
   0.3% third_party/llvm/
   0.3% third_party/tf_runtime/

So what kind of PR clustering on folders do you suggest?:

  1. tensorflow/c
  2. tensorflow/cc
  3. tensorflow/compiler/jit/
  4. tensorflow/compiler/mlir
  5. tensorflow/compiler/xla
  6. ....ETC...

@bhack
Copy link
Contributor

bhack commented Oct 16, 2021

P.s. now we have 118 PRs

@mihaimaruseac
Copy link
Collaborator

5 PRs (compiler/mlir, compiler/xla, python, core, everything else) would be a good compromise I think

@SamuelMarks
Copy link
Contributor Author

@mihaimaruseac So I made #52532 through:

git checkout master
git checkout -b 'tensorflow.compiler.xla'
git branch -a | grep -F ' tensorflow.compiler.xla' | xargs -n 1 git merge
git push --set-upstream offscale
gh pr create --title '[tensorflow/compiler/xla/**/*.cc] Add calls to `reserve()` before populating vectors' \
             --body '#51739#issuecomment-945027209 told me to merge into one PR per "large module/namespace"'

Is that correct? - If so, I'll do the same for the others.

PS: I purposefully didn't squash… do you want me to, or are you happy to just use the GitHub button shortcut?

@bhack
Copy link
Contributor

bhack commented Oct 18, 2021

Now it think that your changes are better aggregated.

@SamuelMarks
Copy link
Contributor Author

So what do you want me to do?

@bhack
Copy link
Contributor

bhack commented Oct 18, 2021

So what do you want me to do?

That you run tests for your PR locally as I've already commented at #52532 (comment)

@mihaimaruseac
Copy link
Collaborator

Let's keep it now to one PR per file as those have been reviewed already and are in the pipeline. Assuming they build things should progress from here.

In the future though, please split per directories, #52532 is still quite large. Also, please run at least a bazel build locally with the PR to make sure it builds.

@bhack
Copy link
Contributor

bhack commented Oct 18, 2021

If we suppose a split per directories as general advice we are going to still generate 91 PRs in cases like this on. No too much different that the 118 PR generated by 1 file x PR

@joker-eph
Copy link
Contributor

I don't think there is an automated way to tell where to split: this is a semantic kind of thing, for example under tensorflow/compiler/ every single directory could be a separate component, similar under tensorflow/core, but not under tensorflow/c.
In general with minimal judgement it isn't too hard to figure out a reasonable grouping.
It is also likely not the common case to have contributions from people who don't really understand the software's high-level components organization.

@bhack
Copy link
Contributor

bhack commented Oct 18, 2021

I don't think there is an automated way to tell where to split: this is a semantic kind of thing, for example under tensorflow/compiler/ every single directory could be a separate component, similar under tensorflow/core, but not under tensorflow/c.

That's why in my previous comment I preferred to have a reference fie on github with a github reference team or user account for each component. You could also use a file different from CODEOWNERS.md if don't want to be notified with that logic.

But in that way ware are almost aware how the code is organized on your side at a semantic level.

@mihaimaruseac
Copy link
Collaborator

I think CODEOWNERS is orthogonal. One is for automatic assignment to reviewers, the other is using judgement to split large PRs.

@bhack
Copy link
Contributor

bhack commented Oct 18, 2021

But also now we partially use the CODEOWNERS default github assigment/notification logic but in other cases we use a more complex triaging/notification/assigmnet logic as you have mentioned in #51739 (comment).

What I meant here is that as we don't have an unique traditional assignment file, it would be nice to have a file in the repository where we maintain the proxy ownership and segmentation of folders/components with a so large project like Tensorflow as this is also not documented in the official website so we don't have any source on this topic.

If you think that then the community will abuse notification on these Github alias I think that you could also omit them and just push, in that file, an overview of the folders components semantic.

@mihaimaruseac
Copy link
Collaborator

So most PRs have been merged. Can you go through the comments on the remaining ones and try to fix them? There are a few with no comments that are going through the pipeline right now but given we've had 100 PRs this resulted in somewhat of a denial of service on CI runners so it will take a while.

@SamuelMarks
Copy link
Contributor Author

Happy to field test your CI runners =)

@mihaimaruseac Just double-checked and did a bunch of comments & commits. That should cover all open—and a couple of closed—PRs.

copybara-service bot pushed a commit to tensorflow/mlir-hlo that referenced this pull request Oct 26, 2021
…is.cc] Add calls to `reserve()` before populating vector

Imported from GitHub PR tensorflow/tensorflow#52466

tensorflow/tensorflow#51739 (comment) told me to split the larger PR into one PR per file; thus this (thanks `bash`, `git` and `gh`!)
Copybara import of the project:

--
cc99fd8ad768f2354674d130357a3efcce9ba475 by Samuel Marks <807580+SamuelMarks@users.noreply.github.com>:

[tensorflow/compiler/mlir/hlo/lib/Analysis/userange_analysis.cc] Add calls to `reserve()` before populating vectors

PiperOrigin-RevId: 405540491
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes size:XS CL Change Size: Extra Small
Projects
PR Queue
  
Closed/Rejected
Development

Successfully merging this pull request may close these issues.

None yet