New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add calls to reserve()
before populating vectors
#51739
Conversation
…sorflow/cc/ops/while_loop.cc,tensorflow/compiler/jit/extract_outside_compilation_pass.cc,tensorflow/compiler/mlir/tools/kernel_gen/transforms/gpu_kernel_to_blob_pass.cc] Space height
…vice/hlo_instruction.cc}] Correctly constrain vectors to explicit reserves
Following the huge amount of interest from @sanjoy @kkimdev @sherhut @qqfish and @joker-eph—over the past 5 weeks—I merged in the latest master and reserved vector allocation to another 82 files in the TensorFlow codebase. |
Nice! I had missed this pull-request originally. It is likely that I read the title as some spam somehow though, can you title this explicitly: |
reserve()
before populating vectors
Sure things @joker-eph ; and whilst I was at it I finished the cc files in the compiler dir |
…l_ir_emitter}.cc,tensorflow/compiler/tf2xla/kernels/cross_op.cc] Properly reserve vector space
There is a wide spectrum between "everything in one PR" and "one PR per file": you went from one extreme to the other ; I'm saying there is also the possibility to exercise some reasonable judgement. (I'm puzzled how you still like the one big PR after suffering through rebase / merge conflicts... the larger the code change the more likely it is to suffer through these). |
This is true but if we give more care in the maintainership of our codeowners file it could be easier to identifiy 1 PR x component logic: https://github.com/tensorflow/tensorflow/blob/master/CODEOWNERS Currently It seems to me quite partial. |
Hi,
I'm not sure why I was added to this thread. Perhaps someone typed the
wrong name?
James
…On Fri, Oct 15, 2021 at 12:33 PM bhack ***@***.***> wrote:
For example it is quite common that a large software has many components,
and different people working on these various component.
This is true but if we give more care in the maintainership of our
codeowners file it could be easier to identifiy 1 PR x component logic:
https://github.com/tensorflow/tensorflow/blob/master/CODEOWNERS
Currently It seems to me quite partial.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#51739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAXP7MLCO24S7LM2JTGJXLUHAGQDANCNFSM5DAYH53A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
@james-martens I don't see your user name in any comment. |
It was in Thea Lamkin's message from yesterday:
***@***.*** <https://github.com/james-martens> I recommend you follow
@akuegel <https://github.com/akuegel> and @joker-eph
<https://github.com/joker-eph>'s request to break this PR up, at least
by narrowing down to your working changes."
…On Fri, Oct 15, 2021 at 1:52 PM bhack ***@***.***> wrote:
@james-martens <https://github.com/james-martens> I don't see your user
name in any comment.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#51739 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACAXP7NSBUPKA34KQYU62I3UHAPZRANCNFSM5DAYH53A>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
OK I think it was a typo searching for the alias auto-completion |
Thanks @bhack, I wasn't aware of this file. There is another tool we have that auto-assign based on path (for example I get all reviews in tensorflow/compiler/mlir auto-assigned. Seems like we could use this CODEOWNERS files instead! |
The auto-assignment is via a github bot configured by the GitHub team (cc @gbaned ) https://github.com/tensorflow/tensorflow/blob/master/.github/bot_config.yml |
I think that could be useful to maintain a reference github account for every folder. If not in CODEOWNERS as we don't want to notify directly team members in another reference file. This could help up to write in the contribution guide how to segment a code contribution in multiple PR, like this one, for the single team review unit if and when this is possible. |
This was transformed in 116 open PRs. |
Just to make an example:
So what kind of PR clustering on folders do you suggest?:
|
P.s. now we have 118 PRs |
5 PRs ( |
@mihaimaruseac So I made #52532 through: git checkout master
git checkout -b 'tensorflow.compiler.xla'
git branch -a | grep -F ' tensorflow.compiler.xla' | xargs -n 1 git merge
git push --set-upstream offscale
gh pr create --title '[tensorflow/compiler/xla/**/*.cc] Add calls to `reserve()` before populating vectors' \
--body '#51739#issuecomment-945027209 told me to merge into one PR per "large module/namespace"' Is that correct? - If so, I'll do the same for the others. PS: I purposefully didn't squash… do you want me to, or are you happy to just use the GitHub button shortcut? |
Now it think that your changes are better aggregated. |
So what do you want me to do? |
That you run tests for your PR locally as I've already commented at #52532 (comment) |
Let's keep it now to one PR per file as those have been reviewed already and are in the pipeline. Assuming they build things should progress from here. In the future though, please split per directories, #52532 is still quite large. Also, please run at least a |
If we suppose a split per directories as general advice we are going to still generate 91 PRs in cases like this on. No too much different that the 118 PR generated by 1 file x PR |
I don't think there is an automated way to tell where to split: this is a semantic kind of thing, for example under |
That's why in my previous comment I preferred to have a reference fie on github with a github reference team or user account for each component. You could also use a file different from But in that way ware are almost aware how the code is organized on your side at a semantic level. |
I think CODEOWNERS is orthogonal. One is for automatic assignment to reviewers, the other is using judgement to split large PRs. |
But also now we partially use the What I meant here is that as we don't have an unique traditional assignment file, it would be nice to have a file in the repository where we maintain the proxy ownership and segmentation of folders/components with a so large project like Tensorflow as this is also not documented in the official website so we don't have any source on this topic. If you think that then the community will abuse notification on these Github alias I think that you could also omit them and just push, in that file, an overview of the folders components semantic. |
So most PRs have been merged. Can you go through the comments on the remaining ones and try to fix them? There are a few with no comments that are going through the pipeline right now but given we've had 100 PRs this resulted in somewhat of a denial of service on CI runners so it will take a while. |
Happy to field test your CI runners =) @mihaimaruseac Just double-checked and did a bunch of comments & commits. That should cover all open—and a couple of closed—PRs. |
…is.cc] Add calls to `reserve()` before populating vector Imported from GitHub PR tensorflow/tensorflow#52466 tensorflow/tensorflow#51739 (comment) told me to split the larger PR into one PR per file; thus this (thanks `bash`, `git` and `gh`!) Copybara import of the project: -- cc99fd8ad768f2354674d130357a3efcce9ba475 by Samuel Marks <807580+SamuelMarks@users.noreply.github.com>: [tensorflow/compiler/mlir/hlo/lib/Analysis/userange_analysis.cc] Add calls to `reserve()` before populating vectors PiperOrigin-RevId: 405540491
…and am at capacity
PS: WiP. Will finish going through you codebase adding capacity hints to all vectors with obvious opportunity for this optimisation.