-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update the instructions for upgrade of LLVM/Clang. #1146
Conversation
Also added a new file LLVM-Upgrade-Notes.md to track important information related to upgrades.
You can now branch your baseline branches to create a new master branch: | ||
|
||
git checkout -b updated-master | ||
git merge updated_baseline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git merge updated_baseline | |
git merge --ff-only updated_baseline |
We expect updated_baseline
to be ahead of baseline
. If it's not, we should flag the problem for investigation rather than generate a merge commit.
branch into `master` and push to the remote repository. | ||
``` | ||
git checkout master | ||
git merge updated_baseline_master_12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git merge updated_baseline_master_12 | |
git merge --ff-only updated_baseline_master_12 |
You probably also want to make sure that this command advances master
exactly to updated_baseline_master_12
(which was tested in the previous step) rather than generating a merge commit (which could fail the tests if Checked C feature development on master
interfered somehow with the baseline upgrade).
which was merged into `updated_baseline_master_12`. | ||
- The pristine LLVM/Clang source at <branch-point-of-11-on-main> | ||
- The pristine LLVM/Clang source at <branch-point-of-12-on-main> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wholeheartedly agree with the value of being able to see all three versions of a file involved in a merge. However, the description of the three versions here is accurate only for the initial merge from LLVM 11 to 12 in phase 1 step 6. For a merge of additional Checked C feature development in phase 1 step 8, the relevant versions are master
, the merge of the previous master
with the baseline LLVM 12 source that was completed in step 6 or an earlier iteration of step 8, and the previous master
that was merged.
We are already assuming that git merge
correctly identifies the three file versions to merge for files without conflicts, so for files that have pending conflicts, it is probably best to view the three file versions already identified by Git rather than try to look them up manually (which could be prone to mistakes/misunderstandings). I know of a few ways to do that, which might be worth documenting in this file:
- Best: Run
git mergetool
, assuming you have a suitable merge tool installed and configured. I currently use kdiff3, but I don't know if it works on Windows or if other tools are better. - Set
git config merge.conflictstyle diff3
before runninggit merge
so that the conflict markup written to the file shows all three versions rather than just two. I recommend leaving this option on all the time; I've found that the extra information never hurts. - Run
git show :1:clang/path/to/some/File.cpp >clang/path/to/some/File.1.cpp
(and similarly with 2 or 3 in place of 1) to write out the three versions for inspection with other tools.
``` | ||
9. After all the merge conflicts and test failures on the local machine are | ||
fixed, push the `baseline` and `updated_baseline_master_12` branches to the | ||
remote repository and excute the ADO tests on `updated_baseline_master_12`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remote repository and excute the ADO tests on `updated_baseline_master_12`. | |
remote repository and execute the ADO tests on `updated_baseline_master_12`. |
**Phase 1**: We upgrade the `master` branch of the `checkedc-clang` repository | ||
up to the commit hash on the `main` branch of the LLVM/Clang repository at which | ||
the branch `release/12.x` is created - we shall refer to this commit hash as | ||
<branch-point-of-12-on-main>. Note that <branch-point-of-12-on-main> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current LLVM practice seems to be to create a tag llvmorg-13-init
referring to the commit after the branch point for the LLVM 12 release, so you should be able to use llvmorg-13-init^
as a commit expression referring to the branch point (^
means "previous commit") rather than looking up the commit ID manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not looking up the commit ID manually - we used the command git merge-base main release/12.x
in a clone of the pristine LLVM/Clang repository. I have updated this in the document (I had missed it earlier).
Additionally,
- the commands
git merge-base main release/12.x
andgit show llvmorg-12-init^
show different commit hashes, and - we want <branch-point-of-12-on-main> to be a commit hash on the
main
branch.
Therefore, the document currently does not recommend the use of llvmorg-12-init^
as <branch-point-of-12-on-main>. We may investigate this further during the next LLVM/Clang upgrade.
All other review comments have been incorporated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- the commands
git merge-base main release/12.x
andgit show llvmorg-12-init^
show different commit hashes, and
That should be git show llvmorg-13-init^
, which is indeed the same as git merge-base main release/12.x
. Still, git merge-base main release/12.x
may be a clearer way to express what we want; good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay! Thanks for the clarification.
the 12.0.0 release. For this, we need to get the commit hash associated with the | ||
12.0.0 release from https://github.com/llvm/llvm-project/releases. Let this | ||
commit hash be <12.0.0-commit-hash>. Note that it should be the full | ||
commit hash. Resolve merge conflicts and test failures if any. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the 12.0.0 release. For this, we need to get the commit hash associated with the | |
12.0.0 release from https://github.com/llvm/llvm-project/releases. Let this | |
commit hash be <12.0.0-commit-hash>. Note that it should be the full | |
commit hash. Resolve merge conflicts and test failures if any. | |
the 12.0.0 release, given by the `llvmorg-12.0.0` tag. Resolve | |
merge conflicts and test failures if any. |
Another case where you can use a Git ref provided by LLVM rather than manually looking up a commit ID.
commit hash. Resolve merge conflicts and test failures if any. | ||
``` | ||
git remote add upstream https://github.com/llvm/llvm-project.git | ||
git pull upstream <12.0.0-commit-hash> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git pull upstream <12.0.0-commit-hash> | |
git pull upstream llvmorg-12.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. This is good work! I identified a few more issues.
``` | ||
git checkout master | ||
git merge updated_baseline_master_12 | ||
git merge -ff-only updated_baseline_master_12 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git merge -ff-only updated_baseline_master_12 | |
git merge --ff-only updated_baseline_master_12 |
Checked C feature development in phase 1 step 8, the relevant versions are a | ||
snapshot of branch `updated_baseline_master_12` after step 6 or an earlier | ||
iteration of step 8 (i.e. the common ancestor), a snapshot of the current | ||
`master` branch that is being merged, and a snapshot of the current | ||
`updated_baseline_master_12` branch before starting the merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this description of the relevant versions is correct. See this diagram of two successive merges:
LLVM 11 -> master (step 6) -------------------------> master (step 8)
| | |
v v v
LLVM 12 -> ubm12 (after step 6) -> ubm12 (after step 7) -> ubm12 (after step 8)
IIUC, the versions in the document correspond to "ubm12 (after step 6)", "master (step 8)", and "ubm12 (after step 7)". But from the diagram, it seems clear that the common ancestor is "master (step 6)" and the other two versions are "master (step 8)" and "ubm12 (after step 7)".
Perhaps this illustrates the point that we should use the three versions identified by Git rather than trying to determine them manually and potentially making mistakes. (Update: In fact, I made a mistake above! I think I fixed it now.) I realize now that it's possible to get the commit IDs during the merge via $(git merge-base HEAD MERGE_HEAD)
, HEAD
, and MERGE_HEAD
. So we can still "use source snapshots" but query the commit IDs from the in-progress merge rather than determine them manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very true! I have incorporated your review comments. Thanks for the review!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two more minor issues I noticed. Otherwise, the discussion of all the topics that I'm familiar with (generally, the checkedc-clang
branching strategy and related procedures) looks good. There are some topics specific to the subject matter of checkedc-clang
development that I'm not qualified to review.
CONFLICT (rename/delete): ... in master renamed to ... in HEAD. Version HEAD ... | ||
left in tree. | ||
CONFLICT (rename/delete): ... in master renamed to ... in HEAD. Version master | ||
... left in tree. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format this as a code block rather than letting the two lines run together in the paragraph?
repository. | ||
``` | ||
git fetch upstream <cherry-pick-branch> | ||
git cherry-pick <commit-to-cherry-pick> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
git cherry-pick <commit-to-cherry-pick> | |
git cherry-pick -x <commit-to-cherry-pick> |
Recording (cherry picked from commit X)
may be helpful. I think your team has done this in other contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two small typos - otherwise LGTM!
LLVM/Clang repository: | ||
|
||
- Commit hash 0024efc69ea6cd0b630cd11cef5991b7edb73ffc on the `main` branch | ||
of the upstream LLVM/Clang repo (See LLVm/Clang bug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: LLVm
=> LLVM
git remote add upstream https://github.com/llvm/llvm-project.git | ||
git pull upstream <branch-point-of-12-on-main> | ||
``` | ||
3. Execute LLVM/Clang tests on the branch `updated_baseline` on your local |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: extra space between "Execute" and "LLVM/Clang"
* Revert "[BoundsWidening] Determine checked scope specifier per statement (#1139)" (#1141) This reverts commit 980321d. * Determine checked scopes per statement (#1142) We introduce a 2-bit field called CheckedScopeSpecifier in the Stmt class. During parsing when a compound statement is created we iterate the elements (statements) of the compound statement and set the checked scope specifier for each element to the checked scope specifier of the compound statement. We can get the checked scope specifier for a statement by calling the getCheckedScopeSpecifier method on the statement. * Update the instructions for upgrade of LLVM/Clang. (#1146) * Updated the instructions for upgrade of LLVM/Clang. Also added a new file LLVM-Upgrade-Notes.md to track important information related to upgrades. * Fixed typos. * Addressed review comments. * Fixed an inadvertent deletion. * Addressed review comments. * Incorporated review comments. * Fixed minor typos. * Fixed typos. * Add new flags for available facts analysis * Add the analysis into the build script and the sema bounds * Add utility functions to check whether a var is used in a Expr and a BoundsExpr * Add AbstractFact as a basic available fact; Add InferredFact and adjust WhereClauseFact to be a subclass of AbstractFact * Add data structures used in the analysis * Add print and dump functions * Add utility functions which are also used by BoundsWideningAnalysis. * Add other utility functions. `IsSwitchCaseBlock`: use `dyn_cast_or_null` to cover the null pointer case. `ConditionOnEdge`: do not test if there is no edge between pred to curr since it will only be called if there is an edge. `GetModifiedVars`: use `TranspareCasts` to bypass some casting. The feature to deal with membership access and the array indexing is still TODO. * Add fact comparision and fact-realted set oerations (contains TODO). * Add testscases (one covers basic features and the other is converted from the previous available facts analysis) * Dataflow analysis: Add statement-based Gen/Kill. * Dataflow analysis: Add block-edge-based Gen set. * Dataflow analysis: Add function to compute In and Out set. * Dataflow analysis: Addworklist algorithm. * Add desctrutors to release the memory * Fix: modify the Gen/Kill rules to match the design doc; It also fixes a bug to visit dead blocks. * Cleanup comments * Fix: use the exisiting functions to find a `VarDecl` in an expr * Change the equal check on fact collections to equal size check * Update the testcases with the updated Gen/Kill * Remove debug flag for available facts. * Use lexco-compare for `EqualityOpFact` and `InferredFact`. * Add a map to store the comparision results of facts. * Change the source location of a fact to its near expr. * Use a dedicated list to collect created facts and clean them finally. * Verify if an expr contains errors before checking invertibility (#1154) The community has introduced a new annotation called "contains-errors" on AST nodes that contain semantic errors. As a result, after the upgrade of Checked C sources to LLVM 12 we need to check if an expr contains errors before operating on the expr. One such place is in InverseUtil::IsInvertible where we need to check if the input modifying expr contains errors. * Added containsErrors checks to InverUtil::Inverse * [BoundsWidening] Handle complex conditionals in bounds widening (#1149) Support bounds widening in presence of complex conditionals like: "if (*p != 0)", "if ((c = *p) == 'a')", etc. * Don't record temporary equality between expressions such as x and x + 1 in TargetSrcEquality (#1162) * Add AllowTempEquality parameter to RecordEqualityWithTarget * Use a ModifiedSameValue variable to determine the return value for UpdateSameValueAfterAssignment * Rename ModifiedSameValue to RemovedAnyExprs and clean up comments * Treat address-of array subscripts the same way as address-of dereferences (#1163) * In CheckAddressOfOperand, add case for address-of array subscripts to C99-specific logic * Move address-of array subscript check after other checks such as taking the address of an lvalue * Adjust expected AST output to account for different types of address-of array subscripts * Restore deleted comment about checking for array subscript expressions * Add comment explaining the placement of the address-of array subscript logic * Put &e1[e2] typing rules under a Checked C flag * Update the available facts analysis. Co-authored-by: Mandeep Singh Grang <magrang@microsoft.com> Co-authored-by: Sulekha Kulkarni <Sulekha.Kulkarni@microsoft.com> Co-authored-by: Katherine Kjeer <6687333+kkjeer@users.noreply.github.com>
This PR updates the instructions to perform an LLVM/Clang upgrade. It also contains a new file called LLVM-Upgrade-Notes.md to track important information related to upgrades.