Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

execute-as-user code change #517

Closed
wants to merge 16 commits into from

Conversation

johnyu0520
Copy link
Contributor

No description provided.

John Yu and others added 15 commits September 14, 2015 12:08
2. modified ProcessJob to call execute-as-user
3. modified AzkabanProcess so that log consumption starts earlier (and don't get gobbled up during execution failures)
4. Made type=command read sysProps as well
… default. 2. Changing the C code to print less verbose log outputs
Update Gradle wrapper version to 2.7.
* Add gradle-errorprone-plugin to build.gradle and enable for all Java builds.
* Fix compile-time errors raised by error-prone.
* Add @ignore to PythonJobTests.testPythonJob because it appeared to hang.

Fixes azkaban#501
Conflicts:
	azkaban-common/src/main/java/azkaban/jobExecutor/ProcessJob.java
@@ -101,7 +101,7 @@
private ExecutingManagerUpdaterThread executingManager;
// 12 weeks
private static final long DEFAULT_EXECUTION_LOGS_RETENTION_MS = 3 * 4 * 7
* 24 * 60 * 60 * 1000l;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that your diff base is old since this patch contains a number of changes from #508. Can you please rebase?

@davidzchen
Copy link
Collaborator

Can you rebase against master instead? That way, you do not have all the old commits polluting your diff.

Ideally, I think it would be better to squash all of the commits into one instead of keeping all the intermediate patches:

git rebase -i master

@johnyu0520
Copy link
Contributor Author

Hey David,

Do you mean to merge the code from master branch? Or do you mean for me to squash my commits so the log looks more clean?

Thanks!
John

@davidzchen
Copy link
Collaborator

Hi John,

I took a look at your fork and the revision history for this PR. It looks like your master branch diverged from the azkaban/azkaban master a while ago and the commits for this PR are interleaved with more recent commits from upstream.

I actually tried cloning your fork of Azkaban, adding azkaban/azkaban as upstream, and doing git rebase -i upstream/master. Given the number of merge conflicts you would get compared to the size of your patch, I think it would be much simpler if you simply apply the net diff for this PR on top of a clean master branch that is identical to the upstream master.

Here is how you can do this:

  1. In your repository, assuming you have azkaban/azkaban added as upstream, first sync your upstream:

    git fetch upstream
    
  2. Create a new branch; let's call it clean-master:

    git checkout -b clean-master
    
  3. Reset this branch to make it identical to upstream/master:

    git reset --hard upstream/master
    
  4. Download the diff for this PR:

    wget https://github.com/azkaban/azkaban/pull/517.diff
    
  5. Apply the diff:

    git apply 517.diff
    
  6. The diff for your PR has now been applied on top of the clean-master branch, which is identical to upstream/master, but is not committed. Because you added azkaban-execserver/src/main/resources/execute-as-user.c as a new file, you would need to git add it. Now, remove the diff file, add execute-as-user.c, and commit:

    rm 517.diff
    git add azkaban-execserver/src/main/resources/execute-as-user.c
    git commit -am "<write your commit message here>"
    
  7. Now, you can make your master branch identical to clean-master and then do a force-push to push your squashed patch to this Pull Request:

    git checkout master
    git reset --hard clean-master
    git push origin master -f
    

Once you have done this:

  • This Pull Request will only contain one commit for your changes on top of master.
  • Your master branch will be identical to the upstream master with the addition of your patch.

Until this PR is merged, if there are new commits that go into master before your commit, I would recommend rebasing so that your patch will come after those new commits:

git fetch upstream
git rebase -i upstream/master

After your patch is merged, sync your master branch with upstream using:

git fetch upstream
git merge upstream/master

After your patch is merged, I would recommend doing development in feature branches in your fork and keeping your master branch identical to upstream/master at all times, and when new changes land in master, rebase your feature branch on top of master. To illustrate:

git checkout -b new-feature
# Do some work.
# Some new commits land in master.
# Fetch upstream and sync master with upstream:
git checkout master
git fetch upstream
git merge upstream/master
# Rebase your feature branch on top of master:
git checkout new-feature
git rebase master
# If there are any merge conflicts, resolve the merge conflicts and then run
# git rebase --continue until the rebase is successful.

This way, your master branch will always be identical to upstream, and your commits are always on top of master and not interleaved with older commits. It is good to rebase often, whenever new commits land in master. This will greatly reduce the number of painful merge conflicts and will allow you to work on more than one change by using different feature branches.

The only caveat is that if you want to push your feature branch to your remote branch (such as if you want to continue working on a different machine), you would need to do a force push after you do a rebase:

git push origin new-feature -f

I think force-pushes for feature branches in this case are fine, but force pushes for master branches should be avoided whenever possible. :)

Anyway, sorry for the hassle. I think we should try to rebase and squash commits before merging pull requests. This way, we would reduce the likelihood of particularly nasty merge conflicts, keep the revision history cleaner, and make it easier to track down why a certain line of code is changed. This is also closer to what Apache projects do.

Feel free to email me if you run into any issues. I am usually online in the evenings as well.

@johnyu0520
Copy link
Contributor Author

Hey David,

Thanks for your such detailed instructions! I've copied them down somewhere, as it will come in handy when merging into master :)

I'll do the squash, and that should reduce a lot of cluttering on the logs.
However, for the branch, I've purposely worked off multiexecutor-canary. That is the version currently deployed in Linkedin, and if I want to push my change, I'll need to build on top of multiexecutor-canary so that I can push my changes into Linkedin production.

We plan to merge multiexecutor-canary into master after my execute-as-user feature request. That will be another whole big operation, and we can watch out to use rebase at that time.

Thanks!
John

@davidzchen
Copy link
Collaborator

Thanks for the clarification. I didn't notice that your diff-base was multipleexecutors_canary and not master. That makes more sense now why the recent changes in master were showing up in this diff. :)

Sounds good. Feel free to let me know if you need any help with the rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants