Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

V1.8.x #19262

Merged
merged 8 commits into from
Oct 2, 2020
Merged

V1.8.x #19262

merged 8 commits into from
Oct 2, 2020

Conversation

leezu
Copy link
Contributor

@leezu leezu commented Oct 1, 2020

Use magit-cherry mode to check for commits present in v1.7.x but missing from v1.8.x. - indicates that the commit is also present in v1.8.x, whereas + indicates that the commit is missing.

Screenshot from 2020-09-30 20-14-33

There are a couple of false positives (declared missing but actually present), as those commits were forward ported to v1.8.x in a squashed form. I went through all + marked commits and applied them to the v1.8.x branch.

Only conflict to resolve was 8c7c2f1 which had to take into account the change by ce0a518

hanke580 and others added 5 commits October 1, 2020 03:24
* add zero grad for npi_unique (apache#18080)

* fix np.clip scalar input case (apache#17788)

* fix true_divide (apache#18393)

Co-authored-by: Hao Jin <hjjn.amzn@gmail.com>
Co-authored-by: Xi Wang <xidulu@gmail.com>
* Fix Windows GPU CI (apache#17962)

Update Windows CI to use VS 2019 and enable x64 bit toolchain. Previously we are using an older 32 bit toolchain causing OOM errors during linking. Switching to x64 bit toolchain on the older VS version previously used by the CI was attempted in apache#17912 and did not work. Update to Cuda 10.2 as it is required by VS 2019. Switch to ninja-build on Windows to speed up build as ninja-build is now preinstalled. Remove logic to install cmake 3.16 on every PR as cmake 3.17 is now preinstalled. Add build retrials due to cuda thrust + VS2019 flakyness.

Co-authored-by: vexilligera <vexilligera@gmail.com>

* backport mixed type

Co-authored-by: Leonard Lausen <lausen@amazon.com>
Co-authored-by: vexilligera <vexilligera@gmail.com>
… variable input shapes (apache#18632) (apache#18703)

* Fix the monitor_callback invalid issue during calibration with variable input shapes

* retrigger CI

* Add UT for monitor check and disable codecov

Co-authored-by: Tao Lv <tao.a.lv@intel.com>
@mxnet-bot
Copy link

Hey @leezu , Thanks for submitting the PR
All tests are already queued to run once. If tests fail, you can trigger one or more tests again with the following commands:

  • To trigger all jobs: @mxnet-bot run ci [all]
  • To trigger specific jobs: @mxnet-bot run ci [job1, job2]

CI supported jobs: [website, centos-cpu, edge, miscellaneous, windows-cpu, unix-gpu, clang, sanity, unix-cpu, centos-gpu, windows-gpu]


Note:
Only following 3 categories can trigger CI :PR Author, MXNet Committer, Jenkins Admin.
All CI tests must pass before the PR can be merged.

@sxjscience
Copy link
Member

Also CC @samskalicky @sandeep-krishnamurthy

@sandeep-krishnamurthy
Copy link
Contributor

Thank you so much @leezu

@sxjscience
Copy link
Member

In fact, should we ensure that each release needs to include all the commits of the previous release? @samskalicky @sandeep-krishnamurthy

@samskalicky
Copy link
Contributor

samskalicky commented Oct 1, 2020

In fact, should we ensure that each release needs to include all the commits of the previous release? @samskalicky @sandeep-krishnamurthy

I like the idea of comparing to the previous release to ensure no PRs were missing (backwards comparing all previous releases seems to be a bit too much though). But theres already a lot of work required by the release manager: https://cwiki.apache.org/confluence/display/MXNET/Release+Process . Its already a significant time commitment.

Maybe we should consider adding the instructions for committers/release-managers to verify that commits to release branches (ie. 1.7.x or 1.8.x) are just cherry-pick/porting PRs from the base branch (ie. 1.x) to avoid the problem in the first place.

@sxjscience
Copy link
Member

@samskalicky I agree. We should revise the guideline for cherry-picking the commits. That means, we need to both cherry-pick to the specific branch and also to 1.x / 2.x in the future.

* Update to thrust 1.9.8 on Windows

* Remove debug logic
Updating thrust alone did not help. Similar issues (though less often) still
occur with updated thrust, and also with nvidia cub. Tracked upstream at
NVIDIA/thrust#1090
@leezu
Copy link
Contributor Author

leezu commented Oct 2, 2020

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu]

@leezu leezu added the pr-awaiting-review PR is waiting for code review label Oct 2, 2020
@leezu
Copy link
Contributor Author

leezu commented Oct 2, 2020

@mxnet-bot run ci [unix-gpu]

@mxnet-bot
Copy link

Jenkins CI successfully triggered : [unix-gpu]

@samskalicky
Copy link
Contributor

samskalicky commented Oct 2, 2020

@leezu how do you wanna handle this, should we merge this PR so we have a single commit we can cherry-pick to v1.x? Or do you wanna open a separate PR there first before merging this one? Just want to make sure we dont merge this into a feature branch without having it in the main v1.x branch again...

And what reviews do you need before merging this PR?

@samskalicky samskalicky merged commit 371b312 into apache:v1.8.x Oct 2, 2020
samskalicky pushed a commit to samskalicky/incubator-mxnet that referenced this pull request Oct 2, 2020
* * Fix einsum gradient (apache#18482)

* [v1.7.x] Backport PRs of numpy features (apache#18653)

* add zero grad for npi_unique (apache#18080)

* fix np.clip scalar input case (apache#17788)

* fix true_divide (apache#18393)

Co-authored-by: Hao Jin <hjjn.amzn@gmail.com>
Co-authored-by: Xi Wang <xidulu@gmail.com>

* [v1.7.x] backport mixed type binary ops to v1.7.x (apache#18649)

* Fix Windows GPU CI (apache#17962)

Update Windows CI to use VS 2019 and enable x64 bit toolchain. Previously we are using an older 32 bit toolchain causing OOM errors during linking. Switching to x64 bit toolchain on the older VS version previously used by the CI was attempted in apache#17912 and did not work. Update to Cuda 10.2 as it is required by VS 2019. Switch to ninja-build on Windows to speed up build as ninja-build is now preinstalled. Remove logic to install cmake 3.16 on every PR as cmake 3.17 is now preinstalled. Add build retrials due to cuda thrust + VS2019 flakyness.

Co-authored-by: vexilligera <vexilligera@gmail.com>

* backport mixed type

Co-authored-by: Leonard Lausen <lausen@amazon.com>
Co-authored-by: vexilligera <vexilligera@gmail.com>

* revise activations (apache#18700)

* [v1.6] Fix the monitor_callback invalid issue during calibration with variable input shapes (apache#18632) (apache#18703)

* Fix the monitor_callback invalid issue during calibration with variable input shapes

* retrigger CI

* Add UT for monitor check and disable codecov

Co-authored-by: Tao Lv <tao.a.lv@intel.com>

* Fail build_windows.py if all retries failed (apache#18177)

* Update to thrust 1.9.8 on Windows (apache#18218)

* Update to thrust 1.9.8 on Windows

* Remove debug logic

* Re-enable build retries on MSVC (apache#18230)

Updating thrust alone did not help. Similar issues (though less often) still
occur with updated thrust, and also with nvidia cub. Tracked upstream at
NVIDIA/thrust#1090

Co-authored-by: Ke Han <38852697+hanke580@users.noreply.github.com>
Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk>
Co-authored-by: Hao Jin <hjjn.amzn@gmail.com>
Co-authored-by: Xi Wang <xidulu@gmail.com>
Co-authored-by: Yijun Chen <chenyijun0902@gmail.com>
Co-authored-by: vexilligera <vexilligera@gmail.com>
Co-authored-by: ciyong <ciyong.chen@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
samskalicky added a commit that referenced this pull request Oct 3, 2020
* * Fix einsum gradient (#18482)

* [v1.7.x] Backport PRs of numpy features (#18653)

* add zero grad for npi_unique (#18080)

* fix np.clip scalar input case (#17788)

* fix true_divide (#18393)

Co-authored-by: Hao Jin <hjjn.amzn@gmail.com>
Co-authored-by: Xi Wang <xidulu@gmail.com>

* [v1.7.x] backport mixed type binary ops to v1.7.x (#18649)

* Fix Windows GPU CI (#17962)

Update Windows CI to use VS 2019 and enable x64 bit toolchain. Previously we are using an older 32 bit toolchain causing OOM errors during linking. Switching to x64 bit toolchain on the older VS version previously used by the CI was attempted in #17912 and did not work. Update to Cuda 10.2 as it is required by VS 2019. Switch to ninja-build on Windows to speed up build as ninja-build is now preinstalled. Remove logic to install cmake 3.16 on every PR as cmake 3.17 is now preinstalled. Add build retrials due to cuda thrust + VS2019 flakyness.

Co-authored-by: vexilligera <vexilligera@gmail.com>

* backport mixed type

Co-authored-by: Leonard Lausen <lausen@amazon.com>
Co-authored-by: vexilligera <vexilligera@gmail.com>

* revise activations (#18700)

* [v1.6] Fix the monitor_callback invalid issue during calibration with variable input shapes (#18632) (#18703)

* Fix the monitor_callback invalid issue during calibration with variable input shapes

* retrigger CI

* Add UT for monitor check and disable codecov

Co-authored-by: Tao Lv <tao.a.lv@intel.com>

* Fail build_windows.py if all retries failed (#18177)

* Update to thrust 1.9.8 on Windows (#18218)

* Update to thrust 1.9.8 on Windows

* Remove debug logic

* Re-enable build retries on MSVC (#18230)

Updating thrust alone did not help. Similar issues (though less often) still
occur with updated thrust, and also with nvidia cub. Tracked upstream at
NVIDIA/thrust#1090

Co-authored-by: Ke Han <38852697+hanke580@users.noreply.github.com>
Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk>
Co-authored-by: Hao Jin <hjjn.amzn@gmail.com>
Co-authored-by: Xi Wang <xidulu@gmail.com>
Co-authored-by: Yijun Chen <chenyijun0902@gmail.com>
Co-authored-by: vexilligera <vexilligera@gmail.com>
Co-authored-by: ciyong <ciyong.chen@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>

Co-authored-by: Leonard Lausen <lausen@amazon.com>
Co-authored-by: Ke Han <38852697+hanke580@users.noreply.github.com>
Co-authored-by: Xingjian Shi <xshiab@connect.ust.hk>
Co-authored-by: Hao Jin <hjjn.amzn@gmail.com>
Co-authored-by: Xi Wang <xidulu@gmail.com>
Co-authored-by: Yijun Chen <chenyijun0902@gmail.com>
Co-authored-by: vexilligera <vexilligera@gmail.com>
Co-authored-by: ciyong <ciyong.chen@intel.com>
Co-authored-by: Tao Lv <tao.a.lv@intel.com>
@leezu leezu deleted the v1.8.x branch October 5, 2020 15:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-review PR is waiting for code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants