{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":346494449,"defaultBranch":"master","name":"pytorch","ownerLogin":"lithuak","currentUserCanPush":false,"isFork":true,"isEmpty":false,"createdAt":"2021-03-10T21:17:13.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/225642?v=4","public":true,"private":false,"isOrgOwned":false},"refInfo":{"name":"","listCacheKey":"v0:1641830627.71297","currentOid":""},"activityList":{"items":[{"before":"0cfc5899f9bade72c7e18666e2006b003b5848bc","after":"0f88d93b10fbe715afa7affb7ee0a3da90f406cd","ref":"refs/heads/master","pushedAt":"2023-09-09T13:37:47.000Z","pushType":"push","commitsCount":352,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"decomposition spectral ops fixes (#108360)\n\nFixes https://github.com/pytorch/pytorch/issues/105986, https://github.com/pytorch/pytorch/issues/108204, https://github.com/pytorch/pytorch/issues/108205\n\nFix all issues flagged when making changes for https://github.com/pytorch/pytorch/pull/107421\n\nPull Request resolved: https://github.com/pytorch/pytorch/pull/108360\nApproved by: https://github.com/ezyang","shortMessageHtmlLink":"decomposition spectral ops fixes (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"1876112581\" data-permission-text=\"Title is private\" data-url=\"https://github.com/pytorch/pytorch/issues/108360\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/pytorch/pytorch/pull/108360/hovercard\" href=\"https://github.com/pytorch/pytorch/pull/108360\">pytorch#108360</a>)"}},{"before":"5251ae6fb7995f0d26f2438179fb5d5a75b9eafd","after":"d4230e55748c66c72e7a17b1cd08540b742b20a5","ref":"refs/heads/viable/strict","pushedAt":"2023-09-09T13:37:38.000Z","pushType":"push","commitsCount":365,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"reland [finishing colesbury's PR 100642] Guard on nn.Module dicts and type (#108883)\n\nPull Request resolved: https://github.com/pytorch/pytorch/pull/108883\nApproved by: https://github.com/voznesenskym, https://github.com/huydhn","shortMessageHtmlLink":"reland [finishing colesbury's PR 100642] Guard on nn.Module dicts and…"}},{"before":"11602ac564c0e3178b38a65e09be13644322d303","after":"5251ae6fb7995f0d26f2438179fb5d5a75b9eafd","ref":"refs/heads/viable/strict","pushedAt":"2023-08-29T06:39:28.000Z","pushType":"push","commitsCount":389,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"Explicitly include iostream (#108103)\n\nSummary: Similar to D48568760\n\nTest Plan: Sandcastle\n\nReviewed By: osalpekar\n\nDifferential Revision: D48758708\n\nPull Request resolved: https://github.com/pytorch/pytorch/pull/108103\nApproved by: https://github.com/osalpekar","shortMessageHtmlLink":"Explicitly include iostream (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"1870614685\" data-permission-text=\"Title is private\" data-url=\"https://github.com/pytorch/pytorch/issues/108103\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/pytorch/pytorch/pull/108103/hovercard\" href=\"https://github.com/pytorch/pytorch/pull/108103\">pytorch#108103</a>)"}},{"before":"5ed60477a7cd930298dbdab71046ec62024427c4","after":"0cfc5899f9bade72c7e18666e2006b003b5848bc","ref":"refs/heads/master","pushedAt":"2023-08-29T06:39:05.000Z","pushType":"push","commitsCount":370,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"[inductor] Improved grid_sampler_2d decomposition for cuda (#104710)\n\nDescription:\n- Improved grid_sampler_2d decomposition code to generate single cuda kernel instead of two\n\nRelated to https://github.com/pytorch/pytorch/issues/104296\n\nPerfs:\n- speed-up on cuda (~x5) and cpu (~x2) for bicubic mode\n\n```\nSpeed-up PR vs Nightly = ratio between columns \"Compiled (2.1.0a0+git52598e9) PR\" and \"Compiled (2.1.0a0+gitcf76938) Nightly\"\n\n[------------------------------------------------------------------------------------------------------------------------------- Affine grid sampling, cpu -------------------------------------------------------------------------------------------------------------------------------]\n                                                                                                          |  Eager (2.1.0a0+git52598e9) PR  |  Compiled (2.1.0a0+git52598e9) PR  |  Compiled (2.1.0a0+gitcf76938) Nightly  |  speed-up PR vs Nightly  |  Eager (2.1.0a0+gitcf76938) Nightly\n1 threads: --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=True, mode=bilinear   |         38.010 (+-0.118)        |          51.466 (+-1.257)          |             47.867 (+-0.124)            |     0.930 (+-0.000)      |           33.654 (+-0.411)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=True, mode=bilinear       |         35.532 (+-0.236)        |          52.189 (+-0.093)          |             58.979 (+-0.206)            |     1.130 (+-0.000)      |           32.543 (+-0.198)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=False, mode=bilinear  |         38.187 (+-0.112)        |          47.892 (+-0.117)          |             45.833 (+-0.081)            |     0.957 (+-0.000)      |           33.752 (+-0.116)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=False, mode=bilinear      |         36.708 (+-0.244)        |          51.680 (+-0.104)          |             58.360 (+-0.108)            |     1.129 (+-0.000)      |           32.576 (+-0.751)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=True, mode=nearest    |         24.201 (+-0.088)        |          27.451 (+-0.059)          |             27.937 (+-0.081)            |     1.018 (+-0.000)      |           24.367 (+-0.074)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=True, mode=nearest        |         19.266 (+-0.105)        |          26.070 (+-0.085)          |             26.092 (+-0.054)            |     1.001 (+-0.000)      |           20.144 (+-0.064)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=False, mode=nearest   |         24.293 (+-0.125)        |          26.085 (+-0.064)          |             26.575 (+-0.061)            |     1.019 (+-0.000)      |           24.515 (+-0.095)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=False, mode=nearest       |         19.440 (+-0.075)        |          25.252 (+-0.059)          |             25.259 (+-0.051)            |     1.000 (+-0.000)      |           19.770 (+-0.070)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=True, mode=bicubic    |        114.900 (+-0.508)        |         113.416 (+-1.271)          |            248.679 (+-1.431)            |     2.193 (+-0.000)      |          114.609 (+-0.515)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=True, mode=bicubic        |        115.973 (+-0.555)        |         124.711 (+-1.596)          |            282.187 (+-2.418)            |     2.263 (+-0.000)      |          115.368 (+-0.652)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=False, mode=bicubic   |        111.730 (+-0.562)        |         110.914 (+-0.865)          |            253.899 (+-2.226)            |     2.289 (+-0.000)      |          111.285 (+-1.226)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=False, mode=bicubic       |        112.859 (+-0.487)        |         131.696 (+-1.298)          |            294.124 (+-1.963)            |     2.233 (+-0.000)      |          110.910 (+-0.969)\n\nTimes are in milliseconds (ms).\n\n[------------------------------------------------------------------------------------------------------------------------------- Affine grid sampling, cuda ------------------------------------------------------------------------------------------------------------------------------]\n                                                                                                          |  Eager (2.1.0a0+git52598e9) PR  |  Compiled (2.1.0a0+git52598e9) PR  |  Compiled (2.1.0a0+gitcf76938) Nightly  |  speed-up PR vs Nightly  |  Eager (2.1.0a0+gitcf76938) Nightly\n1 threads: --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=True, mode=bilinear   |        228.811 (+-0.037)        |          92.990 (+-0.446)          |             92.648 (+-0.286)            |     0.996 (+-0.000)      |          228.274 (+-0.067)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=True, mode=bilinear       |        222.107 (+-0.076)        |          93.247 (+-0.387)          |             92.528 (+-0.423)            |     0.992 (+-0.000)      |          221.922 (+-0.297)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=False, mode=bilinear  |        235.654 (+-0.055)        |          75.781 (+-0.566)          |            115.865 (+-0.419)            |     1.529 (+-0.000)      |          236.032 (+-0.111)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=False, mode=bilinear      |        226.752 (+-0.088)        |          76.312 (+-0.328)          |            116.468 (+-0.477)            |     1.526 (+-0.000)      |          226.950 (+-0.027)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=True, mode=nearest    |        225.540 (+-0.013)        |          75.638 (+-0.341)          |             72.621 (+-0.292)            |     0.960 (+-0.000)      |          225.937 (+-0.017)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=True, mode=nearest        |        217.425 (+-0.024)        |          75.484 (+-0.545)          |             73.518 (+-0.296)            |     0.974 (+-0.000)      |          217.793 (+-0.008)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=False, mode=nearest   |        231.474 (+-0.020)        |          75.972 (+-0.339)          |             73.030 (+-0.387)            |     0.961 (+-0.000)      |          231.991 (+-0.184)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=False, mode=nearest       |        223.408 (+-0.016)        |          75.622 (+-0.279)          |             73.542 (+-0.336)            |     0.973 (+-0.000)      |          223.893 (+-0.021)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=True, mode=bicubic    |        319.382 (+-0.023)        |         149.060 (+-0.190)          |            772.116 (+-0.266)            |     5.180 (+-0.000)      |          320.549 (+-0.387)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=True, mode=bicubic        |        319.987 (+-0.134)        |         154.443 (+-0.014)          |            797.651 (+-0.232)            |     5.165 (+-0.000)      |          320.665 (+-0.397)\n      Input: (8, 3, 345, 456) torch.float32, torch.contiguous_format, align_corners=False, mode=bicubic   |        326.138 (+-0.439)        |         149.092 (+-0.036)          |            772.508 (+-0.259)            |     5.181 (+-0.000)      |          325.751 (+-0.398)\n      Input: (8, 3, 345, 456) torch.float32, torch.channels_last, align_corners=False, mode=bicubic       |        326.024 (+-0.118)        |         154.452 (+-0.209)          |            797.756 (+-0.229)            |     5.165 (+-0.000)      |          326.870 (+-0.372)\n\nTimes are in microseconds (us).\n\n```\n\n[Source](https://raw.githubusercontent.com/vfdev-5/pth-inductor-dev/master/output/20230828-134459-affine-grid-sampler-PR-vs-Nightly-speedup.md)\n\nPull Request resolved: https://github.com/pytorch/pytorch/pull/104710\nApproved by: https://github.com/lezcano","shortMessageHtmlLink":"[inductor] Improved grid_sampler_2d decomposition for cuda (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"1791727864\" data-permission-text=\"Title is private\" data-url=\"https://github.com/pytorch/pytorch/issues/104710\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/pytorch/pytorch/pull/104710/hovercard\" href=\"https://github.com/pytorch/pytorch/pull/104710\">pytorch#1…</a>"}},{"before":"89de0485638bd1e4c8819f9e6a349a58a54b8ab9","after":"11602ac564c0e3178b38a65e09be13644322d303","ref":"refs/heads/viable/strict","pushedAt":"2023-08-21T11:49:08.000Z","pushType":"push","commitsCount":50,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"[dynamo] fix disable_saved_tensors_hooks - graph break (#106875)\n\n```python\ndef wrapper_fn(x):\n    with torch.autograd.graph.disable_saved_tensors_hooks(\"ERROR\"):\n        y = x + 1\n        print(\"HI\")\n        return y + 2\n\nx = torch.randn(())\n\na = wrapper_fn(x)\nopt = torch.compile(wrapper_fn, backend='eager', fullgraph=False)\ne = opt(x)\n```\n\nWithout the fix fails with,\n```\nTraceback (most recent call last):\n  File \"/home/kshiteej/Pytorch/pytorch_functorch/test/test_trace_grad.py\", line 182, in <module>\n    e = opt(x)\n  File \"/home/kshiteej/Pytorch/pytorch_functorch/torch/_dynamo/eval_frame.py\", line 333, in _fn\n    return fn(*args, **kwargs)\n  File \"/home/kshiteej/Pytorch/pytorch_functorch/test/test_trace_grad.py\", line 165, in wrapper_fn\n    def wrapper_fn(x):\nAttributeError: module 'torch.autograd.graph' has no attribute 'disable_saved_tensors_hook'\n```\n\nPull Request resolved: https://github.com/pytorch/pytorch/pull/106875\nApproved by: https://github.com/zou3519","shortMessageHtmlLink":"[dynamo] fix disable_saved_tensors_hooks - graph break (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"1843715062\" data-permission-text=\"Title is private\" data-url=\"https://github.com/pytorch/pytorch/issues/106875\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/pytorch/pytorch/pull/106875/hovercard\" href=\"https://github.com/pytorch/pytorch/pull/106875\">pytorch#106875</a>)"}},{"before":"fd214aa8be7160cba5c6efec7a1b06c9c33b37a4","after":"5ed60477a7cd930298dbdab71046ec62024427c4","ref":"refs/heads/master","pushedAt":"2023-08-21T11:48:53.000Z","pushType":"push","commitsCount":169,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"Optimize load inline via pch (#106696)\n\nAdd PreCompiled Header(PCH) to reduce load_inline build time.\nPCH is gcc built-in mechanism: https://gcc.gnu.org/onlinedocs/gcc-4.0.4/gcc/Precompiled-Headers.html\n\nAdd PCH for '#include <torch/extension.h>'. This file will used in all load_inline modules. All load_inline modules can take benifit from this PR.\n\nChanges:\n1. Add PCH signature to guarantee PCH(gch) file take effect.\n2. Unification get cxx compiler funtions.\n3. Unification get build flags funtions.\n\nBefore this PR:\n![image](https://github.com/pytorch/pytorch/assets/8433590/f190cdcb-236c-4312-b165-d419a7efafe3)\n\nAdded this PR:\n![image](https://github.com/pytorch/pytorch/assets/8433590/b45c5ad3-e902-4fc8-b450-743cf73505a4)\n\nCompiling time is reduced from 14.06s to 7.36s.\n\nPull Request resolved: https://github.com/pytorch/pytorch/pull/106696\nApproved by: https://github.com/jgong5, https://github.com/jansel","shortMessageHtmlLink":"Optimize load inline via pch (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"1838938680\" data-permission-text=\"Title is private\" data-url=\"https://github.com/pytorch/pytorch/issues/106696\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/pytorch/pytorch/pull/106696/hovercard\" href=\"https://github.com/pytorch/pytorch/pull/106696\">pytorch#106696</a>)"}},{"before":"2624da638d989c902ef9e1a5cff6028ab816605c","after":"89de0485638bd1e4c8819f9e6a349a58a54b8ab9","ref":"refs/heads/viable/strict","pushedAt":"2023-08-18T07:58:37.000Z","pushType":"push","commitsCount":96,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"[BE] Use allocator to allocate workspace (#107178)\n\nAs suggested in https://github.com/pytorch/pytorch/pull/106844#discussion_r1293839247 it's better to just allocate DataPtr than the whole tensor\nPull Request resolved: https://github.com/pytorch/pytorch/pull/107178\nApproved by: https://github.com/albanD\nghstack dependencies: #106977, #106844","shortMessageHtmlLink":"[BE] Use allocator to allocate workspace (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"1850572711\" data-permission-text=\"Title is private\" data-url=\"https://github.com/pytorch/pytorch/issues/107178\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/pytorch/pytorch/pull/107178/hovercard\" href=\"https://github.com/pytorch/pytorch/pull/107178\">pytorch#107178</a>)"}},{"before":"e787872a477b688c9cf0e38b090b7db3d0d7dc3c","after":"2624da638d989c902ef9e1a5cff6028ab816605c","ref":"refs/heads/viable/strict","pushedAt":"2023-08-15T13:45:46.000Z","pushType":"push","commitsCount":10000,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"Support third-party devices to use the init_process_group method with… (#107113)\n\n…out specifying the Backend\n\nWhen init_process_group is not been done before, it will automatically apply  init_process_group within Devicemesh without specifying the backend. Thus, when a third-party device want to use Devicemesh without doing init_process_group before, there comes a problem. In this PR, add a default_device_backend_map for third-party device users to add their backends to this map when they register their backends to pytorch firstly. When doing init_process_group without parameter backend, it will init the backends in this map. Thus, a third-party user can use init_process_group method without specifying the Backend.\n\nPull Request resolved: https://github.com/pytorch/pytorch/pull/107113\nApproved by: https://github.com/wanchaol","shortMessageHtmlLink":"Support third-party devices to use the init_process_group method with… ("}},{"before":"9266b2af73311792a746a2f741cc13145b3701df","after":"fd214aa8be7160cba5c6efec7a1b06c9c33b37a4","ref":"refs/heads/master","pushedAt":"2023-08-15T11:29:42.000Z","pushType":"push","commitsCount":10000,"pusher":{"login":"lithuak","name":"Ilya Persky","path":"/lithuak","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/225642?s=80&v=4"},"commit":{"message":"Revert \"Add OnCompletion Hook to ProcessGroup (#106988)\"\n\nThis reverts commit ba1da47e8fa95ca0dd8b2d63430f7eb54fdbbccb.\n\nReverted https://github.com/pytorch/pytorch/pull/106988 on behalf of https://github.com/huydhn due to Sorry for reverting you change, but it is failing Windows build with some linker error.  The Windows failures on PR looks legit ([comment](https://github.com/pytorch/pytorch/pull/106988#issuecomment-1678580899))","shortMessageHtmlLink":"Revert \"Add OnCompletion Hook to ProcessGroup (<a class=\"issue-link js-issue-link\" data-error-text=\"Failed to load title\" data-id=\"1845838926\" data-permission-text=\"Title is private\" data-url=\"https://github.com/pytorch/pytorch/issues/106988\" data-hovercard-type=\"pull_request\" data-hovercard-url=\"/pytorch/pytorch/pull/106988/hovercard\" href=\"https://github.com/pytorch/pytorch/pull/106988\">pytorch#106988</a>)\""}}],"hasNextPage":false,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAADfOB9egA","startCursor":null,"endCursor":null}},"title":"Activity · lithuak/pytorch"}