Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smart transparency & Dark mode effects works slow on 4K resulotion #19

Closed
gileli121 opened this issue Jun 5, 2020 · 11 comments
Closed
Assignees
Labels
high-priority High priority v5

Comments

@gileli121
Copy link
Member

gileli121 commented Jun 5, 2020

Known Issue:

New laptops and monitors can support 4K resolution.
The transparency algorithm and dark mode pro algorithms (included in the pro version) are not fast enough for 4K due to some bottleneck in the algorithms that prevent a better usage of the GPU.

It works fast on 1920x1080 resolution (with NVIDIA GPU).

For now, the only "solution" is to reduce the computer resolution to 1920x1080.

Trying the best to fix this issue.

@gileli121 gileli121 added the high-priority High priority label Jun 5, 2020
@gileli121 gileli121 self-assigned this Jun 5, 2020
@gileli121
Copy link
Member Author

gileli121 commented Jun 8, 2020

A solution has been found.
The solution is to replace as much as possible memory movements between GPU and CPU

The current code in v3.4.5 is
copying memory from GPU VRAM to RAM [ID3D11Texture2D to raw buffer] (20ms) --> copying memory from RAM to GPU VRAM [raw buffer to cuda buffer] (+20ms) --> process inside the GPU using CUDA (up to 18ms) --> copying processed buffer from GPU VRAM to CPU RAM [cuda buffer to cpu buffer] (20ms) --> copy from CPU RAM to GPU VRAM [cpu buffer to ID3D11Texture2D]

Since the API already returns the capture frame to the GPU, there must be a solution to how to avoid copying so much to RAM from VRAM and vice versa.
Instead of copying, I use
http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/online/group__CUDART__D3D11_gf0545f2dd459ba49cfd6bcf2741c5ebd.html
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__INTEROP.html#group__CUDART__INTEROP_1gad8fbe74d02adefb8e7efb4971ee6322
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__INTEROP.html#group__CUDART__INTEROP_1g0dd6b5f024dfdcff5c28a08ef9958031
http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/online/group__CUDART__MEMORY_g1620c76fb3337df8dc7186fd88f40b1a.html
https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__MEMORY__DEPRECATED.html#group__CUDART__MEMORY__DEPRECATED_1g15b5d20cedf31dd13801c6015da0e828
http://developer.download.nvidia.com/compute/cuda/3_1/toolkit/docs/online/group__CUDART__INTEROP_g2e105eddc2ad91633df3ddad95e7ad6f.html

The new flow will not do copies at all between CPU and GPU. it will use almost directly the ID3D11Texture2D that returned from API and stored in GPU.

As a result, FPS improved from 15 FPS to 30FPS on 4K.

A full solution will be developed in the next days/weeks.
Once it will be done, the next version will work fine also on 4K displays.

@gileli121
Copy link
Member Author

gileli121 commented Jun 11, 2020

The other part of the solution:

  • Do all processing on 4K frame inside the GPU
  • When the frame is bigger than 1080P, use 64 threads per block instead of 25 threads per block (for Cuda kernel functions)

The full solution will be implemented in the next weeks / up to 2 months
Until then you can use the effect:

https://windowtop.info/2020/05/07/how-to-make-your-ide-or-code-editor-with-transparent-background/

On up to 1080P resolution with Nvidia GPU.
It will not work fast on 4K for now.

I can't work on it at the moment (too busy for now)

@gileli121
Copy link
Member Author

Should work better in v5
Closed

@gileli121 gileli121 reopened this Apr 17, 2021
@gileli121
Copy link
Member Author

Reopened.
Even after all of these optimizations it still work a little slow on 4K
I opened request to Microsoft here:
microsoft/Windows.UI.Composition-Win32-Samples#60

If they will do what I asked it will open the option to resolve the problem

@gileli121
Copy link
Member Author

I found some hack in Windows API. Maybe I will able to solve it

@gileli121
Copy link
Member Author

Fixed.
I implemented the fix I found and wrote about here
microsoft/Windows.UI.Composition-Win32-Samples#60

@gileli121
Copy link
Member Author

The performance improved only for the dark-mode effect in 4k.
Not for the glass-mode effect.

@gileli121
Copy link
Member Author

Beta release with the optimization for dark-mode

WindowTop 5.2.7-beta - setup.exe.zip
WindowTop 5.2.7 - Beta - Portable.zip

@gileli121
Copy link
Member Author

User confirmed that he got very big improvement in performance

@gileli121
Copy link
Member Author

Found another way to handle this issue - #105

@gileli121 gileli121 removed the fixed label May 30, 2021
@gileli121
Copy link
Member Author

This is a duplicate of #105
So I am closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
high-priority High priority v5
Projects
None yet
Development

No branches or pull requests

1 participant