Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VS 2022 - Keep getting error -mavx2 -mfma #344

Open
newcapricasean opened this issue Mar 14, 2023 · 16 comments
Open

VS 2022 - Keep getting error -mavx2 -mfma #344

newcapricasean opened this issue Mar 14, 2023 · 16 comments

Comments

@newcapricasean
Copy link

newcapricasean commented Mar 14, 2023

It seems to not support the command, in VS 2022, to compile the AVX2 and FMA. Does anyone know why, and how to fix?

    • currently downstairs, and away from my compiler desktop... so not copying/pasting the exact error messages. It'll still compile, and work, but noticeably slower than with the pre-compiled release version I can download, which I presume is SSE2 enabled.
    • I noticed that it is using ninja, rather than make. I suppose that it normal?
  • My desktop is an AMD Ryzen Threadsipper 3990x, which is Zen2, if I'm not mistaken. I know that, with the x265 CLI, it clearly states it's using AVX, AVX2, and a bunch of other CPU capabilities.
  • I keep trying with MSYS2 UCRT, and it'll compile the DLL files, but when I put them in the SYSTEM32 folder, and try to run the script in avspmod, it'll suddenly say all my other filters don't exist. Put the pre-compiled ones back, and the filters function as normal.
@qyot27
Copy link
Member

qyot27 commented Mar 14, 2023

Those are GCC optimization flags. Since this is clearly a continuation/offshoot of #342, let's completely start over.

What steps are you taking when trying to compile? Copy/paste the actual CMake commands. Because there is definitely a configuration problem occurring in either what you're telling CMake to run, or its autodetection of features.

How did you set up the environment? That would be another strong indicator of where any possible autodetection problems are. Since it's MSys2, what exact environment are you launching from the install area, and what packages have you already installed with pacman?

Is the end goal here just to keep up with current development instead of relying on prebuilt releases, or is it for trying to integrate into something else? A client program, developing a plugin?

@newcapricasean
Copy link
Author

newcapricasean commented Mar 14, 2023

First, I decompressed the downloaded source code, from the latest version. I modified the "CMakeLists.txt" line 272 to MSVC_CPU_ARCH "AVX2" CACHE STRING to make AVX2 the default. In Visual Studio 2022, I load the decompressed source code folder. I didn't change any cmake settings. I select "build all". It results in "build all succeeded". However, the following error shows up in the build output repeatedly...
Z:\AviSynthPlus-3.7.2\out\build\x64-Debug\cl : Command line warning D9002: ignoring unknown option '-mavx2'
Z:\AviSynthPlus-3.7.2\out\build\x64-Debug\cl : Command line warning D9002: ignoring unknown option '-mfma'
The resulting libraries work, in this case, but are much slower than the pre-compiled ones I download from here. I don't believe that it is implementing the AVX2. I even tried compiling it with the default SSE2, and it was also much slower than the one I downloaded pre-compiled from here. I just re-compiled it with the SSE2 default, and it shows the exact same errors as mentioned previously. Something isn't right. I'll mention my MSYS2 in the next comment...

@newcapricasean
Copy link
Author

newcapricasean commented Mar 14, 2023

As for in MSYS2 UCRT64, I currently have the following list of precisely what I installed, and the order in which I did so. I used to have many many more installed, but I uninstalled MSYS2 entirely, and then reinstalled it, and installed less packages, this time. I had downloaded a lot for building other things...

pacman -S mingw-w64-ucrt-x86_64-gcc
pacman -S mingw-w64-ucrt-x86_64-toolchain
pacman -S mingw-w64-ucrt-x86_64-make
pacman -S mingw-w64-ucrt-x86_64-cmake
pacman -S mingw-w64-ucrt-x86_64-ninja
pacman -S mingw-w64-ucrt-x86_64-qt6
pacman -S mingw-w64-ucrt-x86_64-qt6-base
pacman -S mingw-w64-ucrt-x86_64-python
pacman -S mingw-w64-ucrt-x86_64-python-sphinx
pacman -S mingw-w64-ucrt-x86_64-python-pip-tools
pacman -S mingw-w64-ucrt-x86_64-python-distutils-extra
pacman -S mingw-w64-ucrt-x86_64-binutils
pacman -S mingw-w64-ucrt-x86_64-autotools
pacman -S mingw-w64-ucrt-x86_64-python-pluginbase
pacman -S mingw-w64-ucrt-x86_64-python-pybase64
pacman -S mingw-w64-ucrt-x86_64-boost
pacman -S mingw-w64-ucrt-x86_64-openlibm
pacman -S mingw-w64-ucrt-x86_64-python-pythran
pacman -S mingw-w64-ucrt-x86_64-unicode-character-database
pacman -S mingw-w64-ucrt-x86_64-extra-cmake-modules
pacman -S mingw-w64-ucrt-x86_64-git-repo
pacman -S mingw-w64-ucrt-x86_64-github-cli

x265 and openlibm-0.8.1 I've had no difficulty compiling, whatsoever...

My end goal is to increase my encoding speed, by optimizing avisynth, and all of my filters, to take full advantage of my processor. When I see it at least compile successfully, without any customizations, I then try adding -march=znver2 -O3 -mtune=znver2 -O3 With x265 and openlibm-0.8.1, everything went smoothly.

I just re-compiled avisynth with my MSYS2 UCRT64, without any customizations, and it compiled successfully, but when I put the resulting DLLs in my C:\Windows\System32 folder, once again, avisynth no longer sees all of my installed filters in the C:\Program Files (x86)\Avisynth+\plugins64 folder. Does the compiled versions expect the plugins to be in another location, perhaps? Or, alternatively, do I have to declare them (and their paths), in the scripts, when compiled?

@pinterf
Copy link

pinterf commented Mar 14, 2023

I've never tried really use mingw built versions under Windows, probably you are the first one :). I usually run it to check gcc compliance of the source code.

To tell the truth, I even did not know about msvcrt vs ucrt existance.
https://stackoverflow.com/questions/67848972/differences-between-msvcrt-ucrt-and-vcruntime-libraries
Should you tell in cmakelist somehow which one to use?

This is how I use it:

@rem cd avisynth-build
del ..\CMakeCache.txt
del .\CMakeCache.txt
cmake .. -G "MinGW Makefiles" -DBUILD_DIRECTSHOWSOURCE:bool=off -DENABLE_PLUGINS:bool=on -DENABLE_INTEL_SIMD:bool=ON
cmake --build . --config Debug --clean-first

@newcapricasean
Copy link
Author

newcapricasean commented Mar 14, 2023

Well... Are you suggesting that, if I switch over to the non-ucrt mingw, in MSYS2, that it might work? Where do you do your compiling? Is it in linux? I am new to compiling, and rather new to linux, but I've been obsessed, for about a month now, trying to learn and figure out how to compile all stages of my encoding process with -march=znver2 -O3 -mtune=znver2 -O3 code (I know that is not possible with Visual Studio)... My cpu is the AMD Ryzen Threadripper 3990x. Fortunately, I had my desktop built custom, which included 256 GB of ECC RAM. Therefore, I use IMDisk to create a 128 GB ramdisk, and do all this tinkering on it. I also use that ramdrive whenever encoding, actually, and have the windows virtual memory disabled, since I have so much RAM to spare. That saves on the wear and tear of the drives. This would include...
(1) avisynthplus
(2) fftw3
(3) DGDecodeNV
(4) DGDecode
(5) fft3dfilter
(6) TIVTC
(7) fft3dgpu
(8) all individual filters of TemporalDegrain script
(9) x265 (already done)

@qyot27
Copy link
Member

qyot27 commented Mar 14, 2023

My end goal is to increase my encoding speed, by optimizing avisynth, and all of my filters, to take full advantage of my processor. When I see it at least compile successfully, without any customizations, I then try adding -march=znver2 -O3 -mtune=znver2 -O3 With x265 and openlibm-0.8.1, everything went smoothly.

How are you adding -march=znver2 -O3 -mtune=znver2 -O3 when using MSys2? However, I'm almost certain the reason is that you're compiling Debug versions in MSVC, and regardless of compiler, Debug builds disable compiler optimizations. It has nothing to do with the setting of MSVC_CPU_ARCH.

The plugins may or may not even support being built by GCC, which would be the primary obstacle here given the next section. I would say the list of Linux-supporting plugins on Doom9 is probably about the same number that can be reasonably expected to build with GCC for Windows. It's not a comprehensive list, though.

I just re-compiled avisynth with my MSYS2 UCRT64, without any customizations, and it compiled successfully, but when I put the resulting DLLs in my C:\Windows\System32 folder, once again, avisynth no longer sees all of my installed filters in the C:\Program Files (x86)\Avisynth+\plugins64 folder. Does the compiled versions expect the plugins to be in another location, perhaps? Or, alternatively, do I have to declare them (and their paths), in the scripts, when compiled?

The MSys2 UCRT environment uses GCC, at least if it finds it. MSVC and GCC are ABI incompatible for C++, which means that you cannot use MSVC-built C++ plugins in a GCC-built* AviSynth+, or vice-versa; GCC-built* AviSynth+ looks for its plugins in plugins_gcc and plugins64_gcc, which are not created by the installer or their registry entries added either (I think; it's been quite a while). The C ABI is compatible, so C plugins (what few of them there are) can be used by either MSVC or GCC builds of AviSynth+.

*also includes Clang in its default configuration, since it uses GCC conventions.

@newcapricasean
Copy link
Author

newcapricasean commented Mar 14, 2023

So... Do I need to go back to having a separate partition with Fedora Linux? Would that resolve my issues? For a while, I was fiddling around with Fedora... I had also figured out how to use a ramdisk, in fedora.

@newcapricasean
Copy link
Author

Unfortunately, I've got to log off, for now, as I have an appointment soon... I'll probably be back online, in a few hours... If not, this evening... I work an overnight job, and so, I'm up all night (US EST), and sleep during the day... LoL...

@newcapricasean
Copy link
Author

https://www.msys2.org/docs/environments/

    • Would using MSYS2's CLANG environments overcome the issue you mentioned, qyot27?

@newcapricasean
Copy link
Author

newcapricasean commented Mar 14, 2023

OMG!!! I think I succeeded with Visual Studio 2022!!! I ended up changing the name from debug-x64 to Release build. Then, I changed it over to Release, in the CMAKE editor. I had already switched the /arch: to AVX2, in the CMakeLists.txt file, so it showed that in the cmake editor. I also updated the avi plugin directory to match my windows installation. I also changed the cmake generator from the ninja default to Visual Studio 17 2022 Win64. It compiled successfully, it works, and it has restored at least the speed of the pre-compiled one I'd downloaded. I'd been using the avspmod to simply move frame by frame to see what it was doing. My previous compiles would be jerky and slow, but this build is showing it to go smoothly. I didn't see any mention of AVX2 in the build screen, though, like I did with the debug mode. In the debug mode, however, it was showing errors regarding the AVX2 and fma. In this build, I got no such warning, but did not see any mention of it, though the CMAKE editor shows it set to AVX2. The resulting avisynth.dll file is smaller than the pre-compiled one, and I don't know whether that is good or bad.

    • warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data
      I don't know whether that is something to be concerned about... It showed up, several times, during the compilation process.

I'm going to try compiling, in the same way, but with arch type set to IA32 to see if file size is different, and whether it moves slower, again. That should let me know whether the compile is, indeed, AVX2 optimized.

    • Learning from this experience will, quite possibly, enable me to do the same with fftw3, fft3dfilter, fft3dgpu, neo_f3kdb-r9, AvsPmod, tivtc, and the rest of the TemporalDegrain.avsi script filters... I'm assuming, of course, that they all have to, first, have the support built into the source code to optimize for different architectures.

@newcapricasean
Copy link
Author

The resulting file sizes were, in fact, different, though, the speed of playback was the same, and the CPU utilization was the same. I suppose that could be normal considering my having the 3990x CPU, but in debug mode, no matter what I did, it produced larger size files, and it was choppy during playback. I think I read somewhere online that debug mode adds extra code for debugging purposes, so that could've been the cause of that. Next project... FFTW3!

@newcapricasean
Copy link
Author

Oh, I just re-compiled it, again, and I did see where it added AVX2 and SSE. I wonder if I can somehow add the the SSE3 & SSE4 optimizations...

@newcapricasean
Copy link
Author

Compiling the directshowsource filter has been problematic. Is it even needed, anymore, to use DirectShowSource?

@qyot27
Copy link
Member

qyot27 commented Mar 14, 2023

Compiling the directshowsource filter has been problematic. Is it even needed, anymore, to use DirectShowSource?

No, DirectShowSource should only ever be used as a last resort when other plugins fail to give meaningful output (which is rare). FFMS2 and LSMASHSource have been the recommended go-to general purpose source filters for over a decade. And they support passing frame properties in their recent versions, so HDR content is correctly preserved when giving the script to FFmpeg.

Would using MSYS2's CLANG environments overcome the issue you mentioned, qyot27?

No, because like I said, Clang aligns with GCC by default, and that's how MSys2 configures their Clang environments. MSVC compatibility with Clang is achieved by using Clang-cl, but to use Clang-cl, you have to use Visual Studio as the generator.

@newcapricasean
Copy link
Author

I just succeeded in compiling a functional fftw3.dll, using MSYS2's MINGW64!!! I'm going to lie down to take a nap, but it works!!! It's the first fftw3.dll compile I've done that actually works!!! It also seems to have accelerated FFT3DFILTER quite a bit!!! I installed a lot of packages into MSYS2's MINGW64, first, as I'd done research on most of the individual components of FFTW and downloaded the vast majority into MSYS2's MINGW64. This is the command I then used to successfully compile a working Windows fftw3.dll...
cd Z:/fftw-3.3.10
./configure CFLAGS="-march=znver2 -O3 -mtune=znver2 -O3" --enable-shared --disable-static CMAKE_SYSTEM_NAME=Windows CMAKE_SYSTEM_PROCESSOR=amd64
make -j 128

@newcapricasean
Copy link
Author

Actually, I made a mistake, and realized that there was the pre-compiled version of libfftw3f-3.dll still in my system32 folder that was making it work. However, I then realized that, perhaps, the problem with my compiles had to do with which version of the fftw3 was being compiled. I knew the other two, of the three, that were pre-compiled, also gave the same error, in avisynth. I should have realized this, from the beginning, but I did not. Long story short, I guess avisynth only likes single precision. My setting that parameter, and building with the MSYS2 MINGW64, was successful! Now, I get to try various settings, and compiles, to see what effect it has on it! Exciting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants