Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pcsx2 1.4.0 + gsdx-ogl + nvidia 260GTX: hardware mode unusable #1355

Closed
Soukyuu opened this issue May 9, 2016 · 40 comments
Closed

pcsx2 1.4.0 + gsdx-ogl + nvidia 260GTX: hardware mode unusable #1355

Soukyuu opened this issue May 9, 2016 · 40 comments

Comments

@Soukyuu
Copy link

Soukyuu commented May 9, 2016

Running up-to-date arch linux x64, I currently have the issue that hardware mode only runs at 1fps, while software mode runs at 40-60 fps. No special settings (speed hacks and sound off), ar tonelico 2 US game. I'm using the nvidia 340xx blob, lib32-libgl is installed and zzogl doesn't complain about anything. Hardware openGL support is 3.3.0:

OpenGL vendor string: NVIDIA Corporation
OpenGL renderer string: GeForce GTX 260/PCIe/SSE2
OpenGL core profile version string: 3.3.0 NVIDIA 340.96
OpenGL core profile shading language version string: 3.30 NVIDIA via Cg compiler
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.3.0 NVIDIA 340.96
OpenGL shading language version string: 3.30 NVIDIA via Cg compiler
OpenGL context flags: (none)
OpenGL profile mask: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 2.0 NVIDIA 340.96 340.96
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.00
OpenGL ES profile extensions:

The pcsx2 log has the following:

Opening GS
glX-Version 1.4 with Direct Rendering
Failed to find glCreateTextures
Failed to find glTextureStorage2D
Failed to find glTextureSubImage2D
Failed to find glCopyTextureSubImage2D
Failed to find glBindTextureUnit
Failed to find glGetTextureImage
Failed to find glTextureParameteri
Failed to find glCreateFramebuffers
Failed to find glClearNamedFramebufferfv
Failed to find glClearNamedFramebufferuiv
Failed to find glClearNamedFramebufferiv
Failed to find glNamedFramebufferTexture
Failed to find glNamedFramebufferDrawBuffers
Failed to find glNamedFramebufferReadBuffer
Failed to find glCheckNamedFramebufferStatus
Failed to find glCreateBuffers
Failed to find glNamedBufferStorage
Failed to find glNamedBufferData
Failed to find glNamedBufferSubData
Failed to find glMapNamedBuffer
Failed to find glMapNamedBufferRange
Failed to find glUnmapNamedBuffer
Failed to find glFlushMappedNamedBufferRange
Failed to find glCreateSamplers
Failed to find glCreateProgramPipelines
Failed to find glClipControl
Failed to find glTextureBarrier
DSA is not supported. Replacing the GL function pointer to emulate it
DSA is not supported. Replacing the GL function pointer to emulate it
Error GL_ARB_texture_barrier is not supported by your driver. You can't emulate correctly the GS blending unit! Sorry!
Error GL_ARB_texture_barrier is not supported by your driver. You can't emulate correctly the GS blending unit! Sorry!
    Loading GS

GPU utilization in nvidia x server setting application drops to from 7% to 1% on the first animation (the checking memory card message box). Clocks are at max speed.

Any idea how to fix that? Or is the hardware mode so "useless" on older hardware?
I saw issue #1347 but the -DGSDX_LEGACY flag seems to be for git snapshots only.

@gregory38
Copy link
Contributor

Error messages aren't important. GSdx will try to use GL4 feature if they available. Technically your GPU support it but your driver is limited on purpose (i.e. you will be able to use them with the Mesa's Nouveau driver).

I'm surprised of the speed. Are you sure that your driver is correctly installed on 32 bits. Delete your GSdx.ini file. (depends on your installation, either in your home or in inis dir)

find ~/.config -iname "GSdx.ini"

@gregory38
Copy link
Contributor

Ah forgot to say, yes legacy is only for latest git. 1.4 will always work with openGL 3.3

@Soukyuu
Copy link
Author

Soukyuu commented May 9, 2016

I'm as surprised as you are, the card still rocks on windows. Deleted the inis but nothing changed. I'm quite sure the 32-bit driver is installed:

$ pacman -Qs libgl
local/lib32-mesa 11.2.1-1
    an open-source implementation of the OpenGL specification (32-bit)
local/lib32-nvidia-340xx-libgl 340.96-1
    NVIDIA drivers libraries symlinks (32-bit)
local/mesa 11.2.1-1
    an open-source implementation of the OpenGL specification
local/nvidia-340xx-libgl 340.96-1
    NVIDIA drivers libraries symlinks

No matter how much I mess with the options, the performance in hardware mode is bad, GS utilization is stuck at 99% while GPU utilization is just not there:
gsdx-ogl3

@gregory38
Copy link
Contributor

Game ? And could you post your ini setting. A screnshot + gsdx.ini
My guess is that your gpu memory is full.

@gregory38
Copy link
Contributor

Maybe not the memory just saw the value...

@gregory38
Copy link
Contributor

How do you get all those gpu information by the way?

@Soukyuu
Copy link
Author

Soukyuu commented May 9, 2016

That's nvidia-settings, the proprietary configuration tool of the nvidia blob.
The game is ar tonelico 2 (US)

GSdx.ini.txt <- to satisfy github

gsdx-oglhw

@gregory38
Copy link
Contributor

Thanks for the tip, I never noticed the info. It could be useful for me to measure memory usage.

Anyway, let's try to remove full depth emulation. And then go to the advance setting tab. Try to disable some extensions such as

  • buffer storage
  • geometry shader
  • others?

@Soukyuu
Copy link
Author

Soukyuu commented May 10, 2016

I tried messing around with settings yesterday with several combinations, not even the slightest change in behavior.

@Soukyuu
Copy link
Author

Soukyuu commented May 10, 2016

Overlooked the "advanced tab" part, will try it out when I get home

@Soukyuu
Copy link
Author

Soukyuu commented May 10, 2016

Ok, messing with the advanced hacks doesn't fix anything. I tried

  • disabling all: no change, 1fps, GS 99%
  • enabling Image Load Store: black screen, but performance ~60%, GS 99%
  • enabling Clip Control: segfault in MGTS
  • enabling Texture Barrier: 1 fps, GS 0%
  • enabling the rest: no change, 1 fps, GS 99%

@gregory38
Copy link
Contributor

How do you start PCSX2? Depends on the script, it will potentially set Nvidia multithread optimization namely this variable __GL_THREADED_OPTIMIZATIONS=1
Maybe you driver doesn't support it well. Try to run PCSX2 directly or with the variable = 0 instead.

@Soukyuu
Copy link
Author

Soukyuu commented May 10, 2016

Sadly no dice, ran it as __GL_THREADED_OPTIMIZATIONS=0 PCSX2 but not really a change, it was around 7% speed instead of 1-3% now. Also tried disabling kwin (compositing).

@gregory38
Copy link
Contributor

Hum where did you get the plugin? Are you sure you don't run an unoptimized build. Besides broken driver I don't have more clue. You're the first one to report perf issue so it is likely on your side ;)

@Soukyuu
Copy link
Author

Soukyuu commented May 10, 2016

I'm running the official package of archlinux, the plugin was bundled with it.
What do you mean by "unoptimized"? Building it with march=native?

The nvidia proprietary driver is said to have the best 3D performance on linux, so I'm rather confused why it seems e.g. intel cards perform better...

As for the issue being on my side: likely. I'm a weird bug magnet.

@gregory38
Copy link
Contributor

I can tell you that on my Nvidia. The driver is as fast as a rocket.

I mean 0 optimization such as -O0

Reset all settings. And let's check on 3rd tab that debug options aren't enabled. Just click/unclick option so they will likely be register in the ini file properly. Then try both capture mode enabled/disable, there is potentially a bug on the button.

Do you have any 32 bits application to test the driver speed?

@Soukyuu
Copy link
Author

Soukyuu commented May 11, 2016

I'll try building it myself, the aur git package fails to compile for me at the moment though.

I'm using the newest 340xx driver there is, Nvidia moved the 260gtx to legacy. You probably are using their current branch?

Will try resetting everything. Can you recommend me any 32-bit applications to test with?

@gregory38
Copy link
Contributor

glxgears, it might be enough. Yes I'm using the recent branch. Honestly I don't know, results are quite strange.

@Hirato
Copy link

Hirato commented May 11, 2016

I'm very certain that you're not using the correct driver at all.

I'd suggest you install lib32-mesa-demos and see if glxinfo32 gives you similarly disappointing results.
Also make sure you have the right version of lib32-nvidia-utils installed, as that's where all libraries for the symlinks in lib32-nvidia-libgl come from.

EDIT: I'm partly mistaken, the symlinks trace back to lib32-libglvnd

@gregory38
Copy link
Contributor

Actually what is your mesa version?

@Soukyuu
Copy link
Author

Soukyuu commented May 11, 2016

Mesa version:

pacman -Qs mesa
local/glu 9.0.0-4
    Mesa OpenGL Utility library
local/lib32-glu 9.0.0-3
    Mesa OpenGL utility library (32 bits)
local/lib32-libtxc_dxtn 1.0.1-5
    S3 Texture Compression (S3TC) library for Mesa (32-bit)
local/lib32-mesa 11.2.1-1
    an open-source implementation of the OpenGL specification (32-bit)
local/lib32-mesa-demos 8.3.0-1
    Mesa demos and tools (32-bit)
local/libtxc_dxtn 1.0.1-6
    S3 Texture Compression (S3TC) library for Mesa
local/mesa 11.2.1-1
    an open-source implementation of the OpenGL specification
local/mesa-demos 8.3.0-1
    Mesa demos and tools

all lib32-nvidia I have:

pacman -Qs lib32-nvidia
local/lib32-nvidia-340xx-libgl 340.96-1
    NVIDIA drivers libraries symlinks (32-bit)
local/lib32-nvidia-340xx-utils 340.96-1
    NVIDIA drivers utilities (32-bit)
local/lib32-nvidia-cg-toolkit 3.1-4
    NVIDIA Cg libraries

glxgears:

glxgears
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
299 frames in 5.0 seconds = 59.796 FPS
301 frames in 5.0 seconds = 60.002 FPS
300 frames in 5.0 seconds = 59.998 FPS

glxgears32:

lxgears32 
Running synchronized to the vertical refresh.  The framerate should be
approximately the same as the monitor refresh rate.
300 frames in 5.0 seconds = 59.847 FPS
300 frames in 5.0 seconds = 59.997 FPS
300 frames in 5.0 seconds = 59.999 FPS

¯_(ツ)_/¯

@Soukyuu
Copy link
Author

Soukyuu commented May 11, 2016

If it matters, I'm running two monitors:

xrandr
Screen 0: minimum 8 x 8, current 3200 x 1084, maximum 8192 x 8192
DVI-I-0 disconnected (normal left inverted right x axis y axis)
DVI-I-1 disconnected (normal left inverted right x axis y axis)
DVI-I-2 connected primary 1920x1080+0+0 (normal left inverted right x axis y axis) 531mm x 299mm
   1920x1080     59.93 +  60.00* 
   1680x1050     59.95  
   1280x1024     75.02    60.02  
   1280x960      60.00  
   1152x864      75.00  
   1024x768      75.03    60.00  
   800x600       75.00    60.32    56.25  
   640x480       75.00    59.94  
DVI-I-3 connected 1280x1024+1920+60 (normal left inverted right x axis y axis) 330mm x 270mm
   1280x1024     60.02*+
   1024x768      60.00  
   800x600       60.32  
   640x480       59.95    59.94

edit: resetting all settings (deleted ~/.config/PCSX2) does not do anything. Would running any of the debug options generate anything that could help you debug this?

@Hirato
Copy link

Hirato commented May 12, 2016

The output of glxinfo32 would far more useful than a vsync'd glxgears (any CPU can push 60+ frames).
It's most likely something the Arch package maintainer should be looking at rather than Gregory.

@Soukyuu
Copy link
Author

Soukyuu commented May 12, 2016

Sorry, I misread. However:

$ glxinfo > /tmp/glxinfo64
$ glxinfo32 > /tmp/glxinfo32
$ md5sum /tmp/glxinfo*
60e8b759e436e530e093ee0a48bc6cd2  /tmp/glxinfo32
60e8b759e436e530e093ee0a48bc6cd2  /tmp/glxinfo64

I know that for example warzone2100 runs fine as well as Diablo III (over WINE, however) - so either they are using different features or stuff is broken somewhere else.

@Soukyuu
Copy link
Author

Soukyuu commented May 12, 2016

glxinfo32.txt

@Soukyuu
Copy link
Author

Soukyuu commented May 12, 2016

I've now built the current git snapshot with -DGSDX_LEGACY='ON', using

CPPFLAGS="-D_FORTIFY_SOURCE=2"
CFLAGS="-march=native -O3 -pipe -fstack-protector-strong --param=ssp-buffer-size=4"
CXXFLAGS="${CFLAGS}"
LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro"

Still the same behavior. Nice side-effect: software mode is buttery smooth now (edit: at least most of the time :x)

@gregory38
Copy link
Contributor

Speed likely comes from avx. Please use std flags, the best will be to use the profiler target of cmake (I will come back to you with the good name).

Then could you install (event) perf from the linux kernel. Hopefully a quick profilling will highlight the issue.

@Soukyuu
Copy link
Author

Soukyuu commented May 13, 2016

I don't have avx, my CPU is an old-ish phenom II x4 970 :p

The flags is what arch linux's makepkg system is using. Gcc now defaults to c++14 btw, so imagine those flags have that.

I'll take a look at perf, do you have a link to a quick intro on how to use it?

@gregory38
Copy link
Contributor

So speed boost is O3. I need to benchmark it properly one day. I will prefer if you use standard flags (at least for test purpose). For cmake

-DCMAKE_BUILD_TYPE=Prof

It is like a release build without a single optimization to allow profiling.

From the top of my head

perf record -- ./bin/PCSX2
=> Do GSdx/PCSX2 configuration before perf.
=> Start your game in HW mode and stay around 1 minute in the sluggish mode.
=> Then exit PCSX2
perf report
=> and screenshot/copy past the terminal output

Note: don't recompile after the recording, so I can ask you further analysis of the perf report.
Note2: There is a way to enable callgraph in perf record but let's try something basic first.

@Soukyuu
Copy link
Author

Soukyuu commented May 13, 2016

Hmm, I'm getting

Failed to open /tmp/perf-20268.map, continuing without symbols

when running perf report. The result is still displayed, but without symbols, obviously...
perf_report.txt

Current git snapshot, built by running

./build.sh --prof -DGSDX_LEGACY=ON

Should use standard flags, as they are only overwritten when using makepkg.

@Soukyuu
Copy link
Author

Soukyuu commented May 13, 2016

Just to rule out some weird GUST game related behavior, I tested .hack//G.U. and Zone of the Enders - same abysmal performance.

@gregory38
Copy link
Contributor

gregory38 commented May 13, 2016

not important, only useful to recompiler code.

Failed to open /tmp/perf-20268.map, continuing without symbols

Anyway, the issue is your driver. All times is spend into it. So either your install is broken, either I use some extensions that work badly on your driver. Maybe your GPU remain in low power.

Edit: by the way, maybe you can lsmod to check if drm or nouveau (free driver) is loaded.

@Soukyuu
Copy link
Author

Soukyuu commented May 14, 2016

My GPU is always in the highest clock (only found it out by chance a while ago), so it's shouldn't be it. However:

lsmod | grep drm
drm                   290816  3 nvidia
lsmod | grep nouveau

Does that make sense? what is drm?

@gregory38
Copy link
Contributor

https://en.wikipedia.org/wiki/Direct_Rendering_Manager

Drm is normally the kernel part of the free driver. It smells fishy. Maybe the kept the same name, but I doubt it.
You can find the directory that contains the nvidia driver with this command

find /lib/modules/`uname -r`/ -iname "*nvidia*.ko"

Here the result on my box. They change recently module name, but you got the idea.

/lib/modules/4.1.10-1-gregory/updates/dkms/nvidia-modeset.ko
/lib/modules/4.1.10-1-gregory/updates/dkms/nvidia-uvm.ko
/lib/modules/4.1.10-1-gregory/updates/dkms/nvidia.ko

@Soukyuu
Copy link
Author

Soukyuu commented May 14, 2016

I think the packaging is different on arch, the only nvidia.ko that I have is nvidia.ko.gz in /usr/lib/modules/extramodules-4.5-ARCH/

@gregory38
Copy link
Contributor

Ok. All nvidia modules ought to be in /usr/lib/modules/extramodules-4.5-ARCH/
And I guess it won't have any drm.ko file.

Anyway, I will close this issue as it is likely a driver issue. And I think it will be better to use nouveau so you can use latest openGL feature.

@Soukyuu
Copy link
Author

Soukyuu commented May 14, 2016

Yes, and since it's the legacy branch, nvidia does not give a damn about it. The main reason I wasn't using nouveau was because it had very bad performance, where even mpv with opengl-hq wouldn't run smoothly. That changed since I last tried it, so I think I will stick with nouveau for now.

@gregory38
Copy link
Contributor

It depends a lots of the low power feature.

Mesa misses various cpu optimization and multi thread stuff. But there are mostly useful for the gl4.5 stuff which you don't have with Nvidia. However rendering quality is much better with Nouveau (due to the extra extensions).

IMHO times to save money ;)

@Soukyuu
Copy link
Author

Soukyuu commented May 16, 2016

Well, it still has enough power for my uses - just the bug with pcsx2 is annoying. Theoretically, I could buy a new one today, but I don't like throwing way working stuff.

In any case, regarding

Error GL_ARB_texture_barrier is not supported by your driver. You can't emulate correctly the GS blending unit! Sorry!

I brought it up with someone on the nouveau irc channel and they commented that GL_ARB_texture_barrier is not missing, but with the blob it's called NV_texture_barrier. Any chance to make it work by checking for the NV version instead? Would that solve my issue?

@gregory38
Copy link
Contributor

On legacy branch. This feature is optional. It doesn't explain the speed impact.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants