Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of broken examples #26

Open
ijacquez opened this Issue Nov 18, 2018 · 11 comments

Comments

Projects
None yet
3 participants
@ijacquez
Copy link

ijacquez commented Nov 18, 2018

I ran through these examples, and those marked FAILED are the ones that caused some sort of exception. I compiled each example with PSP_FW_REVISION=660.

Below is the list.

Status Filename
FAILED morphskin/morphskin.prx
PASSED shadowprojection/shadowprojection.prx
PASSED logic/logic.prx
PASSED speed/speed.prx
FAILED envmap/envmap.prx
FAILED beginobject/beginobject.prx
FAILED splinesurface/splinesurface.prx
PASSED blend/blend.prx
FAILED sprite/sprite.prx
FAILED spharm/spharm.prx
PASSED skinning/skinning.prx
PASSED blit/blit.prx
PASSED copy/copy.prx
PASSED text/gufont.prx
FAILED morph/morph.prx
FAILED reflection/reflection.prx
FAILED zbufferfog/zbufferfog.prx
FAILED cube/cube.prx
FAILED integerdrawing/integerdrawing.prx
FAILED vertex/vertex.prx
FAILED timing/timing.prx
FAILED lights/lights.prx
FAILED signals/signals.prx
PASSED lines/lines.prx
PASSED clut/clut.prx
FAILED ortho/ortho.prx
FAILED rendertarget/rendertarget.prx
FAILED celshading/celshading.prx
@carstene1ns

This comment has been minimized.

Copy link

carstene1ns commented Nov 20, 2018

Just a quick guess:
Did you build them with a (recent) toolchain version that has the fix for double precision crash?
See pspdev/psptoolchain#85 and pspdev/psptoolchain#65 for reference.

@ijacquez

This comment has been minimized.

Copy link
Author

ijacquez commented Nov 20, 2018

I just double checked, and I'm at a435ee0df19d8e07fa4c0a01f1f47fb7aeca834d, which is the HEAD.

@yoanlcq

This comment has been minimized.

Copy link

yoanlcq commented Nov 30, 2018

This looks similar to the situation I was in when I started out (that is, recently), and I remember that setting USE_PSPSDK_LIBC = 1 in the Makefiles (or passing it as argument to make) did the trick for me.
Some seemingly basic floating-point operations caused a crash otherwise (these were easy to figure out by sprinkling printf() calls through one of the samples).

I didn't dig too deep and this does sound like a bug, but hopefully that could help.

(edit: I'm always building with PSP_FW_VERSION = 635.)

@ijacquez

This comment has been minimized.

Copy link
Author

ijacquez commented Dec 1, 2018

Thanks for that. I'll give it a try.

@ijacquez

This comment has been minimized.

Copy link
Author

ijacquez commented Dec 2, 2018

First run of cube.prx and a few other tests I've marked as failed do work.

Yeah, this may be a bug. Can anyone confirm?

@carstene1ns

This comment has been minimized.

Copy link

carstene1ns commented Dec 2, 2018

So, something wrong with our newlib? Very strange.

@ijacquez

This comment has been minimized.

Copy link
Author

ijacquez commented Dec 2, 2018

I haven't checked, but Newlib being built with floating point support?

@yoanlcq

This comment has been minimized.

Copy link

yoanlcq commented Dec 8, 2018

I tried to understand the issue with the cube sample. It turns out that sceGumPerspective() is the culprit in this particular case, because the remainder of the code works just fine.
Since the sample links against pspgum and not pspgum_vfpu, the source for sceGumPerspective() is the following :

void sceGumPerspective(float fovy, float aspect, float near, float far)
{
	float angle = (fovy / 2) * (GU_PI/180.0f);
	float cotangent = cosf(angle) / sinf(angle);
	float delta_z = near-far;
	register ScePspFMatrix4* t __asm("a0") = GUM_ALIGNED_MATRIX();

	memset(t,0,sizeof(ScePspFMatrix4));
	t->x.x = cotangent / aspect;
	t->y.y = cotangent;
	t->z.z = (far + near) / delta_z; // -(far + near) / delta_z
	t->w.z = 2.0f * (far * near) / delta_z; // -2 * (far * near) / delta_z
	t->z.w = -1.0f;
	t->w.w = 0.0f;

	sceGumMultMatrix(t);
}

With

// these macros are because GCC cannot handle aligned matrices declared on the stack
#define GUM_ALIGNED_MATRIX() (ScePspFMatrix4*)((((unsigned int)alloca(sizeof(ScePspFMatrix4)+64)) + 63) & ~63)
#define GUM_ALIGNED_VECTOR() (ScePspFVector4*)((((unsigned int)alloca(sizeof(ScePspFVector4)+64)) + 63) & ~63)

Debugging in psp-gdb with PSPLINK shows that a store exception happens at the assignment immediately following the call to memset(). From the disassembly, I'm assuming that alloca() did resolve to GCC's __builtin_alloca(), because there's no jal alloca instruction. I also see that the call to memset() was indeed compiled as a library call, and was not inlined by GCC.

Here's the thing: newlib's memset() clobbers the a0 register, meaning that t points to garbage after the call. PSPSDK libc's memset() does not; I've reached this conclusion by looking at the disassembly of both functions in psp-gdb.

It looks to me like the fix would be to remove the __asm("a0") part when declaring t. Why would that be so important? In fact I don't see the point in doing an alloca() either. ScePspFMatrix4 is declared with an alignment of 16. If "GCC cannot handle aligned matrices declared on the stack" as the comment says, then isn't all user code screwed anyway? That looks unlikely to me.

What I would do is simply declare t just like a regular variable, and spare the call to memset() altogether via zero-initialization. Is there something critical I'm missing?

Worth noting is that sceGumOrtho() does a similar trick.

@ijacquez

This comment has been minimized.

Copy link
Author

ijacquez commented Dec 10, 2018

Yeah, this all seems rather odd. I'm fairly sure that __attribute__ ((aligned(64))) would be sufficient for ScePspFVector4?

@yoanlcq

This comment has been minimized.

Copy link

yoanlcq commented Dec 10, 2018

I'm fairly sure that attribute ((aligned(64))) would be sufficient for ScePspFVector4?

Maybe you meant ScePspFMatrix4? Such a requirement on ScePspFVector4 would forbid contiguous arrays of it... Or maybe you mean putting the attribute at the declaration site instead of the type itself, which looks reasonable (way better than resorting to alloca()... yikes!)

From what I gather, 16-byte alignment is a requirement for VFPU "unaligned" load/stores (ulv.q/usv.q), and 64-byte alignment is a requirement for VFPU "aligned" load/stores (lv.q/sv.q) (none of which pspgum actually needs....)

There are these pieces of info in the chapter 4 of Hitmen's Yet Another PSP Documentation:

VFPU load/store instructions seem to support only 16-byte-aligned accesses (similiar to Altivec and SSE).

But right below, in the section for lv and sv (Load/Store Vector Quadword) instructions :

Final Address needs to be 64-byte aligned.

That second statement was confusing to me at first, but I think I "got it" by looking at how pspgum_vfpu.c manages its loads and stores.

About the a0 stuff, I'm surprised GCC doesn't do "the right thing" by saving it or something, if it knows that the ABI allows clobbering it, but oh well. This makes me wonder which other pieces of code are doing such a thing in the SDK, probably accidents waiting to happen.
In any case let's just remove that __asm("a0") and call it a day; that's the only problematic part really x).

(For posterity, I think this issue's title should be changed to match what we know now :) )

@ijacquez

This comment has been minimized.

Copy link
Author

ijacquez commented Dec 10, 2018

Yes, I referred to the wrong one. Adding the attribute to the declaration site is preferred.

I'll run a few tests, removing the need for the a0 register.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.