Skip to content

Commit

Permalink
Allow unaligned load/store in QVM interpreter/x86 compiler
Browse files Browse the repository at this point in the history
 constructions like (dataMask & ~3) was used to protect against out-of-bound load/store when address is 4-byte closer to dataMask
 but at the same time it effectively cut low address bits for ALL load/store operations which is totally wrong in terms of conformance to ALLOWED (i.e. generated by q3lcc from C sources) low-level operations like packed binary data parsing
  • Loading branch information
ec- authored and timangus committed May 25, 2017
1 parent abce150 commit 566fb0e
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 14 deletions.
8 changes: 5 additions & 3 deletions code/qcommon/vm.c
Expand Up @@ -451,13 +451,15 @@ vmHeader_t *VM_LoadQVM( vm_t *vm, qboolean alloc, qboolean unpure)
if(alloc)
{
// allocate zero filled space for initialized and uninitialized data
vm->dataBase = Hunk_Alloc(dataLength, h_high);
// leave some space beyound data mask so we can secure all mask operations
vm->dataAlloc = dataLength + 4;
vm->dataBase = Hunk_Alloc(vm->dataAlloc, h_high);
vm->dataMask = dataLength - 1;
}
else
{
// clear the data, but make sure we're not clearing more than allocated
if(vm->dataMask + 1 != dataLength)
if(vm->dataAlloc != dataLength + 4)
{
VM_Free(vm);
FS_FreeFile(header.v);
Expand All @@ -467,7 +469,7 @@ vmHeader_t *VM_LoadQVM( vm_t *vm, qboolean alloc, qboolean unpure)
return NULL;
}

Com_Memset(vm->dataBase, 0, dataLength);
Com_Memset(vm->dataBase, 0, vm->dataAlloc);
}

// copy the intialized data
Expand Down
14 changes: 7 additions & 7 deletions code/qcommon/vm_interpreted.c
Expand Up @@ -436,31 +436,31 @@ int VM_CallInterpreted( vm_t *vm, int *args ) {
return 0;
}
#endif
r0 = opStack[opStackOfs] = *(int *) &image[r0 & dataMask & ~3 ];
r0 = opStack[opStackOfs] = *(int *) &image[ r0 & dataMask ];
goto nextInstruction2;
case OP_LOAD2:
r0 = opStack[opStackOfs] = *(unsigned short *)&image[ r0&dataMask&~1 ];
r0 = opStack[opStackOfs] = *(unsigned short *)&image[ r0 & dataMask ];
goto nextInstruction2;
case OP_LOAD1:
r0 = opStack[opStackOfs] = image[ r0&dataMask ];
r0 = opStack[opStackOfs] = image[ r0 & dataMask ];
goto nextInstruction2;

case OP_STORE4:
*(int *)&image[ r1&(dataMask & ~3) ] = r0;
*(int *)&image[ r1 & dataMask ] = r0;
opStackOfs -= 2;
goto nextInstruction;
case OP_STORE2:
*(short *)&image[ r1&(dataMask & ~1) ] = r0;
*(short *)&image[ r1 & dataMask ] = r0;
opStackOfs -= 2;
goto nextInstruction;
case OP_STORE1:
image[ r1&dataMask ] = r0;
image[ r1 & dataMask ] = r0;
opStackOfs -= 2;
goto nextInstruction;

case OP_ARG:
// single byte offset from programStack
*(int *)&image[ (codeImage[programCounter] + programStack)&dataMask&~3 ] = r0;
*(int *)&image[ (codeImage[programCounter] + programStack) & dataMask ] = r0;
opStackOfs--;
programCounter += 1;
goto nextInstruction;
Expand Down
1 change: 1 addition & 0 deletions code/qcommon/vm_local.h
Expand Up @@ -170,6 +170,7 @@ struct vm_s {

byte *dataBase;
int dataMask;
int dataAlloc; // actually allocated

int stackBottom; // if programStack < stackBottom, error

Expand Down
8 changes: 4 additions & 4 deletions code/qcommon/vm_x86.c
Expand Up @@ -790,7 +790,7 @@ qboolean ConstOptimize(vm_t *vm, int callProcOfsSyscall)
return qtrue;

case OP_STORE4:
EmitMovEAXStack(vm, (vm->dataMask & ~3));
EmitMovEAXStack(vm, vm->dataMask);
#if idx64
EmitRexString(0x41, "C7 04 01"); // mov dword ptr [r9 + eax], 0x12345678
Emit4(Constant4());
Expand All @@ -805,7 +805,7 @@ qboolean ConstOptimize(vm_t *vm, int callProcOfsSyscall)
return qtrue;

case OP_STORE2:
EmitMovEAXStack(vm, (vm->dataMask & ~1));
EmitMovEAXStack(vm, vm->dataMask);
#if idx64
Emit1(0x66); // mov word ptr [r9 + eax], 0x1234
EmitRexString(0x41, "C7 04 01");
Expand Down Expand Up @@ -1377,7 +1377,7 @@ void VM_Compile(vm_t *vm, vmHeader_t *header)
case OP_STORE4:
EmitMovEAXStack(vm, 0);
EmitString("8B 54 9F FC"); // mov edx, dword ptr -4[edi + ebx * 4]
MASK_REG("E2", vm->dataMask & ~3); // and edx, 0x12345678
MASK_REG("E2", vm->dataMask); // and edx, 0x12345678
#if idx64
EmitRexString(0x41, "89 04 11"); // mov dword ptr [r9 + edx], eax
#else
Expand All @@ -1389,7 +1389,7 @@ void VM_Compile(vm_t *vm, vmHeader_t *header)
case OP_STORE2:
EmitMovEAXStack(vm, 0);
EmitString("8B 54 9F FC"); // mov edx, dword ptr -4[edi + ebx * 4]
MASK_REG("E2", vm->dataMask & ~1); // and edx, 0x12345678
MASK_REG("E2", vm->dataMask); // and edx, 0x12345678
#if idx64
Emit1(0x66); // mov word ptr [r9 + edx], eax
EmitRexString(0x41, "89 04 11");
Expand Down

4 comments on commit 566fb0e

@aufau
Copy link

@aufau aufau commented on 566fb0e May 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain when a valid, conforming to the standard C (without undefined behaviour) code causes unaligned ASGN instruction to be evaluated? Sounds unlikely to me considering this would cause unaligned memory access and valid C doesn't allow this. Perhaps compiler can do this if it knows target architecture always supports unaligned memory access? This would make LCC not suitable for PPC though.

What do you mean by packed binary data parsing? Packed structs, bitfields?

@aufau
Copy link

@aufau aufau commented on 566fb0e Jun 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen an example you gave in pull request. It depends on an undefined behaviour and is a very not portable code. Why should we change this behaviour when the original one was there since first q3 code release?

Few excerpts from ANSI C standard draft, C99 is much more strict and verbose on these things:

 * Alignment --- a requirement that objects of a particular type be
   located on storage boundaries with addresses that are particular
   multiples of a byte address.

(…)

 * Undefined behavior --- behavior, upon use of a nonportable or
   erroneous program construct, of erroneous data, or of
   indeterminately-valued objects, for which the Standard imposes no
   requirements.  Permissible undefined behavior ranges from ignoring the
   situation completely with unpredictable results, to behaving during
   translation or program execution in a documented manner characteristic
   of the environment (with or without the issuance of a diagnostic
   message), to terminating a translation or execution (with the issuance
   of a diagnostic message).

   If a ``shall'' or ``shall not'' requirement that appears outside of
   a constraint is violated, the behavior is undefined.  Undefined 
   behavior is otherwise indicated in this Standard by the words
   ``undefined behavior'' or by the omission of any explicit definition
   of behavior.  There is no difference in emphasis among these three;
   they all describe ``behavior that is undefined.''

(…)

   Conversions that involve pointers (other than as permitted by the
constraints of $3.3.16.1) shall be specified by means of an explicit
cast; they have implementation-defined aspects: A pointer may be
converted to an integral type.  The size of integer required and the
result are implementation-defined.  If the space provided is not long
enough, the behavior is undefined.  An arbitrary integer may be
converted to a pointer.  The result is implementation-defined./37/ A
pointer to an object or incomplete type may be converted to a pointer
to a different object type or a different incomplete type.  The
resulting pointer might not be valid if it is improperly aligned for
the type pointed to.  It is guaranteed, however, that a pointer to an
object of a given alignment may be converted to a pointer to an object
of the same alignment or a less strict alignment and back again; the
result shall compare equal to the original pointer.  (An object that
has character type has the least strict alignment.) A pointer to a
function of one type may be converted to a pointer to a function of
another type and back again; the result shall compare equal to the
original pointer.  If a converted pointer is used to call a function
that has a type that is not compatible with the type of the called
function, the behavior is undefined.

@NuclearMonster
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we change this behaviour when the original one was there since original q3 code release?

I don't think we've met yet, what project are you referring to when you use "we"?

@aufau
Copy link

@aufau aufau commented on 566fb0e Jun 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a form of speech, maybe incorrect here. English is not my mother tonque. I'm not affiliated with ioq3, although I co-maintain a GPL2 project that ports ioq3 vm improvements and fixes among other things: https://github.com/mvdevs/jk2mv

Please sign in to comment.