Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Program misbehaves when using -zm - was: Miscompilation when using (-0) 8086 real-mode DOS target, working for (-1) 80186 higher targets #862

Closed
johnsonjh opened this issue May 9, 2022 · 26 comments

Comments

@johnsonjh
Copy link

johnsonjh commented May 9, 2022

I'm having a bit of trouble building an 8086 version of the G editor for DOS.

(TL;DR - Exact same code broken using -0, and OK with -1)

I regularly build G with many compilers (new and old), but it had been very a long time (~1995) since the DOS ports had been regularly compiled, since I took up maintaining it. (The output of my make test script is available here if you'd like to see exactly how each variant is built.)

This test script does not automate all possible DOS builds, such as those using MS-DOS based compilers, however, I can confirm that 8086 binaries built with Microsoft C 7.00b, Microsoft C 8.00a (MSVC 1.52c), and Borland C 5.02 work as expected, with no issues. (Unrelated, but just a note to anyone looking into this, the IA16-GCC builds done by this testing script aren't working yet; this is due to the lack of a large memory model in that compiler).

Since the Microsoft C builds work, and are also smaller, I'm currently using Microsoft C to produce the reference binaries available for download; these can be used to double-check the correct behavior, since they are quite extensively tested.

To build the (unusable) 8086 binary using OpenWatcom v2, I use the following:

» gmake g86 V=1
+ env WATCOM="/opt/watcom" INCLUDE="/opt/watcom/h" PATH="/opt/watcom/binl64:${PATH}"  \
     wcl -k32767 -bcl=DOS -DDOS=1 -DASM86=0 -0 -ml -fpi -zm @g.lkr -d0 -s -zq -oabls  \
     -zp2 -fe=g86.exe g.c

» ls -la g86.exe 
.rwxr-xr-x jhj jhj 113 KB Sun May  8 21:58:04 2022 g86.exe*

» file g86.exe     
g86.exe: MS-DOS executable

Once compiled, the quickest way to identify a bad build is to load a file from the command line, and then attempt to save it using F7, producing the following result:

Screenshot from 2022-05-08 22-06-51

Rebuilding exactly as above, but using -1 instead of -0 results in a compiled binary that works fine:

» env WATCOM="/opt/watcom" INCLUDE="/opt/watcom/h" PATH="/opt/watcom/binl64:${PATH}"  \
     wcl -k32767 -bcl=DOS -DDOS=1 -DASM86=0 -1 -ml -fpi -zm @g.lkr -d0 -s -zq -oabls  \
     -zp2 -fe=g86.exe g.c

» ls -la g86.exe 
.rwxr-xr-x jhj jhj 113 KB Sun May  8 22:16:55 2022 g86.exe*

» file g86.exe 
g86.exe: MS-DOS executable

Pressing F7 works exactly as expected:

Screenshot from 2022-05-08 22-18-30

Note that the problems aren't limited to F7 to save - the editor is completely unusable using -0, and perfectly fine with -1.

Compiler version:

Open Watcom C/C++ x86 16-bit Compile and Link Utility
Version 2.0 beta Mar 13 2022 23:43:44 (64-bit)
Copyright (c) 2002-2022 The Open Watcom Contributors. All Rights Reserved.
Portions Copyright (c) 1988-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
See http://www.openwatcom.org/ for details.

I've tried a slightly newer build but still had the same problems - I haven't yet been able to update to a "0-day" version. I hope this helps in tracking down the problem.

@johnsonjh johnsonjh changed the title Miscompilation when using 8086 real-mode DOS target (working for 80186 and later) Miscompilation when using (-0) 8086 real-mode DOS target, working for (-1) 80186 higher targets May 9, 2022
@jmalak
Copy link
Member

jmalak commented May 9, 2022

Hi, I am not sure what you are trying to achive.
I looked on WATCOM support for G and it looks like strange.
Please are you compiling for 16-bit DOS or for 32-bit extended DOS?

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

@jmalak

I regularly compile for 16-bit and 32-bit DOS and OS/2.

However, this bug affects only 16-bit real-mode MS-DOS and only 8086 (-0) builds.

G compiles, the resulting binary starts, but G itself is completely unusable - almost every function of the program misbehaves (as described above).

8086 DOS G build is as follows:

wcl -k32767 -bcl=DOS -DDOS=1 -DASM86=0      \
  -0 -ml -fpi -zm @g.lkr -d0 -s -zq -oabls  \
     -zp2 -fe=g86.exe g.c

If I change the -0 to -1 (to build for 80186) or higher, everything is fine.

Every other OpenWatcom build is fine.

@jmalak
Copy link
Member

jmalak commented May 9, 2022

I tried to compile current git version for 16-bit and 32-bit DOS and it is completely broken WATCOM stuff.

result of 16-bit DOS compilation

Open Watcom C x86 16-bit Optimizing Compiler
Version 2.0 beta May  4 2022 17:35:04 (32-bit)
Copyright (c) 2002-2022 The Open Watcom Contributors. All Rights Reserved.
Portions Copyright (c) 1984-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
See http://www.openwatcom.org/ for details.
g.c(441): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(444): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(447): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(451): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(455): Error! E1156: Assembler error: 'Cannot use 386 register with current CPU setting'
g.c(455): Error! E1156: Assembler error: 'Cannot use 386 register with current CPU setting'
g.c(456): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(460): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(466): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(470): Error! E1156: Assembler error: 'Cannot use 386 register with current CPU setting'
g.c(470): Error! E1156: Assembler error: 'Cannot use 386 operand size with current CPU setting'
g.c(471): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(472): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(8763): Error! E1156: Assembler error: 'Cannot use 386 register with current CPU setting'
g.c(8763): Error! E1156: Assembler error: 'Cannot use 386 operand size with current CPU setting'
g.c(8764): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(8767): Error! E1156: Assembler error: 'Cannot use 386 register with current CPU setting'
g.c(8767): Error! E1156: Assembler error: 'Cannot use 386 operand size with current CPU setting'
g.c(8768): Error! E1156: Assembler error: 'Invalid instruction with current CPU setting'
g.c(8771): Error! E1156: Assembler error: 'Cannot use 386 register with current CPU setting'
g.c(8771): Error! E1147: Too many errors: compilation aborted

result of 32-bit DOS compilation

Open Watcom C x86 32-bit Optimizing Compiler
Version 2.0 beta May  4 2022 17:35:25 (32-bit)
Copyright (c) 2002-2022 The Open Watcom Contributors. All Rights Reserved.
Portions Copyright (c) 1984-2002 Sybase, Inc. All Rights Reserved.
Source code is available under the Sybase Open Watcom Public License.
See http://www.openwatcom.org/ for details.
g.c(15479): Error! E1122: Illegal register modified by 'movelr' #pragma
g.c(15479): Error! E1122: Illegal register modified by 'movelr' #pragma
g.c(15479): Error! E1120: Parameter number 1 - invalid register in #pragma
g.c(15479): Error! E1120: Parameter number 2 - invalid register in #pragma
g.c(15479): Error! E1120: Parameter number 2 - invalid register in #pragma
g.c(15479): Error! E1120: Parameter number 1 - invalid register in #pragma
g.c(15479): Error! E1122: Illegal register modified by 'mmovelr' #pragma
g.c(15479): Error! E1122: Illegal register modified by 'mmovelr' #pragma
g.c(15479): Error! E1121: Procedure 'mmovelr' has invalid return register in #pragma
g.c(15479): Error! E1120: Parameter number 1 - invalid register in #pragma
g.c(15479): Error! E1120: Parameter number 2 - invalid register in #pragma
g.c(15479): Error! E1120: Parameter number 2 - invalid register in #pragma
g.c(15479): Error! E1120: Parameter number 1 - invalid register in #pragma
g.c(15479): Error! E1122: Illegal register modified by 'space_fill' #pragma
g.c(15479): Error! E1120: Parameter number 1 - invalid register in #pragma
g.c(15479): Error! E1120: Parameter number 1 - invalid register in #pragma
g.c(15479): Error! E1122: Illegal register modified by 'mspace_fill' #pragma
g.c(15479): Error! E1121: Procedure 'mspace_fill' has invalid return register in #pragma
g.c(15479): Error! E1120: Parameter number 1 - invalid register in #pragma
g.c(15479): Error! E1120: Parameter number 1 - invalid register in #pragma
g.c(15479): Error! E1147: Too many errors: compilation aborted
g.c: 15479 lines, included 3469, 0 warnings, 0 errors

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

It works just fine here - those errors indicate you didn't supply the correct arguments, and is the expected output if you are missing at least the "-DDOS=1 -DASM86=0" flags.

Building as follows (assuming the WATCOM, PATH, INCLUDE, and other relevant variables are correctly set):

DOS 8086 16-bit build:
wcl -k32767 -bcl=DOS -DDOS=1 -DASM86=0 -0 -ml -fpi -zm @g.lkr -d0 -s -zq -oabls -zp2 -fe=g8086.exe g.c

DOS 80186 16-bit build:
wcl -k32767 -bcl=DOS -DDOS=1 -DASM86=0 -1 -ml -fpi -zm @g.lkr -d0 -s -zq -oabls -zp2 -fe=g186.exe g.c

DOS 80386 32-bit build:
wcl386 -l=DOS32A -bt=DOS -DDOS=1 -DASM86=0 -3r -mf -fpi -zm @g.lkr -k32767 -d0 -s -zq -oabls -zp2 -DWCL386=1 -fe=g386p.exe g.c

I tested just now using a fresh checkout of G, and the g8086.exe binary misbehaves, but the g186.exe and g386p.exe binaries work fine.

Using OpenWatcom V2, I regularly build six variants, 16-bit DOS 8086, 16-bit DOS 80186, 16-bit DOS 80286, 16-bit DOS 80386, 16-bit OS/2 80286, 32-bit OS/2 80386, and the only build that misbehaves at runtime is the 16-bit DOS 8086 build.

Edit: I am building on 64-bit Linux, if this is relevant. The resulting DOS binaries are tested on a real DOS system, as well as DOSBox and 86Box emulators.

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

In case this helps, here are three fresh builds of G (commit 6fc8d86e175e1281cda0bab94795f39a87e3587e), side by side, all built with OpenWatcom V2, exactly as shown:

Left to right is 16-bit 8086, 16-bit 80186, and 32-bit 80386, showing what happens after pressing F7:

Screenshot from 2022-05-09 04-38-39

@jmalak
Copy link
Member

jmalak commented May 9, 2022

Sorry how you can mix different architecture instruction.
You suppose that you will run 16-bit code on 32-bit CPU and it can not work on 16-bit CPU !!!

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

I'm not sure what you mean?

The 8086 and 80186 binaries are 16-bit MS-DOS only. I always test them on a real IBM XT running DOS.

There is no 32-bit code at all used in the 16-bit builds.

(G supports just shy of 70 different platforms, ranging from 16-bit to 64-bit.)

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

These are the exact three DOS EXE binaries, which I just compiled in the screenshot above, using OpenWatcom V2 (gzip compressed).

The first one, the 16-bit 8086 binary, malfunctions at runtime as explained above (both in emulation as shown, as well as on real 8086 hardware - an IBM PC XT running MS-DOS 5):

16-bit MS-DOS 8086: g86.exe.gz
16-bit MS-DOS 80186: g186.exe.gz
32-bit MS-DOS 80386 DOS32/A: g386p.exe.gz

For completeness, this is the exact same source code, but built for 8086 16-bit MS-DOS using Microsoft C 8.00a, which works fine. I now regularly build for DOS using Microsoft C 7.00b, Borland C 5.02, and many other compilers, all work fine.

@jmalak
Copy link
Member

jmalak commented May 9, 2022

How you can run movsd on 16-bit CPU, it is movsw on 16-bit CPU (same opcode) but different operand and different functionality?
It has nothing to do with your code on other compilers they don't use #pragma aux for in-line code. They will work, but OW not.

@jmalak
Copy link
Member

jmalak commented May 9, 2022

Anyway you should fix your code for in-line memory functions/macros to use unsigned type not short.
Next you should use byte size not element count or create specialized function/macros, but you complicate your live by many variants.
For 16-bit in-line code can not use instruction movsd, stosd, etc. it doesn't work properly on 16-bit CPU.
By example bzero should be something like

extern void  bzero(void *start, unsigned len);
#ifdef _M_I86
# pragma aux bzero =                                       \
  "xor ax,ax"                                              \
  "shr cx,1"                                               \
  "rep stosw" parm[es di][cx] modify[ax];
#else
# pragma aux bzero =                                       \
  "xor eax,eax"                                            \
  "shr ecx,2"                                              \
  "rep stosd" parm[edi][ecx] modify[eax];
#endif

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

How you can run movsd on 16-bit CPU, it is movsw on 16-bit CPU (same opcode) but different operand and different functionality?

I'm not sure what you mean here.

You can verify that there is actually no such instructions compiled in the code whatsoever.

Since these builds for DOS are made using -DDOS=1 -DASM86=0. There is zero assembly code in use.

I think you are being confused by code which is not compiled at all for DOS. You can confirm this by running the just the preprocessor.

image

As you can see, those instructions (appearing 18 times in the input code), are used zero times in the output code.

These instructions cannot be related because they are never compiled in this configuration.

It has nothing to do with your code on other compilers they don't use #pragma aux for in-line code. They will work, but OW not.

I'm not sure what you mean. I use OpenWatcom V2 for this code, and it works just fine. I even included screenshots and the compiled binaries.

It is, in fact, the primary compiler used for building G on MS-DOS and OS/2.

@johnsonjh
Copy link
Author

None of that code is relevant - it is not used at all when you compile G for DOS as I've shown above - you can confirm this by looking at OpenWatcom's C preprocessor output.

OpenWatcom compiles G just fine.

The only issue is that the 8086 EXE (compiled with -0) misbehaves.

Using -1 or anything else, the problem goes away.

@jmalak
Copy link
Member

jmalak commented May 9, 2022

Sorry you are ignoring what I tried to explain you, your OW in-line code is buggy.

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

Sorry you are ignoring what I tried to explain you, your OW in-line code is buggy.

You are not understanding me.

I am NOT ignoring you. Not at all.

You seem to be missing the fact the code you are talking about is not used at all. Again, it is not even compiled as part of the binary that is causing the problem. You can simply remove it - since it is not compiled, it cannot matter.

Please look at the exact commands I'm running above, with unedited output.

In fact, I'll create a branch with all that code removed but the problem remains ...

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

I've removed all that code, and the binary which OpenWatcom produces is identical, which confirms that it's impossible that code could be a problem, as it is never compiled.

This doesn't help or change anything, unfortunately. Compiling this revised code, with all that removed, gives me the exact binary with the exact same problems:

-rwxr-xr-x 1 jhj jhj 116170 May  9 06:52 g86.exe
-rw-r--r-- 1 jhj jhj 116170 May  9 05:13 g86-old.exe

(Edit: Use the https://github.com/johnsonjh/g/blob/asm2/src/g.c file, as I kind of botched the commit referenced above.)

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

See https://github.com/johnsonjh/g/blob/asm2/src/g.c for the preprocessed version of the code.

You can see, there are no assembly instructions at all.

I compiled it again with:
env WATCOM="/opt/watcom" INCLUDE="/opt/watcom/h" PATH="/opt/watcom/binl64:${PATH}" wcl -k32767 -bcl=DOS -0 -ml -fpi -zm @g.lkr -d0 -s -zq -oabls -zp2 -fe=g86.exe g.c. This binary misbehaves.

I compiled with env WATCOM="/opt/watcom" INCLUDE="/opt/watcom/h" PATH="/opt/watcom/binl64:${PATH}" wcl -k32767 -bcl=DOS -1 -ml -fpi -zm @g.lkr -d0 -s -zq -oabls -zp2 -fe=g186.exe g.c. This is just fine.

Again, I think you are misunderstanding the problem I'm trying to report. There is no problem compiling this code.

It's only running the binaries.

The binary compiled with -1 works fine and with -0 does not.

Edit: These binaries are the same as the ones in #862 (comment) but feel free to try to reproduce.

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

For 16-bit in-line code can not use instruction movsd, stosd, etc. it doesn't work properly on 16-bit CPU.

I'm aware - it's impossible to even compile that code for 8086. The code you are talking about is never used, except when building for 386+ CPU's, and is then only enabled with -DASM86=1, for example: wcl -k32767 -bcl=DOS -DDOS=1 -DASM86=1 -3 -ml -fpi -zm @g.lkr -d0 -s -zq -oabls -zp2 -fe=g386r.exe g.c - this code has never been a problem, and isn't related to the 8086 issue as it's not compiled in.

@jmalak
Copy link
Member

jmalak commented May 9, 2022

I don't want study your code.
If you give me simple compilable example to reproduce your problem then I can look what happen.

@jmalak
Copy link
Member

jmalak commented May 9, 2022

Anyway your code is terrible.
Remove following lines in your code and all will be OK.

# pragma aux main     aborts;
# pragma aux _exit    aborts;

what about C and POSIX standards for main and _exit.
Start with debuging first to find your own bugs.
Next bug in preprocessor condition

# if ( TINY_G && !defined(WCL386) ) \
    || ( DOS && defined(_MSC_VER) )

TINY_G must be defined as

#ifndef TINY_G
# if DOS
#  if defined( __WATCOMC__ ) && !defined( _M_I86 )
#   define TINY_G  0
#  else
#   define TINY_G  1  /* small buffers etc for real mode DOS */
#  endif
# else  /* if DOS */
#  define TINY_G  0
# endif  /* if DOS */
#endif  /* ifndef TINY_G */

and condition must be

# if TINY_G || ( DOS && defined(_MSC_VER) )

Using of OW in-line code your way is terrible, mixture of 32-bit CPU and 16-bit CPU instruction for 16-bit DOS code etc. without any condition.
Your code can not be compiled for 32-bit DOS Extenders.

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

@jmalak

This code compiles just fine - I actually support multiple DOS extenders - but that's besides the point here.

None of this code is even compiled or present in the binaries in question.

You really need to understand this to be able to help me. Once again, you MUST be getting confused I think, because the code you are mentioning is not used in any extended DOS configuration. It's used ONLY in real-mode 386 configurations.

But, just to show you, OpenWatcom V2 has zero issues at all with this code.

Here is a 32-bit DOS32/A version:
Screenshot from 2022-05-09 18-07-49

I regularly build DJGPP versions as well, no problems. I simply cannot understand why you say this cannot compile? Not only does it compile, but the 32-bit DOS extender versions are the ones I use daily.

I've even attached the compiled DJGPP and OpenWatcom V2 DOS extended versions to this post. You don't even have to believe me here. You can replicate my compiles in less than 10 seconds using a fresh checkout of the git repo!

OpenWatcom V2 DOS32/A 32-bit build: g386p.exe.gz

I really need you believe the code compiles.

I'm very very frustrated here, because this code compiles just fine on literally every system I've tried it on. I mean, my mind is blown here. I'm not trying to be obstinate, but what you are telling me has no bearing whatsoever on reality! Here is a VIDEO of me compiling G just fine using a DOS extender as above - it takes literally seconds:

Here is the unedited video proof that it compiles and runs and using a 32-bit DOS extender ...

G-DOS.mp4

However, none of this is remotely related to the problem!!!!

@johnsonjh
Copy link
Author

@jmalak

I've actually worked to try narrow down the problem and I've been able to narrow it down very much.

The problem happens only when using -zm.

If I don't use -zm there is no problems.

If I use -zm there is.

Here is a video of the entire process from a fresh git checkout.

In the first 60 seconds, I build G for MS-DOS 8086 WITH -zm and again WITHOUT -zm.

In the final 60 seconds, I demonstrate that the -zm build does not work, but the build without it works just fine.

ZM-problem.mp4

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

# pragma aux main aborts;
# pragma aux _exit aborts;

I removed this, but it doesn't fix anything. Here is video proof of removing it and the problem 100% remains.

When I change the compilation to remove -zm and the problem is fixed.

The following 100 second video shows proof of the broken build even with that code removed, and shows that removing -zm fixes it.

ZM-OWC2.mp4

@jmalak
Copy link
Member

jmalak commented May 9, 2022

It is your problem with your code, not my.
After fixing your bugs in G editor it compile and works.

@jmalak jmalak closed this as completed May 9, 2022
@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

@jmalak

Did you even watch the video?

The problem is -zm - it has nothing to do with the above code.

I showed you video proof of this. I can't understand what our disconnect is here!

I don't care if you close the issue out, but I want to understand the problem. I did exactly what you said fixes the problem. It does not fix the problem. I made a video of myself doing this.

How is this a bug in G? I did exactly as you said and the problem remains. If the problem is a bug in this code, how can the bug remain in the code when the code you said was buggy has been removed and no longer exists? I'm not trying to be difficult here. I really want to understand how deleted code can affect this?

Forget EVERYTHING else:

  • I am able to avoid the bugs by not using -zm.
  • The problem that happens when using -zm happens both with your fix and without.

Please - I'm really trying to understand here! I did exactly the fix you suggested and the fix you claimed work and you say it doesn't work?!

I feel like I'm going insane here. :(

@johnsonjh johnsonjh changed the title Miscompilation when using (-0) 8086 real-mode DOS target, working for (-1) 80186 higher targets Program misbehaves when using -zm - was: Miscompilation when using (-0) 8086 real-mode DOS target, working for (-1) 80186 higher targets May 9, 2022
@johnsonjh
Copy link
Author

Regardless, feel free to leave it closed. I won't use -zm anymore.

@johnsonjh
Copy link
Author

johnsonjh commented May 9, 2022

@jmalak

Now you are really going to hate me.

I cannot replicate the problem at all using the OpenWatcom V2 build from 2022-05-05. I was using 2022-03-13.

So, for anyone who happens to care or have a similar issue upgrading to 2022-05-05 made the problem go away without other changes.

johnsonjh added a commit to johnsonjh/g that referenced this issue May 9, 2022
… issues - at least for me - with 8086 builds. The OpenWatcom V2 upstream claims they could not replicate it, however. (open-watcom/open-watcom-v2#862) - While here, removed the old and now unused mini-crt0.asm and associated code to avoid further confusion from anyone thinking it was in use, that was for old WATCOM versions only - removed references to the mini-crt0 from the documentation as well.  Closes #16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants