Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap-rebuild #37

Open
wants to merge 25 commits into
base: main
Choose a base branch
from
Open

Bootstrap-rebuild #37

wants to merge 25 commits into from

Conversation

partouf
Copy link
Contributor

@partouf partouf commented Aug 14, 2020

At least fixes #7

Although no idea if this is the correct way of course - With these changes there is no reference to errno anymore in the symbol table of the object files (well, the 1 I tested with readelf) and now contains a reference to __errno_location instead.

Either way, the original error doesn't occur anymore, but here's the next one:

==> Linking src/tcc
/opt/compiler-explorer/tendra-source/tendra/obj.ubi2-bootstrap/bin/tcc -Yposix -Xp  -o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/tcc /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/_partial/src/shared/_partial.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/archive.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/compile.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/environ.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/execute.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/filename.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/flags.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/lexer.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/list.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/main.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/options.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/stages.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/startup.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/utility.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/table.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/hash.o /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/temp.o 
/usr/lib/i386-linux-gnu/libc_nonshared.a(atexit.oS): In function `atexit':
(.text+0x11): undefined reference to `__dso_handle'
/usr/bin/ld: /opt/compiler-explorer/tendra-source/tendra/obj.ubi2-rebuild/obj/tcc/src/tcc: hidden symbol `__dso_handle' isn't defined
/usr/bin/ld: final link failed: Bad value
*** Error code 1

@partouf
Copy link
Contributor Author

partouf commented Aug 14, 2020

Trying to grasp some of this atexit/dso_handle thing - reading https://stackoverflow.com/questions/34308720/where-is-dso-handle-defined and https://itanium-cxx-abi.github.io/cxx-abi/abi.html#dso-dtor-runtime-api .

From what I'm getting is that it's perhaps just a matter of defining __dso_handle and a dummy implementation (unless a c++ program), will try that out.

tspec/base/c/c89/errno.h.ts Outdated Show resolved Hide resolved
@partouf
Copy link
Contributor Author

partouf commented Aug 15, 2020

I was pondering about __dso_handle some more and realized that maybe we could just do what Tendra handles crt0 crt1 and crtn.
But I don't know what the logic should be of when to add them and when not.

But that (ff82032) was the last step I had to do into making bootstrap-rebuild work.

I do wonder if there's not a better way of handling this. The paths for crtbegin and end are very specific to which gcc and libraries are used. Maybe there should be some sort of a --gcc-toolchain thing like Clang has.

@partouf
Copy link
Contributor Author

partouf commented Aug 15, 2020

btw I'm using Linux version 4.15.0-88-generic (buildd@lgw01-amd64-036) (gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)) #88-Ubuntu SMP

And this is my system's gcc atm:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)

But this is my compiler-explorer VM, so for testing purposes, I have most of the compilers the live site has too.

@katef
Copy link
Member

katef commented Aug 17, 2020

Hm! I see what you're doing here. But we definitely can't depend on gcc's crt files :) (And yes, the paths will be different for each gcc).

We could provide our own. We already do that for some systems - e.g. https://github.com/tendra/tendra/tree/main/osdep/machines/solaris/sparc/src

If this is the way to go, I think it would be just the same idea.
I think crtbegin.s (and therefore __dso_handle) would be for C++, and I'm not sure what we'd need to provide for it.

There also seems to be /usr/lib/i386-linux-gnu/crt1.o

@katef
Copy link
Member

katef commented Aug 17, 2020

/usr/lib/i386-linux-gnu/crt1.o comes from libc6-dev:i386, I think this is the right thing to depend on.

@partouf
Copy link
Contributor Author

partouf commented Aug 17, 2020

Hm! I see what you're doing here. But we definitely can't depend on gcc's crt files :) (And yes, the paths will be different for each gcc).

We could provide our own. We already do that for some systems - e.g. https://github.com/tendra/tendra/tree/main/osdep/machines/solaris/sparc/src

If this is the way to go, I think it would be just the same idea.
I think crtbegin.s (and therefore __dso_handle) would be for C++, and I'm not sure what we'd need to provide for it.

There also seems to be /usr/lib/i386-linux-gnu/crt1.o

Although it's true that dso_handle is for the C++ ABI, it is required by libc.

Here's a tiny discussing between some LLVM people that needed to start implementing it http://llvm.1065342.n5.nabble.com/llvm-dev-Providing-dso-handle-in-LLVM-td109516.html

The best thing todo would probably be to put it in TCC as a generation step when __dso_handle is not defined by the user code (as I think you're supposed to be able to implement it yourself if you want alternate behaviour), but I think that's a little complicated.
So perhaps the best thing would be to have some custom code to be linked alongside like crt1.s in the example of sparc, indeed.

There are some more things aside from __dso_handle though that might be more complicated. Will try it out and include the other symbols that might be needed, and then figure out what they're for.

@katef
Copy link
Member

katef commented Aug 18, 2020

The llvm thread there is great, good find. What they're saying about being part of the C++ ABI makes sense. It sounds like providing our own __dso_handle is the "right" thing to do.

One last question, do you think we should be using /usr/lib/i386-linux-gnu/crt1.o for any other reason?

@partouf
Copy link
Contributor Author

partouf commented Aug 21, 2020

Took me a while to realize a lot of code is not actually used or built any more in the project, plus the fact that the code needs to be compiled for multiple architectures..
And there doesn't seem to be a good way to determine the right directory from the default files for this either.

@katef
Copy link
Member

katef commented Aug 21, 2020

I think the path for /usr/lib/i386-linux-gnu/crt1.o is supposed to be a stable name, isn't it?

@partouf
Copy link
Contributor Author

partouf commented Aug 21, 2020

I think the path for /usr/lib/i386-linux-gnu/crt1.o is supposed to be a stable name, isn't it?

I would assume so.

@partouf
Copy link
Contributor Author

partouf commented Aug 21, 2020

Ok, long story short. I have an implementation (that compiles, but segfaults) - but am now realizing the value of __dso_handle needs to be filled with an address. Not sure how I'm going to do that yet, will need to sleep on it.

@partouf
Copy link
Contributor Author

partouf commented Aug 21, 2020

Figured out how to fill the address, but it didn't solve the segfault. So I guess I'll be debugging gcc+glibc sources to find out what's going on :'(

@partouf
Copy link
Contributor Author

partouf commented Aug 21, 2020

Ok, long story short. I have an implementation (that compiles, but segfaults) - but am now realizing the value of __dso_handle needs to be filled with an address. Not sure how I'm going to do that yet, will need to sleep on it.

Now fixed, apparently putting the data in .sbss causes issues in 32bit mode or something, so I now put it in .data instead and that seems to work fine.

@partouf
Copy link
Contributor Author

partouf commented Aug 21, 2020

I think the path for /usr/lib/i386-linux-gnu/crt1.o is supposed to be a stable name, isn't it?

Finding out that it's not. /usr/lib32 seems to be the better choice. But this is a native path for x86/64 systems, so it wouldn't work for cross compilation from other platforms I think?

I think for now it's better to go for /usr/lib32 instead.

Note: Added bootstrap-rebuild to the Github Actions

@partouf
Copy link
Contributor Author

partouf commented Aug 21, 2020

So the last step in this is to re-examine the errno problem and test it on other platforms

@katef
Copy link
Member

katef commented Aug 22, 2020

But this is a native path for x86/64 systems, so it wouldn't work for cross compilation from other platforms I think?

I think this is fine - because we'd only be setting this path in an env file under the x32_64 target. We wouldn't set this path for either the native i386 or amd64 (when it exists) targets.

@partouf
Copy link
Contributor Author

partouf commented Aug 22, 2020 via email

@partouf
Copy link
Contributor Author

partouf commented Aug 24, 2020

Ok then, maybe fixed?

I found out there really is no need to refer to __errno_location, all I needed to do was +IMPLEMENT "c/c89", "errno.h.ts" ;
I saw a lot of other h.ts files do that, but errno.h.ts did not.

Will this cause any issues you think @katef ?

@partouf partouf changed the title [WIP] Attempt at fixing bootstrap-rebuild Bootstrap-rebuild Aug 25, 2020
@@ -3,8 +3,7 @@
#
# See doc/copyright/ for the full copyright terms.


+EXP (extern) int errno ;
+IMPLEMENT "c/c89", "errno.h.ts" ;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really interesting. If there's a difference from C89 here, it is likely to be because this version of POSIX did actually specify errno that way. I'll try to check this. Thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C89 §4.1.3 p2:

and errno which expands to a modifiable lvalue/83/ that has type int
so our C89 definition is correct (of course)

I can't find a copy of the 1988 posix text, IEEE Std 1003.1-1988. But I can find 1990, which does indeed say:

extern int errno;

with no possibility for a macro.

So we cannot +IMPLEMENT c89's definition here, unfortunately.

Instead I think the proper thing to do, is to mark errno as __WRONG in the osdep stuff for Linux systems where this is a macro (i.e. all of them).

I also see that the tspec definitions for posix1 +IMPLEMENT c89, which I now believe to be incorrect.

Perhaps we have the option to provide some compatibility glue in the osdep/ hacked includes, such that our posix and posix1 TDF tokens are somehow mapped to the underlying libc macro. I'm not sure exactly how to do that, but maybe with PL_TDF.

If we can't do that, this is a great shame. In practice that means we'll need to depend on whatever version of posix for rebuild did introduce the macro form, not on posix. Or we find a way to drop the dependency there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc #11 and #7

Copy link
Contributor Author

@partouf partouf Aug 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(From what I understood from the C definition is that up until C11, it was allowed to be either an assignable thing or a macro. (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1338.htm) While the posix implementations mostly had macros for a very long time.)

However, the problem not so much lies in the definition, but the result of the actual code.

Because +EXP (extern) int errno causes an actual implementation in the header file that is output (share/tspec/TenDRA/include/posix.api/errno.h extern int errno ;), as opposed to the +EXP lvalue int errno; in c89 that only causes a #pragma (#pragma token EXP lvalue : int : errno # c89.errno.errno ).
So in the case of using the c89 header, it will allow the "actual" errno.h (/usr/include/errno.h) implementation of the system have ownership of the symbols.
While in the original posix spec, the errno symbol is reimplemented as extern int errno ; in the resulting errno.h.

It would be fine if the spec would just output an appropriate pragma.

(I say "implementation" while it's not actual implementation just 'code to say hey I want the symbol here that is defined elsewhere')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From irc:

errno is an lvalue equivalent to extern int, issue 5 made changes to allow for a thread local errno more of which was done in issue 6 with issue 7 completing the task and making it no longer potentially different from c90.

@partouf

This comment has been minimized.

@partouf

This comment has been minimized.

@partouf
Copy link
Contributor Author

partouf commented Aug 27, 2020

In IEEE Std 1003.1-2008 (and later editions as well) this is said https://pubs.opengroup.org/onlinepubs/9699919799.2008edition/basedefs/errno.h.html#tag_13_10 :
The <errno.h> header shall provide a declaration or definition for errno. The symbol errno shall expand to a modifiable lvalue of type int. It is unspecified whether errno is a macro or an identifier declared with external linkage. If a macro definition is suppressed in order to access an actual object, or a program defines an identifier with the name errno, the behavior is undefined.

Interesting to note: Some of the functionality described on this reference page extends the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-2008 defers to the ISO C standard.

I realize was different from IEEE Std 1003.1-1988, but TenDRA only has 1 spec implementation. And the way it's implemented in TenDRA seems to be incompatible with most modern unix's in the case of errno.

And since technically +EXP lvalue int errno; isn't totally incorrect (extern int errno is also an lvalue int), and it works for both old and new posix versions, and that TenDRA requires it in the makefiles for building - how about this:
cdf27f1

So that with just posix, you get:
tcc -Yposix -S main.c

main:
        call __errno_location
        movl %eax,%edx
        movl $12,(%edx)
        call __errno_location
        movl %eax,%edx
        movl (%edx),%eax
        ret

and with strict_posix you get:
tcc -Ystrict_posix -S main.c

main:
        movl $12,errno
        movl errno,%eax
        ret

@katef
Copy link
Member

katef commented Aug 28, 2020

That's a clever idea, with -Ystrict_posix, but I'm afraid I'm not comfortable with that. We already have an idea of strictness, (by way of the -X flags), and I don't like mixing that up with well-defined APIs.

About errno, i propose a few things:

  1. -Yposix is correct, we should leave that as it is. It represents the 1988 spec.
  2. We should provide additional posix revisions. We have a few already, but we'll need the rest anyway.
  3. tdfc2 doesn't actually use errno in the same files which depend on posix - for example in system.c. So let's just have -Yposix for those files specifically.
  4. I would still like to try to remove the dependency where we can, which is what Remove POSIX dependencies #11 is about.

Especially, the idea of API specs being correct is important for TenDRA; it's what the whole system is based around.

Let's do (3) first (and on a separate branch), I think it makes life easier. I'll try to do that myself this evening, if I can.

@katef
Copy link
Member

katef commented Aug 28, 2020

I moved out discussion of -Yposix's errno to #57 - because it doesn't help here at all, but it's its own problem regardless. And in #58 I've made the -Yposix flags per file.

Sorry if it seems like a lot of fuss, but I'd like to spend a little more time thinking about this. I'm sure we can find a solution somehow that doesn't depend on implementing a whole new tspec API (yet), and also doesn't involve changing what the -Yposix API claims.

@katef katef force-pushed the main branch 3 times, most recently from 7c8347f to 46c4502 Compare November 9, 2022 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

make bootstrap-rebuild fails for TLS eglibc/glibc
2 participants