Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Linker #91

Closed
HitTheSticks opened this Issue · 26 comments

3 participants

@HitTheSticks

Are y'all planning on using an existing linker? Has any work gone into that?

I have some interest in writing my own. Specifically because I don't want to use ELF or PE or any of the fat object formats. I specifically intend for my project's results to be statically-linked blobs. Load the binary image, fix up relocations, and set the PC to the entry point.

I may actually combine the assembler and linker into one program. 'dcpuasm input1.s input2.s input3.s input4.s -o linked.bin' directly generates the 'linked.bin' binary, fully assembled and linked.

@krasin
Owner

He there!

The linker work is free to pickup. It would be great to be able to produce the binaries from C w/o manual assembly linking, like now.

Have you seen binutils port for DCPU16? I believe they use ELF, but on the bright side, they have all the tools working. So, it might be useful (beware of GPL, though).

@krasin
Owner

Actually, generating .o files is also not started and should probably be implemented before the linker.

@ghost

It seems they have a full array of tools. Meaning assembler and linker! I don't care if it's GPL, since we won't using it for commercial stuff and won't distributing it.

@krasin
Owner

Well, using GPL for the commercial stuff is perfectly valid, but it's probably not related.

@HitTheSticks

The GPL isn't an issue if we don't actually include GPL code in the output. You are free to use GPL gear to build proprietary software. You just aren't free to extend GPL code with private code (and distribute it) without releasing your code, too.

As for binutils... We don't need the pain that ELF is going to bring. One of the simpler image file formats is probably a better choice. Otherwise, the final step of the build process is going to have to be an ELF->DCPU.bin translator. And you're going to have to do it from a fully-linked, fixed-up image.

Meanwhile, if static linking into a single binary object is the goal, nearly all of that bloat can be stripped away.

After work, I'm going to look into lightweight object file formats from the embedded development world.

@krasin
Owner

The first goal is to have just plain static binaries. The old model of shared objects won't work for DCPU16, because it's designed in "many programs use the same shared libraries, let's save the memory with mmap". In DCPU16 case it's likely that we'll want to be able to have some parts of program on the secondary storage ("floppies") and load them by need, but it's too earlier to say.

Please, make sure that your object files format support relocations, because it would make everything easier. But, definitely, you don't need 100500 types of relocations that ELF has to support.

@aubreyrjones

Here's my plan. [http://0x10cforum.com/forum/m/4932880/viewthread/2746719-alternative-to-operating-systems]

My goal is to build a simple object format, and a simple program image format. Something with no bloat that can literally just be dropped into the memory of the DCPU16 and jumped to. Like, copy it in and press "play".

If I definitely need to do a traditional linker, I want to support relocations, simply because there's no other good way to do incremental linking.

But, there's another option. Almost by definition, DCPU16 programs cannot be very big. Especially compared to the desktop computers most of us are using for development. Incremental build and link is not necessary, or even especially desirable.

Could the "object" files simply be assembly files? The toolchain would look like this: clang -> llvm (optimize) -> .S files -> nos_link. Then the .S files are concatenated, linked and assembled (in that order).

Basically the entire link stage would operate on assembly code, with textual symbols matched in a big hashtable. Then the assembly is rewritten to coalesce segments, and finally actually assembled.

Now, I'm down for doing a traditional binary assembler-then-linker pipeline, if that's what y'all demand. But, what I'm thinking is that this might not only be easier to write but might also be more accessible to somebody using 0x10c to learn programming. That is, every step of the process is readable, useful text up until the very last product. There is no stage where mysterious half-finished programs suddenly appear in your workspace.

When was the last time you actually wanted to fiddle with a .o file? They exist only as an artifact of incremental build tools (and dynamic, runtime linking). Incremental linking when you're building maybe a few thousand lines of code on a quad-core desktop supercomputer seems unnecessary.

@krasin
Owner

Assembly-level linker looks fine with me.

.o files might be important if you want to be able to store rarely used functions on the secondary storage. In this case, you don't know the load address in the advance and you will have to implement a simple relocation scheme.

But this is something that now even in plans, so, please, go ahead and build assembly-level linker. It will be useful for all of us.

@aubreyrjones

That is a good point, that you might want to page in/out some seldom-used functionality.

But that would need special attention anyway. Either it needs to be position-independent code (which must be generated during compile and link), or it must be relocatable code (with reloc metadata included).

One alternative to dynamic loading that people over in the forum have mentioned to me is the idea of overlays. I think it works like this: several different parts of a program are assembled/linked so that they occupy the same address. So, if foo() and bar() are overlayed, they're both assembled and linked to 0x220e. The linker/assembler then inserts a call to the pager before any invocation of foo() or bar(). The pager checks whether or not the proper function is loaded, and loads if necessary (overwriting the current contents of the overlayed memory), then returns. The foo() or bar() call then proceeds.

Now, this is way slick, from an implementation POV and from a limited-systems/rich-programs POV. But, it means that the binary format cannot be identical to the in-memory image of that executable, since some resident module needs to unbox it.

I'm not opposed to that. But it does make trivial bootloaders like "read linearly from disk into identical offsets in memory" more difficult.

Actually, **** it.

No object files. But I'll support overlays in my loader if you'll put in the #pragma.

@aubreyrjones

Actually, no. The trivial bootloader can still work, if you don't compress the null space where the overlay goes. So it's compatible with overlay-free code without any penalty.

I'm totally doing it.

@krasin
Owner

Cool. Please, keep us posted, because it would be nice if clang will know how to invoke your linker, so instead of

clang -ccc-host-triple dcpu16 -S foo.c -o foo.s
clang -ccc-host-triple dcpu16 -S bar.c -o bar.s
nos_link foo.s bar.s -o binary

You will be able to say:

clang -ccc-host-triple dcpu16 foo.c bar.c -o binary

(unrelated, but anyway). Sure, -ccc-host-triple will go away soon. By default, clang will assume it, but clang binary will probably have dcpu16- prefix. So,

dcpu16-clang foo.c bar.c -o binary

is the future, if everything would go as planned.

@aubreyrjones

I'm hacking on this in ruby. I promise to write readable code, not monkey patch anything, and document it. I also promise not to require 600 rubygems. Only a couple. If you want it in C, we can arrange payment details. :)

I will tell you the basic invocation right now, because it will be simple at first. For a basic, make-this-a-loadable-program-at-standard-offset, it will be (note the order, options up front):

nos_link -o cwd/relative/output.x10b cwd/relative/first_asm_file.s [space separated list of more files]

Anyway, hacking time! I'll tell you when it's time to pull a copy.

@aubreyrjones

Hey folks.

My linker is about 80% functional. It seems to be generating a few invalid ops (well, more specifically, it seems to generate a few 3-byte ops when they should be 2-byte).

So, right now, if you take the -l listing, it link in nos_link and properly assemble in an assembler that isn't broken. I expect to fix my assembler tomorrow.

I implemented '.hidden' (aliased to '.private') symbols; they will be guaranteed not to overlap with symbols defined by another module. (I just mangle names around module file names for .hidden symbols.) '.hidden' must appear in the same source file as the definition it's hiding; it need not appear before the definition.

The default visibility for a definition is global. Symbols not defined in a given module are considered extern. If they're not found in the program as a whole, it's considered a link error and the world explodes.

Data is currently unimplemented. That's for tomorrow night.

You can get debugging listings with '-l' (for assembly listing), '-p' (for pre-binary), and '-b' (for asm/hex side-by-side).

I'm going to bed.

https://github.com/netzapper/nos_link

@aubreyrjones

Oh, to be clear. The linker does not break on any of the pseudo ops I've seen in the clang code I've generated. And the output does not include any pseudo-ops. So, at the very least, nos_link can already be used as a cleaner (that handles linkage visibility).

@ghost

Good job! I will have a look this evening (UMT+2).

@krasin
Owner

@netzapper Cool! You were fast!

@arbaal I'm going to bed soon too, please, share your experience after you tried it.

@ghost

Okay, gave it a quick shot. After I installed your dependencies, I could run same basic tests and so far it seems to work. But right now I couldn't get it running with a emulator (dcpu16py). What is a good emulator right now? I mean one with binary support ..

@aubreyrjones

I'm using dcpustudio (or whatever that's called).

As I said, I don't think the assembler is properly assembling right now. But, try doing "-l" on it and get the listing. That listing should be clean and properly mangled so that another assembler will take it and generate a proper binary image.

I'll be fixing the assembler and adding data word support tonight.

@aubreyrjones

Hey, just wanted to let y'all know that I've fixed the binary issue. It was outputting both a short literal and a parameter word (thereby introducing a bad instruction in the stream).

Check that out. It's handled everything I've given it, and it's all running as well as expected in the emulator.

I'm adding data word support in a couple hours.

@aubreyrjones

Alright, data word and string support is now in nos_link.

I'm using 16-bit chars for strings. The regex that picks them up currently does not allow for escaped quotes. I also have no syntax for the color control codes. We're going to need to develop that, since it'd be nice if it were consistent between clang and the assemblinker.

My next goal is to write the loader/preamble and define the basic program layout. Without any alignment constraints, my inclination is simply to spew forth instructions in a row. But, there are details to be concerned with. And if I'm going to support overlays, I need to do some thinking there.

Where do y'all want the stack? Located just above program space? Is there anything that needs to go into a live program header (program size/stack start, [0] = 0, scratch/context registers, etc.)? Or am I free to squeeze it down as tiny as I can?

@krasin
Owner

@netzapper I'm not really familiar with Ruby gems and how to install them. Could you please save me (and many of your prospective users) a couple of minutes and put a more imperative guide for Prerequisites section in README.md?

@krasin
Owner

@netzapper location of the stack does not matter as long as it can grow down, but putting it on top of the address space is de-facto standard for DCPU-16.

@aubreyrjones

@krasin That work for you? It's pretty simple.

@aubreyrjones

Okay, stack goes up top. Didn't realize it grew down. Looking at the spec, of course it does.

There anything else I'm missing? This is my first linker and my first assembler. (Although it's not my first assembly-level metaprogramming tool.)

@krasin
Owner

@netzapper nos_link works for me on the inputs from test/ directory (nice that you have one!)

I will do a more thorough test when the new binary distribution for Clang is ready (I hope, it would be within 24 hours, just need to fix one or two crashes).

@ghost

I will close this issue, since there is binutils and we haven't hear anything from the original poster for a long time.

@ghost ghost closed this
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.