Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some thoughts #1

Open
minexew opened this issue Sep 8, 2018 · 30 comments
Open

Some thoughts #1

minexew opened this issue Sep 8, 2018 · 30 comments

Comments

@minexew
Copy link
Collaborator

minexew commented Sep 8, 2018

Keep in mind that this will be no small undertaking. Unless you're a schizophrenic with 10 years of free time, we should start with minimalist goals, e.g.

  • instead of re-targeting the x86_64 HolyC JIT compiler to RISC-V right away, use/build a HolyC transpiler for now
  • don't bother with building actual hardware, use QEMU for testing (I believe it has excellent RV support) -- the chip, which future generations of mankind will use to run this OS, doesn't exist yet anyway
  • basically figure out the bare minimum subset to get something running, and then the details can be filled in
  • also note that RV has several variants, ideally we would choose just one to support

The first thing that is loaded after the bootloader is the kernel. Now I am not sure how much of the GUI is compiled in there and how much is JITted at boot. The second thing to get loaded is the compiler, which will need to be stubbed out until retargeted. Anyway, if and when we get any GUI running at all, it will be a big morale boost and could also attract more developers.

God says...
...damn, i need to take a massive dump now

@Byzantian
Copy link
Owner

Byzantian commented Sep 8, 2018

Thank you for doing this. Your concerns pretty much line up with my town thoughts I had on the subject.

>compiler
Since most of TempleOS is written in Holy C, transpiling the bootloader and compiler to RISC-V assembly would a a good start. How familiar are you with Terry's compiler? I know that HolyC has some limited support for x86 assembly. Does it compile to that as an intermediary representation, or does it generate the machine code directly? If it indeed uses an intermediary representation, all we would need to do is implement Terry's limited x86 subset as a "high level" language on top of RISC-V. But I have a hunch that it won't be that easy, since it's more likely that Terry just generates the machine code directly. I would like to avoid parsing HolyC if I don't have to. If we have to, it would be wise to use a parser generator like Antlr instead of reinventing the wheel.

>hardware
I believe the only one manufacturing "open" RISC-V computers right now is SiFive. Their Freedom U540 actually seems capable of running TempleOS with its 4 1.4 Ghz cores. But I agree about targeting an emulator first, because:

a) The board hasn't even shipped yet
and
b) Is way too expensive at $1000

>RISC-V variants
I belive we should target RV64GC as that is likely to be the standard for any upcoming high performance single board computers. Having bit manipulation or SIMD would be nice to make rendering easier, but it would be a gamble to rely on them (and we can always reimplement that later, should the ideal hardware magically come along). For now we should target the basic 64 bit ISA (G aka IMAFD). I've found this neat overview of the different variants, but I don't know how accurate it is. I've also read somewhere (probably some official RISC-V documents, but I don't remember) that the low level ring 0 privileged instruction set is not finalized yet, which is another strong argument to target an emulator first.

@minexew
Copy link
Collaborator Author

minexew commented Sep 8, 2018

I've only looked into the compiler briefly. Its front-end does generate intermediate bytecode (see Compiler/PrsExp.HC), which is not bound to x86 too much. Not sure how inline assembly is done, but it's probably very complete and doesn't use the bytecodes at all.
In theory the front-end could be split off fairly cleanly. Then a RV64 back-end could be built. Of course, you would only be able to run the compiler under TOS, which is somewhat impractical. Continuous integration would be a nightmare. But this would be the purest way.
An interesting variant would be to dump the bytecode into a file and have a linux interpreter that could execute this bytecode in linux userspace. That gets you the best of both worlds.

Then there is https://github.com/jamesalbert/HolyC-for-Linux, but it doesn't even go as far as parsing structs. Thus useless, unless the author would be willing to step up. But https://github.com/eliben/pycparser is fantastic and it wouldn't take too much effort to bend it for HolyC. Then you could spit out standard C and let clang chew on it.

@Byzantian
Copy link
Owner

So I guess the first step would be reverse engineering Terry's bytecode and properly documenting it.

One of the bigger hurdles that I see, is the boot code. I've looked at the source code of TempleOS from time to time over the past year, but I never found the boot code. It seems to me that this early bootstrap stage is only available in binary form (please correct me if I'm wrong). Could we use the TempleOS disassembler to generate x86 code that we then could transpile by hand, or does that thing only work for JIT?

I'm currently digging through the complete video archive, trying to get a bigger picture of the OS architecture. Older videos, while potentially providing outdated information, are especially valuable because Terry seems a lot more lucid there.

@minexew
Copy link
Collaborator Author

minexew commented Sep 8, 2018

Oh yeah, the bootloader code is hidden pretty well. It should all be in Adam/Opt/Boot.

QEMU should let us boot straight from ROM, that would save a lot of headaches. Then the bootloader can be minimal.

@Byzantian
Copy link
Owner

Great find. Does that make Temple OS 100% self hosting, then?

I suggest writing some sort of modified version of Terry's disassembler that displays his byte code instead of x86 assembly, could be our first way to get a foot in the door.

As to your suggestion about porting HolyC to linux, I am not quite sure how I feel about this. After all we are trying to port Temple OS to a new platform, as opposed to creating some sort of Temple OS on linux subsystem. What would be the benefit of doing so aside from making version control a bit easier? If we really have to interface with the compiler from outside the OS, we could communicate with the VM via sockets to start the compilation process.

@minexew
Copy link
Collaborator Author

minexew commented Sep 8, 2018

Yeah, as far as I know TOS has been completely self-hosting for a while.

I don't have any particular reasons for wanting to run HC on linux aside from enabling CI. It just seems like a fun thing to do.

@Byzantian
Copy link
Owner

Well that should be easy once we've cracked the bytecode.

Also, I guess now is a good a a time as any to admit that I never fully managed to get TempleOS running in a VM. The installation process always produced an error. That was in July of 2017, so maybe I caught a bad ISO or something (what is the latest "build" that actually works?)

What is the preferred way of emulating TempleOS? VMWare or QEMU? I remember there is support to transport files between the host OS and TempleOS, is that possible on both emulators or only VMWare?

@minexew
Copy link
Collaborator Author

minexew commented Sep 8, 2018

There are multiple ways to interact with the OS in a VM. Terry has scripts to copy files offline (when the VM is shut down) and those are for Vmware. Tramplersheikhs made a tool that lets you mount the FS while the VM is running (then he purged all his repos AGAIN).
I myself use MFA which requires you to run a server within TOS and then you can connect from the outside through TCP. Kinda like using FTP. It works at least in Virtualbox & QEMU.

@minexew
Copy link
Collaborator Author

minexew commented Sep 8, 2018

https://archive.org/download/TempleOS_ISO_Archive/TempleOS_V5.03/Tos_distro_2017.11.20_19-52.iso is probably the latest release and should work just fine

@Byzantian
Copy link
Owner

Thanks. It will take some time before I can start playing around with the disassembler, because I need to spend the next 4 weeks on my semester finals. After that I'll try to make weekly contributions. The only reason I started this project so early was because of the news that Terry had died, so I thought I should seize that opportunity to get potentially interested people on board.
See you then.

And yeah, it's fucking infuriating that Tramplersheikhs keeps deleting his shit for no reason. I actually got him to work on a UEFI port of Temple OS about a year ago, but then he just deleted his repos and disappeared. I hope that he will at least eventually reupload his Temple OS Demo.

@minexew
Copy link
Collaborator Author

minexew commented Sep 9, 2018

This will be harder than I'd thought. The simplest Hello World compiles down to this:

Some of it are probably just repeats in different optimization passes? Idk.

@Byzantian
Copy link
Owner

Is this the IR? How did you produce this? At first glance, the left column appears to be an instruction and the numbers some sort of parameter. CALL_START and CALL_END are very likely function/stack frame definitions.

It defines
17AB4C48
17AC1A48
17B2AF30
17CAF908

(the last two are identical)

The question is if those numbers are actual adresses or just entries in the global symbol table (or maybe both?). What is also interesting that it doesn't show the callee functions like 17D04FA0 (assuming my initial guess is correct)

I can't make heads or tails of the rightmost column. It seems completely unrelated to the instructions and numbers. Especially that colon.

@minexew
Copy link
Collaborator Author

minexew commented Sep 9, 2018

There's a function ICPut that prints out a single IR instruction. You have to hack the compiler to actually use it.
From what I've figured out, the 3rd block is actually my code being compiled.
17xxxxx addresses are on the heap so they can represent anything, 2DD30h is address of the kernel Print function. Basically the code is doing Print("hello world\n", 0); because that's how the templeos vararg calling convention for printf works.

What the other blocks represent, I have no clue. Perhaps the right column is not as important as it seems at first.

@Byzantian
Copy link
Owner

Great work. Can you commit your compiler hack to the repository? Also if I could take another blind guess, I'd say that IMM_I64 and STR_CONST are local stackframe variables, with STR_CONST probably just being a pointer to the actual string.

@Byzantian
Copy link
Owner

Byzantian commented Sep 9, 2018

Also, I'm assmung that Terry just invented his own calling convention and ABI, so figuring those out could help.

@minexew
Copy link
Collaborator Author

minexew commented Sep 9, 2018

I have to figure out how to tell apart the JIT blocks and AOT blocks in an AOT project. Probably a flag somewhere.

@Byzantian
Copy link
Owner

Byzantian commented Sep 10, 2018

Not sure if it's really any new information for you, but I've found the IC encoding.

185 different instructions total.

tos_ic_1

Left is CInit.HC
Right is CompilerA.HH

@Byzantian
Copy link
Owner

As great as DolDoc is, navigating the code without vim is kind of a pain. Do you know of any port? If not, we could compile vim into a position independent statically linked binary and run it on Temple OS. This guy has done it for a Z-Machine interpreter.. This is obviously an ugly solution, but it could speed up development. However I am not sure how big of a hassle this would actually be.

@Byzantian
Copy link
Owner

If I understand this correctly, instructions and their arguments share the same struct.

tos_ic_2

@Byzantian
Copy link
Owner

Wait, this seems more like some sort of AST.

@minexew
Copy link
Collaborator Author

minexew commented Sep 10, 2018

You're on the right track. But the arguments use CICArg, they don't point back to CIntermediateCode instructions

@minexew
Copy link
Collaborator Author

minexew commented Sep 10, 2018

I sure hope the IC is always just a linear sequence before its passed to the optimization stages

@Byzantian
Copy link
Owner

Byzantian commented Sep 12, 2018

Huh, I totally missed that. Still CIntermediateCode contains a CICTreeLinks, which contains two CIntermediateCode pointers, arg1_tree and arg2_tree. What could their purpose be if the args themselves are already stored in two CICArgs? Also, what on earth is ic_body?

@codingdandy
Copy link

I know RISC-V is the Holy Grail and perfect for TempleOS goals, but what you guys think of ARM64 as first port? I guess that would be less painful, due to existing kernel examples and available boards, but would give you experience in the TempleOS internals before something even more ambitious.

Correct me if I'm wrong, but this port would only reuse the HolyC userland and demos, all the rest (the actual hard part) needs to be rewritten almost from scratch: compiler, bootloader, video driver, rending internals, keyboard and mouse drivers. Terry had it "easy" because he could use simple standards like VGA instead of HDMI, PS/2 instead of USB, which are the only interfaces available on most ARM boards, on HiFive Unleashed there's not even that, only communication is over Serial, I guess some video can be done with the pins but whatever.

The GPIO pins are nice, that could be the simple IO that Terry wanted to replace USB.

Either way, you gotta choose one board (lets say HiFive or Tinkerboard) and make it the official and the only supported hardware for the port, hopefully that works for QEMU as well.

@minexew
Copy link
Collaborator Author

minexew commented Sep 25, 2018

The initial port target should not be the HiFive or any currently available hardware board. It should be a virtual QEMU-based system as simple as possible so that full focus can be on the CPU side until that has been dealt with.

@codingdandy
Copy link

codingdandy commented Sep 26, 2018 via email

@Byzantian
Copy link
Owner

Byzantian commented Sep 26, 2018

Thank you for your interest in the TempleOS-V project. It's encouraging to see that there are people out there who see value in this project. Welcome.

However, I am going to have to disagree with some of your premises:

Correct me if I'm wrong, but this port would only reuse the HolyC userland and demos, all the rest (the actual hard part) needs to be rewritten almost from scratch: compiler, bootloader, video driver, rending internals, keyboard and mouse drivers.

Not quite accurate. The bootloader of Temple OS is deprecated and full of x86 cruft. Most of the stuff it does, like identity page memory mapping, dealing with the A20 line, entering 32 bit mode from 16 bit mode and then jumping to long mode is not going to be necessary on RISC-V, which should make writing a new bootloader a much easier task then it seems. We can just load the kernel into memory and then jump to it.

The compiler does not have to be rewritten, just re-targeted to produce RISC-V machine code. This is a good opportunity to explicitly state my ideas about porting the compiler: After reverse engineering the IR, we could write a C++ based interpreter which runs on bare metal RISC-V, into which we feed an IR dump of the kernel and then also the compiler. Now we have a x86 producing compiler, running on an HolyC IR interpreter, running on RISC-V hardware. All that needs to be done is to rewrite the backend to produce RISC-V. After that, build the OS as usual and we should be set.

However, that does leave out the unfortunate reality that some parts of the kernel are x86 assembly, those need to be rewritten by hand entirely. (Even some stuff that is not inherently tied to the x86 platform is written in assembly for some reason. Guess Terry just liked writing assembly)

Mouse and Keyboard input is indeed going to be an issue, but IMO we should just use some addressees and interrupts as a placeholder in our QEMU environment and then see if we can make it play nice with the VM (someone should look into the RISC-V emulator and see if there is any stuff about I/O). Once an actual RISC-V board with a proper south bridge and a standardized MMIO comes along, we can just retrofit it to that.

As far as video is concerned, we are just going to write everything into a software framebuffer and forget that VGA exists. No VGA registers, no PCI, just write into a framebuffer. How we get the framebuffer to generate an actual video signal is stuff we can only worry about once there is actual hardware we could target.

I did not take a close look at the rendering system, but I don't see how it is tied to VGA at all. At the end of the day, it just writes into a framebuffer at a specific address.

@Byzantian
Copy link
Owner

About the actual hardware: There is a project to create a 100% libre RISC-V SOC, with an actual GPU that is also RISC-V with some custom extensions. Should this ever see the light of day, it could be the perfect hardware to develop this project for.

https://content.riscv.org/wp-content/uploads/2018/07/1130-19.07.18-Luke-Leighton-Software-Libre-Engineer-Advocate.pdf

https://www.youtube.com/watch?v=ojeHX33MPSc

@Byzantian
Copy link
Owner

Okay, finals are over and I can finally begin to work on this project proper. This thread is getting kind of too long. Should we open a new issue for the IR reverse engineering?

@Epitome-of-Abnegation
Copy link

What happened to this project?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants