Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoupling of the front/middle/backend #297

Closed
chuggafan opened this issue Dec 6, 2018 · 62 comments
Closed

Decoupling of the front/middle/backend #297

chuggafan opened this issue Dec 6, 2018 · 62 comments

Comments

@chuggafan
Copy link
Contributor

So, essentially what I'm asking for here is to change OCC's internals dramatically, and I know this is a lot of work, however I think the benefits outweigh the negatives of re-writing here.

So essentially what I'm asking for is to turn the frontend of OCC into a separate program from the middle/backend, sort of like how CLANG outputs LLVM IR and then is compiled, or GCC outputs GIMPLE that gets turned into GENERIC then into assembly, etc. The main goal of this is to be able to decouple the compiler frontend from the compiler backend and make the two less reliant on each other overall, which would allow me (personally) on the #235 in stages, instead of writing the entirety of the compiler and then re-verifying it, it'd be simpler to write the front-end and output something standard that the middle end can consume and spit out to the backend. Decoupling all of these parts also has the distinct advantage of making OCC be able to be modified easier to handle different platforms because there's now a standard interchange format between the parts.

This issue is of similar nature to #227 in that it means we're doing a massive change to the internals and I expect something like this could take at a minimum months, but my personal feelings on this is that it would be a significant benefit in the longer term for the reasons listed above and more. I can completely understand if you see it as un-feasible but I think it's an optimistic goal to work towards.

@LADSoft
Copy link
Owner

LADSoft commented Dec 6, 2018

It actually sounds good. Having it split out that way is something I've thought of before, especially as I want to expand the 'adl' file to more directly generate the compiler output similar to how it does ASM. I suppose the easy thing is to have the compiler generate an unoptimized version of the intermediate code to a file (in a more compact format) and then the middle portion could just read that in and do all the things it already does. We'd need some kind of profile to tell it how to allocate registers and some other stuff I guess though... as far as what the middle layer outputs I really don't know. Again the easy thing is just have it output the same intermediate code but on the other hand it would be nice to do some more advance optimizations around moving code around to avoid pipeline stalls... eventually! lol!

@chuggafan
Copy link
Contributor Author

Yhea, the way CLANG/GCC does it is that it first parses everything in the frontend and does basic IR generation with 0 optimizations yet applied, after that it goes to the middle end to do all of the optimizations from there with the hints dropped in the IR, then from that the same IR is dumped into the backend to generate the final code. I know MSVC at least does a Frontend/backend, but I don't know if there's an exact middle end where they do anything else with.

@LADSoft
Copy link
Owner

LADSoft commented Dec 7, 2018

yeah when I thought about it a while back I concluded a three tiered approach would be good. Seems like I'm in good company :) Well I like this little project but I need to get back to #227 as soon as I clear a few more items from the list :)

@LADSoft
Copy link
Owner

LADSoft commented Oct 8, 2019

I've started working on this. It will take a while as there is a lot to it...

@LADSoft
Copy link
Owner

LADSoft commented Oct 20, 2019

so I have done most of the infrastructure work to separate occ into multiple programs - the main thing missing is streaming compilation-related data in and out of files to communicate the data between the separate programs. Still a lot of details to take care of though.

@LADSoft
Copy link
Owner

LADSoft commented Nov 4, 2019

this weekend I finished the rewrite and did some minor testing. at this point I've got 'hello world' compiling properly but larger programs are problematic.

@chuggafan
Copy link
Contributor Author

Nice! Hoping you work out the rest of the kinks so that it's possible that it's "production worthy"!

@LADSoft
Copy link
Owner

LADSoft commented Nov 4, 2019

I'm actually very happy, got pretty far with it given how much was rewritten. Really wasn't expecting to be able to generate any EXE file so fast :) I have an idea what might be wrong though, I went crazy rewriting the local stack allocation routine and it shows lol! I think there may be variable collisions...
But even so there is still a lot of testing to do...

LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 10, 2019
LADSoft added a commit that referenced this issue Nov 15, 2019
LADSoft added a commit that referenced this issue Nov 15, 2019
LADSoft added a commit that referenced this issue Nov 15, 2019
LADSoft added a commit that referenced this issue Nov 15, 2019
LADSoft added a commit that referenced this issue Nov 15, 2019
@LADSoft
Copy link
Owner

LADSoft commented Feb 13, 2020

so I was going to punt and get rid of one of the compile passes, comparing the outputs of the last two compile passes for validation purposes. Unfortunately they don't match even though everything works properly... it seems like there is some small problem in the code generated for the linker which changes the link order (well that is what it seems like). I'll look into that soon... if we can get past this it might be time to close this issue.

LADSoft added a commit that referenced this issue Apr 19, 2020
LADSoft added a commit that referenced this issue Apr 19, 2020
@LADSoft
Copy link
Owner

LADSoft commented Apr 19, 2020

I got back to this after a hiatus and fixed a couple of things... bugs in the new code were a lot of the cause why I couldn't just compare the files. So it is good I went down this road...

There are a couple more issues - one is that the template engine is recompiling stuff it doesn't need to recompile ( after compiling the compiler with itself). Another is the DATE/TIME in all the modules needs to be reworked since it is another problem with comparing files built at different times... will get back to it in a few days.

LADSoft added a commit that referenced this issue Apr 22, 2020
@LADSoft
Copy link
Owner

LADSoft commented Apr 22, 2020

down to one more issue, which is the DATE and TIME references throughout the project cause comparison problems. Will fix it over the next few days...

@GitMensch
Copy link
Contributor

I highly suggest to not change anything in OrangeC's sources and possibly also not in the generated binaries concerning timestamps, but instead add support for environment variable SOURCE_DATE_EPOCH , you can then set the variable in the build before running (ideally correctly) and all following runs will then have the same but still correct timestamps.

@LADSoft
Copy link
Owner

LADSoft commented Apr 23, 2020

that sounds like a really good idea that will save me a lot of work... I suppose all the major compilers and linkers support it? Looks like MSVC does at least...

@chuggafan
Copy link
Contributor Author

MSVC afaict does not support SOURCE_DATE_EPOCH, SOURCE_DATE_EPOCH is GCC & Clang. bazelbuild/bazel#9466 MSVC doesn't have deterministic debug builds as you can see from this, but they're trying and the last update they gave is from https://developercommunity.visualstudio.com/idea/426033/support-for-cls-experimentaldeterministic-becoming.html Feb 14th, so it might come eventually, but currently is not there.

@LADSoft
Copy link
Owner

LADSoft commented Apr 24, 2020

hm should have read the article more completely I guess lol!

Well found a couple more bugs, localtime not generating the right values in the presence of DaylightSaving Time and something where dlpe could fill unused portions of the executable with garbage... but after fixing those the files compare nicely now. Hope to be able to get the build fixed and working this weekend so I can close out this issue...

@GitMensch
Copy link
Contributor

Hope to be able to get the build fixed and working this weekend so I can close out this issue...

Sounds really cool. You may want to open a new one for SOURCE_DATE_EPOCH if this isn't already "in process" (or maybe even then, to be able to reference it in the Changelog).

Do I understand it correct that after this one is closed (and possibly some tests like the calculation-test we had before and maybe even a rough GC-test [to see that it did not make things much work with those "special uses"]) it is time for a new release?

@LADSoft
Copy link
Owner

LADSoft commented Apr 24, 2020

there was issue #195 for adding SOURCE_DATE_EPOCH to the preprocessor... and it was something already done. I added support so that dlpe would also use it for time stamps in the executable files and it will be checked in against #195 as well...

The corresponding milestone won't be complete as there are a lot of small issues that got added to it that I need to address. But we could have a release anyway as there have been some useful bug fixes done in the meantime. Yeah but I need to do some testing this weekend, in any case. I haven't made major changes since running the various automake scripts but it would be a good test anyway... I guess lets see what it looks like early next week?

@GitMensch
Copy link
Contributor

there was issue #195 for adding SOURCE_DATE_EPOCH to the preprocessor... and it was something already done.

Wow, how time flies by...

I added support so that dlpe would also use it for time stamps in the executable files and it will be checked in against #195 as well...

Sounds good, as necessary back then you may need some doc to update together with this.

I guess lets see what it looks like early next week?

Completely fine with me. My point was mainly because of the many internal changes already happened since last release (and it seems that the milestone possibly needs more months to finish [which is not bad, but a "compare with version X" option may still be useful.])

@LADSoft
Copy link
Owner

LADSoft commented Apr 25, 2020

I thought I should note that the general math tests and various other test files that we came up with previously (including the c-testsuite and the mcpp preprocessor tests and various C++ language stressors) were rolled into the compiler tests. They get run every time we do a checkin and have been stable since the new code went to the master branch. So I figured if I just do basic testing with the autoconfig projects to make sure nothing is broken there that will pretty much cover all the bases.

@LADSoft
Copy link
Owner

LADSoft commented Apr 26, 2020

There was one more issue found while compiling mpir but it was probably pre-existing. But I've fixed it. The builds on appveyor succeeded last night and now it is in process of building the release binaries so I'm closing this.

@LADSoft LADSoft closed this as completed Apr 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants