Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make case tables immutable #1636

Merged
merged 4 commits into from Oct 15, 2013
Merged

Conversation

MartinNowak
Copy link
Member

  • Avoids redundant object copies and semantic
    analysis for every usage.
  • Multilib archives or gc-sections will take care
    of the binary size issue.

- Avoids redundant object copies and semantic
  analysis for every usage.

- Multilib archives or gc-sections will take care
  of the binary size issue.
@DmitryOlshansky
Copy link
Member

LGTM, I'll adjust the generator accordingly.

@MartinNowak
Copy link
Member Author

LGTM, I'll adjust the generator accordingly.

I see, where is the generator?

- This is mainly for consistency with other tables.
- Store the static immutable CodepointTries in separate functions.
- The semantic analysis and object generation only
  needs to be done once when building phobos.
  Using those overloads becomes a simple link dependency.

- add overloads for most common cases
@MartinNowak
Copy link
Member Author

I found a few more ways to optimize compile time and object sizes.

@MartinNowak
Copy link
Member Author

If making a .di file turns out to be too hard we might be able to only import std.internal.unicode_tables in the remaining templates, so that it's rarely ever imported.

@DmitryOlshansky
Copy link
Member

I see, where is the generator?

ATM here:
https://github.com/blackwhale/gsoc-bench-2012/blob/master/gen_uni.d

I plan to one day move it to DLang tools repo.

@DmitryOlshansky
Copy link
Member

Another thought is probably split it up (e.g. 3 pieces - tries, sets and case entries) and use local imports in specific functions.

@DmitryOlshansky
Copy link
Member

Last thing that worries me is inlining or more preciesly the lack of it, see also:
dlang/dmd#2561

@WalterBright
Copy link
Member

There's nothing hard about writing a .di file. It's just a list of declarations.

@DmitryOlshansky
Copy link
Member

There's nothing hard about writing a .di file. It's just a list of declarations.

But it doesn't help matters as it kills CTFE-ablity of std.uni (and consequently all of string/formatting code in Phobos).

@MartinNowak
Copy link
Member Author

The function overload trick already heavily reduces compile time and template bloat for common use cases.
I think we can live with the fact that calling sicmp with non-strings requires some template instantiations.
The only performance penalty left is parsing std.internal.unicode_tables which can be mitigated by localizing the import.
Can we merge this? I'll make a pull for the generator.

WalterBright added a commit that referenced this pull request Oct 15, 2013
@WalterBright WalterBright merged commit 5b5b5bb into dlang:master Oct 15, 2013
WalterBright added a commit that referenced this pull request Oct 15, 2013
@WalterBright
Copy link
Member

Merging this did NOT decrease "hello world" size, it INCREASED it by 100Kb !

(On Windows 32)

@MartinNowak
Copy link
Member Author

Merging this did NOT decrease "hello world" size, it INCREASED it by 100Kb !

(On Windows 32)

Ouch, I made this pull request to address the compile time issue and found some ways to deduplicate tables in different template instantiations. The hello world size was not what I optimized for.
The size of hello world increased because of something that I didn't knew, multilib only works for functions not for data.
Part of this pull request was to write immutable tables into phobos instead of generating them in every template instance (MartinNowak@45c873f).
Now the problem is that phobos contains a monolithic unicode_tables.o (343kB nm -S output). We could put the tables into functions to workaround the issue, the real solution though is to extend multilib to data or to fix the outstanding gc-section issues.

@WalterBright
Copy link
Member

Sure, but extending multilib to data is not going to happen for 2.064. We need a fix for 2.064.

@MartinNowak
Copy link
Member Author

Sure, but extending multilib to data is not going to happen for 2.064. We need a fix for 2.064.

Wrapping the data in @property functions will fix the size issue, though it might cause inline problems.
I'll have to wait for @blackwhale to fix or explain how the gen_uni works because I can't get it to run and I don't want to spend hours on editing the generated file by hand.

@MartinNowak MartinNowak deleted the issue10866 branch October 15, 2013 23:24
@MartinNowak
Copy link
Member Author

Multilib archives or gc-sections will take care
of the binary size issue.

Here is the pull that will make use of multilib #1647, trims hello world by ~300kB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants