Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Header file restructuring #238

Closed
wants to merge 21 commits into from
Closed

Header file restructuring #238

wants to merge 21 commits into from

Conversation

lcastelli
Copy link
Contributor

Currently the header files generated by hiphop contain extensive inclusions of other header files to satisfy various kinds of dependencies. Since the header files that get included contain in turn other inclusions, the total number of files that get included in each cpp file can easily become a substantial portion of the whole codebase that is being compiled. This makes incremental compilation much less effective, increases the memory required to compile each cpp file, and can even hit compiler hard limits with large codebases, like the maximum of 200 recursive inclusions of GCC.

This patch series is aimed at restructuring how header files are generated by hiphop with the primary purpose of substantially decreasing the occurrences of recursive inclusions of header files. Using these patches it's been observed roughly a 10 times reduction in the total amount of code that get included in each cpp file (mostly by recursive inclusions) when compiling a large codebase. This in turn has decreased the memory required by gcc for each compilation by roughly 30% and the compilation time by about 10%.

The series goes through some steps to add direct dependencies in situations where the code now relies on recursive inclusions, restructures the header files to keep track of their dependencies and satisfy them without additional inclusions when possible, and finally removes all the legacy inclusions rendered obsolete by the previous modifications.

The series also tries to simplify how header files can be managed by removing the requirement for the includers of class header files to include additional headers before them, in order to properly resolve the dependencies in the class headers themselves. That is, after these patches class headers can be directly included at the top of a file and the result should be valid C++ without requiring any additional inclusion. In order to do this forward headers are introduced for class headers as well, similarly to what is already done for file headers.

In general after these patches the organization of the generated header files is as follow:

  • cls/classname.h: contains the complete declaration of a single class/interface (including its constants).
  • cls/classname.fw.h: included at the top of the corresponding class header file, contains the code required to satisfy the dependencies of the header file.
  • php/filename.h: contains the complete declaration of everything that is defined in the corresponding php file, including its classes (that get declared by including the corresponding cls/classname.h file).
  • php/filename.fw.h: included at the top of the corresponding file header, contains the code required to satisfy the dependencies of that file.
  • php/filename.fws.h: basically a forward header for the cpp files, this file has not changed at all in this patch series.

Potentially all the forward headers could be collapsed with their corresponding header files since they only get included once, thus reducing the total amount of files generated. This has not been done because some dependencies are detected only when the header files are written, and it's thus easier to insert them afterwards in a separate file. This also maintains the same organization as before.

Although not strictly required, after all the patches have been applied system should be regenerated. This has not been done in a patch of this series to avoid creating too many conflicts for merges.

Feedback/reviews are welcome :)

This adds some more direct dependencies in a few places, where the code
before relied on the fact that the dependency would be satisfied by an
indirect inclusion. It is intended to prepare for eliminating most of
the (unneeded) indirect inclusions.
This adds a forward header file for classes. This file is intended to
declare all the constants, static strings, etc, that are required by
the class header.

This patch is a preparation for making the class header files
completely self-contained, so that it is not required to include
anything else before a class header file in order to get valid C++
code. This way the inclusion system can be simpler.
This adds two simple helper functions to simplify printing a namespace
block only in case it is needed.
This splits the literal string declarations to either go to the class
forward header or the file forward header, according to where they are
required. It also adds a function isFileOrClassHeader to CodeGenerator
to simplify the detection of whether we need to add header file
dependencies.
This splits the static array declarations between the class and the
file forward headers.
This adds explicit constant declarations in the forward headers in case
constants are used in the corresponding header file. Eventually this
can allow removing the inclusion of the whole header file that declares
those constants (which in turn could include other header files as
well).
Apparently they were completely missing, and things usually worked
because most of the times they are optimized away.
This adds class constant declarations to forward headers.

In addition the replacement of a class constant with its value is
removed from outputCPPImpl. This could create dependency problems since
the value could be an expression dependent on other declarations, which
would have not been added in the analysis/optimization phases. Note
that in case the value is static the replacement is already done during
the pre-optimization, so this additional occasion for replacement is
not needed in that case.
This adds class forward declarations when needed by header files.
When a class constant expression calls the lazy_initializer of the
corresponding class, its header file must have been included before (a
simple forward declaration is not sufficient, since a method is
called). This is not correctly done when the constant expression is a
default parameter, so it gets expanded in a header file. This patch
covers that case.
The rationale is to have everything defined by a class in its class
header for consistency.
Group everything that is related to a class in its class header.
Basic header files are already included in the forward headers, so they
are not needed in the file headers.
The modifications done before have made this inclusion superfluous.
This allows for a dramatic reduction in the total number of header
files included in each cpp file, especially because by removing the
inclusions in the header files this prunes a lot of indirect inclusions
that were performed before.
This is now possible because of all the previous modifications. This
further substantially reduce the number of indirect inclusions that are
performed.
This is done for consistency: the forward headers are reserved for
declarations required by the corresponding header files only.
…aders

The macros for forward declarations are also useful to declare the
p_##cls typedefs which are not declared anywhere else. Thus it makes
sense to have a forward class declaration of a class in its header
file, and just leave the forward header for the forward declarations
needed to satisfy the dependencies.

This patch also removes FORWARD_DECLARE_REDECLARED_CLASS since that
seems unused.
This restructures how files forward headers are laid out to make them
more compact and consistent with class forward headers.
It's not necessary to include fws files of all the dependencies, so
this patch removes them. It also divides the inclusions so that it's
clearer if a file was included to satisfy a dependency or not.
This patch refactors some header includes that no longer obtain the
intended definitions after the header file restructuring. This allows
regenerating system without errors.
Interfaces were forward declared in the depedencies the same ways as
classes are. This patch fixes those declarations.
@markw65
Copy link
Contributor

markw65 commented Jan 22, 2011

Great set of changes. Thanks for doing this. I just merged it to facebook's internal repo, so it should get out to github today or monday.

I'm seeing more like a 20-25% compile time speedup with our code base

@scottmac
Copy link
Contributor

It's merged now.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants