Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement reference counted AST nodes #2222

Merged
merged 1 commit into from
Dec 18, 2016
Merged

Conversation

mgreter
Copy link
Contributor

@mgreter mgreter commented Oct 25, 2016

This is some work I've started maybe a year ago to change the current memory handling to a reference counted implementation (and rebased to current master). This is really just a POC and probably only covers about 10% of all the changes that would be needed. Nonetheless it already is a workable POC and might serve as a ground for anyone who would like to complete this task. At the moment libsass can take up an absurd amount of memory in certain situations (mostly with loops). And the only solution I see is to change the current memory handling to a more fine grained implementation. As there is no way to create closures or references in libsass, a reference counting implementation seems to fit perfectly (There should be no need for a multiple generation GC implementation).

The basic idea is to hold a stack object instead of the pointer to AST_Nodes directly. The stack object is in charge to hold the pointer and to deallocate it once the reference count goes down to zero. This is a "smart pointers" implementation. We may could use shared_ptr from c++11, but I'm not sure how portable that would be (gcc 4.4?) and a custom implementation gives us more freedom to apply it to the existing code base and also has a better L1 cache locality, due to the reference counter beeing directly attached to every object.

I might pick up on this work myself, but not sure when (next year or next decade) as I would estimate at least 160h of work until this is fully in a useable state for libsass (the rebasing alone took me 30h to get to a working state again). Posting here anyway, so the base work is not lost in the github nirvana.

The full glory with indivudial rebase commit can be seen at
https://github.com/mgreter/libsass/commits/memory/ref-count-poc
POC: https://github.com/mgreter/libsass/blob/memory/ref-count-poc/src/context.cpp#L554-L584
https://github.com/mgreter/libsass/blob/memory/ref-count-poc/src/ast_fwd_decl.hpp#L33-L56

@mgreter
Copy link
Contributor Author

mgreter commented Oct 30, 2016

Progressed at https://github.com/mgreter/libsass/tree/poc/memory-ref-count
Got a working end to end POC for the following sample:

foo {
  @for $i from 0 to 1000000 {
    $q: $i !global;
  }
  bar: $q;
}

This now uses a constant amount of memory (~14MB), while the old version had a peek ~210MB.

@mgreter mgreter force-pushed the memory/ref-count branch 8 times, most recently from 90853bf to 6c3c5be Compare November 20, 2016 04:56
@mgreter
Copy link
Contributor Author

mgreter commented Nov 20, 2016

This is getting close the be ready to be merged. Put a lot of effort into it for the last month (100h+) and it seems feasible now. I changed a lot of stuff on the road and still have some major refactoring on the roadmap until this work is ready.

@chriseppstein
Copy link
Contributor

This seems like a good cleanup. The memory usage at linkedin is very high and it would be great to see an improvment in this area. It also seems like this infrastructure will be pretty critical to making first-class functions from sass 3.4 work well.

@mgreter
Copy link
Contributor Author

mgreter commented Nov 21, 2016

@chriseppstein the memory usage will improve significantly. Other than that there is not much more benefit to it, as we currently just record each memory allocation and delete it once the whole context is gone (therefore accumulating a lot of temporary and intermediate memory chunks, but not leaking any memory either). This may also have some minor performance drawbacks, since keeping the reference counter does involve some costs. But I hope in the long run we can re-gain that by having a more clear memory management (i.e. by avoiding unecessary cloning).

BTW. the current state of this PR is stable in terms of the spec-tests. According to my internal memory tracking, there are no leaks and there shouldn't be any dangling pointers. Nonetheless I'm quite sure there are still some edge cases that may leak some memory. Indicating that we lack a spec test for these cases.

@mgreter
Copy link
Contributor Author

mgreter commented Nov 22, 2016

Created a first draft for some documentation.

LibSass smart pointer implementation

LibSass uses smart pointer very similar to shared_ptr known
by Boost or C++11. Implementation is a bit less modular since
it was not needed. Various compile time debug options are
available if you need to debug memory life-cycles.

Memory Classes

SharedObj

Base class for the actual node implementations. This ensures
that every object has a reference counter and other values.

class AST_Node_Impl : public SharedObj { ... };

SharedPtr (base class for SharedImpl)

Base class that holds on to the pointer. The reference counter
is stored inside the pointer object directly (SharedObj).

SharedImpl (inherits from SharedPtr)

This is the main base class for objects you use in your code. It
will make sure that the memory it points at will be deleted once
all copies to the same object/memory go out of scope.

Class* pointer = new Class(...);
SharedImpl<Class> obj(pointer);

To spare the developer of typing the templated class every time,
we created typedefs for each available AST Node specialization.

typedef SharedImpl<Number> Number_Obj;
Number_Obj number = SASS_MEMORY_NEW(...);

Circular references

Reference counter memory implementations are prone to circular references.
This can be addressed by using a multi generation garbage collector. But
for our use-case that seems overkill. There is no way so far for users
(sass code) to create circular references. Therefore we can code around
this possible issue. But developers should be aware of this.

There are AFAIR two places where circular references could happen. One is
the sources member on every Selector. The other one can happen in the
extend code (Node handling). The easy way to avoid this is to only assign
complete object clones to these members. If you know the objects lifetime
is longer than the reference you create, you can also just store the raw
pointer. Once needed this could be solved with weak pointers.

Addressing the invalid covariant return types problems

If you are not familiar with the mentioned problem, you may want
to read up on covariant return types and virtual functions, i.e.

We hit this issue at least with the CRTP visitor pattern (eval, expand,
listize and so forth). This means we cannot return reference counted
objects directly. We are forced to return raw pointers or we would need
to have a lot of explicit and expensive upcasts by callers/consumers.

Simple functions that allocate new AST Nodes

In the parser step we often create new objects and can just return a
unique pointer (meaning ownership clearly shifts back to the caller).

typedef Number* Number_Ptr;
int parse_integer() {
  ... // do the parsing
  return 42;
}
Number_Ptr parse_number() {
  Number_Ptr p_nr = SASS_MEMORY_NEW(...);
  p_nr->value(parse_integer());
  return p_nr;
}
Number_Obj nr = parse_number();

The above would be the encouraged pattern for such simple cases.

Allocate new AST Nodes in functions that can throw

There is a major caveat with the previous example, considering this
more real-life implementation that throws an error. The throw may
happen deep down in another function.

int parse_integer() {
  ... // do the parsing
  if (error) throw(error);
  return 42;
}

With this parse_integer function the previous example would leak memory.
I guess it is pretty obvious, as the allocated memory will not be freed,
as it was never assigned to a SharedObj value. Therefore the above code
would better be written as:

typedef Number* Number_Ptr;
int parse_integer() {
  ... // do the parsing
  if (error) throw(error);
  return 42;
}
// this leaks due to pointer return
// should return Number_Obj instead
// though not possible for virtuals!
Number_Ptr parse_number() {
  Number_Obj nr = SASS_MEMORY_NEW(...);
  nr->value(parse_integer());
  return &nr;
}
Number_Obj nr = parse_number();

The example above unfortunately will not work as is, since we return a
Number_Ptr from that function. Therefore the object allocated inside
the function is already gone when it is picked up again by the caller.
The easy fix for the given simplified use case would be to change the
return type of parse_number to Number_Obj. Indeed we do it exactly
this way in the parser. But as stated above, this will not work for
virtual functions due to invalid covariant return types.

Return managed objects from virtual functions

The easy fix would be to just create a new copy on the heap and return
that. But this seems like a very inelegant solution to this problem. I
mean can't we just tell the object to treat it like a newly allocated
object? And indeed we can. I've added a detach method that will tell
the object to survive deallocation until the next pickup.

typedef Number* Number_Ptr;
int parse_integer() {
  ... // do the parsing
  if (error) throw(error);
  return 42;
}
Number_Ptr parse_number() {
  Number_Obj nr = SASS_MEMORY_NEW(...);
  nr->value(parse_integer());
  return nr.detach();
}
Number_Obj nr = parse_number();

Compile time debug options

Note: not yet finalized

  • compile time option to add file/line for each allocation
  • compile time option to track all allocated/leaked memory
  • compile time option to trace (specific) memory life-cycles

Why reinvent the wheel when there is shared_ptr from C++11

First, implementing a smart pointer class is not really that hard. It
was indeed also a learning experience for myself. But there are more
profound advantages:

  • Better GCC 4.4 compatibility (which most code still has OOTB)
  • Not thread safe (give us some free performance on some compiler)
  • Beeing able to track memory allocations for debugging purposes
  • Adding additional features if needed (as seen in detach)
  • Optional: optimized weak pointer implementation possible

Thread Safety

As said above, this is not thread safe currently. But we don't need
this ATM anyway. And I guess we probably never will share AST Nodes
across different threads.

@mgreter
Copy link
Contributor Author

mgreter commented Nov 23, 2016

This PR is ready in technical terms. The code only needs more cleanup and API polishing. I'm quite sure I can get this ready before X-Mas. I also tested performance and it is on par with LibSass before this change, but has a dramatically better footprint than before. I only compared spec test runtimes, but I would guess that more complex use cases might profit from this performance wise.

@mgreter
Copy link
Contributor Author

mgreter commented Nov 25, 2016

This could be merged right now as it is. I want to make more refactorings and split at least certain compile units into more fine grained ones. But I guess this PR is already big enough. Prolonging this just binds me to put more effort into maintaining the commits than doing productive improvements to the code base.

I would also like us (LibSass) to follow the naming conventions normally used (ie. google c++ style guide) to rename our classes to be camel case (i.e Binary_Expression => BinaryExpression). There might be some rules for special memory objects (ie. BinaryExpression_Ptr). Methods should also be CamelCase, while private things can still be "in underscrore representation". But honestly I don't care too much about very strict code convertions, as long as it has some consitency.

And I just want to stress that english is only my third language, so please bear with me!
Regards
Marcel

@mgreter
Copy link
Contributor Author

mgreter commented Nov 25, 2016

@mgreter mgreter self-assigned this Nov 25, 2016
@xzyfer
Copy link
Contributor

xzyfer commented Nov 25, 2016 via email

@drewwells
Copy link
Contributor

great work @mgreter this looks awesome!

@@ -1,10 +1,12 @@
#include "sass.hpp"
#include "ast.hpp"
#include "context.hpp"
#include "debugger.hpp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dandling debugger?

class Keyframe_Rule : public Has_Block {
ADD_PROPERTY(Selector*, selector)
class Keyframe_Rule_Ref : public Has_Block_Ref {
ADD_PROPERTY(Selector_Obj, selector2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's with the 2 suffix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used during dev since I had to replace each class with an updated one without breaking stuff. Left for now (will rename sortly before PR finally ready). Used to differentiate regular selector from Keyframe_Rule. Easier to track where variables are passed and used.

@xzyfer
Copy link
Contributor

xzyfer commented Dec 11, 2016

These massive PR are both conceptually and technically difficult to review, but I have it a shot. I took well over an hour so I've just left comments in place as I worked through the diff. Some of the questions may be been answered further down the in the diff.

First, there are some good quality of life improvements here. Not needing to pass around ctx.mem to constructors anymore is a win in and of itself.

Feedback

  • document survive along side detach
  • I'm not convinced there's value in having our DSL in _PTR_CONST type defs
    • I not bothered either way
  • SASS_MEMORY_CAST is used seemingly haphazardly.
    • As much as I'd hate to see this PR grow we should go all in or not at all in this PR

Questions

You've expanding some operations which in many cases has had the code harder to grok. Is this for technical reasons, to debug, or personal taste?

i.e.

Statement_Obj stm = b->at(i);
Statement_Obj ith = stm->perform(this);
Statement_Obj ith = b->at(i)->perform(this);

In places like CSSize this IMHO has negatively affected the readability of the code.

Asides

This requires further offline disucssion but I'm not convinced aiming for GCC 4.4 is worth while. We currently state GCC 4.6+ as a requirement but in practice (I think) we require 4.7 or 4.8.

IMO there's a lot of value in adopting C++ 11 language semantics i.e. shared_ptr, and move semantics where they make sense

@xzyfer
Copy link
Contributor

xzyfer commented Dec 11, 2016

Is it worth changing the error handling model of the parser in order to simplify this implementation? It shouldn't be difficult to change the parser to not throw unexpectedly.

@mgreter
Copy link
Contributor Author

mgreter commented Dec 13, 2016

Rebased to master. Addressed pretty much all of your comments. Thanks for the pre-review.

  • The XYZ_PTR_CONST are needed as the code would no compile otherwise and giving const XYZ_PTR is not the same. I guess it something for further investigations. But it works ok as it is now and there seem bigger things to look for at this initial stage.

  • Renamed surivive to detach. Will update documentation when I have time.

  • SASS_MEMORY_CAST - I would just change it as we go along. Not really that important IMO.

You've expanding some operations which in many cases has had the code harder to grok. Is this for technical reasons, to debug, or personal taste?

There is a technical reason. Chances are that we would otherwise not pick-up all objects. But it depends on the root object. I think by now pretty much all lists also hold reference counted objects. So this may no longer needed in all places. I can see if I see some obvious spots.

GCC 4.4 - FYI: I do release perl-libsass in gcc 4.4 compatible form since a few versions. So I do have some interest to keep it compatible if possible. Anyway, I believe this implementation leaves us much more freedom that we would get with raw shard_ptr. The implementation is also rather short (must is debugging mode stuff).

Is it worth changing the error handling model of the parser in order to simplify this implementation? It shouldn't be difficult to change the parser to not throw unexpectedly.

I don't really see the point. IMO throwing is the right way and I don't know how it would be easy to change it to anything else. Beside we still need to account for our memory allocations now. With detach there is the right tool to solve this quite easily (assign to obj, call detach if you need to return ptr).

Further ToDo: debug mode needs some more work to be really useable. I think I've lost some bits on the way.

@xzyfer
Copy link
Contributor

xzyfer commented Dec 13, 2016

All in all this looks good to me. I won't realistically be able to re-review after you've addressed my feedback but I trust it's fine. Given the scope of this these changes, the scope of changes I need to make to get custom property support landed for 3.5 I think we should aim to merge this asap so I'm not blocked on 3.5 feature work.

My preference would be to merge #2134 and then this, purely because rebasing #2134 will be a PITA. What are your thoughts, are you ok with rebasing your work after #2134 or would you prefer to and this first?

@mgreter
Copy link
Contributor Author

mgreter commented Dec 13, 2016

I'd rather rebase #2134 afterwards. Doesn't look too hard (100 lines) ...

@xzyfer
Copy link
Contributor

xzyfer commented Dec 13, 2016

No worries. Feel free to squash and merge when you're ready.

@xzyfer
Copy link
Contributor

xzyfer commented Dec 13, 2016

@mgreter heads up master is failing due to sass/sass#2211. You can safely ignore the failing spec for spec/libsass-closed-issues/issue_2132

@mgreter
Copy link
Contributor Author

mgreter commented Dec 14, 2016

@mgreter heads up master is failing due to sass/sass#2211. You can safely ignore the failing spec for spec/libsass-closed-issues/issue_2132

Is there anyone looking into these? It's a pitty that master is broken ATM!

@mgreter mgreter force-pushed the memory/ref-count branch 2 times, most recently from a966cb2 to eea0896 Compare December 15, 2016 00:46
@xzyfer
Copy link
Contributor

xzyfer commented Dec 15, 2016

Fixed master. Kicked off this build again and it's green.

@mgreter mgreter force-pushed the memory/ref-count branch 2 times, most recently from 9491853 to c1f30f2 Compare December 16, 2016 01:02
@mgreter mgreter removed the Dev - WIP label Dec 16, 2016
@mgreter mgreter force-pushed the memory/ref-count branch 7 times, most recently from 6d88286 to 9d2d6b4 Compare December 16, 2016 19:19
@mgreter
Copy link
Contributor Author

mgreter commented Dec 16, 2016

Updated documentation a little and added my valgrind test script too.
Made 2 valgrind test runs on linux (Atom CPU and Virtualbox - both x86_64).

Will merge this in a few hours if nothing more pops up.

@mgreter mgreter merged commit c32ce55 into sass:master Dec 18, 2016
xzyfer added a commit that referenced this pull request Dec 28, 2016
This fix was annoyingly bundled with the reference counted AST PR (#2222).

As a result it was not back ported to 3.4, which caused
sass/sass-spec#1022 to start failing the 3.4 branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants