Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Document class #3586

Draft
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

fabian-thomas
Copy link
Contributor

@fabian-thomas fabian-thomas commented Nov 18, 2021

  • removed dubious locking code from Document initialisation

I need your input on the code I removed from the Document initialisation. I'm gonna comment on what I think the lines were intended to do.

TODO:

src/model/Document.cpp Outdated Show resolved Hide resolved
src/model/Document.cpp Outdated Show resolved Hide resolved
src/model/Document.cpp Outdated Show resolved Hide resolved
src/model/Document.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@bhennion bhennion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! LGTM. Please check that the document is always locked by the caller when the operator= is called, just to make sure.

src/model/Document.cpp Outdated Show resolved Hide resolved
src/model/Document.cpp Outdated Show resolved Hide resolved
@fabian-thomas
Copy link
Contributor Author

Thanks for the PR! LGTM. Please check that the document is always locked by the caller when the operator= is called, just to make sure.

Do you know how I can do something like assert(thread_has_lock(gmutex))?

Overwriting the = operator seems to be really bad style to me. It took me a bit to find out about that definiton while debugging. Is there any benefit by doing that?

@fabian-thomas
Copy link
Contributor Author

I've found a lot more code that uses tryLock and some other dubious lines in the Document class. I'm gonna look into that once I know how to assert for having the lock.

@bhennion
Copy link
Contributor

Do you know how I can do something like assert(thread_has_lock(gmutex))?

I don't think you can.
Also, and I should have mentionned it earlier, a PR #3382 that changed every g_mutex into std::mutex was merged in the release-1.1 branch. It has not yet been cherry-picked to master, so it will cause some merge conflicts once it is.
@xournalpp/core Maybe we could do this cherry-picking in advance?

Overwriting the = operator seems to be really bad style to me. It took me a bit to find out about that definiton while debugging. Is there any benefit by doing that?

There are pros and cons. In this case, more cons than pros, I think. The operator is used only in Control.cpp (3 times) and every time it plays the role of a std::swap rather than of an assignment. I think it'd be reasonable to make it Document::swap and actually swap everything in it. We'd save some unnecessary copy operations and make things clearer, I think.

@bhennion
Copy link
Contributor

Also, and I should have mentionned it earlier, a PR #3382 that changed every g_mutex into std::mutex was merged in the release-1.1 branch. It has not yet been cherry-picked to master, so it will cause some merge conflicts once it is. @xournalpp/core Maybe we could do this cherry-picking in advance?

For the moment, you could cherry-pick c8d690c at the root of your PR to work with those changes already.

@Febbe
Copy link
Collaborator

Febbe commented Nov 20, 2021

Why not merging release-1.1 into master?
Cherry picking has the disadvantage, that every change is at least twice in our history.

@fabian-thomas fabian-thomas changed the title Update page update handling Wip: refactor Document class Nov 20, 2021
@fabian-thomas fabian-thomas changed the title Wip: refactor Document class Draft: refactor Document class Nov 20, 2021
@fabian-thomas
Copy link
Contributor Author

I decided to move the actual bug fix to another PR since this one will get big.

@bhennion
Copy link
Contributor

Why not merging release-1.1 into master?

Yes, of course, merging is better!

@fabian-thomas fabian-thomas marked this pull request as draft November 20, 2021 14:38
@fabian-thomas fabian-thomas changed the title Draft: refactor Document class Refactor Document class Nov 20, 2021
@fabian-thomas fabian-thomas changed the base branch from master to release-1.1 November 20, 2021 16:10
@fabian-thomas
Copy link
Contributor Author

Why not merging release-1.1 into master?

Yes, of course, merging is better!

I've done it a bit different. I've cherry-picked my commits to release-1.1.

@LittleHuba
Copy link
Member

That is really not what you want. This PR can never be merged into a minor release as it has a way too high potential to break something. This should be seen more like a new feature than a bugfix and is therefore best merged into master.

@Febbe
Copy link
Collaborator

Febbe commented Nov 20, 2021

You could write some enclosing guard named Document with the following function:

template
void doTask(Fn&& fn){
std::lock_guard guard{mutex};
fn(internalDocument);
}

With the private member internalDocument, which contains all the fields in the doc.

@LittleHuba
Copy link
Member

We could actually overload the -> operator for the wrapper class. That is a common way to do something like that.

@fabian-thomas
Copy link
Contributor Author

That is really not what you want. This PR can never be merged into a minor release as it has a way too high potential to break something. This should be seen more like a new feature than a bugfix and is therefore best merged into master.

Oh, yes. So how do we do that?

I base this PR on master and merge release-1.1. And then we merge this PR when we merge release-1.1 onto master?

@fabian-thomas fabian-thomas changed the base branch from release-1.1 to master November 21, 2021 19:42
@LittleHuba
Copy link
Member

No, we will have to merge release-1.1 in its own PR. If you include it in this PR that is sub-optimal as you will mix a lot of different changes. I'll see if I have the time to do this merge in the next few days. Hopefully, there are not that many conflicts.

Once I merged both branches, you can just fast-forward your branch to the HEAD of master. I'll mention you, so you get notified once I merged the branches.

@Febbe
Copy link
Collaborator

Febbe commented Nov 22, 2021

@fabian-thomas you can start your work on top of master now.

@fabian-thomas
Copy link
Contributor Author

I've began to look into this. I started by searching for a monitor like implementation that doesn't use some crazy cpp syntax and is easy to understand. What also mattered was that we need some reentrant functionality (locking the document for multiple function calls). Also a reader-writer functionality would be nice to have.

I've settled on the following:
We have one recursive_mutex for the Document class.
In each function in Document we either do a
std::shared_lock<std::recursive_mutex> lock(mutex); (function only reads)
or a
std::unique_lock<std::recursive_mutex> lock(mutex); (function also writes).
This solves concurrency and reader-writer differentiation.

For the reentrant functionality we just do something like:

std::lock_guard<std::recursive_mutex> lock(doc.mutex);
doc.function1();
doc.function2();

What do you think about this approach?

@fabian-thomas

This comment has been minimized.

@Febbe

This comment has been minimized.

@Febbe
Copy link
Collaborator

Febbe commented Dec 5, 2021

I've began to look into this. I started by searching for a monitor like implementation that doesn't use some crazy cpp syntax and is easy to understand.

You can ask your questions here if you have one.

For the reentrant functionality we just do something like:

std::lock_guard<std::recursive_mutex> lock(doc.mutex);
doc.function1();
doc.function2();

What do you think about this approach?

The document's Mutex must not be accessible. It would be better to implement the lock and try_lock function, but not unlock.
Those functions shall return a guard like: std::unique_lock / std::shared_lock.

Using a recursive lock has a smell, but it's ok, to fix a deadlock (when a better aproach is e.g. not feasible because a large amount of code has to be refactored).

That's why an observer based approach would be better, but your approach with my change request is way better than the current situation.

Comment on lines 198 to 201
cairo_surface_t* Document::getPreview() {
std::shared_lock<std::recursive_mutex> lock(mutex);
return this->preview;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a reference or a pointer, even a const& is not thread safe. You have to transfer the guard also.

Suggested change
cairo_surface_t* Document::getPreview() {
std::shared_lock<std::recursive_mutex> lock(mutex);
return this->preview;
}
using ReadGuard = std::shared_lock<std::recursive_mutex>;
std::pair<cairo_surface_t*, ReadGuard> Document::getPreview() {
return {this->preview, mutex};
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've rewritten it a little bit to make sure that I will take a look at this later.
In the best case we would write a wrapper for this cairo_surface that holds the document lock until it itself is freed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think this is not feasible. There are some other classes that would need this treatment too. Does somebody know another way to do this?

@fabian-thomas
Copy link
Contributor Author

Using a recursive lock has a smell, but it's ok, to fix a deadlock (when a better aproach is e.g. not feasible because a large amount of code has to be refactored).

What is wrong with recursive locks, apart from their minimal overhead?

That's why an observer based approach would be better, but your approach with my change request is way better than the current situation.

Could you elaborate that observer approach further? I guess we should go for the best pattern, since we'll already refactor a lot of code.

@Febbe
Copy link
Collaborator

Febbe commented Dec 5, 2021

What is wrong with recursive locks, apart from their minimal overhead?

It's a sign of bad design: It basically means, that none knows, and can say, which thread has when and where access to the data.
But this is actually something which can be controlled when the access patterns are well-defined.
It also has an overhead.

The best solution, would be, to have the class Document; itself, without any Lockable's, a class DocumentHandle;, owning a (shared)reference to a Document and a Lockable. The class DocumentHandle; implements all functions of the Document with a lock_guard. It also implements a template function, which accepts a Function, acquires a lock_guard, and executes that function with a reference to the real Document.

@fabian-thomas
Copy link
Contributor Author

fabian-thomas commented Dec 5, 2021

But that would require a lot of redundant definitions of each function. In DocumentHandle header and source + Document header and source.
How does this pattern solve reentrance?

@Febbe
Copy link
Collaborator

Febbe commented Dec 5, 2021

But that would require a lot of redundant definitions of each function. In DocumentHandle header and source + Document header and source.

Yep, that's a solution which is easy, understandable and explicit.
Another approach would be something like a

template<class std::shared_ptr<std::pair<T, std::mutex>>> WriteHandle

, which overloads the auto operator->() -> T*.
The Handle releases the lock on destruction.

How does this pattern solve reentrance?

The document is itself not accessible from outside, it is also not lockable. But a function which is transferred to the DocumentHandle will retrieve a reference to the Document.
So, inside the function, the Document can be modified without acquiring a lock.
This is safe, because the function can only be executed by the Document Handle, which will acquire the lock.

@fabian-thomas
Copy link
Contributor Author

fabian-thomas commented Dec 8, 2021

This will take a while to complete...
After experimenting some more with different monitor implementations I found the one with overloading the -> operator to be the best.

The most interesting class to review by now is therefore the Monitor one. Pls provide feedback on that one (as stated in the class I've copied the most interesting code from stackoverflow).

For the process of refactoring I need some more information on what is working concurrently on the Document. Currently all calls to the Document will be synchronized. If it would turn out that only some fields in the Document need to be synchronized this would add a big(???) overhead.

If there are no bigger flaws with my work until now I would continue refactoring from time to time to get to a point where I can build the application again.

Comment on lines 1 to 57
#include "Monitor.h"

/*
* Credit to Mike Vine on StackOverflow.
* https://stackoverflow.com/a/48408987
*/

template <class T>
template<typename ...Args>
Monitor<T>::Monitor(Args&&... args) : model(std::forward<Args>(args)...) {}


template <class T>
Monitor<T>::LockedMonitor::LockedMonitor(Monitor* monitor) : mon(monitor), lock(monitor->mutex) {}

template <class T>
T* Monitor<T>::LockedMonitor::operator->() { return &mon->model;}

template <class T>
void Monitor<T>::LockedMonitor::ReplaceModel(T model) {
mon->model = model;
}


template <class T>
Monitor<T>::TryLockedMonitor::TryLockedMonitor(Monitor* monitor) {
this->lock = std::unique_lock<std::mutex>(monitor->mutex, std::defer_lock);
this->lockAcquired = lock.try_lock();
if (this->lockAcquired) {
this->mon = monitor;
}
}

template <class T>
T* Monitor<T>::TryLockedMonitor::operator->() {
assert(lockAcquired);
return &mon->model;
};

template <class T>
void Monitor<T>::TryLockedMonitor::ReplaceModel(T model) {
assert(lockAcquired);
mon->model = model;
}


template <class T>
auto Monitor<T>::operator->() -> typename Monitor<T>::LockedMonitor { return LockedMonitor(this); }

template <class T>
auto Monitor<T>::lock() -> typename Monitor<T>::LockedMonitor { return LockedMonitor(this); }

template <class T>
auto Monitor<T>::tryLock() -> typename Monitor<T>::TryLockedMonitor { return TryLockedMonitor(this); }

template <class T>
T& Monitor<T>::getUnsafeAccess() { return model; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C++ template code must be either in the header file, or directly in the source, where it is used. (The compiler don't know how to generate the code).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually works (it compiles for me) but is really annoying to read. Should I just remove the header and move everything to the source file again?

src/core/model/Monitor.h Outdated Show resolved Hide resolved
src/core/model/Monitor.h Outdated Show resolved Hide resolved

private:
T model;
std::mutex mutex;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want a read and write access, to reduce contention.

Suggested change
std::mutex mutex;
std::shared_mutex mutex;



LockedMonitor operator->();

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add

SharedMonitor lock_shared();
std::optional<SharedMonitor> try_lock_shared();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this differentiation? This will add a lot of complexity.

When we add something like that the caller is again responsible for only doing read operations while holding the shared monitor. We have no way of forcing a uniquely locked monitor for some function calls to Document.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the SharedMonitor only returns a Document const* the caller can't access non const functions.

Comment on lines 21 to 30
struct TryLockedMonitor
{
TryLockedMonitor(Monitor* monitor);
T* operator->();
void ReplaceModel(T model);
bool lockAcquired;
private:
Monitor* mon = nullptr;
std::unique_lock<std::mutex> lock;
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
struct TryLockedMonitor
{
TryLockedMonitor(Monitor* monitor);
T* operator->();
void ReplaceModel(T model);
bool lockAcquired;
private:
Monitor* mon = nullptr;
std::unique_lock<std::mutex> lock;
};
struct SharedMonitor
{
SharedMonitor(Monitor* monitor);
T const * operator->();
void ReplaceModel(T model);
bool lockAcquired;
private:
Monitor* mon = nullptr;
std::shared_lock<std::shared_mutex> lock;
};

Comment on lines 40 to 44
template <class T>
void Monitor<T>::TryLockedMonitor::ReplaceModel(T model) {
assert(lockAcquired);
mon->model = model;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, overloading operator== is what we want. Also, for the Monitor class itself.

fabian-thomas and others added 3 commits December 9, 2021 11:21
Co-authored-by: Fabian Keßler <fabian_kessler@gmx.de>
Co-authored-by: Fabian Keßler <fabian_kessler@gmx.de>
@bhennion
Copy link
Contributor

bhennion commented May 3, 2022

@fabian-thomas What is the status of this PR?

Making the document thread-safe(r) seems like an important thing to do, so merging a PR along those lines would be great!

@fabian-thomas
Copy link
Contributor Author

I absolutely agree. Currently I have no time for working on this. Additionnaly, I think that it might be better if someone who has a better overview of the whole codebase would do this.

I'm gonna try to finish and merge the other PRs that I've currently open in the next few weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants