Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About scheduler_lock #82

Closed
4kangjc opened this issue Dec 11, 2022 · 11 comments
Closed

About scheduler_lock #82

4kangjc opened this issue Dec 11, 2022 · 11 comments

Comments

@4kangjc
Copy link
Contributor

4kangjc commented Dec 11, 2022

// This lock is held when the fiber is in state-transition (e.g., from running
// to suspended). This is required since it's inherent racy when we add
// ourselves into some wait-chain (and eventually woken up by someone else)
// and go to sleep. The one who wake us up can be running in a different
// pthread, and therefore might wake us up even before we actually went sleep.
// So we always grab this lock before transiting the fiber's state, to ensure
// that nobody else can change the fiber's state concurrently.
//
// For waking up a fiber, this lock is grabbed by whoever the waker;
// For a fiber to go to sleep, this lock is grabbed by the fiber itself and
// released by *`SchedulingGroup`* (by the time we're sleeping, we cannot
// release the lock ourselves.).
//
// This lock also protects us from being woken up by several pthread
// concurrently (in case we waited on several waitables and have not removed
// us from all of them before more than one of then has fired.).
Spinlock scheduler_lock;

// Argument `context` (i.e., `this`) is only used the first time the context
// is jumped to (in `FiberProc`).
jump_context(&caller->state_save_area, state_save_area, this);

或许可以利用一下这个Argument contextfiber切换过去再将caller fiberState改变,状态改变的时候就不需要这把锁了?

inline void FiberEntity::Resume() noexcept {
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // Argument `context`  set caller
  auto caller_ = jump_context(&caller->state_save_area, state_save_area, caller);
  // caller_ set nullptr when fiber return
  if (caller_) {
    static_cast<FiberEntity*>(caller_)->state = FiberState::Waiting;
  }
  ...
}

static void FiberProc(void* context) {
  auto caller = reinterpret_cast<FiberEntity*>(context);
  caller->state = FiberState::Waiting;
  //....
  current_fiber->state = FiberState::Dead;
  GetMasterFiberEntity()->M_return([](){...});
}

void FiberEntity::M_return(Function<void()>&& cb) noexcept {
  // set `resume_proc` ....
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // set Argument `context`  nullptr
  jump_context(&caller->state_save_area, state_save_area, nullptr);
}
@4kangjc
Copy link
Contributor Author

4kangjc commented Dec 11, 2022

// This lock is held when the fiber is in state-transition (e.g., from running
// to suspended). This is required since it's inherent racy when we add
// ourselves into some wait-chain (and eventually woken up by someone else)
// and go to sleep. The one who wake us up can be running in a different
// pthread, and therefore might wake us up even before we actually went sleep.
// So we always grab this lock before transiting the fiber's state, to ensure
// that nobody else can change the fiber's state concurrently.
//
// For waking up a fiber, this lock is grabbed by whoever the waker;
// For a fiber to go to sleep, this lock is grabbed by the fiber itself and
// released by *`SchedulingGroup`* (by the time we're sleeping, we cannot
// release the lock ourselves.).
//
// This lock also protects us from being woken up by several pthread
// concurrently (in case we waited on several waitables and have not removed
// us from all of them before more than one of then has fired.).
Spinlock scheduler_lock;

// Argument `context` (i.e., `this`) is only used the first time the context
// is jumped to (in `FiberProc`).
jump_context(&caller->state_save_area, state_save_area, this);

或许可以利用一下这个Argument contextfiber切换过去再将caller fiberState改变,状态改变的时候就不需要这把锁了?

inline void FiberEntity::Resume() noexcept {
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // Argument `context`  set caller
  auto caller_ = jump_context(&caller->state_save_area, state_save_area, caller);
  // caller_ set nullptr when fiber return
  if (caller_) {
    static_cast<FiberEntity*>(caller_)->state = FiberState::Waiting;
  }
  ...
}

static void FiberProc(void* context) {
  auto caller = reinterpret_cast<FiberEntity*>(context);
  caller->state = FiberState::Waiting;
  //....
  current_fiber->state = FiberState::Dead;
  GetMasterFiberEntity()->M_return([](){...});
}

void FiberEntity::M_return(Function<void()>&& cb) noexcept {
  // set `resume_proc` ....
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // set Argument `context`  nullptr
  jump_context(&caller->state_save_area, state_save_area, nullptr);
}

curren_fiber的设置在yield之前设置应该没问题吧, 如果有问题的话,context就设置成这个吧

struct jump_context_data {
  FiberEntity* self, *caller;
};

@0x804d8000
Copy link
Collaborator

这儿不(只)在于FiberProc(fiber创建后第一次运行),在涉及到fiber::Mutex之类的场景,从pthread1加入wait-chain之后,可能立刻就会被pthread2从chain中弹出并准备执行,这时候需要某种同步机制达到「pthread2等待直到pthread1不再操作fiber_entity」,无论是spinlock,还是复用其他某个字段(比如fiber->state),最后对性能的影响应该都是类似的,相比之下spinlock更利于对整体代码的理解。

不过就性能而言,fiber调度的主要成本在于pthread2的唤醒,这儿多一个少一个atomic的操作应该实际上感知不到

@4kangjc
Copy link
Contributor Author

4kangjc commented Dec 11, 2022

不过现在这个写法,fiber的state就和schedulingGroup强绑定了,需要它才能去正确的得到fiber的state

@0x804d8000
Copy link
Collaborator

不排除我有什么地方记岔了,但是fiber的state应该只要能获取到scheduler_lock就可以访问,并不要求跟scheduling_group绑定

比如SchedulingGroup::RemoteAcquireFiber就是用来将fiber在scheduling group之间迁移的

@4kangjc
Copy link
Contributor Author

4kangjc commented Dec 11, 2022

不排除我有什么地方记岔了,但是fiber的state应该只要能获取到scheduler_lock就可以访问,并不要求跟scheduling_group绑定

比如SchedulingGroup::RemoteAcquireFiber就是用来将fiber在scheduling group之间迁移的

噢好像,fiber单独拿出来使用也不能yield,要配合scheduling_group才能yield,也就不涉及waiting状态的改变。我的原意是指单独使用fiber的时候,去yield,fiber的状态没有变成waiting

@4kangjc
Copy link
Contributor Author

4kangjc commented Dec 11, 2022

// GetId()?
this_fiber里这个返回debug_id不行么

@0x804d8000
Copy link
Collaborator

https://github.com/Tencent/flare/blob/master/flare/fiber/alternatives.h#L35

这个,用来不依赖__const__那个hack的情况下获取当前线程id

@4kangjc
Copy link
Contributor Author

4kangjc commented Dec 11, 2022

https://github.com/Tencent/flare/blob/master/flare/fiber/alternatives.h#L35

这个,用来不依赖__const__那个hack的情况下获取当前线程id

thread id也有errno这个问题吗

@4kangjc
Copy link
Contributor Author

4kangjc commented Dec 11, 2022

// GetId()?
this_fiber里这个返回debug_id不行么

// `GetId()`?

std::uint64_t debugging_fiber_id;

@0x804d8000
Copy link
Collaborator

thread id也有errno这个问题吗

有,其实用到thread-local storage都有可能有问题

this_fiber里这个返回debug_id不行么

这个id有需求可以返回,这个是fiber的id。但是async_test.cc那个测试里面测试的目的是「Dispatch方式运行的fiber会直接在caller的pthread里面运行」,所以比较的是threadid

@4kangjc
Copy link
Contributor Author

4kangjc commented Dec 12, 2022

thread id也有errno这个问题吗

有,其实用到thread-local storage都有可能有问题

this_fiber里这个返回debug_id不行么

这个id有需求可以返回,这个是fiber的id。但是async_test.cc那个测试里面测试的目的是「Dispatch方式运行的fiber会直接在caller的pthread里面运行」,所以比较的是threadid

不是,这是另外一个话题了,我的意思是this_fiber.h里怎么没有GetId这个函数,实现就用fiber_entitydebugging_fiber_id不行吗,

// `GetId()`?

我看这里只是打了个问号

@4kangjc 4kangjc closed this as completed Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants