About scheduler_lock #82

4kangjc · 2022-12-11T07:16:11Z

Lines 90 to 106 in 37be012

    
           // This lock is held when the fiber is in state-transition (e.g., from running 
        
           // to suspended). This is required since it's inherent racy when we add 
        
           // ourselves into some wait-chain (and eventually woken up by someone else) 
        
           // and go to sleep. The one who wake us up can be running in a different 
        
           // pthread, and therefore might wake us up even before we actually went sleep. 
        
           // So we always grab this lock before transiting the fiber's state, to ensure 
        
           // that nobody else can change the fiber's state concurrently. 
        
           // 
        
           // For waking up a fiber, this lock is grabbed by whoever the waker; 
        
           // For a fiber to go to sleep, this lock is grabbed by the fiber itself and 
        
           // released by *`SchedulingGroup`* (by the time we're sleeping, we cannot 
        
           // release the lock ourselves.). 
        
           // 
        
           // This lock also protects us from being woken up by several pthread 
        
           // concurrently (in case we waited on several waitables and have not removed 
        
           // us from all of them before more than one of then has fired.). 
        
           Spinlock scheduler_lock;

flare/flare/fiber/detail/fiber_entity.h

Lines 354 to 356 in 37be012

    
           // Argument `context` (i.e., `this`) is only used the first time the context 
        
           // is jumped to (in `FiberProc`). 
        
           jump_context(&caller->state_save_area, state_save_area, this);

或许可以利用一下这个Argument context， fiber切换过去再将caller fiber的State改变，状态改变的时候就不需要这把锁了？

inline void FiberEntity::Resume() noexcept {
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // Argument `context`  set caller
  auto caller_ = jump_context(&caller->state_save_area, state_save_area, caller);
  // caller_ set nullptr when fiber return
  if (caller_) {
    static_cast<FiberEntity*>(caller_)->state = FiberState::Waiting;
  }
  ...
}

static void FiberProc(void* context) {
  auto caller = reinterpret_cast<FiberEntity*>(context);
  caller->state = FiberState::Waiting;
  //....
  current_fiber->state = FiberState::Dead;
  GetMasterFiberEntity()->M_return([](){...});
}

void FiberEntity::M_return(Function<void()>&& cb) noexcept {
  // set `resume_proc` ....
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // set Argument `context`  nullptr
  jump_context(&caller->state_save_area, state_save_area, nullptr);
}

The text was updated successfully, but these errors were encountered:

4kangjc · 2022-12-11T07:28:12Z

flare/flare/fiber/detail/fiber_entity.h

Lines 90 to 106 in 37be012

// This lock is held when the fiber is in state-transition (e.g., from running

// to suspended). This is required since it's inherent racy when we add

// ourselves into some wait-chain (and eventually woken up by someone else)

// and go to sleep. The one who wake us up can be running in a different

// pthread, and therefore might wake us up even before we actually went sleep.

// So we always grab this lock before transiting the fiber's state, to ensure

// that nobody else can change the fiber's state concurrently.

//

// For waking up a fiber, this lock is grabbed by whoever the waker;

// For a fiber to go to sleep, this lock is grabbed by the fiber itself and

// released by *`SchedulingGroup`* (by the time we're sleeping, we cannot

// release the lock ourselves.).

//

// This lock also protects us from being woken up by several pthread

// concurrently (in case we waited on several waitables and have not removed

// us from all of them before more than one of then has fired.).

Spinlock scheduler_lock;

flare/flare/fiber/detail/fiber_entity.h

Lines 354 to 356 in 37be012

// Argument `context` (i.e., `this`) is only used the first time the context

// is jumped to (in `FiberProc`).

jump_context(&caller->state_save_area, state_save_area, this);

或许可以利用一下这个Argument context， fiber切换过去再将caller fiber的State改变，状态改变的时候就不需要这把锁了？
inline void FiberEntity::Resume() noexcept {
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // Argument `context`  set caller
  auto caller_ = jump_context(&caller->state_save_area, state_save_area, caller);
  // caller_ set nullptr when fiber return
  if (caller_) {
    static_cast<FiberEntity*>(caller_)->state = FiberState::Waiting;
  }
  ...
}

static void FiberProc(void* context) {
  auto caller = reinterpret_cast<FiberEntity*>(context);
  caller->state = FiberState::Waiting;
  //....
  current_fiber->state = FiberState::Dead;
  GetMasterFiberEntity()->M_return([](){...});
}

void FiberEntity::M_return(Function<void()>&& cb) noexcept {
  // set `resume_proc` ....
  SetCurrentFiberEntity(this);
  state = FiberState::Running;
  // set Argument `context`  nullptr
  jump_context(&caller->state_save_area, state_save_area, nullptr);
}

curren_fiber的设置在yield之前设置应该没问题吧，如果有问题的话，context就设置成这个吧

struct jump_context_data {
  FiberEntity* self, *caller;
};

0x804d8000 · 2022-12-11T15:11:20Z

这儿不（只）在于FiberProc（fiber创建后第一次运行），在涉及到fiber::Mutex之类的场景，从pthread1加入wait-chain之后，可能立刻就会被pthread2从chain中弹出并准备执行，这时候需要某种同步机制达到「pthread2等待直到pthread1不再操作fiber_entity」，无论是spinlock，还是复用其他某个字段（比如fiber->state），最后对性能的影响应该都是类似的，相比之下spinlock更利于对整体代码的理解。

不过就性能而言，fiber调度的主要成本在于pthread2的唤醒，这儿多一个少一个atomic的操作应该实际上感知不到

4kangjc · 2022-12-11T15:34:38Z

不过现在这个写法，fiber的state就和schedulingGroup强绑定了，需要它才能去正确的得到fiber的state

0x804d8000 · 2022-12-11T15:38:38Z

不排除我有什么地方记岔了，但是fiber的state应该只要能获取到scheduler_lock就可以访问，并不要求跟scheduling_group绑定

比如SchedulingGroup::RemoteAcquireFiber就是用来将fiber在scheduling group之间迁移的

4kangjc · 2022-12-11T15:46:43Z

不排除我有什么地方记岔了，但是fiber的state应该只要能获取到scheduler_lock就可以访问，并不要求跟scheduling_group绑定

比如SchedulingGroup::RemoteAcquireFiber就是用来将fiber在scheduling group之间迁移的

噢好像，fiber单独拿出来使用也不能yield，要配合scheduling_group才能yield，也就不涉及waiting状态的改变。我的原意是指单独使用fiber的时候，去yield，fiber的状态没有变成waiting

4kangjc · 2022-12-11T16:08:50Z

// GetId()?
this_fiber里这个返回debug_id不行么

0x804d8000 · 2022-12-11T16:27:35Z

https://github.com/Tencent/flare/blob/master/flare/fiber/alternatives.h#L35

这个，用来不依赖__const__那个hack的情况下获取当前线程id

4kangjc · 2022-12-11T16:56:30Z

https://github.com/Tencent/flare/blob/master/flare/fiber/alternatives.h#L35

这个，用来不依赖__const__那个hack的情况下获取当前线程id

thread id也有errno这个问题吗

4kangjc · 2022-12-11T16:59:56Z

// GetId()?
this_fiber里这个返回debug_id不行么

flare/flare/fiber/this_fiber.h

Line 48 in 37be012

// `GetId()`?

flare/flare/fiber/detail/fiber_entity.h

Line 88 in 37be012

std::uint64_t debugging_fiber_id;

0x804d8000 · 2022-12-11T17:48:47Z

thread id也有errno这个问题吗

有，其实用到thread-local storage都有可能有问题

this_fiber里这个返回debug_id不行么

这个id有需求可以返回，这个是fiber的id。但是async_test.cc那个测试里面测试的目的是「Dispatch方式运行的fiber会直接在caller的pthread里面运行」，所以比较的是threadid

4kangjc · 2022-12-12T02:35:40Z

thread id也有errno这个问题吗

有，其实用到thread-local storage都有可能有问题

this_fiber里这个返回debug_id不行么

这个id有需求可以返回，这个是fiber的id。但是async_test.cc那个测试里面测试的目的是「Dispatch方式运行的fiber会直接在caller的pthread里面运行」，所以比较的是threadid

不是，这是另外一个话题了，我的意思是this_fiber.h里怎么没有GetId这个函数，实现就用fiber_entity的debugging_fiber_id不行吗，

flare/flare/fiber/this_fiber.h

Line 48 in 37be012

// `GetId()`?

我看这里只是打了个问号

4kangjc closed this as completed Dec 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About scheduler_lock #82

About scheduler_lock #82

4kangjc commented Dec 11, 2022 •

edited

4kangjc commented Dec 11, 2022 •

edited

0x804d8000 commented Dec 11, 2022

4kangjc commented Dec 11, 2022

0x804d8000 commented Dec 11, 2022

4kangjc commented Dec 11, 2022

4kangjc commented Dec 11, 2022

0x804d8000 commented Dec 11, 2022

4kangjc commented Dec 11, 2022 •

edited

4kangjc commented Dec 11, 2022

0x804d8000 commented Dec 11, 2022

4kangjc commented Dec 12, 2022 •

edited

About scheduler_lock #82

About scheduler_lock #82

Comments

4kangjc commented Dec 11, 2022 • edited

4kangjc commented Dec 11, 2022 • edited

0x804d8000 commented Dec 11, 2022

4kangjc commented Dec 11, 2022

0x804d8000 commented Dec 11, 2022

4kangjc commented Dec 11, 2022

4kangjc commented Dec 11, 2022

0x804d8000 commented Dec 11, 2022

4kangjc commented Dec 11, 2022 • edited

4kangjc commented Dec 11, 2022

0x804d8000 commented Dec 11, 2022

4kangjc commented Dec 12, 2022 • edited

4kangjc commented Dec 11, 2022 •

edited

4kangjc commented Dec 11, 2022 •

edited

4kangjc commented Dec 11, 2022 •

edited

4kangjc commented Dec 12, 2022 •

edited