[TODO] Overhaul scheduling and thread / process lifecycle #48

byteduck · 2023-02-04T21:19:50Z

The current way that threads and processes are managed is unsafe and breaks easily. Thread, Process, and TaskManager contain some of the oldest code in the entire kernel, and I've learned a lot since writing most of it. In no particular order, here is a list of things that needs to be done:

Manage the deallocation / destruction of processes using reference counting.
Overhaul thread blocking - blockers should unblock a thread on-demand instead of needing to iterate over every blocked thread on every preemption to check if theyr'e ready to be unblocked.
Replace the current round-robin scheduling with something better that allows for different thread priorities.
Keep track of locks held by threads and the order in which they were acquired to prevent deadlocks.
Make exec work in-place instead of creating a new Process and setting the old one's pid to -1.
Reduce preemption overhead by as much as possible.

I'm sure there's more I'm forgetting, and I'll add to this list as I think of things.

The text was updated successfully, but these errors were encountered:

MarcoCicognani · 2023-02-05T17:38:33Z

To improve the performance of the scheduling and the waiting infrastructure is clever to keep the logic of the waiting state into the execution context of the blocked thread.

I mean: instead of remove the blocked thread from the scheduling list, and put it into a special list which is iterated each context switch to check the waiting state; keep the thread in scheduling list and, when in waiting state, it remains for his time slice into che kernel context to do a while ( need_waiting() ) { yield() }

This simplifies the Scheduler code and reduces unsafe code

byteduck · 2023-02-05T19:32:16Z

To improve the performance of the scheduling and the waiting infrastructure is clever to keep the logic of the waiting state into the execution context of the blocked thread.

I mean: instead of remove the blocked thread from the scheduling list, and put it into a special list which is iterated each context switch to check the waiting state; keep the thread in scheduling list and, when in waiting state, it remains for his time slice into che kernel context to do a while ( need_waiting() ) { yield() }

This simplifies the Scheduler code and reduces unsafe code

That could be a possible solution, I never thought of that! My only worry with doing it that way is that context switches are fairly expensive (we have to invalidate the TLB, switch out registers, etc).

The way I was thinking of doing was instead of having the blocked thread (in waitpid with a WaitBlocker, for example) constantly checking if any children have died and then unblocking itself, we could just have the children unblock any pertinent waiting thread(s) when they die.

MarcoCicognani · 2023-02-05T23:09:23Z

Even this could be a solution.
The solution I proposed tries to simplify the Scheduler code and the way the waiting is handled.
Different OS kernels use this solution. I don't know for SerenityOS, but GhostOS and EscapeOS for sure

byteduck · 2023-02-05T23:15:17Z

Even this could be a solution.

The solution I proposed tries to simplify the Scheduler code and the way the waiting is handled.

Different OS kernels use this solution. I don't know for SerenityOS, but GhostOS and EscapeOS for sure

Ah okay, I wasn't aware they did that! Doing it the way you suggested would definitely be easier and require rewriting a lot less code, so it's definitely worth looking into.

The main issue with the way it's done now is that we cannot acquire any locks, yield, etc while evaluating block conditions since it's done in the preemption logic, so doing it as you suggested would definitely fix that.

byteduck · 2023-02-10T00:23:22Z

I mean: instead of remove the blocked thread from the scheduling list, and put it into a special list which is iterated each context switch to check the waiting state; keep the thread in scheduling list and, when in waiting state, it remains for his time slice into che kernel context to do a while ( need_waiting() ) { yield() }

This simplifies the Scheduler code and reduces unsafe code

Just an update - I tried this method, and compared it to the old one of looping through blocked threads upon every preemption. Essentially, after preemption, if a thread is blocked, it will check if it should unblock. If not, it will yield to the next thread in the queue.

On my machine, it is about 30% slower to boot doing it this way, probably due to the overhead of having those blocked threads in the queue since the vast majority of threads are blocked at any given time. Preempting without switching out the page tables when the thread is blocked (since we only care about kernel memory anyway) helps a little bit, but it's still a lot slower than the old way. This could be alleviated by moving blocked threads to a lower-priority queue, but that would also result in less responsiveness since threads would take longer on average to wake back up.

Ultimately, I think an on-demand thread blocking system would be best, where a thread is immediately added back to the queue once it's ready instead of constantly polling whether or not it's ready. This will require a lot of changes, but it should result in higher responsiveness.

byteduck added suggestion New feature or request kernel The issue is about the kernel critical This issue affects the core functionality of duckOS labels Feb 4, 2023

byteduck self-assigned this Feb 4, 2023

byteduck mentioned this issue Feb 4, 2023

[BUG] Terminal does not accept input after CTRL-C #45

Closed

byteduck removed the suggestion New feature or request label Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TODO] Overhaul scheduling and thread / process lifecycle #48

[TODO] Overhaul scheduling and thread / process lifecycle #48

byteduck commented Feb 4, 2023

MarcoCicognani commented Feb 5, 2023

byteduck commented Feb 5, 2023

MarcoCicognani commented Feb 5, 2023

byteduck commented Feb 5, 2023

byteduck commented Feb 10, 2023 •

edited

[TODO] Overhaul scheduling and thread / process lifecycle #48

[TODO] Overhaul scheduling and thread / process lifecycle #48

Comments

byteduck commented Feb 4, 2023

MarcoCicognani commented Feb 5, 2023

byteduck commented Feb 5, 2023

MarcoCicognani commented Feb 5, 2023

byteduck commented Feb 5, 2023

byteduck commented Feb 10, 2023 • edited

byteduck commented Feb 10, 2023 •

edited