Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap 0.2 #57

Closed
10 tasks done
huachaohuang opened this issue Nov 2, 2021 · 25 comments
Closed
10 tasks done

Roadmap 0.2 #57

huachaohuang opened this issue Nov 2, 2021 · 25 comments
Milestone

Comments

@huachaohuang
Copy link
Contributor

huachaohuang commented Nov 2, 2021

Overview

Goals:

  • Set up basic project management
  • Present the fundamental ideas and usages of Engula

Non-goals:

  • Reliability
  • Performance

Project

Modules

Documents

@huachaohuang huachaohuang pinned this issue Nov 2, 2021
@tisonkun
Copy link
Contributor

tisonkun commented Nov 2, 2021

Thanks for starting this work! I have one question about the layout and one question about microunit.

About the layout, if the kernel contains integrations with microunit, we set up a dependency from kernel to framework. Then from discussion #54 , the microunit component requires basic consistent storage. Where is the basic consistent storage? If it stays in kernel, we now introduce a circular dependency.

About microunit, I think one of us can create a dedicated subtask issue tracked here, and then we can go into details about HTTP APIs and node communication (if any).

@huachaohuang
Copy link
Contributor Author

The dependency should be something like this:

- API
  - Kernel
    - Platform
      - Framework
        - Microunit
        - Consensus

So upper-level layers can depend on lower-level layers.
I will create another issue for microunit and may lay down some foundation first. You can try to take some tasks when you have enough confidence about how things work :)

@tisonkun
Copy link
Contributor

tisonkun commented Nov 2, 2021

@huachaohuang thanks for you explanation.

the microunit component requires basic consistent storage. Where is the basic consistent storage?

Is there a dependency from Microunit to Consensus? From the graph it seems microunit and consensus are independent.

@huachaohuang
Copy link
Contributor Author

@tisonkun I think a module can depend on another module at the same level. If we need a consistent store in microunit, then microunit can depend on consensus.
BTW, I think we don't need a consistent store for v0.2. We can store all data in memory for now.

@huachaohuang huachaohuang changed the title Roadmap for v0.2 Engula v0.2 Nov 2, 2021
@huachaohuang huachaohuang added this to the Version 0.2 milestone Nov 3, 2021
@tisonkun tisonkun mentioned this issue Nov 3, 2021
5 tasks
@huachaohuang
Copy link
Contributor Author

We need to take care of the async runtime. There are a few questions to be answered:

  • Do we need to build our own runtime?
    • If we don't, what existing runtimes do we choose?
      • There are two major runtimes I can see: tokio and async-std
        • tokio is more mature but async-std seems pop up now and then
      • How do we do simulation tests with existing runtimes?
    • If we do, then everything is workable except that we have a lot of work to do. And we should be worried about the maintainability of the project.

I think the best thing we can do in this stage is to abstract the runtime so that we can switch to a different runtime in the future.

@huachaohuang
Copy link
Contributor Author

huachaohuang commented Nov 6, 2021

The bad thing is that runtime is contagious. If our dependencies rely on a specific runtime, so do we.

@PsiACE
Copy link

PsiACE commented Nov 6, 2021

For now, it is more expensive to build our own runtime and there are performance, compatibility and other issues to consider. The best and most popular choice at the moment is still tokio.

@PsiACE
Copy link

PsiACE commented Nov 6, 2021

How do we do simulation tests with existing runtimes?

What does this mean? Perhaps you could explain it in more detail.

@huachaohuang
Copy link
Contributor Author

@PsiACE There is a discussion about simulation here. I think some people are working on simulation for Tokio too.

@PsiACE
Copy link

PsiACE commented Nov 6, 2021

@PsiACE There is a discussion about simulation here. I think some people are working on simulation for Tokio too.

I generally understand. There seems to be a lack of available, reliable and complete work for this type of testing. Also, raft-rs seems to have done some tests with datadriven and failpoint, which I think might be a reference.

@huachaohuang
Copy link
Contributor Author

@PsiACE There is a discussion about simulation here. I think some people are working on simulation for Tokio too.

I generally understand. There seems to be a lack of available, reliable and complete work for this type of testing. Also, raft-rs seems to have done some tests with datadriven and failpoint, which I think might be a reference.

Yes, this is a very hard topic. I think we can lay down the foundation of the project layout and module APIs first. Then we will try to figure out which way to go for testing, tracing, etc.

@huachaohuang
Copy link
Contributor Author

huachaohuang commented Nov 6, 2021

Considering that the project scope is very large, it is not feasible for me to design every module beforehand. So for better collaboration, I think I should focus on writing down the top-level design of the overall architecture and some design guidelines of different components first. Then we can open discussions for different components. Contributors who want to do serious design and implementation for a specific module can send an RFC to make further design decisions.

As for v0.2, our primary goal is to lay down the foundation of the project. So we only need to deliver the most basic implementations (like a memory version or something) for different modules.

@huachaohuang huachaohuang changed the title Engula v0.2 Roadmap v0.2 Nov 9, 2021
@w41ter
Copy link
Contributor

w41ter commented Nov 19, 2021

  • How do we do simulation tests with existing runtimes?

It is too difficult that manage the time and event sequences in a distributed system's simulation tests. It is more efficient that use code injection (eg, failpoints) to ensure the ordering of key events.

@w41ter
Copy link
Contributor

w41ter commented Nov 19, 2021

There are two major runtimes I can see: tokio and async-std

  • tokio is more mature but async-std seems pop up now and then

In my impression, tokio is the future.

@huachaohuang
Copy link
Contributor Author

huachaohuang commented Nov 19, 2021

It is too difficult that manage the time and event sequences in a distributed system's simulation tests. It is more efficient that use code injection (eg, failpoints) to ensure the ordering of key events.

Yes, it is very difficult. I don't have a solution yet. But I personally don't like failure injection. It takes a lot of effort to design the failure cases and it only works when you already understand the failure paths.

@tisonkun
Copy link
Contributor

@huachaohuang you can update "Modules" items from * xxx to * [ ] xxx so that subtasks will show a "Tracked in #57" button to jump back to this issue.

@w41ter
Copy link
Contributor

w41ter commented Nov 26, 2021

It takes a lot of effort to design the failure cases and it only works when you already understand the failure paths.

Yes, but I personally think that the fail injection is necessary. It can be used to construct the corner cases that have been discoveried. As you said, it is indeed more suitable for known issues.

@w41ter
Copy link
Contributor

w41ter commented Nov 26, 2021

The simulation method that want to cover all failure paths is very complicated. In the short term, fault injection and chaos testing should be able to help solve most problems.

@huachaohuang
Copy link
Contributor Author

Related discussions on zulip:

@huachaohuang We have developed a lot of designs, concepts, and abstractions recently. We apply the principle of public designs and discussions (engula has no internal documents or discussion channels at all), which means that we talk about a lot of early ideas that are subject to change. While these ideas give more opportunities for the community to involve, they also confuse the community if they change rapidly. So in order to converge the work we have done so far, I suggest that we cut a simpler version 0.2 with the following features at the end of Dec 2021:

  • A design document that explains existing concepts
  • A hash engine that can use different kinds of kernels
  • A kernel, journal, and storage abstraction
  • A memory kernel that integrates the memory journal and memory storage
  • A file kernel that integrates the file journal and file storage
  • A grpc kernel that integrates the grpc journal and grpc storage
  • A blog post that announces the v0.2 release, demonstrates the hash engine and explains some future plans.

Specifically, I suggest cutting microunit off at v0.2 since we don't have a clear design about it for now. Then we can focus on resource management and deployment in v0.3.

@tisonkun Today I consider this topic also. I agree with you that we don't have to deliver multiple concepts about deployment before we have a clear design and so does microunit. However, we should still deliver a basic usable executable in v0.2 so that early users can explore the software and inspire ideas.

@huachaohuang
Copy link
Contributor Author

Here's my plan towards v0.2 so far:

  • I will start cleaning up the current codebase, maybe remove some unnecessary code for v0.2
  • Then I will clean up the kernel, journal, storage traits
  • Someone can help to fix some existing journal, storage implementations, as well as work on A file journal implementation #128
  • Someone can help to implement a file kernel
  • I will implement a grpc kernel and provide a binary to start a kernel server
  • Figure out how to release
  • Prepare the release post and future plans

@huachaohuang
Copy link
Contributor Author

I think we get people on every issue except #66 now. So let me see how to tackle it.

@huachaohuang
Copy link
Contributor Author

Despite some new discussions we had in the past few days, I think we should still deliver v0.2 on time (12-17). So let's still move forward with the current codebase. So far, if we drop #66 , what's left are: #128 #147 #148 . Other issues can be resolved trivially. So I am going to take over those three issues from now on.

@huachaohuang
Copy link
Contributor Author

The draft release post is here.

@huachaohuang
Copy link
Contributor Author

Added a tutorial with the release post here. You can help to review engula/engula.github.io#17.

@huachaohuang huachaohuang changed the title Roadmap v0.2 Roadmap 0.2 Dec 16, 2021
@huachaohuang
Copy link
Contributor Author

Congratulations! We have just released v0.2.0! The release post is here.
Thank you everyone for contributing to this release. I hope you enjoy this journey. And Engula 0.3 is on the road now :)

@huachaohuang huachaohuang unpinned this issue Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants