Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: community history #36

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

0x-r4bbit
Copy link
Member

No description provided.

This adds a raw spec proposal for the community history problem.
These are some rough notes that need to be turned into cockburn style notes:

- As a user I want to be able to read messages of community channels prior to my membership
- As a user I want to be able to read messages older than 30 days
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iurimatias I think I need some help putting this into the desired format.

#### General

- The community owner MUST run store nodes
- Store nodes MUST implement [WAKU2-STORE](https://rfc.vac.dev/spec/13/) and [WAKU2-FTSTORE](https://rfc.vac.dev/spec/21/)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not needed to specify, given that store nodes implement these protocols by definition

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to be explicit about the required protocols at the specs level

Copy link
Member Author

@0x-r4bbit 0x-r4bbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@John-44 @staheri14 @iurimatias I tried putting the document in the desired shape from Status Desktop's perspective.

I've left some comments that I'd love to get some input on. Also, feel free to review the entire thing and leave some feedback if stuff should be changed/improved.

#### General

- The community owner MUST run store nodes
- Store nodes MUST implement [WAKU2-STORE](https://rfc.vac.dev/spec/13/) and [WAKU2-FTSTORE](https://rfc.vac.dev/spec/21/)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not needed to specify, given that store nodes implement these protocols by definition.

#### Editing communities
- The community owner MAY change the list of store nodes for the community
- The Status client MUST update its known store nodes
- The Status client MUST fetch history data when known store nodes were changed
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this true? I assume that, when store nodes have been changed, there's a possibility they might have more history data. So re-fetching at this point might be desired. Thoughts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should a new store node have more history than the other existing nodes? Or do you mean the new store node should fetch history from the old store nodes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.. Maybe not a possibility, but I was thinking, maybe there's some nodes that received messages that other didn't and the owner then decides to put those more "complete" nodes as store nodes for his community.

But now that I think of it, that would mean that the other nodes that were set before live in their own network, being isolated in some way, therefore not having had access to other messages.

So yea, we can probably remove this requirement

- The Status client MUST update its known store nodes
- The Status client MUST fetch history data when known store nodes were changed
- The community owner MAY remove the list of store nodes from the community
- The Status client MUST remove the known store nodes from its cluster
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@John-44 @staheri14 do you see any issue with this? I think ideally communities will have their own store nodes, but we'd still fall back to status' default store fleet, so removing them should be allowed, right?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting removal is fine, however, we should be more specific about how this removal happens. What you are suggesting is the support for a dynamic group of store nodes, so we need to think of a protocol via which Status clients become aware of the arrival and departure of store nodes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you are suggesting is the support for a dynamic group of store nodes

I could be wrong, but AFAIK Status clients know fleets they are connected to. So removing store nodes from a community would mean to remove them from the fleet configuration they've been added to when the community was created.


### Designs

TODO
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@John-44 we probably need designs for specifying the store node list when creating and editing communities.


TODO

### Additional notes
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question /cc @staheri14 @John-44:

Should community specific store nodes only allow storing messages/data related to that community? Obviously that will have some second order effects. One of them being that every community would run their own network.

Copy link

@staheri14 staheri14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PascalPrecht for the doc, I have left some comments :)

#### General

- The community owner MUST run store nodes
- Store nodes MUST implement [WAKU2-STORE](https://rfc.vac.dev/spec/13/) and [WAKU2-FTSTORE](https://rfc.vac.dev/spec/21/)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to be explicit about the required protocols at the specs level

#### Editing communities
- The community owner MAY change the list of store nodes for the community
- The Status client MUST update its known store nodes
- The Status client MUST fetch history data when known store nodes were changed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should a new store node have more history than the other existing nodes? Or do you mean the new store node should fetch history from the old store nodes?

- The Status client MUST update its known store nodes
- The Status client MUST fetch history data when known store nodes were changed
- The community owner MAY remove the list of store nodes from the community
- The Status client MUST remove the known store nodes from its cluster

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting removal is fine, however, we should be more specific about how this removal happens. What you are suggesting is the support for a dynamic group of store nodes, so we need to think of a protocol via which Status clients become aware of the arrival and departure of store nodes

@oskarth
Copy link

oskarth commented Dec 3, 2021

This is more for the archival storage part and perhaps not directly related to the PR: One thing to consider is how this will work with Waku v2 production store nodes in nim-waku.

As a reminder, all Waku v2 production nodes will run nim-waku in line with the strategic long term plan for the Status Network.

Considering the goal of Communities is to ultimately be run by the the Community owner (e.g.), presumably they'll run their own store nodes as well. For these nodes, there's not necessarily status-go available (not until go<>nim messaging is prioritized). Can we create the torrent directly from the store DB then?

In the interim, it seems like we are making an assumption that:

  1. Community owners will run Status Desktop and produce a torrent file from status-go local client DB
  2. Store nodes, which are the 99% access case and where most hard problems will be encountered, are run separately, e.g. by Status or possibly community people

Is this accurate?

I think that's fine for trying things out (similar to using S3 buckets to enable internal dogfooding with minimal effort), but it'd be useful to make sure we have a path towards something more sustainable and in line with the goal of Communities beyond a minimal internal dogfooding MVP stage. For example, the torrent creation interface should be minimal and be generic enough that we have some flexibility here and can have tighter integration with store nodes going forward, e.g.

@0x-r4bbit
Copy link
Member Author

@oskarth thank you for your feedback here!
Let me try to address your comments:

This is more for the archival storage part and perhaps not directly related to the PR: One thing to consider is how this will work with Waku v2 production store nodes in nim-waku.

At the time this PR here was created, I was under the assumption that we'll be going with the MVP proposal by @staheri14 created in vacp2p/research#83, which expects Waku v2 to be used for this solution. After various discussions (that you have also been part of) we're settling on going straight to the archival storage problem and not deal with waku v2 usage for now (although this should happen soon afterwards IMO). Therefore, this PR needs to be updated to reflect the assumptions and expectations of how this will work. As part of that particular effort, there'll also be new specs in Status and possibly vac to discuss the implementation if archival nodes using protocols like torrent.

For these nodes, there's not necessarily status-go available (not until go<>nim messaging is prioritized). Can we create the torrent directly from the store DB then?

This is a very good point and also one that @staheri14 has brought up in one of our discussions. The specs that need to be created for archival storage can most likely live in vac so they'd apply to WakuV2 nodes (possibly even WakuV1) and Status nodes would implement that protocol the same way. Doing it that way, both, waku and status nodes will be able to create and run torrents.

Community owners will run Status Desktop and produce a torrent file from status-go local client DB

That is correct.

Store nodes, which are the 99% access case and where most hard problems will be encountered, are run separately, e.g. by Status or possibly community people

The "Status nodes" that will contribute to the archival storage availability will be Status Desktop clients run by community members + plus the community owner. Status as an organisation will not have to provide torrents via mailserver nodes (although it could if we create the spec in a way that it can be used in both, waku and status nodes). That way of doing has a bunch of tradeoffs that John is willing to take. I'm gonna lay the out int he specs so we can discuss them in more detail there if needed.

but it'd be useful to make sure we have a path towards something more sustainable and in line with the goal of Communities beyond a minimal internal dogfooding MVP stage. For example, the torrent creation interface should be minimal and be generic enough that we have some flexibility here and can have tighter integration with store nodes going forward, e.g.

100% agree.

Does this answer all your questions?

I guess it's fair to say that this draft here needs to be reworked now that we go for something that doesn't require community owners to run their own nodes.

@John-44
Copy link
Collaborator

John-44 commented Dec 3, 2021

Considering the goal of Communities is to ultimately be run by the the Community owner (e.g.), presumably they'll run their own store nodes as well. For these nodes, there's not necessarily status-go available (not until go<>nim messaging is prioritized). Can we create the torrent directly from the store DB then?

Even if a Community Owner node is running nim-waku in the future to provide store node services, that same Community Owner node will also be running go-waku in parallel in order to power Status Desktop's chat functionality. Therefor I would have thought that it is a safe assumption to say that status-go will always be available in all cases, and therefor the Status client's local message database will be available and up to date in all cases. I don't think we need to worry about server hosted nim-waku nodes creating torrents, and even if we wanted to do this it's impossible to do unless we expand the scope of nim-waku to include all of the communities protocol that we are building on top of waku.

nim-waku by itself isn't sufficient for receiving Community messages, decrypting the messages, and putting them unencrypted form into a local message database, because it's missing the whole Communities protocol layer where is where the encryption of all community messages takes place. So unless we are planning on expanding the scope of nim-waku to include all the community protocol stuff we are layering on top of waku (and I can't see any reason why we would want to do this), then I really can't see any point in worrying about an interfacing directly with a with store node (so the interface with Status Desktop's local message database is the only thing we need to define).

--- Looking ahead at potential future directions so we don't paint ourselves into a corner ---

At some point in the future we may replace status-go with a rewrite in another language, but any rewrite would target being a drop in replacement for status-go so I don't think this would change the assumption that each Status Desktop client would have a local message database populated with messages that it has received from the Status chat and Communities protocol messaging layer (which is built on top of the Waku messaging layer). The interface we define for exporting and importing messages from this local message database is the key thing, any new local client database would been to be able to conform to this interface that we are going to define.

Another future path could be to add the store node functionality to waku-go (instead of running waku-nim and waku-go in parallel inside Status Desktop), but we need to complete the initial integration of waku v2 into Status Mobile and Desktop first before choosing the next step.

@John-44
Copy link
Collaborator

John-44 commented Dec 3, 2021

In the interim, it seems like we are making an assumption that:

  1. Community owners will run Status Desktop and produce a torrent file from status-go local client DB

Yes this is accurate

  1. Store nodes, which are the 99% access case and where most hard problems will be encountered, are run separately, e.g. by Status or possibly community people

Today Status the org is running Waku v1 mailservers.

When we make the initial transition to Waku v2, I assume we will initially be replicating the Waku V1 model with Waku V2 e.g. Status the org will run waku v2 store nodes (replacing the Status the org run waku v1 mailservers), is this a correct assumption?

Then after we've got this initial transition to Waku v2 up and running, I assume the next step is start working on the items needed to further decentralize and scale Waku v2. Splitting each Community into it's own Waku topic, and placing the load of running the Waku store nodes for that particular topic on the community itself seems like one good possible direction. I think it would be useful for us to get together a list of all the possible future enhancements to Waku v2 that we want to see after the initial integration of Waku v2 into Status Desktop and Mobile goes live, and for us to arrange this list in priority order so we have a shared understanding of what we will work on next.

Is this accurate?

@oskarth
Copy link

oskarth commented Dec 3, 2021

Thanks @PascalPrecht! Just mentioned one correction but we talked about it in PM already. (Re "Store nodes, which are the 99% access case and where most hard problems will be encountered, are run separately, e.g. by Status or possibly community people")

@John-44 Already mentioned in all group contexts but mentioning in a public comment: Can we please let @staheri14 and @PascalPrecht own this problem? They are the domain experts. The requirements and goals are clear.

@John-44
Copy link
Collaborator

John-44 commented Dec 3, 2021

@John-44 Already mentioned in all group contexts but mentioning in a public comment: Can we please let @staheri14 and @PascalPrecht own this problem? They are the domain experts. The requirements and goals are clear.

I was replying to your message @oskarth , not @staheri14 or @PascalPrecht 's messages!

The reason I replied to your message is because I think it is important that we constrain the scope of the Community History Archive Service to the smallest possible scope, with the smallest number of cross project dependencies, that still delivers a solution to our Community History Archive problem.

  • Every additional consideration that we add which is not absolutely essential to meeting our immediate Community History Archive Problem needs adds scope to this project, which will in turn push back the delivery date of the Community History Archive Service MVP.

  • Any non-essential cross project dependency we add to Community History Archive Service adds complexity and coordination cost, which will also push back the delivery date of the Community History Archive Service MVP.

  • Every time we add a dependency on an external service or component that isn’t proven by being live and working in a production environment today adds execution risk.

  • Everything we build ourselves as part of this project (as opposed to leveraging something that has already been built and proven to work) increases execution time and risk.

Our goal with the Community History Archive Service MVP isn't to build a perfect solution.

Our goal is to build the smallest, simplest, lowest risk solution (even if this means making a whole bunch of compromises) so that we can get the Community History Archive Service MVP built and working in product asap. IMHO we need to be ruthless with scope on this project until the MVP is delivered. If something isn't essential for the MVP it should be cut from the scope of things we are worrying about.

This is why I'm responding to your comment. For this project specifically, I think we need to become much more ruthless about challenging what needs to be in scope and reducing scope wherever possible.

It might be useful if I take a step back and explain my underlying motivations for challenging any new considerations that are brought up with regard to this project. I’m challenging any new considerations that are brought up to test if the consideration being brought up is actually an essential consideration that must be considered for an absolutely minimum scope Community History Archive Service MVP.

I’m hoping that we will be able to restart our internal dogfooding effort in January or at the latest in February next year. Once we start our internal dogfooding, every day we don’t have the Community History Archive Service MVP up and running in our Status Desktop product we will be feeling pain for all the reasons mentioned in the vacp2p/rfc#420 problem statement.

Also it is very highly desirable that we have the Community History Archive Service MVP working and in product before we start early beta testing of Communities with ‘pathfinder partners’.

Note that both our internal dogfooding of Status Communities and the early beta testing of Communities with ‘pathfinder partners’ will work absolutely fine with the Waku v1 we have in the product today, we don’t need to block on waku v2 integration to take either of these steps.

This is why I’m challenging any new considerations that are brought up. When any new consideration is mentioned I’m asking myself “is this consideration strictly relevant for the MVP?” If I think there is a chance that the answer to that question might be no then I’m challenging it, not because I have any attachment to any particular solution, but because I want push us collectively towards trying to find ways to make the Community History Archive problem easier and quicker and less risky to solve.

I think we need to make a shift in our collective mindset about how we are approaching this particular problem. Instead of thinking about all possible considerations and then trying to devise a solution that meets as many of the needs as possible of all the considerations we have listed, we need to ruthlessly exclude every consideration that isn’t essential to meet the MVP needs. E.g. we’ve got to hunt for every simplification we can find, at every step of the process. How can we make the requirements easier to meet while still solving the problem? How can we minimise the number of considerations we need to design and engineer for? How can we maximise use of already written and proven production quality code? How can we minimise the amount of new code we need to write to solve this problem? How can we reduce and ideally eliminate cross project dependencies? Etc…

The proviso on this aggressive reduction of considerations (and therefore scope) is that we should still list all the considerations we can think of (even if we then rule many of these considerations out of scope), and we should keep a list of the compromises we know we are making. A “Now, Next, Never” table is one of my fav tools for doing this type of exercise.

Once the MVP is delivered and in product, the immediate pain we are going to face when we re-start internally dogfooding Status communities without the Community History Archive Service is going to be mitigated, and then we will then have the luxury of being able to go back to the list of considerations we previously descoped, and then decide which of these considerations is the next most important thing that needs to be brought back into scope.

I’m feeling a burning urgency to get Communities to market, and to do that we first need to internally dogfood, and then do early beta testing, and for both of these things to happen (internal dogfooding and early beta testing) we really could do with having the Community History Archive Service up and running. So let’s challenge ourselves to make this problem as small as possible so we can solve it as quickly as possible.


On a different subject:

Re. "Store nodes, which are the 99% access case and where most hard problems will be encountered, are run separately, e.g. by Status or possibly community people" - think it would be useful for us to have a discussion about what waku v2 functionality we should prioritise next after the initial waku v2 integration is done. Imho this should be a separate discussion from the Community History Archive Service discussion we are having here.

@John-44
Copy link
Collaborator

John-44 commented Dec 3, 2021

Hi @r4bbit.eth and @Sanaz ,

Myself and @oskarth had a good and productive conversation an hour or so ago and I think we are aligned regarding next steps for the Community History Archive Service project.

I’ll outline what we’ve agreed below, @oskarth please correct me if I’ve missed anything or if you think anything I’m saying here is incorrect:

  • Next step is for @r4bbit.eth and @Sanaz to complete an Implementation Specification and a Protocol Specification for the Community History Archive Service.

  • When working on these specifications, try to keep the scope of the specifications to the absolute minimum scope needed for the Community History Archive Service MVP.

  • In addition to the above, do feel free to list any and all considerations that you feel could be relevant to Community History Archive Service, even if these considerations are not essential for the MPV. But please mark any considerations that might not be absolutely essential must-haves for the MVP as either “might be outside of MVP scope” or “outside of MVP scope”. I’ve used “Now, Next, Never” tables to do this type of scoping before, but any way of doing this scoping is fine as long as what falls inside the minimum MVP and what falls outside of the minimum MVP is clearly marked.

  • While you are working on the first draft of these specs, myself and @oskarth will get out of your way and remove ourselves from this discussion. But feel free to ping myself and/or @oskarth if you have any questions while working on this first draft.

  • Then once a first draft of these specs are complete, and before implantation starts, myself, @oskarth and @iurimatias will review the specs and see if see anything marked as inside the MVP scope that could be moved outside of the MVP scope, or conversely if there is something outside of the MVP scope that we feel should probably be inside the MVP scope.

  • Once we are all happy with the specs, @Sanaz will return to her work on Waku and @r4bbit.eth will take charge of the implementation Community History Archive Service

  • As time is of the essence, @r4bbit.eth and @Sanaz could you provide us with an estimate of how long you think it will take you to write these specs? This is so that we can timebox this specification work.

@oskarth, I hope I’ve represented everything from our discussion correctly, feel free to jump in if I’ve missed anything or if anything I’ve said in this message needs correcting.

Thanks all! Lets get this Community History Archive Service spec'ed and built 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants