Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replication #50

Closed
juliangruber opened this issue Feb 11, 2013 · 16 comments
Closed

replication #50

juliangruber opened this issue Feb 11, 2013 · 16 comments
Labels
discussion Discussion stale This issue or pull request is old

Comments

@juliangruber
Copy link
Member

@gedw99: "I am building a 3D cad modelling system and tons of json data I need to
store on the servers in many data centers.
I run offline using indexdb and so need to also sync.

Originally I used pouchdb and couxhdb.

But I want to replace all of it with level dB."

  • what's the merge strategy?
  • will it be master-only?
  • how is your topology?
@ghost
Copy link

ghost commented Feb 11, 2013

Merge strategy will need to use difference engine based on date time stamps between client and server as well as server to server. Same merge strategy makes sense because its peer to peer based.
This is possible because all data changes are time stamped and kept. This is because its like an edit list and he user can always go forward or back in time on the edits in the CAD software. SO its makes the data replication easy.

NOT master only. Peer to peer multi master.

The topology is client to server, server to server and maybe client to client. I say maybe because i doubt that ONLY JavaScript running in a browser will work based for the various security implications, but it can certainly work with a replication token supplied from the server

@juliangruber
Copy link
Member Author

please avoid doing so many typos, your text is difficult to read

@ghost
Copy link

ghost commented Feb 11, 2013

ok fixed it up. sorry

@juliangruber
Copy link
Member Author

So for every datapoint of two merging bodies you accept the one with the newest timestamp. That's easy, that is exactly what scuttlebutt / crdt will do for you. Or do you have any special requirements to the underlying datastructure?

@ghost
Copy link

ghost commented Feb 11, 2013

Yes this is true. I take the latest.
I read a little about scuttle butb but have not used it in anger yet.

i also have to store images for the CAD materials
The image file i would want to also save in the DB. This is dumb i know from a speed point of view but its needed because i need the images to be saved in the off line database client side.

do you know if scuttle butt works client side on top of index db ?

server to server replication
from the point of view of server to server multi master i need to make sure i have 3 copies in each data center
not sure what is appropriate here. I am wondering if there are any levelup modules that handle this for me. ?

data level security
i will store the user ID against the data. simple.

@dominictarr
Copy link
Contributor

sweet, this is super cool!

I have also been thinking about a cad system. I've built a few boats,
such as http://www.flickr.com/photos/dominictarr/sets/72157594180332221/

And so it's basically inevitable that one day i'll need to write my own boat design software, then use it to design a boat.

So, you are basically gonna need Vector data, correct?

There isn't any scuttlebutt for vector data yet, but there certainly could be.
(Scuttlebutt itself is a Super Class, that handles the replication part, see links in the repo)

I also have not yet implemented a scuttlebutt that has roll-back/checkout/undo - but it would basically be a matter of
keeping more history -- fairly simple.

what kind of data structures are you planning? I am very happy to help figure out something that will work well with replication!

see also https://github.com/dominictarr/snob <- this uses a more git-like architecture, which may suit your application,
(this can probably be updated to fit a scuttlebutt type model, but will be more work than just using scuttlebutt)

@ghost
Copy link

ghost commented Feb 12, 2013

hey Dom,

this s great news.

i should also mention that we are a non profit in Germany. Our main thing is Biomimicry.
So we are really trying to change the world with this thing.

For me this all came about because i wanted to make building easier and to be abe to actually build generative houses.

So the CAD system is generative as well "traditional drawing".
This means we also need to hold code in the database too. Both JS as well as binary.
The binary is run on gpU's using webcl i can discuss more of that later.

As far as data structures. Yes its vector, but also need to hold binary images too. All needs to be part of the database.

but Snob looks damn useful. CAD and version control are often lumped together in a not too nice marriage.
so i very much like having version control in from day one.

Then there is CRDT
None destructive editing is also one way of doing version control too in Design systems.
I assume you can go back to a point in time with it. You might think of it as a repo where every change is a snapshot, But can you merge designs with it like we do on github with your SNOB ?

So i wonder trying to work out if SNOB or CRDT is the best one ?

@navaru
Copy link

navaru commented Feb 22, 2013

@gedw99 can you provide some real json as an example, I wonder how the data model looks. Your project sounds interesting, is it opensource? website?

@dominictarr
Copy link
Contributor

@gedw99 sorry for the late response!

CRDT and SNOB have different use-cases,

CRDT is designed for the case where you don't need the full history, although it would be possible to add backtracking - I have considered this, I just havn't had a usecase for it yet.

SNOB is the same architecture as git, with branches and merges and that stuff, only you have pluggable diff tools instead of only handling text. If you can write a diff, patch, diff3 operations for your data structure, then snob can version it.

I started with snob, but the realized that most applications where much simpler, and wrote crdt and scuttlebutt.

What do you mean by generative?

@ghost
Copy link

ghost commented Feb 23, 2013

thanks Dom for the explanation

SNOB sounds good, but i doubt i can do real time updates of 2 users looking
at the same data with it ?

Do a goolge image search of "generative architecture" and you will see what
sort of things we can print and make with it.
As an architect i reject that everything we make and surround ourselves
with must be orthogonal. 3D printng and 2D cnc techniques have opened the
door to making non orthogonal things.

On 23 February 2013 00:24, Dominic Tarr notifications@github.com wrote:

@gedw99 https://github.com/gedw99 sorry for the late response!

CRDT and SNOB have different use-cases,

CRDT is designed for the case where you don't need the full history,
although it would be possible to add backtracking - I have considered this,
I just havn't had a usecase for it yet.

SNOB is the same architecture as git, with branches and merges and that
stuff, only you have pluggable diff tools instead of only handling text. If
you can write a diff, patch, diff3 operations for your data structure,
then snob can version it.

I started with snob, but the realized that most applications where much
simpler, and wrote crdt and scuttlebutt.

What do you mean by generative?


Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/71#issuecomment-13979266.

Contact details:
+49 1573 693 8595 (germany)
+46 73 364 67 96 (sweden)
skype: gedw99

@dominictarr
Copy link
Contributor

@gedw99 both snob and crdt are realtime!

Aha, generative architecture is what the name suggests! This is really cool!

The tricky part in data replication is handling the case where two users have updated the data concurrently.
"concurrent" means something like "the same time", but relates to the synchronizations between the users rather than the clock time. So if two users make an update starting from the same version, they create two parallel versions.

Snob and Crdt use two different approaches to merging these parallel versions.

Hmm, it's probably easier to get started with crdt - hmmm, I think you could probably port something that works with crdt to snob. It all depends on what the data structure looks like.

@ghost
Copy link

ghost commented Feb 23, 2013

I will look into both.

I have about 10 year experience programming. I know the patterns and
theories. I just wanted to know what are the real world low level
difference from you ecause you wrote it.

I will use CRDT as the operational transformation and see how ti goes.

t this stage we are working on the WebCL aspects for the CAD Kernel.

These guys have really cracked it.
http://www.hastaladesign.com/?cat=22968

do a video search for "Softkill Design"

The apprahc they are taking is based on biomimciry. It will be very
successful and realy solve many problems in the world.
The biochemists will be busy though. Need organic based polymers now to
make them cheaper.

On 23 February 2013 02:21, Dominic Tarr notifications@github.com wrote:

@gedw99 https://github.com/gedw99 both snob and crdt are realtime!

Aha, generative architecture is what the name suggests! This is really
cool!

The tricky part in data replication is handling the case where two users
have updated the data concurrently.
"concurrent" means something like "the same time", but relates to the
synchronizations between the users rather than the clock time. So if two
users make an update starting from the same version, they create two
parallel versions.

Snob and Crdt use two different approaches to merging these parallel
versions.

Hmm, it's probably easier to get started with crdt - hmmm, I think you
could probably port something that works with crdt to snob. It all depends
on what the data structure looks like.


Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/71#issuecomment-13982431.

Contact details:
+49 1573 693 8595 (germany)
+46 73 364 67 96 (sweden)
skype: gedw99

@dominictarr
Copy link
Contributor

@gedw99 Okay, Great!

I'm super-busy until monday, but after that I'll write up a wiki page about the differences between the replication approaches.

@ghost
Copy link

ghost commented Feb 23, 2013

thanks mate - I will look at it.

Gerard

On 23 February 2013 02:47, Dominic Tarr notifications@github.com wrote:

@gedw99 https://github.com/gedw99 Okay, Great!

I'm super-busy until monday, but after that I'll write up a wiki page
about the differences between the replication approaches.


Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/71#issuecomment-13982897.

Contact details:
+49 1573 693 8595 (germany)
+46 73 364 67 96 (sweden)
skype: gedw99

@navaru
Copy link

navaru commented Feb 23, 2013

I think you'll need a CRDT based approach since you need to handle specific operations.

I've learned Operational Transformation a year ago, so in order to apply OT to a document editor you'll define:

  • A document represents a string of characters, but when it comes to OT, a document is a list of changesets.
  • A changeset is group of edits made within a certain time by one user (~500ms), that may be canceled or propagated as
  • Operational transformation deals with actions (operations) that will be performed on the document. An operation is a sequence of changesets (operation components)

There are two types of OT:

  • primitive operation model (most implemented model)
    • string-wise: insert, delete, update
    • map app specific logic to primitives
  • app-specific operation model (has a more complex transaction layer)
    • n different operations => n * n transformation functions

OT has the downfall that you need a server to handle the transactions, no peer-to-peer.

I'm still researching CRDT.

How complex is your data model?

Anyway, biomimciry is awesome!

@ghost
Copy link

ghost commented Feb 23, 2013

thanks Eugen

Ok, this pretty much aligns with what i have learnt too about OT patterns.
i too still do not fully understand how that differs from CRDT.

There is a interesting group that have written a system that does OT
patterns but not using OT
Called substance.io.
They have it working well.

P2P is not vital. Was just a nce to have. Authoritive Server can do the
marging transactions.
.

I built a sterolithographic printer last year and layered around with
different materials for printing houses.
Its a fast process, but the photopolymers are expensive, and break down
over time.
You can mix carbon nanotube emulsions in them to help. But they will still
break down.
The key is finding an organic material.

Right now i am playing around with light scribe machine and production of
graphene.
I intend to use this process to make 3d structures out of graphene.

the great thing is that it also makes a great battery. Handy for anything
you want to make, since the structure itself is the battery.

G

On 23 February 2013 03:01, Eugen Tudorancea notifications@github.comwrote:

I think you'll need a CRDT based approach since you need to handle
specific operations.

I've learned Operational Transformation a year ago, so in order to apply
OT to a document editor you'll define:

  • A document represents a string of characters, but when it comes to
    OT, a document is a list of changesets.
  • A changeset is group of edits made within a certain time by one user
    (~500ms), that may be canceled or propagated as
  • Operational transformation deals with actions (operations) that will
    be performed on the document. An operation is a sequence of changesets
    (operation components)

There are two types of OT:

  • primitive operation model (most implemented model)
    • string-wise: insert, delete, update
    • map app specific logic to primitives
      • app-specific operation model (has a more complex transaction layer)
    • n different operations => n * n transformation functions

OT has the downfall that you need a server to handle the transactions, no
peer-to-peer.

I'm still researching CRDT.

Anyway, biomimciry is awesome!


Reply to this email directly or view it on GitHubhttps://github.com/rvagg/node-levelup/issues/71#issuecomment-13983129.

Contact details:
+49 1573 693 8595 (germany)
+46 73 364 67 96 (sweden)
skype: gedw99

@ralphtheninja ralphtheninja reopened this Dec 18, 2018
@ralphtheninja ralphtheninja transferred this issue from Level/levelup Dec 18, 2018
@vweevers vweevers added the discussion Discussion label Jan 1, 2019
@vweevers vweevers added the stale This issue or pull request is old label Nov 2, 2022
@vweevers vweevers closed this as not planned Won't fix, can't repro, duplicate, stale Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion Discussion stale This issue or pull request is old
Projects
None yet
Development

No branches or pull requests

5 participants