Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data persistence on IPFS #86

Open
cbruguera opened this issue Mar 22, 2018 · 10 comments
Open

Data persistence on IPFS #86

cbruguera opened this issue Mar 22, 2018 · 10 comments

Comments

@cbruguera
Copy link

Hello, I'm just curious to know how is data persistence handled on IPFS, given that (if my understanding is correct) files must be "pinned" by an ipfs node in order to achieve persistance, yet the data is prone to loss (or unavailability) if it's hosted on a single node that goes offline.

Where can I check for details on how is persistence (and availability) ensured by OrbitDB? Also, on a related note, does OrbitDB work as a "private" cluster of ipfs nodes or is this connecting to the IPFS network as a whole via some public gateway?

Thanks beforehand for any feedback on the matter.

@fazo96
Copy link

fazo96 commented Mar 26, 2018

Whether orbitdb uses a private cluster of IPFS nodes depends on how you configure IPFS/libp2p, by default they connect to the public network but I think there's some early experimental support for private networks.

Persistence in IPFS works like this: you only replicate the data you read, so you will never end up caching, uploading or replicating something that you didn't explicitly request to IPFS.

In OrbitDB, when your local copy updates it fetches all the new entries from IPFS so they get copied to your local node, and your local node will help serving them to the rest of the network.

Pinning means that when the garbage collector of IPFS runs (to free some space) it won't ever delete your pinned stuff. OrbitDB as far as I know never pins anything

@balupton
Copy link

Is there an option then to do a full clone of the data, and to keep it up to date with new data, to ensure persistence across multiple nodes that are programmed to do the same?

@fazo96
Copy link

fazo96 commented Mar 28, 2018

By default, when you open a database with orbit-db it syncs up so that all of the nodes replicate all the data, and they also cache it locally, so if you restart them you can load what they synced from the local storage instead of having the replicate it all over again from the network.

For this to work there has to be at least one reachable online node to sync from, otherwise new nodes won't be able to get the data. If nobody is online or you delete the local storage of all nodes, then you will have lost the database.

So @balupton just by opening the same database on multiple machines/nodes they will keep up to date by themselves.

Of course if you write multihashes (for example you want to keep a feed of videos, so you write the multihashes of the videos to a orbit-db-feed like this { videoMultihash: 'Qm...' }) the multihashes will not be opened by orbit-db (they are just strings) and you will have to ensure those are replicated yourself, orbit-db will only keep the objects you put into it synced and won't follow links or multihashes

@haadcode
Copy link
Member

@fazo96 has already done great job at explaining the persistency, but I wanted to add a note that currently js-ipfs doesn't have GC, so nothing gets removed meaning everything is pinned by default.

However, this will change in the future as js-ipfs gets GC and we want to make sure that OrbitDB is actually persisting everything (by default), so some work on pinning needs to happen. If you're using OrbitDB with go-ipfs (through js-ipfs-api), then GC happens and data may not be persisted anymore after a time. This is a known issue and we're planning to implement actual pinning (from IPFS perspective) soon.

@balupton
Copy link

So @balupton just by opening the same database on multiple machines/nodes they will keep up to date by themselves.

Sweet. And what about the option of having it so new nodes can add new data without replicating past data?

@aphelionz
Copy link
Member

Moving to the Field Manual for more details / discussion

@aphelionz aphelionz transferred this issue from orbitdb/orbitdb Sep 27, 2019
@revolunet
Copy link

Hi, does anyone have an exemple of setting up a "backup" ipfs server for orbitdb persistance ?

@aphelionz
Copy link
Member

There are a few different efforts going. The one I've been using and working on is https://github.com/Jon-Biz/orbitdb-pinner

@Belz-tech
Copy link

Hi.
I am really new to IPFS and OrbitDB. I have read quite a lot of articles and have done extensive research, but am still confused. Could you use it for inventory management, and if so, how do you keep your data of all transactions, stock movement, available stock etc? Surely everything can't sit in nodes and be dependant on someone reading / pinning it. What happens if all nodes goes down?

@bitcard
Copy link

bitcard commented Mar 1, 2021

mark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants