Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a db for creation bytecode #111

Closed
edisinovcic opened this issue Mar 30, 2020 · 13 comments
Closed

Create a db for creation bytecode #111

edisinovcic opened this issue Mar 30, 2020 · 13 comments
Assignees

Comments

@edisinovcic
Copy link
Contributor

edisinovcic commented Mar 30, 2020

  • address (PK)
  • create bytecode
  • chain id
  • block number
  • deployed code hash

View in Huly HI-212

@ligi
Copy link
Member

ligi commented Mar 30, 2020

we might want to try orbitdb for this: https://github.com/orbitdb/orbit-db

@edisinovcic
Copy link
Contributor Author

edisinovcic commented Mar 31, 2020

Doing a small test for orbit-db on goerli data here: https://github.com/Shard-Labs/orbit-db
Go implementation: https://github.com/berty/go-orbit-db
Test implementation: https://codesandbox.io/s/orbitdb-starter-template-4c8qu

@chriseth
Copy link
Collaborator

While orbitdb sounds interesting, I think a simple local db with fast inserting and indexing could be the better choice here... :)

@edisinovcic
Copy link
Contributor Author

It's definitely an interesting tool, not sure how ready it is for what we need. I've been playing a bit with it and it's relatively simple to use, the only thing I'm not sure about is data persistence in the long run which is something necessary for us.

@edisinovcic
Copy link
Contributor Author

@ligi if you want to feel free to experiment with it, there is a code on the repo. Publishing is implemented, there is still reading to be done.

@tjayrush
Copy link

only thing I'm not sure about is data persistence in the long run which is something necessary for us

Sorry to jump in, but can you please elaborate on this comment a bit? Do you currently have any plans for how this super-critical data might be 'persisted in the long run.' To me, this is one of the most important design criteria.

@edisinovcic
Copy link
Contributor Author

edisinovcic commented Mar 31, 2020

Data will be published on ipfs and backup will be held on s3. So if one of those fails other will take over.

Data will be fetched from ipfs, s3 is just for backup. Also, multiple replications of s3 will be held and we will support and encourage other users to run the whole systems on their own machines. We don't want to be only holders of the data and whole flow will be repeatable by any user that runs it on their own machines. Also buckets on s3 will be public.

System will always give priority to decentralized storage (ipfs) before accessing centralized ones (s3). Only if ipfs is down or can't be reach access to centralized will be tried.

Hope this helps :)

@tjayrush
Copy link

It totally helps. I'm running into the same issues with my work which has the same sort of 'this data belongs to the community and shouldn't be held by any one person or groups of people' aspects to it.

IPFS seems like the solution, especially for immutable data that can be easily put into a content-addressable store. One of the things I've done is create 'snapshots' of the data (I'm building an index of address appearances) ==> every 'so often' I stop adding data to the index and write a snapshot to IPFS.

How do you plan on publishing the location (IPFS hashes) of the data, and, is the data you're creating immutable?

Thanks for taking the time to answer. I feel like we could solve the same problem in the same way. Your project (collecting source) has a side effect (automatically collecting all the function and event signatures) that my project needs.

If it works out, I'd gladly run one of your 'nodes' to produce ABI data as my users need it.

@edisinovcic
Copy link
Contributor Author

Basically we've implemented a cron job that does this periodically for both ipfs (still some polishing to do) and s3. We also plan to add (ens and ipns) support so it's more easily searchable by humans.

Awesome, we are currently running this fork of geth if you want you can make changes there if there are not too complex and probably we can done this to conserve resources.

By the way we have a call every Monday at 3pm UTC if you are able to join next one it would be great. :)

@chriseth
Copy link
Collaborator

chriseth commented Apr 1, 2020

@tjayrush we talked about this project in Osaka, right?

@tjayrush
Copy link

tjayrush commented Apr 2, 2020

Yes. We met through Griff Green.

For TrueBlocks, we don't really need the source code (yet), but from the source, we can extract the ABI. The ABI is super useful to us. Our code watches every block (as part of it's index creation). I'd love to add a feature that extracts the same data you're extracting, publish that data somewhere in some format (that you define) and as a by-product extract the ABIs I need (and also share them).

My biggest engineering concern is "making sure whoever gathers the data doesn't become the only group with the data." Or maybe better called the "data capture problem."

I'm convinced that all old-fashioned, web 2.0 methods of data delivery result in data capture--eventually. I'd love to try to figure out how to deliver data (and thereby build apps) that can get every piece of data they need without there being any possibility of 'the system' getting captured out from under the app. That has to be engineered on purpose.

Sorry to rant -- over an out.

@edisinovcic edisinovcic self-assigned this Apr 6, 2020
@edisinovcic
Copy link
Contributor Author

edisinovcic commented Apr 6, 2020

In the end, in my opinion, the best option would be to go with Postgresql (we need indexed data here) and using Hasura for easier access to the database (optional). Basically an API at the top of posgresql. For replication on other clients process called Hot Standby can be used if direct database replication wants to be done.
This is only for creation bytecode, not the verified contracts themselves. Currently, the size of the text file is around 50 GB on mainnet (not optimized) and still syncing so storing this on ipfs needs to be considered after optimizing. This file will be converted to database itself.

@edisinovcic edisinovcic changed the title Create a db with columns Create a db for creation metadata Apr 6, 2020
@edisinovcic edisinovcic changed the title Create a db for creation metadata Create a db for creation bytecode Apr 6, 2020
@edisinovcic
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants