Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roadmapping #1

Open
8 of 31 tasks
Tracked by #70
serapath opened this issue Aug 14, 2019 · 0 comments
Open
8 of 31 tasks
Tracked by #70

roadmapping #1

serapath opened this issue Aug 14, 2019 · 0 comments

Comments

@serapath
Copy link
Member

serapath commented Aug 14, 2019

@todo

  • Run the scraper locally and make sure it runs and stores all contracts
  • Write a tutorial how to use the updated github scraper repository and publish and run it on digital ocean
  • Run scraper server on digital ocean droplet and make sure it keeps on running
  • replace mySQL with filesystem and store all smart contract source codes in sourcecodes/
  • create REST API server with routes to check:
    1. how much space the sourcecodes/**/* folder is using
    2. how many contracts are stored in total (maybe: https://stackoverflow.com/a/22957193)
    3. respond with the result of ls ./sourcecodes
  • share/publish scraped data
    • dat share the folder of stored contracts using the the dat cli
  • datdot-tcir#4: scraper until presentation for EthIndia
  • datdot-cir#7 AST/type/data analysis
  • datdot-cir#5 p2p publish data archive
  • datdot-cir#2 AST import statement analysis and content hash replacement
  • datdot-tcir#3 index structure (e.g. hypertrie(s))
  • check that scraper is continues to run
  • once all content has been scraped, try adding a new one to see if the scraper receives the updates - if not, make it work :-)
  • datdot-tcir#6 maintain always up-to-date list of proxy servers
  • add additional API to server /dat to tell if dat share ... is still serving
    • if so it should curl https://<digitaloceanserver ip>/dat returns e.g. true
  • also http://104.248.146.46:3001/list has a broken formatting and the output seems to be formatted with HTML but it woud be better to have raw text with no HTML tags at all
  • also can you change the README.MD and delete the old content in it and instead write the scraper server address and document the API? :-)
    • http://104.248.146.46:3001
      • /dat - indicates true or false to tell whether dat is still active or not
      • /daturl - returns the dat://<address> which can be used to download all contract
      • /count - gives the total amount of contracts currently stored on the server
      • /size - shows how much disk space contracts occupy
      • /list - shows a list of all contracts on the server
  • write a tutorial (with screencast) how to update the digital ocean droplet with that changed code and keep it running
  • scrape source codes manually from URLs below to add them to the database once
    • contact @serapath for some additional sources to scrape
  • e.g. typeverify#1: verify items
  • datdot-tcir#3: hypertrie(s) structure
    • start with a small subset of contracts (...maybe 20 items)
    • create a modified verifier module to still be able to verify them
    • process all items and verify them once the small subset of 20 works and can be verified
  • #5 store all items in dat and publish on DHT
  • make a repository for a pure dat based p2p database node which can run on DappNode
  • proof of concept by creating a datdot-tcir instance on digital ocean to pin the items archive
  • work out fully featured concept of scsen

maybe we even need multiple servers - one on each continent? :-)

We could either use:

@serapath serapath changed the title concept scraper concept & updates Aug 16, 2019
@serapath serapath pinned this issue Aug 17, 2019
@serapath serapath changed the title scraper concept & updates roadmapping May 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant