Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

privacy and self-sovereign identity #3013

Closed
synctext opened this issue Jun 23, 2017 · 62 comments
Closed

privacy and self-sovereign identity #3013

synctext opened this issue Jun 23, 2017 · 62 comments
Assignees
Milestone

Comments

@synctext
Copy link
Member

synctext commented Jun 23, 2017

draft thesis direction

Investigate the revealing of privacy-sensitive attributes of your identity.
Should be usable even in the sensitive medical domain with electronic patient files.

First step, find related work in this area (irma, zero-knowledge, zkSNARKs)

For next meeting: gather all general docs, weforum Djuri his thesis, etc. in .PDF form.

@synctext
Copy link
Member Author

synctext commented Jun 26, 2017

Facinating 2012 report from World Economic Forum

To restore trust, this report proposes three separate, but related
questions, which need to be addressed by all stakeholders:

1. Protection and Security: How can personal data be protected
   and secured against intentional and unintentional security
   breach and misuse?
2. Rights and Responsibilities for Using Data: How can rights
   and responsibilities, and therefore appropriate permissions, be
  established for personal data to flow in ways that both respect
  its context and balance the interests of all stakeholders?
3. Accountability and Enforcement: How can organizations
  be held accountable for protecting, securing and using
  personal data, in accordance with the rights and established
  permissions for the trusted flow of data?

Our government is already conducting various initiatives within this area. See this letter to parlement about the various ongoing matters.

@synctext synctext changed the title Self-sovereign identity and privacy privacy and aelf-sovereign identity Jun 29, 2017
@synctext synctext changed the title privacy and aelf-sovereign identity privacy and self-sovereign identity Jun 29, 2017
@AngelaBarrio
Copy link

AngelaBarrio commented Jul 7, 2017

Read literature:

  • The inevitable rise of self-sovereign identity - Sovrin Foundation, 2016
  • Towards self-sovereign identity using blockchain technology - D. Baars, 2016
  • Portable trust: Biometric-based authentication and blockchain storage for self-sovereign identity systems - Hammudoglu, J. et al., 2017
  • Decentralizing privacy: using blockchain to protect personal data - Zyskind et al., 2015
  • Healthcare data gateways: foud healthcare intelligence on blockchain with novel privacy risk control - Yue, X. et al., 2016
  • A case study for blockchain in healthcare: "medrec" prototype for electronic health records and medical research data - Ekblow et al., 2016
  • Rethinking personal data: strengthening trust - World economic forum, 2012
  • Portable reputation toolkit use cases - Allen et al., 2016
  • The path to self-sovereign identity - Allen, C. et al., 2016

Attended events:

  • Techruption meeting (June 15)
  • ECP special Blockchain: wet- en regelgeving (June 29)

@synctext
Copy link
Member Author

synctext commented Jul 7, 2017

W3C is creating a group on self-sovereign Identity matters. However, seems too heavy and leaning towards a browser-only perspective.

Their charter

Please also read detailed (privacy-minded) crypto papers:

@synctext
Copy link
Member Author

synctext commented Jul 19, 2017

Key paper discussing 15 years of decentralization with 189 citations from the literature {including TribBler}
Titled: "Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments"

This paper is a beautiful version of 3 years earlier work by TUDelft students https://cryptome.org/2014/04/tor-vulns-attacks.pdf titled "The fifteen year struggle of decentralizing
privacy-enhancing technology"

After 15 years of struggling scientist we now see the emergence of numerous standards group which also make sadly little progress. https://github.com/WebOfTrustInfo/ID2020DesignWorkshop/blob/master/topics-and-advance-readings/DID-Whitepaper.md#decentralized-identifiers-dids-and-decentralized-identity-management-didm

@synctext
Copy link
Member Author

Real patient open dataset:
A sample anonymized data set, including 5,000 patients and 500,000 observations, is available for the following Platform releases (note that demo data for OpenMRS Platform releases and will not work for OpenMRS Reference Application releases)

@synctext
Copy link
Member Author

synctext commented Aug 25, 2017

About self-sovereign IDs..
@qstokkink can you please provide us with a reference to the Stanford paper which is not that popular but seems to have a the right crypto magic we need for key attestation and zero-knowledge proof fun?

@qstokkink
Copy link
Contributor

https://crypto.stanford.edu/%7Edabo/papers/2dnf.pdf

@AngelaBarrio
Copy link

Read literature:

  • Enigma: Decentralized computation platform with guaranteed privacy
  • Systematizing Decentralization and Privacy: Lessons from 15 Years of Research and Deployments
  • The fifteen year struggle of decentralizing privacy-enhancing technology
  • Evaluating 2-DNF Formulas on Ciphertexts

@AngelaBarrio
Copy link

Read literature:

  • Wat moet ik nou met blockchain? - Idius Felix, 2017
  • De kwaliteit van het elektronisch patientendossier van huisartsen gemeten - NIVEL,2011
  • Several presentations and news articles (MedMij, ZorgLog, UMCG blockchain vision etc)

Meeting with Idius Felix. He advised me to look for another dmain because I lack domain knowledge about healthcare. He thought my research question was too broad.

Tried to contact Ron van den Bosch (UMCG) but he retired this month.

@AngelaBarrio
Copy link

@synctext
Copy link
Member Author

synctext commented Oct 2, 2017

Felix consult; the electronic patient file does not exist. Spread out across hospital, special clinics, general practitioner, elderly care, etc. Future direction:

  • broader, beyond medical. Generic attribute hiding.
  • narrow, focus on 1 medical case study. Align with the app world.
  • narrow, focus on the electronic switching point of EPD.
  • narrow, focus on 1 group, like general practitioners. Existing open source: company in this area](http://www.elatro.nl/opensource) and ideas.
  • Shift to implementation + policy of self-sovereign ID, internal affairs ministry

ToDo: Learn QT for Android and Python.

@AngelaBarrio
Copy link

Activities:

  • Python course at Codecademy (currently at 67%)
  • Read 2 papers about TrustChain

@synctext
Copy link
Member Author

synctext commented Oct 18, 2017

Medicit input: self-sovereign identity where patient can see all available data, append corrections, control sharing, and control removal. Defined format and medical patient data. Demo app https://mxi.nl/upload/documenten/presentatie_doorontwikkeling_blockchain_zin_izo_mxi.pdf

Next meeting: first QT app, expand with MijnZorgLog basic features. Possible future steps: Trustchain + self-sovereign ID, IPv8. Smartphone becomes your personal medical data server. Problem, reliability, backup, and theft-locking. Brainstorm idea: spread your encrypted patient files among social network. Features: remotely disable, your trustchain locks, trustchain goes into recovery mode, recover with friends (mechanism to support 1-5 friends)?.

This is the Facebook account recovery using trusted contacts:

1 Tap Forgot Password? on the login page.
2 If prompted, find your account by entering your email, phone, username or full name and tap Search.
3 Look at the list of email addresses listed on your account. If you don't have access to any of these, tap No longer have access to these?
4 Enter a new email or phone that you know you can access and tap Continue.
5 Tap Reveal My Trusted Contacts and type the full name of one of your trusted contacts.
6 You'll see a set of instructions that includes a URL. The URL contains a special security code that only your trusted contacts can access. Call your friends and give them the URL so that they can open the link and give the security code to you.
7 Use the security codes from your trusted contacts to access your account.

@synctext
Copy link
Member Author

synctext commented Oct 25, 2017

Please read fascinating thesis from Yesterday. It links various domains, excellent problem description. No code, no engineering.
opportunities for blockchain based identities in healthcare THESIS
Included illustration, copied from above thesis:
dental_care_and_identity_workflow

@AngelaBarrio
Copy link

Activities:

  • Finished Python course
  • Started Android app course
  • Read report "Fifth Annual Study on Medical Identity Theft"
  • Read paper "Exploring medical identitiy theft"
  • Read paper "Cybersecurity in healthcare: A systematic review of modern threats and trends"

@synctext
Copy link
Member Author

synctext commented Nov 6, 2017

  • Smartphone becomes your personal medical data server. Problem, reliability, backup, and theft-locking.
  • medical records storage
  • no not store on a single device
  • decentralised trusted data storage environment: use your friends
  • possible solution: threshold encryption, old 1992 paper, 2001, "recent" 2006 work and 2008
  • app dev suggestion tutorial

@AngelaBarrio
Copy link

Read literature:

  • Threshold cryptosystems: Authentication and secret sharing (1992)
  • Strength in Numbers: Threshold ECDSA to Protect Keys in the Cloud (2015)

@synctext
Copy link
Member Author

synctext commented Nov 22, 2017

Buzzword bingo for possible thesis title and storyline:

  • Blockchain-based distributed tamper-proof filesystem using threshold encryption
  • Smartphone-based blockchain storage primitive with theft locking, disaster recovery, side-channel protection, and distributed key storage using threshold encryption
  • Blockchain-based medical file sharing protected using threshold encryption

Suggested ToDo:

  • continue reading papers in more details
  • try to find Python library for threshold encryption, try a 2 our of 3 prototype
  • try to understand the possibility of going beyond signatures with encrypted block storage

This morning discussion inspired the following design. Simple party-trick also used in Enigma design by MIT for distributed encrypted storage. Simple designs like this might actually work!
threshold_encryption_with_blockchain-based_distributed_storage

@synctext
Copy link
Member Author

synctext commented Dec 14, 2017

  • keeping busy with Distributed Algorithms course
  • Student team Shamir secret sharing on Android
  • elliptic curve detailed reading
  • End-goal: still mentioned above, prototype self-sovereign identity where patient can see all available data, append corrections, control sharing, and control removal.
  • ToDo next sprints;
    • 8page thesis draft
    • architecture + algorithm + experiment description
    • Create prototype and thesis in final 2 months (theft locking experiment, backup procedure, key storage at friends, loss recovery experiment)

@synctext
Copy link
Member Author

please read https://link.springer.com/chapter/10.1007/978-3-319-67816-0_21

4 Definitions and Threat Model
We consider a setting in which users maintain attributes about themselves and require registrars to
vouch for these attributes. For example, in order to register the attribute “over 18 years old,” a user
reveals their identity to the government, who verifies their age.
Definition 1 (Passive/active verification)
Definition 2 (Attribute integrity)
Definition 3 (Attribute privacy)

@synctext
Copy link
Member Author

Related work on zero-knowledge proof, Nature papers. Zero-Knowledge Nuclear Warhead Verification.

@AngelaBarrio
Copy link

Activities:

  • Christmas holiday
  • Finished Distributed algorithms lab
  • Read paper "Who Am I? Secure Identity Registration on Distributed Ledgers" by Azouvi, Al-Bassam and Meiklejohn (2017)
  • Started writing 8-page thesis draft. Mainly chapter on related work. Now on 5 pages.

@AngelaBarrio
Copy link

AngelaBarrio commented May 17, 2018

Activities:

  • Read paper "Patient-centered transparency requirements for medical data sharing systems" (Spagnuelo & Lenzini, 2016)
  • Expanded thesis introduction, problem statements and related work section
  • Tried to 100% understand Green & Eisenbarth TECDSA paper. CONCERN: nothing about signature verification in the paper?
  • Attempt to improve my blockchain, not very successful. Made an appointment with a Python/Flask wizard.

thesis_mei.pdf

@synctext
Copy link
Member Author

@AngelaBarrio
Copy link

Activities:

  • Expanded mini blockchain with Flask functionality that adds the content of a http post to the blockchain
  • Read papers "Care. data and access to UK health records: patient privacy and public trust" and "Database State" about the privacy laws compliance of state databases including EMR
  • Expanded thesis text in several areas
  • Problem: got stuck thinking about software architecture
  • Compared 3 EMR system papers, found one useful, one medium and one quite vague
  • Useful one is MedRec, uses PyEthereum, maybe switch to that
    thesis_6_juni.pdf

@synctext
Copy link
Member Author

synctext commented Jun 6, 2018

Dedicated prototype. Completely from scratch, without any unneeded features and functionality.
solve the deniability problem. Irrefutable, tamper-proof medical records. Mental note, Wolter Pieters ask as 3rd committee member? Much related work, you found 3, but very fresh and immature field.

  • Todo use library for Joint Random Secret Sharing and get some implementation experience.

  • form of multi-party agreement in the encrypted domain. Form a group public key and a private key of the group. Additionally can offer 2 out of 3 redundancy. Smart usage within blockchain/trustchain.

Validation of EMR entries means that an entry becomes official only when both the
patient and the health care provider have agreed to the entry. This is similar to a
person sending a registered letter and the recipient signing for delivery. The patient
cannot claim not to know the content of the entry.
  • The thesis core has now been defined:
The private key is chosen by all the nodes together using Joint Random Secret Sharing
(JRSS). In this technique, each node chooses a random local secret value and
shares it with the group, using Shamir’s Secret Sharing (. Gennaro et al. 1996). Every
node adds all the shares together (including its own), resulting in the joint random
secret share. Just one of the nodes needs to introduce randomness to keep the joint
secret unknown.

@maxvisser
Copy link

I subscribed to this post and saw that you maybe start looking at switching to PyEthereum; I see you guys are trying to do these steps:

  1. Secret splitting
  2. Secret fragment distribution
  3. Secret fragment publishing
  4. Secret reconstruction

This for me sounds like Shamir Secret Sharing; there are currently two projects in the Ethereum ecosystem working on this; Since 31st of May 2018, Kimono released their smart contract on Rinkeby for trustless secret sharing; kimono medium post: https://medium.com/@pfh/kimono-trustless-secret-sharing-using-time-locks-on-ethereum-8e7e696494d

The kimono platform uses a commit and reveal scheme that distributes the secret for each piece of data across a network of peers who share it with the public only once the time-lock is complete. it relies on Shamir Secret Sharing

uPort also has a bounty open for a implementation of shamir secret sharing (sss) in WebAssembly, that is almost complete. 3box/sss-wasm#2

@synctext
Copy link
Member Author

synctext commented Jun 21, 2018

Roadmap for thesis. Concrete tasks

  • Thesis direction and Problem Description draft - Samantha "Barbie" de Jong
  • Related Work - draft
  • initial code - Completely from scratch, without any unneeded features and functionality.
  • Library Joint Random Secret Sharing
  • Chapter 5: Self-sovereign identity prototype
    • related work screenshots
    • offer 2 out of 3 redundancy
  • Chapter 6: Using real medical records with Standard Shamir or "Strength in Numbers: Threshold ECDSA to Protect Keys in the Cloud"
  • FINAL CHAPTER: Select algorithm beyond Standard Sharmir
  • Chapter 7: Implement or improve the algorthm (?by making it dstributed) {For AngelaChain}
  • DONE

@AngelaBarrio
Copy link

AngelaBarrio commented Jul 6, 2018

thesis_6_juli.pdf
Activities:

  • Made a homepage where a user can upload a file. The user can indicate who needs to sign this file. This events gets added to the blockchain.
  • The blockchain is displayed on another page
  • Made a button to make a ecdsa signature using python-ecdsa library. Still needs to be linked to the actual document.
  • Made some additions and improvements to the thesis draft.

@synctext
Copy link
Member Author

synctext commented Jul 6, 2018

  • 4.3 Digital signature algorithm. are there also various crypto libraries to choose from?
  • 2.2.1 Accountability on access are these the requirement for your prototype or research goals
  • Another project: move from approving invoices, towards approving hours and try to create jointly agreed facts. Simplify administration greatly.
  • mention loss recovery or out of thesis scope?
  • thesis storyline?
    • tamper-proof access log
    • Irrefutable, tamper-proof medical records
  • make it nice by taking inspiration and color code from others? Note the power of anatomy pictures
    image
    image
    image

@AngelaBarrio
Copy link

AngelaBarrio commented Jul 26, 2018

thesis_26_juli.pdf
Activities:
Created basic login page.
Improved prototype by adding working sign functionality. Uploader can check boxes to determine who is required to sign the event. Logged-in user can sign the event, this gets added to the blockchain as well.
Made a start with the access monitoring functionality.
Some small additions to thesis draft.

@synctext
Copy link
Member Author

synctext commented Jul 26, 2018

next steps

  • making it distributed (e.g. multiple VMs or multiple instances with local sockets)
  • define end goal or target features
    • fancy zero-knowledge proof, threshold crypto
    • Barbie storyline
      • tamper-proof access log
      • Irrefutable, tamper-proof medical records
      • create 1 or 2 screens to make full functional for thesis inclusion
      • Thesis scenario. Beyond single patient and GP: multiple specialists, GP and visit abroad
      • complex care scenario
A 74-year-old Latino patient with a history
of hypertension, diabetes mellitus, dyslipidemia,
multivessel coronary and peripheral
arterial diseases, and chronic osteomyelitis
was recently discharged from our hospital
after undergoing several toe amputations.
His hospital course was complicated by
contrast-induced nephropathy that required
hemodialysis; heparin-induced thrombocytopenia;
lower extremity cellulitis; and significant
functional decline.
  • create superior version of MijnZorgLog blockchain on central Azure servers
    • focus on hour validation and registration
      image
    • use their Figure 10 as guideline
    • prove that there is no need for smart contracts, Azure servers, and mining nodes.
  • real patient data? (from test DB mentioned above)
  • MedMij direction, persoonlijke zorg omgeving, mybeing example
  • performance analysis (correctness is bottleneck, how to make graphs of that?)
  • compare with requirements

@AngelaBarrio
Copy link

AngelaBarrio commented Sep 3, 2018

Activities last month:

  • Created a page where the users can see and download the entries.
  • Added sign-button (but does not work 100% bc it needs to be connected to the logged in user)
  • Sign-events are added to the blockchain
  • Made download button, but download (read) event is not added to blockchain yet
  • Made blockchain semi-distributed (there is one master node which hosts the website)
  • Changed login page (just finished, was needed for signing functionality).
  • Some additions to thesis draft. Tried to find out how existing systems do their logging but they don't respond to my emails.

thesis_3_sept.pdf

@synctext
Copy link
Member Author

synctext commented Sep 3, 2018

  • "This problem cannot be solved with a smart algorithm: it must be brute forced." add nuance
  • "This problem cannot be solved with a smart algorithm: it must be brute forced. A
    consequence of this is the effectiveness against Sybil attacks. The high costs of creating
    pseudonymous identities prevents cheap attacks (Vukoli´c 2015). A drawback
    is that it is (obviously) computationally intensive, and therefore uses much energy,
    which is an unnecessary strain on the environment" storyline switches from brute-force, to Sybil, and
    back.
  • Design choices or System architecture and design space.
  • who serves the webpages is an architecture or implementation issue
  • Todo: how to increase visibility of the engineering hours of prototyping?
  • others also failed to find a lib. Seems a fancy lib to mention: https://petlib.readthedocs.io/en/latest/
  • key issue "5.3.1 Using threshold signatures", very hidden.
  • Solid progress, so far. Thesis is growing. FINAL CHAPTER... What is the key contribution of thesis? Prove that is solves the Barbie problem? Threshold in a medical (emergency) access scenario? Or creation of efficient threshold EC sigs in the distributed prototype? Performance analysis for 1 million patients? Real-time updates from any medical professional and full replication of your patient data?
    • build storyline in your final chapter, brainstorm:
    • Doctor consults your medical file to formulate treatment plan ( you can read this in log)
    • outcome medical test notification ( you can read this in log)
    • approve renewal medication usage ( you can read this in log)
    • emergency room checks your allergy for penicillin ( you can read this in log)
  • Becomes part-time thesis

@AngelaBarrio
Copy link

Activities last 3 weeks:

  • everything works except for retrieving blockchain from another node
  • made website ok looking
  • changed and added some parts in thesis draft
    thesis-24-sept.pdf

@synctext
Copy link
Member Author

Discovered initiatives:

Feedback:

  • Chapter 2, intro line. For instance, "the central problem of this thesis is a new method of access to medical files using tamper-proof logging, non-repudiation and guaranteed patient informing. The patient owns all data in our model and records all access by medical staff, instead of trying to create a fault-free security moat. Ease of access during an emergency is key to usability in the real world."
  • "2.1 Barbie’s medical records in HiX" Call it the Barbie '2.1 leading thesis use-case'; it then has more emphasis
  • Single mention in 4.1 of magical essence "auditable access trail"
  • use Medrec scholar citation to find additional related work
    • "Secure and Trustable Electronic Medical Records Sharing using Blockchain"
    • "A Blockchain-based Approach to the Secure Sharing of Healthcare"..
      image
  • shamelessly be inspired by their scenarios: "Scenario 1: Primary Patient Care. Using blockchain technology for primary patient care can help to address the following problems of the current healthcare systems. 1) A patient often visits multiple disconnected hospitals. He has to keep the history of all his data and maintain the updates. This leads to the situation when required information may not be available."
  • Figure 4.1 [ToDo Citations], Fig 4.2,..
  • Make this more prominent as, possibly, section 5.1 Patient Data Ownership
  • plus Who Owns the Data? Open Data for Healthcare
  • 5.2 "Analysis of functionality and choices of current systems"
  • unnamed page 21 table: No. 1 property "maturity level" "screenshots-only", "prototype exists"
  • Stars on GitHub (16-07-18), "Github community visibility", "540 stars"
  • 6.1 : event-driven architecture, blockchain primitives, user-experience, tokens and password management (e.g. UZI pas)
  • use-cases, own chapter or inside a certain chapter...
  • chapter 7, giving patients data ownership represents a complete reorganization of medical practices. change data governance (non-TBM version of term).

@synctext
Copy link
Member Author

synctext commented Oct 15, 2018

Performance analysis:

  • run in cmdline scripted form
  • enhance prototype it does something with x-amount of people in a fixed scenario
  • X-axis, increase number of patients, doctors from 1 to 30 (or try 50?)
  • Y-Axis, determine CPU usage, memory, number of messages,...

Think about more decentralised approach. Can every medical institute run their own portal for login and coordination?

@AngelaBarrio
Copy link

Activities:

  • implemented kind of Proof of Elapsed Time consensus

  • coducted some experiments for system without consensus

thesis-21-nov.pdf

@synctext
Copy link
Member Author

synctext commented Nov 21, 2018

Progress meeting minutes:

  • Accountability on access implemented
  • Validation of entries implemented
  • no long-lived identities yet for doctor or patient
  • possible inclusion of blockchain baby architecture picture
  • draw.io or xfig
    1. MediTrail prototype global overview / implementation / architecture and implementation
  • idea behind "Proof of Elapsed Time consensus" focus of this sprint
    • 1 node is elected to extend the blockchain
    • unique blockchain for each medical patient file
    • random rounds to process pending transaction
    • buffer in node contain the pending transaction, then committed to the blockchain
    • unclear to me what the buffering and commitment-delay offers
  • "For the purpose of performance evaluation we added a consensus algorithm to our prototype. This consensus mechanism is not designed to add features or business functionality. It is added purely as an enabler for determining the cost of consensus in a concrete use-case such as medical patient file access auditing."
  • possible outcome: snail-slow

@qstokkink qstokkink added this to the Backlog milestone Nov 27, 2018
@AngelaBarrio
Copy link

Activities:

  • attached private key to user
  • conducted experiments
  • improved thesis draft
    thesis-9-dec.pdf

@synctext
Copy link
Member Author

synctext commented Dec 11, 2018

Meeting minutes:

  • 24 pages for first 4 chapters

  • Chapter 2 research statement

    1. Accountability on access: knowing who has accessed the file;
      1. The medical record system must provide the patient with accountability mechanisms.
      1. The medical record system must provide the patient with evidence regarding permissions history for auditing purposes.
      1. The medical record system must provide the patient with evidence of security breaches.
    1. Validation of EMR entries by both the health care provider and the patient.
  • Section 2.4 Requirements:

      1. Accountability on access: Every access to an entry in the EMR system is recorded. The log contains information on the name of the user who accessed the file, the name of the file itself, and the timestamp of the event.
      1. Validation of entries: A user should be able to sign an entry with a secure digital signature. The digital signatures should be verifiable by anyone in the system.
  • Section 5.1 "From requirements to use cases", no link to above research questions

  • "The system that is being designed in this master thesis is a prototype" strange defensive wording

  • POET: "For a permissioned blockchain, it is a secure and efficient algorithm" mention unlimited identity creation and cheating at the lottery. Mention commitment scheme. Magic Intel silicon can't be the magic solution (e.g. tamper-proof).

  • Section "5.2 Blockchains", selecting blockchain technology

  • Section "5.4 Digital signature algorithm", encryption implementation

  • A block contains it's own hash? "6. Hash of this block"

  • FIGURE 6.4: Screenshot of the my logs page (btw never referenced in text)

    • Shows 4 message types without documentation
    • Genesis_block, upload_notification, signature_request, file_access_notification
    • 5th: signature_placement
  • Pseudocode is tricky (strict: all variables in italic, defined, and used); for instance, undefined SigningBlock. Magic variable "

  • single linked list, so no backwards traversal without (documented) index

  • Section 7 lacks intro sentence

  • 7.1.1: please integrate these 2 lists

  • 7.4 Speed of MediTrail features; performance analysis

  • Fig 7.2: not scientific to draw a line through 4 data-points! Plus line smoothing is considered bad practice.

  • "Distance to latest signing block" please conduct experiments with larger files; upto 1000.

  • 7.5.3 Discussion of file access performance, please integrate into single experiment section

  • debugging of sleep time problem.

  • without consensus performance should be easily 10k blocks per second

@AngelaBarrio
Copy link

New thesis draft, processed feedback from last time.
thesis-7-jan.pdf

@synctext
Copy link
Member Author

synctext commented Jan 8, 2019

Review notes:

  • where is the open source repository?

  • thesis is "file oriented" and not 'any digital information' contained within a digital patient database.

  • Section 5.3.1, "this high resilience may not be fully needed." perhaps replace with something more technical, like, this solution to the double spending problem in an anonymous network is highly inefficient and not appropriate in our medical context.

  • 5.5 Monitoring access to files; please reduce this section or make it more scientific by introducing an "5.5 event-based architecture" where evens can be remotely placed signatures, blockchain block creations, or file access. Void of all expensive continuous polling.
    5.3.4 Proof of Elapsed Time; "it is a secure and efficient algorithm" hard claim without citations or mathematical proof! Please explain this simple 'sleep-and-wake-up' algorithm and compare it to leader election in Byzantine Fault Tolerant Systems and Nakamoto consensus.

  • "every file that has been uploaded should be downloadable by every user in the system." This formulations sounds very privacy invasive.

  • "To create a minimum level of aesthetic appeal" :-)

  • 7.4.2 "Unfortunately, performing the same large experiments with the PoET configuration would cost too much time compared to the value of the data."; "no data" entries are unclear. why was this experiment stopped?

  • 7.4.3: "Unfortunately, performing the same large experiments with the PoET"; again not a clean experiment.

  • The 2 magic parameters determine the PoET performance. Make that more explicit and clear, "which makes sense because the parameters for choosing the sleeping time stay set during the experiments."

  • "Using the PoET consensus algorithm with the chosen parameters adds tremendous overhead."; no motivation for magic parameter values.

  • Tip for scientific depth. Key to consensus is the expected outcome for an n participant lottery with drawing numbers between (min, max) interval. The expected duration of a consensus round can be expressed mathematically. Good theoretical depth to final chapter.

Future work ideas&brainstorm: run a test network using public datasets like a prescribed medication for all doctors in the UK for a whole year. Emulate all hospitals, doctors, and medication prescription events.

@synctext
Copy link
Member Author

On Security Analysis of Proof-of-Elapsed-Time (PoET). Solid related work for a security analysis. It introduces a likelyhood of being a cheater.
The basic idea is to use z-test to check whether a node is generating blocks too fast (winning too frequently in the competition with other nodes for block creation.

@synctext
Copy link
Member Author

synctext commented Jan 28, 2019

No more comments on thesis text and chapter wording. Except usage of Sybil term for cheating in a lottery (7.2.1 Sybil attacks). Advised to make "7.2 Resistance against attacks" a single text paragraph without sub-sub-headings.

Final thesis addition suggestion: link outcome of experimental results with the appropriate theory.

Great additional to theoretical thesis depth (1+ page):

  • PoET consensus results: between 5315 ms; 5864 ms
  • can be explained with the theory and z-test
  • model parameter; between 1 and 10 second for each blockchain round.

@erikvandenakker
Copy link

Review notes:

General impression: Extensively introduced topic (good!; clear!); relatively short description of the actual work and its evaluation (but i'm not familiar with the standards in your field, so I'm not the best to judge.

  • Small remark: "[TODO: what is a replica??]" I encountered this in your thesis ...
  • Suggestion: "Many patients feel that they do not control access to their data, but would like to be able to access the data themselves, look at the history of data access and give or deny access permissions to healthcare providers (World Economic Forum 2012)."
    While you have developed and tested the required software to make this feasible, I have a feeling it is not tested for the described scenario; to make it more concrete:
    If I were a patient, I would like to get a comprehensive overview of all my data with annotations who touched what when and when was this agreed upon. How feasible is this given a typical patient (elderly individual first admitted 15 years ago) with a typical amount of medical data (say at least 20 items that are independently tracked)? How long would it take to produce [1] the complete information as presented in 6.4 [2] Do you have suggestions to graphically represent this information?
  • 7.4.1 Results have been presented as averages. Why not use boxplots to indicate the variance of the elapsed time? Or is this variance meaningless?
  • 7.4.2 "Unfortunately, performing the same large experiments with the PoET configuration would cost too much time compared to the value of the data." Can you explain this?
  • "Unfortunately, users cannot verify each other’s signatures." What wold you propose to incorporate this?
  • How to deal with re-use of patient data for scientific research or quality aspects? Do these type of accesses get the same 'stamp', is this desirable? And if not, how to deal with this?

I would be happy to discuss more practical aspects when implementing MediTrail in a medical environment such as an academic hospital. Looking forward to hear more!

Cheers,

Erik

@AngelaBarrio
Copy link

AngelaBarrio commented Feb 10, 2019

  • I fixed the bug!!! performance is much better now

  • Had to re-do almost all experiments due to fixing the bug, but it enabled me to include the experiments that would take too much time previously

  • Represented Upload and Read event experiments with box plots

  • Implemented some text feedback
    THESIS-10feb.pdf

  • Need to fix the box plots (somehow add right numbers on horizontal axis and converting to ms)

  • Code: https://github.com/AngelaBarrio/meditrail

@synctext
Copy link
Member Author

  • nice addition to depth of thesis
  • FIGURE 7.1: Time needed for adding upload blocks (N=100), no consensus (battle with .xls)
  • explain standard deviations and size of boxplot.
  • mixing seconds and milliseconds
  • expand section 7.5

@AngelaBarrio
Copy link

Presentation draft:
presentatie.pptx

@synctext
Copy link
Member Author

synctext commented Oct 4, 2022

Thesis FINISHED 👏

@synctext synctext closed this as completed Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants