Getting ready to work with the European Blockchain Services Infrastructure (EBSI)
==============

This text is focused on facilitating a dialogue between IT staff / developers and decision makers in organizations. To help the former group, the text provides a step by step guide on how to use Aries, Ursa, and Indy tools, libraries and reusable components in order to learn about the core focus of the European Blockchain Services Infrastructure (EBSI): enabling user controlled authentic data (UCAD) using a Verifiable Data Registry (VDR) that is jointly maintained and support by the member states. To help the latter group, the text provides an overview of core concepts and helps explain how the various parts of the solution contribute to enabling UCAD. The text will also include tips and suggestions from the past two years of experience working with EBSI, the Toolbox project, and with the Proof of Business project at Bolagsverket.

# About this learning material

Our increasingly digital world is fueled by data. In this emerging data driven world, trust is a key challenge. Trust means 1) ensuring data security and data privacy, 2) having data that is easy to verify, 2) making difficult attempts to forge and tamper with data data and data records, and 4) ensuring that the data subject is in sole and total control over their own data. To distinguish between data exhcnages that satisfy these four points and those that do not, this text will refer to the former using the term 'user controlled authentic data' (UCAD). New technologies like those introduced by the European Blockchain Services Infrastructure (EBSI), i.e., distributed ledgers (DLT), and work on Verifiable Credential (VC) formats and proof mechanisms, are enabling UCAD.

To understand this text on VC and the EBSI, it is helpful to clarify a few points:

1. This text is focused on facilitating a dialogue between IT staff (specialists and developers) and decision makers (who may or may not have a technical background). Consequently, the text will focus on both technical topics and the implications of certain technical choices for decision makers.
2. The assumed context is the public sector. While most parts in the text are context agnostic, the focus on EBSI merits a specific focus on the public sector context.
3. The text was developed following experiences working with VC and EBSI in the Swedish public sector, more specifically with The Swedish Company Registration Offices.
4. The aim is to let developers get hands on experience with key concepts so that they can get familiar with VC and supporting infrastructure.

The motivation for the focus on facilitating a dialogue between IT staff and decision makers was that there already exists educational material on EBSI and on VCs targeting either a general audience (cf. the three chapter EBSI Explained series found [here](https://ec.europa.eu/digital-building-blocks/wikis/display/EBSI/What+is+ebsi)) or a technical audience (cf. the [W3C specifications for Verifiable Credentials data model](https://www.w3.org/TR/vc-data-model/), the Linux Foundation [LFS173x course](https://training.linuxfoundation.org/training/becoming-a-hyperledger-aries-developer-lfs173/) on becoming an Aries developer, and/or the [EBSI demonstrator](https://ec.europa.eu/digital-building-blocks/wikis/display/EBSIDOC/Demonstrator)). However, the VC concept and the infrastructure it relies on (e.g., EBSI) represent (at least in parts) fundamental shifts in how data is managed. And to fully understand these shifts and to realize many of the potential benefits with VC and EBSI, it is important that both technical staff understand the technical requirements and that non technical decision makers are equipped with the required knowledge to allow them to assess the merits of proposed uses of the technology (with a focus on helping decision makers avoid improper uses).

The reason for the public sector focus has less to do with VC and more to do with EBSI. Specifically, the EBSI is a joint initiative from the European Commission and the European Blockchain Partnership with the aim to leverage blockchain to accelerate the creation of cross-border services for public administrations and their ecosystems to verify information and to make services more trustworthy (cf. the [EBP page](https://digital-strategy.ec.europa.eu/en/policies/blockchain-partnership)). It is the focus of EBSI on public administration and public services that motives the focus on the public sector context in this text. Arguably, this context focus does not impact the VC parts of this text. Perhaps the biggest impact is on the adversarial assumptions and how these impact the underlying infrastructure choices of EBSI, which in turn, impacts how trust is established in VC based data exchanges.

The reason for the focus on the experiences of the Swedish Company Registration Offices is due to interest. This text was developed as the end result of a CEF project on EBSI training. Early efforts in the project aimed to map the interest for EBSI and VCs in the Swedish public sector. This mapping exercise revealed that the interest in the Swedish public sector was on a rather general level. The notable exception was the Swedish Company Registration Offices (hereafter referred to by their Swedish name Bolagsverket). Bolagsverket had a strategy to decentralize information flows related to organizations and a vision to let the organizations be in control of their respective flows. Bolagsverket also wanted to explore EBSI and to learn how to leverage the infrastructure's capabilities to deliver public services. Finally, Bolagsverket had other blockchain initiatives and a project focused on VC and EBSI would reach a broad audience in the Swedish public sector since many actors follow closely the work Bolagsverket does. 

Finally, the aim to improve knowledge though hands on experience means that large portions of this text will be developed around mature projects that developers can use as a foundation for testing and trailing UCAD. This means that most of the exercises and the work done here will leverage the following three Hyperledger open source projects: [Aries](https://www.hyperledger.org/use/aries), [Ursa](https://www.hyperledger.org/use/ursa), and [Indy](https://www.hyperledger.org/use/hyperledger-indy). Together, these projects provide libraries, tools, and reusable components for creating decentralized applications for UCAD. Most major enterprise focused UCAD projects today leverage the three Hyperledger projects (e.g., the [Hyperledger Labs project Business Partner Agent](https://github.com/hyperledger-labs/business-partner-agent) and the [OrgBook BC project](https://www.orgbook.gov.bc.ca/search) that has issued over 4 million VCs to date) and there is extensive documentation and developer support and training. In particular, Aries helps developers work with different VC formats and the protocols required to establish secure connections and to exchange UCAD. In turn, Ursa is a cryptographic library that supports certain cryptographic primitives and algorithms that are specifically designed with user privacy in mind. Finally, Indy provides the required infrastructure to anchor trust in a decentralized and verifiable data registry. This fourth point is particularly relevant to discuss as it represents a compromise between developer friendly material and relevance for decision makers.

## Learning with the Hyperledger projects: Benefits and words of caution

The needs of IT staff and developers trying to learn about VC and DLT differ from that of decision makers. The examples in this text are mainly based on the three aforementioned Hyperledger projects Aries, Ursa, and Indy. The reason for this choice is twofold. The three Hyperledger projects are comparatively far more mature than EBSI and its VC ecosystem. Together, the projects have over 44 000 commits from 650 code contributors (for up to date information use https://insights.lfx.linuxfoundation.org/projects/health). Furthermore, the three Hyperledger projects have been used as a foundation for several large scale pilots and many organizations have relied on them for their trial and proof of concept needs. And the choice of DLT specifically, or verifiable data registry (VDR) in general, is not necessarily an important consideration for developers or policy makers attempting to understand UCAD. What matters is that there exists a VDR that can act as a trust anchor. This text is focused on helping IT staff and developers communicate with decision makers, and the communication regarding the choice of VDR is rather trivial for IT staff and developers (often as simple as changing a few lines in a configuration file or picking a certain API and using the right access tokens, neither is important for understanding VC and EBSI).

There are, however, certain contextual factors that make the a focus on the three Hyperledger projects questionable. Perhaps most important is that [Verifiable Credentials come in many flavors](https://www.lfph.io/wp-content/uploads/2021/04/Verifiable-Credentials-Flavors-Explained-Infographic.pdf). The ones initially developed for Aries were developed with a strong emphasis on privacy. And this strong privacy focus was realized using relatively complex cryptography. For the private sector, the adversarial assumptions may warrant such a high focus on privacy for UCAD exchanges. But for the public sector, using complex cryptography has several limitations and it is arguable that the costs are worthwhile considering how the public sector context is a very different adversarial environment than is the private sector. It is questionable to assume a malicious public actor since these actors are often the authorized sources of UCAD and the principle providers of services and infrastructure. 

For the public sector, it makes a lot more sense to optimize for ease of implementation and to use cryptography that public sector actors are very familiar with. Especially considering how both EBSI and the ongoing work in the common union Toolbox for a coordinated effort toward a European digital identity framework (cf. [C/2021/3968](https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A32021H0946)) are likely to focus on UCAD enabled using less complex cryptography (e.g., [EBSI only requires ES256](https://ec.europa.eu/digital-building-blocks/wikis/display/EBSIDOC/E-signing+and+e-sealing+Verifiable+Credentials+and+Verifiable+Presentations), which is [basic ECDSA with P-256 and SHA-256](https://ldapwiki.com/wiki/ES256)). Consequently, the text herein will more elaborately explain how to enable UCAD using ES256 and focus more on the technologies that both the Toolbox group and the EBSI group is emphasizing for use in the public sector.

Another important consideration is that EBSI is a DLT agnostic platform. At the writing of this document, EBSI supports two DLTs: [Hyperledger Besu](https://www.hyperledger.org/use/besu) and [Hyperledger Fabric](https://www.hyperledger.org/use/fabric). Besu is an Ethereum client specifically designed for enterprise and consortium environments. Since this text is educational, ease of use is prioritized. Using a general purpose DLT like Ethereum is less attractive than using a DLT purpose built for VC support and UCAD. 

Hyperledger Indy is a DLT designed specifically for UCAD. So, while EBSI does not support an Indy based ledger, it is easier to use an Indy based DLT as a VDR for learning purposes. The alternative is to: 1) develop the necessary functionality using a development environment for Ethereum, or 2) wait for EBSI to become production ready with full feature support. And while the work is rapidly progressing, and the list of [EBSI conformant wallets](https://ec.europa.eu/digital-building-blocks/wikis/display/EBSI/Conformant+wallets) keeps growing, there is still a lot of features missing on EBSI. Also, most EBSI conformant wallets are still focused on Desktop and/or mobile and not server environments, the latter being more suitable for our organization focused context. Also, the test suites for EBSI, the wallets, and the conformance tests are far less suitable for development and learning than their Aries equivalents. 

Aries thus offers both a solution for Enterprise environments called [Hyperledger Aries Cloud Agent Python](https://github.com/hyperledger/aries-cloudagent-python), or ACA-Py for short, and simpler [interoperability testing](https://aries-interop.info/acapy.html). It is also far more mature and most enterprise tests have been developed using ACA-Py. A few words of caution are in order here:

* Most Aries based implementations do not support any of the formats and proof mechanisms that EBSI currently supports. Aries initially focused on supporting the AnonCreds VC format and relied on Indy as a VDR. In Q4 2020, focus shifted to also support other ledgers and VC formats, most notably the VC format W3C JSON-LD ZKP with BBS+ signatures. By Q1 2022, Aries offered full support for AnonCreds and W3C VC in JSON-LD ZKP using BBS+.
* Multimessage signature schemes, such as CL (used in AnonCreds) and BBS+, is very different from the signatures schemes that the public sector is used to work with. Neither EBSI nor the Toolbox group is currently supporting it.
* As of Q1 2022, efforts in Hyperledger Aries are shifting away from the W3C VC data model. The reason for this is practical (cf. the Q1 report [here](https://wiki.hyperledger.org/display/TSC/TSC+Project+Updates)). Many who want to deploy solutions quickly and test out VC and DLT, appreciate the full stack solution that is AnonCreds. To this end, work ins progressing on developing an [open specifications for the AnonCreds VC format](https://anoncreds-wg.github.io/anoncreds-spec/). 
* The Q2 2022 update reiterates that support for W3C standard VCs are still in progress and that developers wanting quick development should focus on AnonCreds.  

Succinctly put: using Aries, Ursa, and Indy is very helpful if the main goal is learning about VC and how decentralized infrastructures like EBSI can support UCAD. Experiences from both Bolagsverket and from other enterprise focused UCAD projects have shown that AnonCreds is highly suitable for deploying solutions quickly. There is still a lot of development efforts ongoing with EBSI, and organizations would do well to keep up to date with recent developments. The assumption that underlies the text herein is that an organization can use Aries, Ursa, and Indy to get familiar with the core concepts, and thus be far more able to onboard EBSI once EBSI becomes mature.

## Disposition

This text will be structured as follows. First, the VC model is introduced both in general and with a specific focus on how the EBSI VC lifecycle looks like. The text aims to explain not only the model, but also the problem the model was introduced to solve. Common concepts and key terms will be described, followed by an introduction of the Trust Over IP (ToIP) stack. By the end of this first VC focused section, decision makers should be able to understand what VC and DLT enables and be ready to have a discussion with their IT colleagues on how to realize UCAD and how to leverage EBSI.

Then, the text takes a technical deep dive into many core concepts required for developers to navigate VC and DLT in general, and EBSI VC specifically. In this section, the text describes each layer of the ToIP stack more in-depth. The text details also how Aries and Aries agents work and covers the major solution components of an UCAD ecosystem. Special attention will be given to help developers set up a test network for rapid development and quick deployment. Again, the focus is on enabling learning. This section will include also a description of EBSI conformant wallets and what EBSI services are usable today. This section will also explain major differences between Aries and the likely direction that EBSI is heading (e.g., with respect to revocation etc.).

Finally, we focus on the controller component for UCAD. The controller is the organization specific codified business logic that provides an interface between the organization's existing systems and an UCAD ecosystem. This chapter will use ACA-Py and the Bolagsverket case as an exmaple.

Throughout the text, there will be a strong emphasis on labs and active development and using code as a pedagogical tool for learning about VC and DLT.

# Verifiable Credentials

## The Verifiable Credentials data model

The concept relies heavily on [the Verifiable Credentials data model](https://www.w3.org/TR/vc-data-model/). 

<img src="https://courses.edx.org/assets/courseware/v1/e6878fa7000fb538a7e8b9dec066d0b1/asset-v1:LinuxFoundationX+LFS173x+3T2021+type@asset+block/LFS172x_CourseGraphics_V1-04.png" alt="VC data model" width="500"/>

**Fig 1**. *The Verifiable Credentials Data Model.*

## The Trust over IP stack

The Trust over IP stack has two tracks: Technology and Governance. The technology stack contains all the technical components that make it possible to exchange verifiable data between two actors. The governance track contains the rules and policies that govern each layer of the technical solution.

<img src = 'fig/toip.png' width = 700>

Below, we look specifically at the technology stack (layers 1-3) and run everything using the default governance settings and in "God mode".

This text aims to document the design process with the Authentic Company Data project at Bolagsverket. Digital identity is rapidly evolving. Open source projects like Hyperledger Aries and Ursa projects provide a set of protocols that, when paired with DLTs such as Hyperledger Indy or Besu, can be used for building distributed applications built on authentic and secure data.






A core element of this text is the [Trust over IP concept](https://trustoverip.org/toip-model/). The Linux Foundation added ToIP in to its projects in 2020. The mission of the ToIP foundation is to simplify and standardize how trust is established online, i.e., to provide mechanisms for how interacting entities can trust the credential exchanges they are engaged in. At its core, the ToIP introduces three actors: an issuer of credentials, a holder of credentials, and a verifier of credentials. 

The concept introduces a technology and governance stack that affords a very high degree of privacy and user control to the credential holder. The ToIP concept can be viewed as an alternative to existing centralized and federated identity models where the user has very limited control over their online idenity and data. However, the ToIP concept does not assume any trust model for identity related data and can be configured to support many kinds of trust models. 

## Objectives

The specific objectives of this text is to build a good foundation for understanding ToIP compliant digital identities. The text will use a series of labs to demonstrate how Verifiable Credentials (or equivalent authenticated data) works and to motivate design choices along the way. The main focus of the text is  Layers 1-3 on the ToIP technology stack.

<img src="https://miro.medium.com/max/1400/1*DgPBpnT_RxDEnd_ptdynkQ.png" alt="drawing" width="800"/>

**Fig 2.** *The dual stack ToIP ([source](https://www.dizme.io/)).*

Also, the default settings for the governance stacks are often used since the focus is on the technology stack. More specifically:

* Layer 1: establishing cryptographic roots of trust labs:
  1. How to establish a secure and privacy connection between two actors.
  2. How to verify the identity of the party you are connecting to.
  3. Explain the governance choices necessary to enable technical trust on Layer 1.
* Layer 2: the agent labs:
  1. How to implement digital wallets and agents
* Layer 3: data exchange labs
  1. Offering, requesting, and creating verifiable credentials
  2. Verifying verifiable credentials
  3. Enabling advanced features like selective disclosure and predicate proofs

Note that the lab numbering will not correspond to the layer numbering.

## Terminology and key concepts

See this [link](https://docs.google.com/document/d/1gfIz5TT0cNp2kxGMLFXr19x1uoZsruUe_0glHst2fZ8/edit) for an exhaustive list of terms and definitions. Key concepts used in the labs are as follow:

* **Self-sovereign identity (SSI)**. An idea that digital identity should be privacy preserving and identity subject controlled by design. Identity related data flows should be solely in the control of the identity subject, i.e., there is no communication between the issuer and the verifier.
* **Trust over IP**. A Linux Foundation organization that has developed the dual stack model of how to establishing trust online.
* **Decentralized identifiers (DID)**. A DID is a universally unique identifier that can be cryptographically verified in such a way that does not rely on a central authority. See the [W3C proposed standards page](https://www.w3.org/TR/did-core/) for more information.
* **Zero Knowledge Proof (ZKP)**. A proof system that prooves only whether a fact or a statement is true or not without revealing any additional details about the fact or statement.
*  **Selective disclosure**. A capability of some verifiable credential formats that allows the holder to select which attributes from an issued verifiable credential to share with a verifier.
*  **Revocation**. The capability of an issuer to publish information used to verify the status of an issued credential. Revocation is either done using traditional techniques, or in a ZKP fashion.
*  **Verifiable Credential formats**. The specific way a verifiable credential is formatted. For an in-depth reading see [Young. Verifiable Credentials
Flavors Explained](https://www.lfph.io/wp-content/uploads/2021/02/Verifiable-Credentials-Flavors-Explained.pdf) and for an overview see this [infographic](https://www.lfph.io/wp-content/uploads/2021/04/Verifiable-Credentials-Flavors-Explained-Infographic.pdf). The credential format and the signature schemes used is one major design choice. For selective disclosure, you need a signature scheme capable of multi-message signing, e.g., CL or BBS+. Of the two, CL signatures are more widely deployed, but are not compliant with the W3C VC data model. In contrast, BBS+ is W3C VC compliant, but does not support ZKP revocation and redicate proofs. Each lab will detail the use of VC format.
    - *AnonCred*. An early privacy focuse VC format focused on attributes. Not compatible with the W3C VC format but is easier to use and has more extensive privacy features. See details [here](https://github.com/PeterAltmann/SSIdemo/blob/main/VC_formats.md).
* **Secure storage**. A way to securely store secrets. This involves the management of cryptographic secrets handled within a key management service, and the storage of other sensitive data.
* **Agent**. An agent is a software that interacts with other entities in order to facilitate the handling of VCs.
* **Blockhains and Distributed Ledger Technology (DLT)**. An identity ecosystem must rely on some sort of verifiable data registry (a VDR) to maintain certain data that must be public. The exact nature of the VDR is another major design implication.  
* **Framework**. The framework is one out of two logical components that enables an agent to interact. The framework is what knows how to establish a connection, send a message, create a credential etc. There exist many frameworks, the most popular is arguably the frameworks that build on the Hyperledger Aries protocols. Most of the labs herein, build on the Aries framework called ACA-Py unless otherwise specified. The ACA-Py is a python based framework focused on enterprise agents.
* **Controller**. The controller is the second logical component of an agent. The framework does not know when to establish a connection or when to issue a credential. The controller encodes the organizations business rules and is responsible for telling the framework what to do and when to do it.

There are many protocols, technologies, implementations etc., mentioned in the text below. To facilitate reading, please se the following terminology/concepts/terms map in [this link](https://user-images.githubusercontent.com/30799110/150362394-5d0319ae-7bad-4674-863c-d6dd35346ea9.png).

# General setup guide for the labs

The prerequisites for the labs are a computer (with access to Ubuntu 18.04) and a smart phone. All the labs can be run locally on your machine, or using the browser using a service called "Play with Docker", which allows you to access a terminal command line without having to install anything locally. If you want to run the labs locally, you will need a terminal CLI running bash shell, [docker](https://docs.docker.com/get-docker/) and [docker-compose](https://docs.docker.com/compose/install/), and [git](https://www.linode.com/docs/guides/how-to-install-git-on-linux-mac-and-windows/). To run the labs in your browser, go to the docker playground http://play-with-von.vonx.io/ (already has all the prerequisites installed). Optionally, if you are not comfortable with CLI, there is this guide that is focused on [openAPI](https://medium.com/@khalifa.toumi/how-to-use-hyperledger-aries-cloud-agent-for-a-classical-workflow-issuer-holder-verifier-9dd595f2f847).

## Lab 1 

In Lab 1., we use the demos provided to get a basic feel for how agents work and interact. Follow [this link](https://github.com/PeterAltmann/SSIdemo/blob/main/LAB1.md) for the lab.

## Lab 2

In Lab 2. we move beyond the demos and take a closer look at how we can use ACA-Py to provision agents and do basic interactions. Follow [this link](https://github.com/PeterAltmann/SSIdemo/blob/main/LAB2.md) to get to the lab. 

## Lab 3

In Lab 3. we continue taking a closer look at ACA-Py and see how we can use it to start our own agents and issue a VC within the context of Authentic Company Data. See this [link](https://github.com/PeterAltmann/SSIdemo/blob/main/LAB3.md). 

## Lab 4

In [Lab 4](https://github.com/PeterAltmann/SSIdemo/blob/main/LAB4.md), we take a closer look at DIDComm and the ways agents can connect.

## Lab 5

In Lab 5, we take a closer look at a simple presentation flow.

## Lab 6

In [Lab 6](https://github.com/PeterAltmann/SSIdemo/blob/main/LAB6.md), we take a closer look at revocation.

## Lab 7

In Lab 7, we take a closer look at selective disclosure.

## Lab 8

In Lab 8, we look at the OpenAPI demo that exists in the [ACA-Py github demo folder](https://github.com/hyperledger/aries-cloudagent-python/blob/main/demo/AriesOpenAPIDemo.md). 

## Lab 9

In Lab 9. We showcase webhooks and how the aca-py agent framework interacts with the controller.

## Lab 10

In Lab 10, we look at JSON-LD VC formats signed with BBS+. Source (https://github.com/hyperledger/aries-cloudagent-python/blob/main/demo/AliceWantsAJsonCredential.md)



# A technical deep dive

## Revocation

# An illustrative example using ACA-Py and Proof of Business