PSI integration #25

bcebere · 2020-06-07T13:41:38Z

Description

The current PSI implementation supports 2 operations:

Get intersection elements.
Get Intersection size.

In order to finish the PSI integration, we need:

Batch queries. Batch queries #22
Support for strings in the database. Database: support string values #21
Support for sparse databases.Sparse database support #24
Support for database buckets Support bucketing in PIR Database #27
Support for cardinality requests. PIR cardinality #23

On completion, it should be straightforward to integrate PIR in https://github.com/OpenMined/PSI

The current PSI flow:

1. Setup phase

The server encrypts all its elements x under a commutative encryption scheme, computing H(x)^s where s is its secret key. The encrypted elements are then inserted in a Bloom filter, which is sent to the client encoded as JSON. The message has the following form:

{
  "num_hash_functions": <int>,
  "bits": <string>
}

Here, bits is a Base64-encoded string.

2. Client request

The client encrypts all their elements x using the commutative encryption scheme, computing H(x)^c, where c is the client's secret key. The encoded elements are sent to the server as a JSON array of Base64 strings, together with a boolean reveal_intersection that indicates whether the client wants to learn the elements in the intersection or only its size.

{
  "reveal_intersection": <bool>,
  "encrypted_elements": [ Base64(H(x_1)^c), Base64(H(x_2)^c), ... ]
}

3. Server response

For each encrypted element H(x)^c received from the client, the server encrypts it again under the commutative encryption scheme with its secret key s, computing (H(x)^c)^s = H(x)^(cs). The result is sent back to the client as a JSON array of strings:

{
  "encrypted_elements": [ Base64(H(x_1)^c), Base64(H(x_2)^c), ... ]
}

If reveal_intersection is false, the array is sorted to hide the order of entries from the client.

4. Client computes intersection

The client decrypts each element received from the server's response using its secret key c, computing (H(x)^(cs))^(1/c) = H(x)^s. It then checks if each element is present in the Bloom filter, and reports the number of matches as the intersection size.

The text was updated successfully, but these errors were encountered:

kshehata · 2020-06-09T07:28:49Z

A couple of thoughts here:

I don't think we should do cardinality within PIR. Whatever integrates PIR should do it on that end.
For our current use case, I think that bucketing might be better than directly supporting sparse databases. The latter will require cuckoo hashing or something similar, which is inefficient. See Support bucketing in PIR Database #27

bcebere · 2020-06-09T07:36:17Z

A couple of thoughts here:

I don't think we should do cardinality within PIR. Whatever integrates PIR should do it on that end.

For our current use case, I think that bucketing might be better than directly supporting sparse databases. The latter will require cuckoo hashing or something similar, which is inefficient. See Support bucketing in PIR Database #27

The cardinality is useful for not revealing the actual database content to the user. This should be done on server side, as in current PSI implementation.
As far as I understand the bucketing approach inflicts restrictions over the database content/indexes, and we might want a generic solution for that.

Since tokens are expected to have a uniform distribution (both before and after
transformation), tokens should be uniformly distributed across shards and buckets. As such, the
bucket addresses can simply be the first log2
(nshardsnbuckets) most significant bits of the transformed
token itself.

I don't think we should enforce this assumption in a generic purpose library.

bcebere added the Type: Epic 🤙 Describes a large amount of functionality that will likely be broken down into smaller issues label Jun 7, 2020

bcebere self-assigned this Jun 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PSI integration #25

PSI integration #25

bcebere commented Jun 7, 2020 •

edited

Loading

kshehata commented Jun 9, 2020

bcebere commented Jun 9, 2020

PSI integration #25

PSI integration #25

Comments

bcebere commented Jun 7, 2020 • edited Loading

Description

The current PSI flow:

1. Setup phase

2. Client request

3. Server response

4. Client computes intersection

kshehata commented Jun 9, 2020

bcebere commented Jun 9, 2020

bcebere commented Jun 7, 2020 •

edited

Loading