Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about the computation of total number of blocks #12

Closed
AmbitionXiang opened this issue Nov 30, 2021 · 7 comments
Closed

A question about the computation of total number of blocks #12

AmbitionXiang opened this issue Nov 30, 2021 · 7 comments

Comments

@AmbitionXiang
Copy link

Hi, I have a question about the implementation of PathORAM. In the conference paper, N is the working set, i.e., the number of distinct data blocks that are stored in ORAM. The derivation of the total number of blocks is also illustrated in the journal version, that is, height L=ceil(log_2(N)), and the total server storage is Z*2^L. The derivation seems inconsistent with your implementation. What are the considerations here?

@cbeck88
Copy link
Contributor

cbeck88 commented Nov 30, 2021

The derivation of the total number of blocks is also illustrated in the journal version, that is, height L=ceil(log_2(N)), and the total server storage is Z*2^L. The derivation seems inconsistent with your implementation.

I believe that this accurately describes our implementation.

The implementation of PathORAM in this repo is generic over an ORAMStorage object which stores the entire tree. The PathORAM object only cares that it can check out any branch that it wants. This is why we cannot find a container being allocated which is obviously of size Z*2^L in the path_oram/mod.rs.

However we can see the checked-out branch here: https://github.com/mobilecoinofficial/mc-oblivious/blob/e9be7af384654c5467fb28f4396eed6c255f0e15/mc-oblivious-ram/src/path_oram/mod.rs#L250

It contains a vector of byte chunks of size Z * ValueSize, and when we check out, we can see that it resizes it self to leaf.height() + 1: https://github.com/mobilecoinofficial/mc-oblivious/blob/e9be7af384654c5467fb28f4396eed6c255f0e15/mc-oblivious-ram/src/path_oram/mod.rs#L377

So if this is a branch in the tree, this should tell us that the tree has height equal to leaf.height() + 1.

The ORAMStorage object that we actually use in the enclaves is here:
https://github.com/mobilecoinfoundation/mobilecoin/tree/master/fog/ocall_oram_storage/trusted

This object does two things:
(1) Implement tree-top caching, which caches the upper portion of the tree in the enclave. (Essentially, what will fit in the EPC)
(2) For the lower portion of the tree, we can store it with untrusted, but we must encrypt and authenticate it whenever a segment enters/leaves the enclave.

The logic for this encryption / authentication scheme is in the trusted crate.
The edl file which describes the interface by which the enclave may call to untrusted to obtain or return tree segments is here: https://github.com/mobilecoinfoundation/mobilecoin/blob/master/fog/ocall_oram_storage/edl/src/oram_storage.edl
The untrusted side, which simply allocates storage on the heap at time of writing, is here: https://github.com/mobilecoinfoundation/mobilecoin/blob/master/fog/ocall_oram_storage/untrusted/src/lib.rs

That allocation occurs here: https://github.com/mobilecoinfoundation/mobilecoin/blob/21aabfb46cf750817c35696fee3a42e505ca49a1/fog/ocall_oram_storage/untrusted/src/lib.rs#L105
This is expected to be the largest allocation and indicative of total memory use when the tree is large relative to the EPC.

There are a bunch of additional considerations about exactly how big the tree needs to be:

There's at least one important optimization over the conference paper given by Gentry et al: https://eprint.iacr.org/2013/239
The tree-top caching across SGX boundary is very important, and was described by the ZeroTrace authors: https://eprint.iacr.org/2017/549.pdf
There is a lot more work for us to do here -- we still have not implemented CircuitORAM although we expect a lot of performance gain from that, based on the ZeroTrace results and discussions with the authors. The main thing we have done to improve performance over a "naive" implementation is the use of SIMD instructions in the aligned-cmov crate.

The tree-top caching stuff that we did can be roughly compared with the ZeroTrace implementation here: https://github.com/sshsshy/ZeroTrace/blob/master/ZT_Enclave/ORAMTree.cpp, it's not exactly the same but it's similar in spirit.

Please let me know if you still think there is an inconsistency or if you have any other questions, I'm happy to share my thoughts.

@AmbitionXiang
Copy link
Author

AmbitionXiang commented Nov 30, 2021

Thank you for your comprehensive explanation. Yeah, I'm reading your codes and the codes of ZeroTrace simultaneously and the spirits of both are indeed similar, as you said.

According to my understanding, for PathORAM4096Z2Creator, bucket size = 4096, block size = 2048, Z=2, and the parameter size in function new of PathORAM corresponds to the N in paper. I know your implementation has already implied that Z*2^L, and I have no doubt about it. However, my question is about the computation of L.

In your implementation, let height = log2_ceil(size).saturating_sub(log2_ceil(Z::U64)); seems to compute L'=ceil(log_2(N/Z)), not L=ceil(log_2(N)). For N=2048, Z=2, L' = 10, and the derived total number of blocks is 2*2^10=2048, it seems that no dummy blocks can be placed in the total tree, which seems unsafe; while L = 11, and the total number of blocks is 2*2^11=4096. I do not know whether my understanding is correct.

Moreover, I plan to adapt RingORAM to your framework, the way you compute the total number of blocks seems not compatible to RingORAM.

By the way, your implementation is concise, beautiful, and powerful. I like it.

@cbeck88
Copy link
Contributor

cbeck88 commented Nov 30, 2021

In your implementation, let height = log2_ceil(size).saturating_sub(log2_ceil(Z::U64)); seems to compute L'=ceil(log_2(N/Z)), not L=ceil(log_2(N)). For N=2048, Z=2, L' = 10, and the derived total number of blocks is 22^10=2048, it seems that no dummy blocks can be placed in the total tree, which seems unsafe; while L = 11, and the total number of blocks is 22^11=4096. I do not know whether my understanding is correct.

I see what you are saying.

I think I was trying to implement the optimization described in Gentry et al. in section 3: https://eprint.iacr.org/2013/239

The construction from [20] uses an N-leaf tree to store an N-element database. We first observe that
increasing just slightly the capacity of the buckets, one can achieve the same overflow probability
for many more elements than leaves. Note that the buckets in [20] are large enough to keep O(k)
elements, but the expected number of elements in each bucket in only O(1), which is the reason for
the O(k) overhead in storage. We observer, however, that we can increase the expected number of
elements per bucket to any desired number m, while keeping the overflow probability at exp(−k),
simply by making the bucket capacity O(m+ k), thus reducing the overhead to O(m/k). Hence by
setting m = k, we reduce the storage overhead to only O(1).
Specifically, by making the capacity of each bucket be 2k (rather than k as in [20]), we can use
a shallower tree with only n/k leaves to store n elements. We reduce the number of nodes by a
factor of k while increasing the bucket size by only a factor of 2, thus getting a factor k/2 saving
in storage.

But it's possible that I messed it up.

I didn't remember this part

by making the capacity of each bucket be 2k

But nevertheless the implementation seems to work -- we create trees with thousands of items and exercise them millions of times with the Z=4 ORAM.

It might be that the correct way to think about it is, the Z in our code is 2k from Gentry, and we are doing a Z=2 parameterization with path oram effectively? And for some reason that is working with the Gentry optimization (although it isn't supposed to work otherwise?)

I need to think more about this and read these papers again, sorry.

it seems that no dummy blocks can be placed in the total tree, which seems unsafe;

I agree it does seem unsafe -- although there is the stash as well, but that should not really matter if the entire tree is full.

In this test, we are making an n=8192 ORAM, and exercising it 20,000 times:

https://github.com/mobilecoinofficial/mc-oblivious/blob/e9be7af384654c5467fb28f4396eed6c255f0e15/mc-oblivious-ram/src/lib.rs#L332

If there is really no space for dummy blocks then I would expect that after 20,000 times we would see overflow. So I am missing something here. I will write again later. Thanks for your questions!

By the way, your implementation is concise, beautiful, and powerful. I like it.

Thank you for your kind words!

Moreover, I plan to adapt RingORAM to your framework, the way you compute the total number of blocks seems not compatible to RingORAM.

That sounds like a really interesting project!

@AmbitionXiang
Copy link
Author

If there is really no space for dummy blocks then I would expect that after 20,000 times we would see overflow.

Yeah, it is confusing that no overflow is observed by you. But I notice that you have observed it when Z=2.

I will write again later.

Thank you for your patience. I'm looking forward to your helpful explanation.

cbeck88 added a commit that referenced this issue Dec 1, 2021
this adds a version of exercise_path_oram that queries consecutive
locations over and over, rather than using the "progressive"
strategy.

the progressive stratey was good initially when we would find bugs
quickly when items are queried again, and it's also good for testing
very large ORAMs in a way that is interesting.

this strategy helps to answer questions posed in issue #12, like,
is it the case that when all items exist in the tree, there is no
space for any dummy blocks, due to our choices of parameters.
In case of such an issue, we would expect this test to fail when
the number of rounds is significantly larger than the size of the
ORAM, because with high probability when an item is mapped
there would be no space on the branch that it selects or in the
stash.
@cbeck88
Copy link
Contributor

cbeck88 commented Dec 1, 2021

It occurred to me that the test that I mentioned was using a "progressive" probing strategy which doesn't actually access all locations in the ORAM, so it's conceivable that the ORAM has that problem and the test passes.

I have added some more tests that always probe consecutive locations, and they seem to be passing locally:
#13

this just helps give me confidence that the stuff in prod is not broken though -- it doesn't explain right now why it's not broken.

@cbeck88
Copy link
Contributor

cbeck88 commented Dec 1, 2021

Ok, I think the reason is here:

https://github.com/mobilecoinofficial/mc-oblivious/blob/e9be7af384654c5467fb28f4396eed6c255f0e15/mc-oblivious-ram/src/path_oram/mod.rs#L142

We are actually creating the ORAM storage object with 2<<h and not 1<<h, so this is actually 2 * 2^h.

The code comment above this explains:

         // This is 2u64 << height because it must be 2^{h+1}, we have defined
        // the height of the root to be 0, so in a tree where the lowest level
        // is h, there are 2^{h+1} nodes.

So I think we have effectively done Gentry et. al. optimization -- instead of 2^n nodes, we have 2 * 2^n / Z nodes.
But here what we are calling Z is what Gentry et. al. are calling 2k, so that's a documentation problem.
And the strategy appears to be working in practice when what Gentry et al call Z is set to be 2. I don't know why that is, but I don't know that it is expected to fail in this case.

Another problem I remember now is, what I have called "height" of a node in all of the comments is actually called the "depth" in computer science literature, and I really should fix that.

@AmbitionXiang
Copy link
Author

I understood! Thank you for your explanation! I've learned a lot.

cbeck88 added a commit that referenced this issue Dec 1, 2021
* add additional test functions

this adds a version of exercise_path_oram that queries consecutive
locations over and over, rather than using the "progressive"
strategy.

the progressive stratey was good initially when we would find bugs
quickly when items are queried again, and it's also good for testing
very large ORAMs in a way that is interesting.

this strategy helps to answer questions posed in issue #12, like,
is it the case that when all items exist in the tree, there is no
space for any dummy blocks, due to our choices of parameters.
In case of such an issue, we would expect this test to fail when
the number of rounds is significantly larger than the size of the
ORAM, because with high probability when an item is mapped
there would be no space on the branch that it selects or in the
stash.

* cargo fmt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants