## OpenDP Context API

The Context API preserves the core elements of the OpenDP framework, but simplifies the process of constructing DP analyses by using a more data-oriented approach, familiar to libraries like Pandas and NumPy.

A `Context` is a privacy accountant that mediates query access to your sensitive dataset. You can make a `Context`  via the `compositor` method. A `compositor` requires the following parameters:
- `data`: The data to be analyzed.
- `privacy_unit`: A tuple consisting of a metric and dataset distance
- `privacy_loss`: A tuple consisting of a privacy measure and a privacy loss parameter.

In addition, `compositor` requires:
- `split_evenly_over` or `split_by_weights`: Either one of these parameters must be specified. When making multiple DP queries on the same dataset, these parameters define how you want to split your privacy budget across your queries.

The `unit_of` and `loss_of` helper functions can be used to construct the `privacy_unit` and `privacy_loss` parameters, respectively. The `unit_of` wrapper allows you to define the contributions for a dataset (the number of entries a single individual contributed to in the dataset). By default, defining a value for the contributions parameter gives a symmetric distance.

In [1]:
from typing import List
import opendp.prelude as dp
dp.enable_features("contrib")

dp.unit_of(contributions=3)

(SymmetricDistance(), 3)

The `loss_of` wrapper allows you to define privacy measures and loss for different forms of DP. This example defines a privacy loss of epsilon=1.0 for pure DP.

In [2]:
# Pure DP
dp.loss_of(epsilon=1.0)

(<opendp.mod.Measure at 0x7fb63403bd40>, 1.0)

Now, let's create a context via the `compositor` method. Note that leaving the `domain` parameter unspecified assumes that the structure of the data is public knowledge. In some cases, specifying the domain explicitly can improve utility (i.e by setting dataset size).

In [3]:
context = dp.Context.compositor(
    data=[1, 2, 3],
    privacy_unit=dp.unit_of(contributions=1),
    privacy_loss=dp.loss_of(epsilon=3.0),
    domain=dp.domain_of(List[int]),
    split_evenly_over=1
)

Once you have created the `Context` object, you can submit DP queries to it. This example clamps the data, computes the sum, and applies Laplace noise calibrated to the privacy budget of epsilon=3.0. The query is not applied to the data until `.release()` is called. After calling the `query` method. More documentation for these operations can be found below:
- [clamp()](https://docs.opendp.org/en/stable/user/transformations.html#clamping)
- [sum()](https://docs.opendp.org/en/stable/user/transformations.html#aggregators)
- [laplace()](https://docs.opendp.org/en/stable/user/transformations.html#aggregators)

In [4]:
dp_sum = context.query().clamp((0, 5)).sum().laplace()
dp_sum.release()

6

Note that attempting to run another DP query will result in a message stating that we have exhausted our number of queries.

In [5]:
from opendp.mod import OpenDPException

try:
    print(dp_sum.release())
except OpenDPException as err:
    print(err.message)

out of queries


You can allow for more queries by changing the `split_evenly_over` parameter. In this example, we have defined a new context, but on a real data release, your entire data analysis should be conducted within a single root Context.

In [6]:
context = dp.Context.compositor(
    data=[1, 2, 3],
    privacy_unit=dp.unit_of(contributions=1),
    privacy_loss=dp.loss_of(epsilon=3.0),
    domain=dp.domain_of(List[int]),
    split_evenly_over=2
)

Now, we can split our privacy budget evenly over 2 queries.

In [7]:
# Release a DP sum
dp_sum = context.query().clamp((0, 5)).sum().laplace()
dp_sum.release()

9

In [8]:
# Release a DP count
dp_count = context.query().clamp((0, 5)).count().laplace()
dp_count.release()

3

The `split_evenly_over` parameter splits the privacy loss evenly across each query. If you wanted to give more of our privacy budget to one of the queries, you can do so by specifying the `split_by_weights` parameter instead. For example, when computing a DP count and a DP sum, you may want to allocate more of the privacy budget to the sum query, since the sensitivity of a sum is greater than that of a count.

In [9]:
context = dp.Context.compositor(
        data=[1, 2, 3],
        privacy_unit=dp.unit_of(contributions=1),
        privacy_loss=dp.loss_of(epsilon=1.0),
        domain=dp.domain_of(List[int]),
        # Give more privacy loss to the sum query
        split_by_weights=[2, 1]
    )

In [10]:
# Release a DP sum, using 2/3 of the privacy loss
dp_sum = context.query().clamp((0, 5)).sum().laplace()
dp_sum.release()

5

In [11]:
# Release a DP count, using 1/3 of the privacy loss
dp_count = context.query().clamp((0, 5)).count().laplace()
dp_count.release()

-2

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=eb3cc54e-32a2-45f7-b3b6-7c75d6451047' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>