# Introduction to Federated Core (FC)

This tutorial is the first part of a two-part series that demonstrates how to implement custom types of federated algorithms in TensorFlow Federated (TFF) using the Federated Core (FC) - a set of lower-level interfaces that serve as a foundation upon which we have implemented the Federated Learning (FL) layer.

This first part is more conceptual; we introduce some of the key concepts and programming abstractions used in TFF, and we demonstrate their use on a very simple example with a distributed array of temperature sensors. In the second part of this series, we use the mechanisms we introduce here to implement a simple version of federated training and evaluation algorithms. As a follow-up, we encourage you to study the implementation of federated averaging in tff.learning.

By the end of this series, you should be able to recognize that the applications of Federated Core are not necessarily limited to learning. The programming abstractions we offer are quite generic, and could be used, e.g., to implement analytics and other custom types of computations over distributed data.

## What is Federated Core (FC)

Federated Core (FC) is a development environment that makes it possible to compactly express program logic that combines TensorFlow code with distributed communication operators, such as those that are used in Federated Averaging - computing distributed sums, averages, and other types of distributed aggregations over a set of client devices in the system, broadcasting models and parameters to those devices, etc.

## Who should use Federated Core API?

Primary target audiences for TFF's FC API is researchers and practitioners who want to experiment with new federated learning algorithms and evaluate the consequences of subtle design choices that affect the manner in which the flow of data in the distributed system is orchestrated without getting bogged down by system implementation details. 

The level of abstraction that FC API is aiming for roughly corresponds to pseudocode one could use to describe the mechanics of a federated learning algorithm in a research publication - what data exists in the system and how it is transformed, but without dropping to the level of individual point-to-point network message exchanges.

In [59]:
import collections

import numpy as np
import tensorflow as tf
import tensorflow_federated as tff


In [60]:
# required to run TFF inside Jupyter notebooks
import nest_asyncio
nest_asyncio.apply()

In [61]:
@tff.federated_computation
def hello_world():
    return 'Hello, World!'


In [62]:
hello_world()

b'Hello, World!'

# Federated data

One of the distinguishing features of TFF is that it allows you to compactly express TensorFlow-based computations on federated data. We will be using the term *federated data* in this tutorial to refer to a collection of data items hosted across a group of devices in a distributed system. The following are some examples.

* Applications running on mobile devices collect data and store it locally without uploading to a centralized location.
* IoT sensors and other edge computing devices collect and data locally without uploading to a centralized location.
* TODO: find more examples!

Federated data are treated as first-class citizens in TFF meaning that they have types and may appear as *parameters* and *results* of functions. To reinforce this notion, we will refer to federated data sets as *federated values*, or as *values of federated types*.

For example, here's how one would define the type of a federated float hosted by a group of client devices in TFF. A collection of temperature readings that materialize across an array of distributed sensors could be modeled as a value of this federated type.

In [5]:
tff.FederatedType?

[0;31mInit signature:[0m [0mtff[0m[0;34m.[0m[0mFederatedType[0m[0;34m([0m[0mmember[0m[0;34m,[0m [0mplacement[0m[0;34m,[0m [0mall_equal[0m[0;34m=[0m[0;32mNone[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m      An implementation of `tff.Type` representing federated types in TFF.
[0;31mInit docstring:[0m
Constructs a new federated type instance.

Args:
  member: An instance of `tff.Type` (or something convertible to it) that
    represents the type of the member components of each value of this
    federated type.
  placement: The specification of placement that the member components of
    this federated type are hosted on. Must be either a placement literal
    such as `tff.SERVER` or `tff.CLIENTS` to refer to a globally defined
    placement, or a placement label to refer to a placement defined in other
    parts of a type signature. Specifying placement labels is not
    implemented yet.
  all_equal: A `bool` value that indicates whether all 

In [63]:
# collection of data items across all devices is modeled as a single federated value
temperature_type = tff.FederatedType(member=tf.float32, placement=tff.CLIENTS)

A federated type with member constituents `T` and placement `G` can be represented compactly as `{T}@G`, as shown below.

In [65]:
# use print to get the str repr
print(temperature_type)

{float32}@CLIENTS


The curly braces `{}` serve as a reminder that the member constituents (items of data on different devices) may differ (as you would expect of temperature sensor readings). The clients as a group are jointly hosting a multi-set of `T`-typed items that together constitute the federated value.

It is important to note that the member constituents of a federated value should not be thought of as a simple dict keyed by an identifier of a client device in the system - federated values are intended to be collectively transformed only by federated operators representing various kinds of distributed communication protocols for performing computations (such as aggregation). 

Federated types in TFF come in two flavors: those where the member constituents of a federated value may differ (as just seen above), and those where they are known to be all equal. This is controlled by the third, optional `all_equal` parameter in the [`tff.FederatedType`](https://www.tensorflow.org/federated/api_docs/python/tff/FederatedType) constructor (defaulting to `False`).

The following are examples of federated values where `all_equal=False`.

* Raw data stored locally on client devices.
* Local metrics summarizing local training progress on client devices.
* Set of parameters for a machine learning model trained on a client device that will eventually be communicated to the server.

A federated type with a placement `G` in which all of the `T`-typed member constituents are known to be equal can be compactly represented as `T@G` (as opposed to `{T}@G`, that is, with the curly braces dropped to reflect the fact that the multi-set of member constituents consists of a single item).

The following are examples of federated values where `all_equal=True`.

* Hyperparameters (such as a learning rate, a clipping norm, etc.) that has been broadcasted by a server to a group of devices that participate in federated training.
* Set of parameters for a machine learning model pre-trained at the server that will eventually be broadcasted to a group of client devices.

In [68]:
hyperparameter_type = tff.FederatedType(member=tf.float32,placement=tff.CLIENTS, all_equal=True)

In [69]:
print(hyperparameter_type)

float32@CLIENTS


Now a slightly more complicated example. Suppose we have a pair of `tf.float32` parameters `m` and `b` for a simple one-dimensional linear regression model. We can construct the (non-federated) type of such models for use in TFF as follows.

In [70]:
tff.NamedTupleType?

[0;31mInit signature:[0m [0mtff[0m[0;34m.[0m[0mNamedTupleType[0m[0;34m([0m[0melements[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
An implementation of `tff.Type` representing named tuple types in TFF.

Elements initialized by name can be accessed as `foo.name`, and otherwise by
index, `foo[index]`.
[0;31mInit docstring:[0m
Constructs a new instance from the given element types.

Args:
  elements: An iterable of element specifications. Each element
    specification is either a type spec (an instance of `tff.Type` or
    something convertible to it via `tff.to_type`) for the element, or a
    (name, spec) for elements that have defined names. Alternatively, one
    can supply here an instance of `collections.OrderedDict` mapping element
    names to their types (or things that are convertible to types).
[0;31mFile:[0m           ~/Training/data-science-project-templates/tensorflow-federated-data-science-project/env/lib/python3.7/site-packages/tenso

In [14]:
linear_regression_type = tff.NamedTupleType([
    ('m', tf.float32),
    ('b', tf.float32)
])


In [15]:
# angle braces `<>` in type string are TFF notation for named or unnamed tuples.
print(linear_regression_type)

<m=float32,b=float32>


Non-scalar types are also supported. In the above code `tf.float32` is actually a shortcut notation for the more general type `tff.TensorType(dtype=tf.float32, shape=[])`. When this model is broadcasted to clients, the type of the resulting federated value can be represented as shown below.

In [16]:
model_parameters = tff.FederatedType(linear_regression_type, tff.CLIENTS, all_equal=True)

In [17]:
print(model_parameters)

<m=float32,b=float32>@CLIENTS


Now, coming back to `float32@CLIENTS` - while it appears replicated across multiple devices, it is actually a single `float32` (since all members are the same). In general, you may think of any all-equal federated type, i.e., one of the form `T@G`, as isomorphic to a non-federated type `T`, since in both cases, there's actually only a single (albeit potentially replicated) item of type `T`.

# Placements

In the preceding section we introduced the concept of placements - groups of system participants that might be jointly hosting a federated value and we demonstrated the use of `tff.CLIENTS` as an example specification of a placement. The notion of a placement is so fundamental to TFF that placements needed to be incorporated directly into the TFF type system.

### Placements help us reason about where data *currently* reside and where we *intend* data to materialize

Although in this tutorial, you will only see TFF code being executed locally in a simulated environment, primary goal for TFF is to enable developers to write code that could be deployed for execution on groups of physical devices in a distributed system. Each of of those devices would receive a separate set of instructions to execute locally depending on the role it plays in the system (an end-user device, a centralized coordinator, an intermediate layer in a multi-tier architecture, etc.). It is important to be able to reason about which subsets of devices execute what code and where different portions of the data might physically materialize.

This is especially important when dealing with client devices that generate data that is private and/or sensitive. Developers need the ability to *statically* verify that this data will never leave the client device (and possibly even *prove* assurances about how the data is being processed). The placement specifications are one of the mechanisms designed to support this.

Representing the type of a certain value as `T@G` or `{T}@G` (as opposed to just `T`) makes data placement decisions explicit. Furthermore lifting placements into the TFF type system potentially allows for the use of formal verification tools to automatically provide privacy guarantees for sensitive client data.

### TFF encourages us to focus on *data* placement, rather than on *operations* placement 

TFF has been designed as a *data-centric* programming environment. Unlike many existing frameworks that focus on operations and where those operations might run, TFF focuses on data, where that data materializes, and how it's being transformed. Consequently, data placement is modeled as a property of *data* in TFF, rather than as a property of operations on data.

An important thing to note at this point, however, is that while we encourage TFF users to be explicit about groups of participating devices that host the data (the placements), developers will *never* deal with the raw data or identities of the individual participants. Within the body of TFF code, by design, there's no way to enumerate the devices that constitute the group represented by `tff.CLIENTS`, or to probe for the existence of a specific device in the group. There's no concept of a device or client identity anywhere in the Federated Core API, the underlying set of architectural abstractions, or the core runtime infrastructure we provide to support simulations. All the computation logic you write will be expressed as operations on the entire group of clients.

## Specifying Placements

TFF provides two basic placement literals, `tff.CLIENTS` and `tff.SERVER`, to make it easy to express the a variety of practical scenarios that are naturally modeled as client-server architectures, with multiple client devices (mobile phones, embedded devices, distributed databases, sensors, etc.) orchestrated by a single centralized server coordinator. TFF is designed to also support custom placements, multiple client groups, multi-tiered and other, more general distributed architectures, but discussing them is outside the scope of this tutorial.

Note that TFF doesn't prescribe what either the `tff.CLIENTS` or the `tff.SERVER` actually represent.

* `tff.SERVER` might be a single physical device.
* `tff.SERVER` might be a group of replicas in a fault-tolerant cluster running state machine replication.

Rather, we use the `all_equal=True` mentioned in the preceding section to express the fact that we're generally dealing with only a single item of data at the server.

Likewise, `tff.CLIENTS` in some applications might represent all clients in the system - what in the context of federated learning we sometimes refer to as the population or, as in a production implementation of Federated Averaging, it may represent a cohort - a subset of the clients selected for paticipation in a particular round of training. The abstractly defined placements are given concrete meaning when a computation in which they appear is deployed for execution (or simply invoked like a Python function in a simulated environment, as is demonstrated in this tutorial). In our local simulations, the group of clients is determined by the federated data supplied as input.

# Federated computations

## Declaring federated computations

TFF is designed as a strongly-typed functional programming environment. The basic unit of composition in TFF is a *federated computation* - a section of logic that may accept federated values as input and return federated values as output. 

Here's an example of how you can define a federated computation that calculates the average of the temperatures reported by the sensor array from our previous example.

In [72]:
tff.federated_computation?

[0;31mSignature:[0m [0mtff[0m[0;34m.[0m[0mfederated_computation[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Decorates/wraps Python functions as TFF federated/composite computations.

The term *federated computation* as used here refers to any computation that
uses TFF programming abstractions. Examples of such computations may include
federated training or federated evaluation that involve both client-side and
server-side logic and involve network communication. However, this
decorator/wrapper can also be used to construct composite computations that
only involve local processing on a client or on a server.

The main feature that distinguishes *federated computation* function bodies
in Python from the bodies of TensorFlow defuns is that whereas in the latter,
one slices and dices `tf.Tensor` instances using a variety of TensorFlow ops,
in the former one slices and dices `tff.Value` instances using TFF operators.

The suppor

In [71]:
SENSOR_READINGS_TYPE = tff.FederatedType(member=tf.float32, placement=tff.CLIENTS)

@tff.federated_computation(SENSOR_READINGS_TYPE)
def compute_average_temperature(sensor_readings):
    return tff.federated_mean(sensor_readings)


Code generated by the [`tff.federated_computation`](https://www.tensorflow.org/federated/api_docs/python/tff/federated_computation) decorator is neither TensorFlow nor is it Python - it's a specification of a distributed system in an internal platform-independent glue language. What we mean by this is that when the Python interpreter encounters a function decorated with `tff.federated_computation` does two things.

1. traces the statements in this function's body once (at definition time)
2. constructs a serialized representation of the computation's logic for future use (i.e., execution, incorporated as a sub-component into another computation).

TFF computations are modeled as functions. While these functions may or may not always have parameters, functions defining TFF computations will *always* well-defined type signatures. 

In [73]:
print(compute_average_temperature.type_signature)

({float32}@CLIENTS -> float32@SERVER)


Type signatures are represented as (`T` -> `U`) for types `T` and `U` of inputs and outputs, respectively. The type of the formal parameter, such `sensor_readings` in this case, is specified as the argument to the decorator. You don't need to specify the type of the result - it's determined automatically.

TFF programmers are strongly encouraged to be explicit about the types of data they work with, as that makes understanding, debugging, and formally verifying properties of your code easier.

## Question?

What does this type signature tell us about the computation?

## Answer!

The type signature tells us that the computation accepts a collection of different sensor readings on client devices and returns a single average on the server.

Before we go any further, let's reflect on this for a minute - the input and output of this computation are in different places (on `CLIENTS` vs. at the `SERVER`). Recall what we said in the preceding section on placements about how TFF operations may span across locations and what we just said about federated computations as representing abstract specifications of distributed systems. We have just a defined one such computation - a simple distributed system in which data is consumed at client devices, and the aggregate results emerge at the server.

In many practical scenarios, the computations that represent top-level tasks will tend to accept their inputs and report their outputs at the server - this reflects the idea that computations might be triggered by queries that originate and terminate on the server.

However, FC API does not impose this assumption, and many of the building blocks we use internally (including numerous `tff.federated_*` operators you may find in the API) have inputs and outputs with distinct placements, so in general, you should not think about a federated computation as something that runs on the server or is executed by a server. The server is just one type of participant in a federated computation. In thinking about the mechanics of such computations, it's best to always default to the global network-wide perspective, rather than the perspective of a single centralized coordinator.

In [155]:
# what federated operators are currently available?
tff.federated_

## Executing federated computations

In order to support development and debugging, TFF allows you to directly invoke computations defined this way as Python functions, as shown below. Where the computation expects a value of a federated type with the `all_equal=False`, you can feed it as a plain list in Python, and for federated types with the `all_equal=True`, you can just directly feed the (single) member constituent. This is also how the results are reported back to you.

In [23]:
compute_average_temperature([2.3, 4.5, 6.7])

4.5

You can think of Python code that defines a federated computation similarly to how you would think of Python code that builds a TensorFlow graph in a non-eager context (if you're not familiar with the non-eager uses of TensorFlow, think of your Python code defining a graph of operations to be executed later, but not actually running them on the fly). The non-eager graph-building code in TensorFlow is Python, but the TensorFlow graph constructed by this code is platform-independent and serializable. Likewise, TFF computations are defined in Python, but the Python statements in their bodies, such as `tff.federated_mean` in the example weve just shown, are compiled into a portable and platform-independent serializable representation under the hood.

As a developer, you don't need to concern yourself with the details of this representation, as you will never need to directly work with it, but you should be aware of its existence, the fact that TFF computations are fundamentally non-eager, and *cannot* capture arbitrary Python state. Python code contained in a TFF computation's body is executed at definition time, when the body of the Python function decorated with `tff.federated_computation` is traced before getting serialized. It's not generally retraced again at invocation time.

You may wonder why we've chosen to introduce a dedicated internal non-Python representation. One reason is that ultimately, TFF computations are intended to be deployable to real physical environments, and hosted on mobile or embedded devices, where Python may not be available.

Another reason is that TFF computations express the global behavior of distributed systems, as opposed to Python programs which express the local behavior of individual participants. You can see that in the simple example above, with the special operator tff.federated_mean that accepts data on client devices, but deposits the results on the server.

The operator `tff.federated_mean` cannot be easily modeled as an ordinary operator in Python, since it doesn't execute locally - as noted earlier, it represents a distributed system that coordinates the behavior of multiple system participants. We will refer to such operators as federated operators, to distinguish them from ordinary (local) operators in Python.

The TFF type system, and the fundamental set of operations supported in the TFF's language, thus deviates significantly from those in Python, necessitating the use of a dedicated representation.

## Composing federated computations

As noted above, federated computations and their constituents are best understood as models of distributed systems, and you can think of composing federated computations as composing more complex distributed systems from simpler ones. You can think of the `tff.federated_mean` operator as a kind of built-in template federated computation with a type signature `{T}@CLIENTS -> T@SERVER`.

The same is true of composing federated computations. The computation `compute_average_temperature` may be invoked in a body of another Python function decorated with `tff.federated_computation` - doing so will cause it to be embedded in the body of the parent, much in the same way `tff.federated_mean` was embedded in its own body earlier.

An important restriction to be aware of is that bodies of Python functions decorated with `tff.federated_computation` must consist only of federated operators: they *cannot* directly contain TensorFlow operations. TensorFlow code must be confined to blocks of code decorated with a `tff.tf_computation` discussed in the following section. Only when wrapped in this manner can the wrapped TensorFlow code be invoked in the body of a `tff.federated_computation`.

# TensorFlow logic

TFF is designed for use with TensorFlow. As such, the bulk of the code you will write in TFF is likely to be ordinary (i.e., locally-executing) TensorFlow code. In order to use such code with TFF, as noted above, it just needs to be decorated with [`tff.tf_computation`](https://www.tensorflow.org/federated/api_docs/python/tff/tf_computation).

In [193]:
tff.tf_computation?

[0;31mSignature:[0m [0mtff[0m[0;34m.[0m[0mtf_computation[0m[0;34m([0m[0;34m*[0m[0margs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Decorates/wraps Python functions and defuns as TFF TensorFlow computations.

This symbol can be used as either a decorator or a wrapper applied to a
function given to it as an argument. The supported patterns and examples of
usage are as follows:

1. Convert an existing function inline into a TFF computation. This is the
   simplest mode of usage, and how one can embed existing non-TFF code for
   use with the TFF framework. In this mode, one invokes
   `tff.tf_computation` with a pair of arguments, the first being a
   function/defun that contains the logic, and the second being the TFF type
   of the parameter:

   ```python
   foo = tff.tf_computation(lambda x: x > 10, tf.int32)
   ```

   After executing the above code snippet, `foo` becomes an instance of the
   abstract base class `Computation`. Like all computations, 

## Declaring TensorFlow computations

Recall that communication between clients and server is expensive in federated learning and we may want to minimize amount of data sent between clients and the server. Lots of research into compression operators. Here's how we could implement a naive compression operator using TensorFlow code. 

In [196]:
COMPRESSOR_TYPE = tff.TensorType(tf.float32, shape=(None,))

@tff.tf_computation(COMPRESSOR_TYPE)
def top_1_compression(gradients):
    _, indices = tf.math.top_k(tf.abs(gradients), k=1)
    updates = tf.gather(gradients, indices)
    compressed_gradients = tf.tensor_scatter_nd_add(tf.zeros_like(gradients),
                                                    tf.expand_dims(indices, axis=1),
                                                    updates)
    return compressed_gradients


Why does TFF define yet another decorator [`tff.tf_computation`](https://www.tensorflow.org/federated/api_docs/python/tff/tf_computation) instead of simply using an existing mechanism such as [`tf.function`](https://www.tensorflow.org/api_docs/python/tf/function) (note that, unlike in the preceding section, here we are dealing with an ordinary block of TensorFlow code)?

There are a few reasons for this, the full treatment of which goes beyond the scope of this tutorial, but it's worth naming the main two:

* In order to embed reusable building blocks implemented using TensorFlow code in the bodies of federated computations, they need to satisfy certain properties such as getting traced and serialized at definition time, having type signatures, etc. This generally requires some form of a decorator.

* In addition, TFF needs the ability for computations to be able to accept data streams (represented as tf.data.Datasets), such as streams of training example batches in machine learning applications, as either inputs or outputs. This capability currently does not exist in TensorFlow; the tff.tf_computation decorator offers partial (and for now still experimental) support for it.

In general, we recommend using TensorFlow's native mechanisms for composition, such as `tf.function`, wherever possible, as the exact manner in which TFF's decorator interacts with eager functions can be expected to evolve.

Now, coming back to the example code snippet above, the computation add_half we just defined can be treated by TFF just like any other TFF computation. In particular, it has a TFF type signature.

In [197]:
print(top_1_compression.type_signature)

(float32[?] -> float32[?])


Note this type signature does not have placements: TensorFlow computations cannot consume or return federated types. We can now use `top_1_compression` as a building block in other computations. For example, here's how you can use the `tff.federated_map` operator to apply add_half pointwise to all member constituents of a federated float on client devices.

In [195]:
tff.federated_map?

[0;31mSignature:[0m [0mtff[0m[0;34m.[0m[0mfederated_map[0m[0;34m([0m[0mmapping_fn[0m[0;34m,[0m [0mvalue[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Maps a federated value pointwise using a mapping function.

The function `mapping_fn` is applied separately across the group of devices
represented by the placement type of `value`. For example, if `value` has
placement type `tff.CLIENTS`, then `mapping_fn` is applied to each client
individually. In particular, this operation does not alter the placement of
the federated value.

Args:
  mapping_fn: A mapping function to apply pointwise to member constituents of
    `value`. The parameter of this function must be of the same type as the
    member constituents of `value`.
  value: A value of a TFF federated type (or a value that can be implicitly
    converted into a TFF federated type, e.g., by zipping) placed at
    `tff.CLIENTS` or `tff.SERVER`.

Returns:
  A federated value with the same placement as `v

In [201]:
@tff.federated_computation(tff.FederatedType(COMPRESSOR_TYPE, tff.CLIENTS))
def federated_top_1_compression(gradients):
    return tff.federated_map(top_1_compression, gradients)


In [202]:
print(federated_top_1_compression.type_signature)

({float32[?]}@CLIENTS -> {float32[?]}@CLIENTS)


## Executing TensorFlow computations

Execution of computations defined with [`tff.tf_computation`](https://www.tensorflow.org/federated/api_docs/python/tff/tf_computation) follows the same rules as those we described for `tff.federated_computation`. They can be invoked as ordinary callables in Python, as follows.

In [198]:
top_1_compression([1, 2, 3, 4, -5])

array([ 0.,  0.,  0.,  0., -5.], dtype=float32)

In [203]:
federated_top_1_compression([[1, 2, 3, 4, -5], [10, 2, 3, 4, -5, 6], [1, 2, 3, 4]])

[<tf.Tensor: shape=(5,), dtype=float32, numpy=array([ 0.,  0.,  0.,  0., -5.], dtype=float32)>,
 <tf.Tensor: shape=(6,), dtype=float32, numpy=array([10.,  0.,  0.,  0.,  0.,  0.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([0., 0., 0., 4.], dtype=float32)>]

Once again, it is worth noting that invoking the computation `federated_top_1_compression` in this manner simulates a distributed process where data is consumed on clients and returned to those clients. Put another way each client performs a local computation; there is no `tff.SERVER` explicitly mentioned in this system (even if in practice, orchestrating such processing might involve one). Think of a computation defined this way as conceptually analogous to the "map" stage in a "map-reduce" computation.

Also, keep in mind that what we said in the preceding section about TFF computations getting serialized at the definition time remains true for `tff.tf_computation` code as well - the Python body of `federated_top_1_compression` gets traced once at definition time. On subsequent invocations, TFF uses its serialized representation.

The only difference between Python methods decorated with `tff.federated_computation` and those decorated with `tff.tf_computation` is that the latter are serialized as TensorFlow graphs (whereas the former are not allowed to contain TensorFlow code directly embedded in them).

Under the hood, each method decorated with `tff.tf_computation` temporarily disables eager execution in order to allow the computation's structure to be captured. While eager execution is locally disabled, you are welcome to use eager TensorFlow, AutoGraph, TensorFlow 2.0 constructs, etc., so long as you write the logic of your computation in a manner such that it can get correctly serialized.

# Working with tf.data.Datasets

As noted earlier, a unique feature of `tff.tf_computation` is that it allows you to work with `tf.data.Dataset` defined abstractly as formal parameters by your code. Parameters to be represented in TensorFlow as data sets need to be declared using the `tff.SequenceType` constructor.

For example, the type specification `tff.SequenceType(tf.float32)` defines an abstract sequence of float elements in TFF. Sequences can contain either tensors, or complex nested structures (we'll see examples of those later). The concise representation of a sequence of `T`-typed items is `T*`.

In [47]:
tff.SequenceType(tf.float32)

SequenceType(TensorType(tf.float32))

Suppose that in our temperature sensor example, each sensor holds not just one temperature reading, but multiple. Here's how you can define a TFF computation in TensorFlow that calculates the average of temperatures in a single local data set using the `tf.data.Dataset.reduce` operator.

In [204]:
@tff.tf_computation(tff.SequenceType(tf.float32))
def compute_local_average(temperatures):
    total, n_obs = temperatures.reduce((0.0, 0.0), lambda x, y: (x[0] + y, x[1] + 1))
    return total / n_obs


@tff.tf_computation(tff.SequenceType(tf.float32))
def compute_number_obs(temperatures):
    size = temperatures.reduce(0.0, lambda x, _: x + 1)
    return size

In [205]:
compute_local_average([1., 2., 3., 4., 5.])

3.0

In [206]:
compute_number_obs([1., 2., 3., 4., 5.])

5.0

In the body of a method decorated with `tff.tf_computation`, formal parameters of a TFF sequence type are represented simply as objects that behave like `tf.data.Dataset`, i.e., support the same properties and methods (they are currently not implemented as subclasses of that type - this may change as the support for data sets in TensorFlow evolves).

In [207]:
print(compute_local_average.type_signature)

(float32* -> float32)


# Putting it all together

Now, let's try again to use our TensorFlow computation in a federated setting. Suppose we have a group of sensors that each have a local sequence of temperature readings. We can compute the global temperature average by averaging the sensors' local averages as follows.

In [42]:
@tff.federated_computation(tff.FederatedType(tff.SequenceType(tf.float32), tff.CLIENTS))
def federated_local_average(sensor_readings):
    return tff.federated_map(compute_local_average, sensor_readings)


@tff.federated_computation(tff.FederatedType(tff.SequenceType(tf.float32), tff.CLIENTS))
def federated_number_obs(sensor_readings):
    return tff.federated_map(compute_number_obs, sensor_readings)


@tff.federated_computation(tff.FederatedType(tff.SequenceType(tf.float32), tff.CLIENTS))
def federated_global_average(sensor_readings):
    weights = federated_number_obs(sensor_readings)
    return tff.federated_mean(federated_local_average(sensor_readings), weights)


Also note that the input to `federated_global_average` now becomes a federated float sequence. Federated sequences is how we will typically represent on-device data in federated learning, with sequence elements typically representing data batches (you will see examples of this shortly).

In [43]:
print(federated_local_average.type_signature)

({float32*}@CLIENTS -> {float32}@CLIENTS)


In [45]:
print(federated_number_obs.type_signature)

({float32*}@CLIENTS -> {float32}@CLIENTS)


In [46]:
print(federated_global_average.type_signature)

({float32*}@CLIENTS -> float32@SERVER)


In [47]:
federated_global_average([[68.0, 70.0], [71.0], [68.0, 72.0, 70.0]])

69.833336

In [48]:
# quick check by hand
(2 * 69 + 71 + 3 * 70) / 6

69.83333333333333