본 내용은 [tensorflow probability](https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/Ch2_MorePyMC_TFP.ipynb)를 기반으로 정리함.

### tensorflow-probability
- tf 2.2 & tfp 0.10 을 기본 버전으로 사용함.
- 데이터 시각화는 plotly를 사용하여 구현함.

In [1]:
## Basics
from __future__ import absolute_import, division, print_function
warning_status = "ignore" #@param ["ignore", "always", "module", "once", "default", "error"]
import warnings
warnings.filterwarnings(warning_status)
with warnings.catch_warnings():
    warnings.filterwarnings(warning_status, category=DeprecationWarning)
    warnings.filterwarnings(warning_status, category=UserWarning)
    
## python packages
import os
import numpy as np

## visualization packages
import plotly.express as px
import plotly.graph_objs as go
import plotly.figure_factory as ff
from plotly.offline import iplot

## import tensorflow
import tensorflow as tf
import tensorflow_probability as tfp
tfd = tfp.distributions
tfb = tfp.bijectors

## Color map
class _TFColor(object):
    """Enum of colors used in TF docs."""
    red = '#F15854'
    blue = '#5DA5DA'
    orange = '#FAA43A'
    green = '#60BD68'
    pink = '#F17CB0'
    brown = '#B2912F'
    purple = '#B276B2'
    yellow = '#DECF3F'
    gray = '#4D4D4D'
    def __getitem__(self, i):
        return [
            self.red,
            self.orange,
            self.green,
            self.blue,
            self.pink,
            self.brown,
            self.purple,
            self.yellow,
            self.gray,
        ][i % 9]
TFColor = _TFColor()

print(tf.__version__)
print(tfp.__version__)

2.2.0
0.10.0


**Usage of XLA(Accelerated Linear Algebra)**

- XLA is a domain-specific compiler for linear algebra that optimizes Tensorflow computations.
- tensor 연산을 하는 파이썬 함수에 아래와 같은 `tf.function` 데코레이터를 사용하면 됨.
```python
@tf.function(experimental_compile=True)
def train_mnist(images, labels):
    images, labels = cast(images, labels)

    with tf.GradientTape() as tape:
        predicted_labels = layer(images)
        loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
          logits=predicted_labels, labels=labels
        ))
    layer_variables = layer.trainable_variables
    grads = tape.gradient(loss, layer_variables)
    optimizer.apply_gradients(zip(grads, layer_variables))
```
- [mnist 예시에 대하여 xla 성능 실험](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/g3doc/tutorials/compile.ipynb)
    - `TRAIN_STEPS=10000` with 2019 macbook AIR
    - tf2 $\longrightarrow$ 97초
    - tf2 with xla $\longrightarrow$ 37초


---

# 2.1 서론
tensorflow-probability 재구성함.
## 2.1.1 부모와 자식관계
베이지안 확률론에서 랜덤변수(Random Variable)들 간의 관계는 부모와 자식 간의 관계로 설명할 수 있다.
- **부모변수**는 다른 랜덤변수에 영향을 주는 랜덤변수다.
- **자식변수**는 다른 랜덤변수에 영향을 받는 랜덤변수다. 즉, 부모변수에 종속되는(dependent) 랜덤변수다.
- 모든 변수는 부모변수와 동시에 자식변수가 될 수 있다.

### 랜덤변수들 간의 관계

In [2]:
rv_lambda_ = tfd.Exponential(rate=1., name='poisson_param')
lambda_ = rv_lambda_.sample()
rv_data_generator = tfd.Poisson(lambda_, name='data_generator')
data_generator = rv_data_generator.sample()
print("Value of sample from data generator random variable\n\n", data_generator)
print()
data_plus_one = data_generator + 1
print("data_generator plus one\n\n", data_plus_one)

Value of sample from data generator random variable

 tf.Tensor(1.0, shape=(), dtype=float32)

data_generator plus one

 tf.Tensor(2.0, shape=(), dtype=float32)


- `rv_lambda_` $\longrightarrow$ `rv_data_genrator`
    - `rv_lambda_`은 `data_genrator`의 파라미터(모수)를 좌우하므로 `rv_data_genrator`의 부모변수다.
    - 반대로 `rv_data_genrator`은 `rv_lambda_`의 자식변수다

- `rv_data_genrator` $\longrightarrow$ `data_plus_one`의 형태를 tfp.distribution의 형태로 정의할 수 있나?
- tfp에서 부모, 자식 간의 랜덤 변수 관계를 명시적으로 표현할 수 있나?

## 2.1.2 The variables of TensorFlow Probability

### TFP Distributions

- stochastic 변수
    - `tfp.distributions`에 대한 subclass들을 호라용해서 stochastic한 random variable들을 표현함.
    - `Poisson`, `Uniform`, `Exponential` 등이 있음.
- deterministic 변수
    - stochastic 변수에서 random하게 sample된 변수들.
    - tfp에서는 `tf.Tensors`임.

#### Initializing a Distribution (stochastic)

In [4]:
my_distribution = tfd.Uniform(low=0., high=4.)

In [18]:
my_distribution.sample()

<tf.Tensor: shape=(), dtype=float32, numpy=1.3115563>

#### Initializing a Distribution (determistic)

In [19]:
lambda_1 = tfd.Exponential(rate=1., name="lambda_1") #stochastic variable
lambda_2 = tfd.Exponential(rate=1., name="lambda_2") #stochastic variable
tau = tfd.Uniform(name="tau", low=0., high=10.) #stochastic variable

# deterministic variable since we are getting results of lambda's after sampling    
new_deterministic_variable = tfd.Deterministic(name="deterministic_variable", 
                                               loc=(lambda_1.sample() + lambda_2.sample()))

In [29]:
new_deterministic_variable.sample()

<tf.Tensor: shape=(), dtype=float32, numpy=2.0690525>

The use of the deterministic variable was seen in the previous chapter's text-message example.  Recall the model for $\lambda$ looked like: 

$$
\lambda = 
\begin{cases}\lambda_1  & \text{if } t \lt \tau \cr
\lambda_2 & \text{if } t \ge \tau
\end{cases}
$$

And in TFP code:

In [32]:
# Build graph

# days
n_data_points = 5  # in CH1 we had ~70 data points
idx = np.arange(n_data_points)
# for n_data_points samples, select from lambda_2 if sampled tau >= day value, lambda_1 otherwise
rv_lambda_deterministic = tfd.Deterministic(tf.gather([lambda_1.sample(), lambda_2.sample()],
                    indices=tf.to_int32(
                        tau.sample() >= idx)))
lambda_deterministic = rv_lambda_deterministic.sample()

# Execute graph
[lambda_deterministic_] = evaluate([lambda_deterministic])

# Show results

print("{} samples from our deterministic lambda model: \n".format(n_data_points), lambda_deterministic_ )

TypeError: cast() missing 1 required positional argument: 'dtype'

Clearly, if $\tau, \lambda_1$ and $\lambda_2$ are known, then $\lambda$ is known completely, hence it is a deterministic variable. We use indexing here to switch from $\lambda_1$ to $\lambda_2$ at the appropriate time. 