From 93066043e16c1fc6754fcdf5df1f7a7275a1062d Mon Sep 17 00:00:00 2001
From: Robert Lucian Chiriac <robert.lucian.chiriac@gmail.com>
Date: Fri, 19 Jun 2020 01:27:57 +0300
Subject: [PATCH 1/7] Add troubleshooting guide for TF session

---
 docs/summary.md                               |  1 +
 .../tf-session-in-different-thread.md         | 56 +++++++++++++++++++
 2 files changed, 57 insertions(+)
 create mode 100644 docs/troubleshooting/tf-session-in-different-thread.md

diff --git a/docs/summary.md b/docs/summary.md
index e4652889e6..45781f8258 100644
--- a/docs/summary.md
+++ b/docs/summary.md
@@ -45,6 +45,7 @@
 
 * [API is stuck updating](troubleshooting/stuck-updating.md)
 * [NVIDIA runtime not found](troubleshooting/nvidia-container-runtime-not-found.md)
+* [TF session called in predict method](troubleshooting/tf-session-in-different-thread.md)
 
 ## Guides
 
diff --git a/docs/troubleshooting/tf-session-in-different-thread.md b/docs/troubleshooting/tf-session-in-different-thread.md
new file mode 100644
index 0000000000..42c93ff4b1
--- /dev/null
+++ b/docs/troubleshooting/tf-session-in-different-thread.md
@@ -0,0 +1,56 @@
+# TensorFlow session called in predict method
+
+_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_
+
+## Context
+
+When doing inferences with TensorFlow using the Python Predictor, it has to be noted that Python Predictor constructor and its `predict` method run on different threads. This means that when the program enters the `predict` method, the current session which has presumably been saved as an attribute inside the [`PythonPredictor`](../deployments/predictors.md#python-predictor) object won't point to the default graph the session has been initialized with. 
+
+The error you will get as a consequence of having run the 2 methods (constructor and `predict` method) in different threads is:
+`
+TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(1, ?), dtype=int32) is not an element of this graph.
+`
+
+## Use session.graph.as_default()
+
+For this error to be avoided, you need to set the default graph and session right before running the prediction in the `predict` method:
+```python
+def predict(self, payload):
+    with self.sess.graph.as_default():
+        # your implementation of predict_on_session
+        self.prediction = predict_on_sess(sess, payload)
+    return self.prediction
+```
+
+## Is it slow?
+
+It has been observed that calling `self.session.graph.as_default()` takes about *0.4 microseconds* on average. This works for any value of `threads_per_worker`.
+
+The following only applies when `threads_per_worker` is set to 1. Use this when you want to have a minimal computational cost. For this, you can have a separate method that loads the default session and graph once for the given thread. Here's one approach:
+```python
+class PythonPredictor:
+    def __init__(self, config):
+        self.sess = tf.Session(...)
+        self.default_graph_loaded = False
+
+    def load_graph_if_not_present(self):
+        """
+        Placeholder for "with self.sess.graph.as_default()"
+
+        Relevant sources of information:
+        https://stackoverflow.com/a/49468139/2096747
+        https://stackoverflow.com/a/57397201/2096747
+        https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/client/session.py#L1591-L1601
+        """
+        if not self.default_graph_loaded:
+            self.sess.__enter__()
+            self.default_graph_loaded = True
+
+    def predict(self, payload):
+        self.load_graph_if_not_present()
+        # your implementation of predict_on_session
+        prediction = predict_on_sess(self.sess, payload)
+        return prediction
+```
+
+The above implementation calls `__enter__()` method only once and after that it only checks if a flag is set. The computational cost for this is minimal, and therefore, a higher throughput could be achieved. This implementation is about 2 orders of magnitude faster than the other one.
\ No newline at end of file

From e98639d75d60436806c4d83cc8630c0d06bbfa2a Mon Sep 17 00:00:00 2001
From: Robert Lucian Chiriac <robert.lucian.chiriac@gmail.com>
Date: Fri, 19 Jun 2020 01:31:02 +0300
Subject: [PATCH 2/7] Add spaces before the indented code

---
 docs/troubleshooting/tf-session-in-different-thread.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/docs/troubleshooting/tf-session-in-different-thread.md b/docs/troubleshooting/tf-session-in-different-thread.md
index 42c93ff4b1..d0e9cbbfdc 100644
--- a/docs/troubleshooting/tf-session-in-different-thread.md
+++ b/docs/troubleshooting/tf-session-in-different-thread.md
@@ -4,7 +4,7 @@ _WARNING: you are on the master branch, please refer to the docs on the branch t
 
 ## Context
 
-When doing inferences with TensorFlow using the Python Predictor, it has to be noted that Python Predictor constructor and its `predict` method run on different threads. This means that when the program enters the `predict` method, the current session which has presumably been saved as an attribute inside the [`PythonPredictor`](../deployments/predictors.md#python-predictor) object won't point to the default graph the session has been initialized with. 
+When doing inferences with TensorFlow using the Python Predictor, it has to be noted that Python Predictor constructor and its `predict` method run on different threads. This means that when the program enters the `predict` method, the current session which has presumably been saved as an attribute inside the [`PythonPredictor`](../deployments/predictors.md#python-predictor) object won't point to the default graph the session has been initialized with.
 
 The error you will get as a consequence of having run the 2 methods (constructor and `predict` method) in different threads is:
 `
@@ -15,6 +15,7 @@ TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:
 
 For this error to be avoided, you need to set the default graph and session right before running the prediction in the `predict` method:
 ```python
+
 def predict(self, payload):
     with self.sess.graph.as_default():
         # your implementation of predict_on_session
@@ -27,6 +28,7 @@ def predict(self, payload):
 It has been observed that calling `self.session.graph.as_default()` takes about *0.4 microseconds* on average. This works for any value of `threads_per_worker`.
 
 The following only applies when `threads_per_worker` is set to 1. Use this when you want to have a minimal computational cost. For this, you can have a separate method that loads the default session and graph once for the given thread. Here's one approach:
+
 ```python
 class PythonPredictor:
     def __init__(self, config):
@@ -53,4 +55,4 @@ class PythonPredictor:
         return prediction
 ```
 
-The above implementation calls `__enter__()` method only once and after that it only checks if a flag is set. The computational cost for this is minimal, and therefore, a higher throughput could be achieved. This implementation is about 2 orders of magnitude faster than the other one.
\ No newline at end of file
+The above implementation calls `__enter__()` method only once and after that it only checks if a flag is set. The computational cost for this is minimal, and therefore, a higher throughput could be achieved. This implementation is about 2 orders of magnitude faster than the other one.

From 1ba3d9ea84ae1224143341aced201de100c851a8 Mon Sep 17 00:00:00 2001
From: Robert Lucian Chiriac <robert.lucian.chiriac@gmail.com>
Date: Fri, 19 Jun 2020 01:34:02 +0300
Subject: [PATCH 3/7] Add another hyperlink to PythonPredictor's docs

---
 docs/troubleshooting/tf-session-in-different-thread.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/troubleshooting/tf-session-in-different-thread.md b/docs/troubleshooting/tf-session-in-different-thread.md
index d0e9cbbfdc..8222961c07 100644
--- a/docs/troubleshooting/tf-session-in-different-thread.md
+++ b/docs/troubleshooting/tf-session-in-different-thread.md
@@ -4,7 +4,7 @@ _WARNING: you are on the master branch, please refer to the docs on the branch t
 
 ## Context
 
-When doing inferences with TensorFlow using the Python Predictor, it has to be noted that Python Predictor constructor and its `predict` method run on different threads. This means that when the program enters the `predict` method, the current session which has presumably been saved as an attribute inside the [`PythonPredictor`](../deployments/predictors.md#python-predictor) object won't point to the default graph the session has been initialized with.
+When doing inferences with TensorFlow using the [Python Predictor](../deployments/predictors.md#python-predictor), it has to be noted that Python Predictor constructor and its `predict` method run on different threads. This means that when the program enters the `predict` method, the current session which has presumably been saved as an attribute inside the [`PythonPredictor`](../deployments/predictors.md#python-predictor) object won't point to the default graph the session has been initialized with.
 
 The error you will get as a consequence of having run the 2 methods (constructor and `predict` method) in different threads is:
 `

From 69ea1d520c01ddaf98a50a6da1a8fc4037297ec7 Mon Sep 17 00:00:00 2001
From: Robert Lucian Chiriac <robert.lucian.chiriac@gmail.com>
Date: Fri, 19 Jun 2020 01:50:33 +0300
Subject: [PATCH 4/7] Change filename of troubleshooting page

---
 docs/summary.md                                                 | 2 +-
 ...n-in-different-thread.md => tf-session-in-predict-method.md} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename docs/troubleshooting/{tf-session-in-different-thread.md => tf-session-in-predict-method.md} (100%)

diff --git a/docs/summary.md b/docs/summary.md
index 45781f8258..7094dddf78 100644
--- a/docs/summary.md
+++ b/docs/summary.md
@@ -45,7 +45,7 @@
 
 * [API is stuck updating](troubleshooting/stuck-updating.md)
 * [NVIDIA runtime not found](troubleshooting/nvidia-container-runtime-not-found.md)
-* [TF session called in predict method](troubleshooting/tf-session-in-different-thread.md)
+* [TF session called in predict method](troubleshooting/tf-session-in-predict-method.md)
 
 ## Guides
 
diff --git a/docs/troubleshooting/tf-session-in-different-thread.md b/docs/troubleshooting/tf-session-in-predict-method.md
similarity index 100%
rename from docs/troubleshooting/tf-session-in-different-thread.md
rename to docs/troubleshooting/tf-session-in-predict-method.md

From df4c25252a35292e1c214142e9dcd66cf473eb06 Mon Sep 17 00:00:00 2001
From: Robert Lucian Chiriac <robert.lucian.chiriac@gmail.com>
Date: Wed, 24 Jun 2020 07:49:55 +0300
Subject: [PATCH 5/7] Edit troubleshooting to include #1146

---
 .../tf-session-in-predict-method.md           | 42 ++++---------------
 1 file changed, 7 insertions(+), 35 deletions(-)

diff --git a/docs/troubleshooting/tf-session-in-predict-method.md b/docs/troubleshooting/tf-session-in-predict-method.md
index 8222961c07..72d85adcde 100644
--- a/docs/troubleshooting/tf-session-in-predict-method.md
+++ b/docs/troubleshooting/tf-session-in-predict-method.md
@@ -11,9 +11,15 @@ The error you will get as a consequence of having run the 2 methods (constructor
 TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(1, ?), dtype=int32) is not an element of this graph.
 `
 
+## Use one-threaded processes
+
+When `threads_per_process` is set to 1 and `processes_per_replica` >= 1, both the constructor and the `predict` method of Python Predictor run on the same thread, meaning that the above error won't occur anymore. The downside to this is that the API replica is limited to 1 thread per process.
+
+For it to work with multiple threads, check the following section.
+
 ## Use session.graph.as_default()
 
-For this error to be avoided, you need to set the default graph and session right before running the prediction in the `predict` method:
+For this error to be avoided on any number of threads, you need to set the default graph and session right before running the prediction in the `predict` method:
 ```python
 
 def predict(self, payload):
@@ -22,37 +28,3 @@ def predict(self, payload):
         self.prediction = predict_on_sess(sess, payload)
     return self.prediction
 ```
-
-## Is it slow?
-
-It has been observed that calling `self.session.graph.as_default()` takes about *0.4 microseconds* on average. This works for any value of `threads_per_worker`.
-
-The following only applies when `threads_per_worker` is set to 1. Use this when you want to have a minimal computational cost. For this, you can have a separate method that loads the default session and graph once for the given thread. Here's one approach:
-
-```python
-class PythonPredictor:
-    def __init__(self, config):
-        self.sess = tf.Session(...)
-        self.default_graph_loaded = False
-
-    def load_graph_if_not_present(self):
-        """
-        Placeholder for "with self.sess.graph.as_default()"
-
-        Relevant sources of information:
-        https://stackoverflow.com/a/49468139/2096747
-        https://stackoverflow.com/a/57397201/2096747
-        https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/client/session.py#L1591-L1601
-        """
-        if not self.default_graph_loaded:
-            self.sess.__enter__()
-            self.default_graph_loaded = True
-
-    def predict(self, payload):
-        self.load_graph_if_not_present()
-        # your implementation of predict_on_session
-        prediction = predict_on_sess(self.sess, payload)
-        return prediction
-```
-
-The above implementation calls `__enter__()` method only once and after that it only checks if a flag is set. The computational cost for this is minimal, and therefore, a higher throughput could be achieved. This implementation is about 2 orders of magnitude faster than the other one.

From c569d98bce618188c3ed9c9a22ac32afa2d08937 Mon Sep 17 00:00:00 2001
From: David Eliahu <deliahu@users.noreply.github.com>
Date: Thu, 25 Jun 2020 11:56:23 -0700
Subject: [PATCH 6/7] Update docs

---
 docs/summary.md                               |  2 +-
 .../tf-session-in-predict-method.md           | 30 -------------------
 docs/troubleshooting/tf-session-in-predict.md | 20 +++++++++++++
 3 files changed, 21 insertions(+), 31 deletions(-)
 delete mode 100644 docs/troubleshooting/tf-session-in-predict-method.md
 create mode 100644 docs/troubleshooting/tf-session-in-predict.md

diff --git a/docs/summary.md b/docs/summary.md
index a91835e291..10c2b8bdea 100644
--- a/docs/summary.md
+++ b/docs/summary.md
@@ -47,7 +47,7 @@
 
 * [API is stuck updating](troubleshooting/stuck-updating.md)
 * [NVIDIA runtime not found](troubleshooting/nvidia-container-runtime-not-found.md)
-* [TF session called in predict method](troubleshooting/tf-session-in-predict-method.md)
+* [TF session in predict()](troubleshooting/tf-session-in-predict.md)
 
 ## Guides
 
diff --git a/docs/troubleshooting/tf-session-in-predict-method.md b/docs/troubleshooting/tf-session-in-predict-method.md
deleted file mode 100644
index 72d85adcde..0000000000
--- a/docs/troubleshooting/tf-session-in-predict-method.md
+++ /dev/null
@@ -1,30 +0,0 @@
-# TensorFlow session called in predict method
-
-_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_
-
-## Context
-
-When doing inferences with TensorFlow using the [Python Predictor](../deployments/predictors.md#python-predictor), it has to be noted that Python Predictor constructor and its `predict` method run on different threads. This means that when the program enters the `predict` method, the current session which has presumably been saved as an attribute inside the [`PythonPredictor`](../deployments/predictors.md#python-predictor) object won't point to the default graph the session has been initialized with.
-
-The error you will get as a consequence of having run the 2 methods (constructor and `predict` method) in different threads is:
-`
-TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(1, ?), dtype=int32) is not an element of this graph.
-`
-
-## Use one-threaded processes
-
-When `threads_per_process` is set to 1 and `processes_per_replica` >= 1, both the constructor and the `predict` method of Python Predictor run on the same thread, meaning that the above error won't occur anymore. The downside to this is that the API replica is limited to 1 thread per process.
-
-For it to work with multiple threads, check the following section.
-
-## Use session.graph.as_default()
-
-For this error to be avoided on any number of threads, you need to set the default graph and session right before running the prediction in the `predict` method:
-```python
-
-def predict(self, payload):
-    with self.sess.graph.as_default():
-        # your implementation of predict_on_session
-        self.prediction = predict_on_sess(sess, payload)
-    return self.prediction
-```
diff --git a/docs/troubleshooting/tf-session-in-predict.md b/docs/troubleshooting/tf-session-in-predict.md
new file mode 100644
index 0000000000..683e032796
--- /dev/null
+++ b/docs/troubleshooting/tf-session-in-predict.md
@@ -0,0 +1,20 @@
+# Using TensorFlow session in predict method
+
+_WARNING: you are on the master branch, please refer to the docs on the branch that matches your `cortex version`_
+
+When doing inferences with TensorFlow using the [Python Predictor](../deployments/predictors.md#python-predictor), it should be noted that your Python Predictor's `__init__()` constructor is only called on one thread, whereas its `predict()` method can run on any of the available threads (which is configured via the `threads_per_process` field in the API's `predictor` configuration). If `threads_per_process` is set to `1` (the default value), then there is no concern, since `__init__()` and `predict()` will run on the same thread. However, if `threads_per_process` is greater than `1`, then only one of the inference threads will have executed the `__init__()` function. This can cause issues with TensorFlow because the default graph is a property of the current thread, so if `__init__()` initializes the TensorFlow graph, only the thread that executed `__init__()` will have the default graph set.
+
+The error you may see if the default graph is not set (as a consequence of `__init__()` and `predict()` running in separate threads) is:
+
+```text
+TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(1, ?), dtype=int32) is not an element of this graph.
+```
+
+To avoid this error, you can set the default graph and session before running the prediction in the `predict()` method:
+
+```python
+
+def predict(self, payload):
+    with self.sess.graph.as_default():
+        # perform your inference here
+```

From 5b3dd03d34e7fb7530dd24044032e949b660e842 Mon Sep 17 00:00:00 2001
From: Robert Lucian Chiriac <robert.lucian.chiriac@gmail.com>
Date: Thu, 25 Jun 2020 22:37:18 +0300
Subject: [PATCH 7/7] Remove session word in troubleshooting guide

---
 docs/troubleshooting/tf-session-in-predict.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/troubleshooting/tf-session-in-predict.md b/docs/troubleshooting/tf-session-in-predict.md
index 683e032796..fe724927a3 100644
--- a/docs/troubleshooting/tf-session-in-predict.md
+++ b/docs/troubleshooting/tf-session-in-predict.md
@@ -10,7 +10,7 @@ The error you may see if the default graph is not set (as a consequence of `__in
 TypeError: Cannot interpret feed_dict key as Tensor: Tensor Tensor("Placeholder:0", shape=(1, ?), dtype=int32) is not an element of this graph.
 ```
 
-To avoid this error, you can set the default graph and session before running the prediction in the `predict()` method:
+To avoid this error, you can set the default graph before running the prediction in the `predict()` method:
 
 ```python