diff --git a/sdk/cosmos/azure-cosmos/docs/ClientRetryPolicy.png b/sdk/cosmos/azure-cosmos/docs/ClientRetryPolicy.png
index 3afd18b7c0977..1ba9a6d18b21c 100644
Binary files a/sdk/cosmos/azure-cosmos/docs/ClientRetryPolicy.png and b/sdk/cosmos/azure-cosmos/docs/ClientRetryPolicy.png differ
diff --git a/sdk/cosmos/azure-cosmos/docs/ErrorCodesAndRetries.md b/sdk/cosmos/azure-cosmos/docs/ErrorCodesAndRetries.md
index ee6f0ceb3f0d1..0b3bde6b62743 100644
--- a/sdk/cosmos/azure-cosmos/docs/ErrorCodesAndRetries.md
+++ b/sdk/cosmos/azure-cosmos/docs/ErrorCodesAndRetries.md
@@ -1,16 +1,18 @@
 ## Cosmos DB Java SDK – Detailing Exceptions and Retries 
  
-| Status code  | Cause of exception and retry behavior     |
-| :--- | :--- |
-| 400    | For all operations: </br><ul><li> This exception is encountered when the request is invalid, which could be for any of the following reasons: </br><ul><li>Syntax error in query text</li><li>Malformed JSON document for a write request</li><li>Incorrectly formatted REST API request body etc.</li></ul></li><li>The client does NOT retry the request when a Bad Request (400) exception is thrown by the server.</li></ul>                        |
-| 401   | For all operations: </br><ul><li> This is an unauthorized exception due to invalid auth tokens being used for the request. The client does NOT retry requests when this exception is encountered.</li></ul>                   |
-| 403   | For all operations: </br><ul><li> This is a forbidden exception due to invalid permissions and the client does NOT retry requests when a 403 is encountered. </li></ul>                    |
+| Status code | Cause of exception and retry behavior     |
+|:------------| :--- |
+| 400         | For all operations: </br><ul><li> This exception is encountered when the request is invalid, which could be for any of the following reasons: </br><ul><li>Syntax error in query text</li><li>Malformed JSON document for a write request</li><li>Incorrectly formatted REST API request body etc.</li></ul></li><li>The client does NOT retry the request when a Bad Request (400) exception is thrown by the server.</li></ul>                        |
+| 401         | For all operations: </br><ul><li> This is an unauthorized exception due to invalid auth tokens being used for the request. The client does NOT retry requests when this exception is encountered.</li></ul>                   |
+| 403/3       | <ul><li>For Write Operations: </br><ul><li>Client will refresh database account and then retry. </li></ul></li></ul> 
+| 403/1008    | <ul><li>For Read Operations: </br><ul><li>Client will refresh database account and then retry. </li></ul></li></ul>
+| 403/Others  | For all operations: </br><ul><li> This is a forbidden exception due to invalid permissions and the client does NOT retry requests when a 403 is encountered. </li></ul>                    |
 | 404/1002    | <ul><li>For Write Operations: </br><ul><li>For a **single-write region**, is only applicable for read operations. </li><li>For a **multi-write region**, SDK will retry on the same region few times, then retry on different regions. </li></ul></li><li>For Query Operations: </br><ul><li>When using Session Consistency: </br><ul><li>The Cosmos DB SDK does retry the <br/>read request against a second replica for the partition (in the same region) with the specified session token. </li><li>If the second replica also throws a 404: </br><ul><li>If there are additional regions for the Cosmos DB account: </br><ul><li>For a single-write region account, the client retries the request against the write region for the account (if the first request targeted the read region). The client will follow the same path of targeting 1 replica followed by a retry against another replica if the read against the first replica resulted in a 404. </li><li>If all regions have 2 replicas that returned a 404, the exception is bubbled back up to the calling code. </li></ul></li></ul></li></ul></li><li>When using all other Consistency Levels: </br><ul><li>N/A as a query operation will return an empty set instead of a Resource Not Found (404) exception. </li></ul></li></ul></li><li>For Point Read Operations: </br><ul><li>When using Eventual Consistency: </br><ul><li>The SDK sends the read request to a single replica for the partition. If the replica does not contain the data, a 404 (Resource Not Found) exception is thrown.</li><li>The Cosmos DB SDK does not retry this exception and bubbles the exception to the calling code. This is because Eventual consistency favors latency over data consistency.  </li></ul></li><li>When using Session Consistency: </br><ul><li>The SDK sends the read request to a single replica for the partition along with a session token. If the replica does not contain data that is more recent than the specified session token, a 404 (Resource Not Found) exception is thrown. </br><ul><li>The Cosmos DB SDK does retry the read request against a second replica for the partition (in the same region) with the specified session token. </li><li>If the second replica also throws a 404: </br><ul><li>If there are additional regions for the Cosmos DB account: </br><ul><li>For a single-write region account, the client retries the request against the write region for the account (if the first request targeted the read region). The client will follow the same path of targeting 1 replica followed by a retry against another replica if the read against the first replica resulted in a 404. </li><li>If all regions have 2 replicas that returned a 404, the exception is bubbled back up to the calling code. </li></ul></li><ul></li></ul></li></ul></li><li>When using Bounded Staleness and Strong Consistency: </br><ul><li>The SDK sends the read request to 2 replicas for the partition in the specified region. </li><li>If one of the 2 replicas returns a 404, data from the second replica is returned by the SDK. </li><li>If both replicas return a 404, the exception is bubbled up to the application. </li><li>The Cosmos DB SDK does NOT retry the read against a remote region is both replicas in the specified region return a 404 for the point read operation. </li></ul></li></ul></li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
-| 408    | <ul><li>For Write Operations: <br><ul><li>Timeout exceptions can be encountered by both the client as well as the server. </li><li>Client-side timeouts are manifested internally as 410 exceptions due to intermittent network connectivity issues and follow the path specified below in the section detailing 410 status codes. </li><li>Server-side timeout exceptions are not retried for write operations as it is not possible to determine if the write was in fact successfully committed on the server. To avoid overwriting a previously committed write, the client does NOT retry this operation and bubbles up the exception to the application as a Request Timeout Exception (408). </li><li>For a client-generate timeout exception, there are two scenarios: </br><ul><li>The request was sent over the wire to the server by the client, but the network request timeout exceeded, while waiting for a response. In this case, it is unclear if the write was received and committed by the server and thus, the operation is not retried. A timeout exception is bubbled up to the application by the client. </li><li>The request was not sent over the wire to the server which resulted in a client-generated timeout. If this is due to a network error: </li></ul></li><li>For a server-generated timeout exception: The client DOES NOT retry.</li></ul></li><li>For Query and Point Read Operations:</br><ul><li>The request is retried locally for up to 30 seconds with an exponential backoff for subsequent retries. If all retries are exhausted, the client bubbles up the exception back to the application as a Request Timeout Exception (408). </li></ul> </li></ul>    |
-| 409    | <ul><li>For Write Operations: </br><ul><li>This exception occurs when an attempt is made by the application to Create/Insert an Item that already exists.</li><li>This exception can occur regardless of the Consistency level set for the account. </li><li>This exception can occur for write operations when an attempt is made to create an existing item or when a unique key constraint violation occurs. </li><li>The client does NOT retry on Conflict exceptions </li></ul></li><li>For Query Operations: </br><ul><li>N/A as this exception is only encountered for Create/Insert operations. </li></ul></li><li>For Point Read Operations: <br><ul><li>N/A as this exception is only encountered for Create/Insert operations. </li></ul></li></ul>    |
-| 410  | <ul><li>For both read and write operations, a 410 (Gone Exception) can be thrown in the following scenarios: </br><ul><li>When a partition is split (or merged in the future) and no longer exists </li><li>When a replica has been moved to another address. Replicas are moved to ensure load balancing of request volume on the server. However, this is a very rarely triggered operation.  </li></ul></li><li>For Write Operations: <br><ul><li>A 410 can be thrown by both the client as well as the server.   </li><li>For a server-generated 410: </br><ul><li>The client retries the write operation after triggering an address resolution operation. Retries are executed for up to 30 seconds with an exponential back off retry between successive retries. </li></ul><li>For a client-generated 410: </br><ul><li>For a client-generated 410 when the request was **NOT** sent over the wire: </br><ul><li>**For a single-write region account**, the client retries the request in the local region for up to 30 seconds with an exponential back off between successive retries. After all the retries are exhausted, the exception is bubbled up to the application as Service Unavailable Exception (503). </li><li>**For a multi-write region account**, the client first retries the request for up to 30 seconds in the local region with an exponential back off between successive retries. After all the retries in the local region have been exhausted, the client retries the request against the next region in the list of Preferred Locations (and if **usingMultipleWriteLocations** is set to true). If all the retries in the next region also result in 410 exceptions, the exception is bubbled up to the application as a Service Unavailable exception (503). </li></ul><li>For a client-generated 410 when the request **WAS** sent over the wire: </br><ul><li>The client does NOT retry these requests as it cannot be determined if the server received the request and committed the write operation. </li><li>Thus, when the 410 was due to a networking timeout when waiting on a response from the server, the client bubbles up the exception to the application as a Request Timeout (408). </li><li>If the 410 was due to a network connectivity issue after sending the request over the write, the client bubbles up the exception to the application as a Service Unavailable Exception (503). </li></ul></li></li></ul></li></ul></li><li>For Read Operations: </br><ul><li>When using Eventual Consistency: </br><ul><li>The client triggers an Address Resolution to refresh the addresses of the replicas for the partition. However, as of today this does not refresh the connection itself. The connection is refreshed when the first request to that endpoint is made. This is done for performance reasons and to ensure the number of established connections does not exceed the limits of the VM. </li><li>The client then retries the read request against a random replica for the partition. </li><li>The previous two steps are repeated if they continue to see Gone Exceptions for a maximum of 30 seconds (when using Direct Mode) and 60 seconds (when using Gateway mode). </li><li>This exception typically occurs due to intermittent network connectivity to the server. </br><ul><li>**For a multi-region account**, after all the retries against the local region are exhausted, the exception is retried against the next region for the account (for a multi-region account). The order in which regions are selected is based on the list of Preferred Locations, configured in the client. </br><ul><li>If all retries are exhausted against all subsequent regions for the account, then the exception is bubbled up to the application as a Service Unavailable Exception (503). </li></ul></li></ul><li>Important points to note about the request timeout value that is configured by the application when the Cosmos client is initialized: </br><ul><li>The setting is only applicable to calls using Direct Mode. </li><li>Gateway mode timeout settings are not exposed externally and controlled internally by the client as there are additional internal operations that rely on the Gateway and an incorrect timeout setting can lead to adverse side effects. </li><li>The range of possible values that can be set for network timeouts in Direct Mode are between 5 and 10 seconds (inclusive). </li><li>This timeout setting applies to each network request. </li><li>Thus, each retry (in a scenario with multiple retries issued by the client) will have its own timer. The timeout duration is not cumulative across all retries.</li></ul></li></li></ul></li></ul></li><li>When using Session Consistency: </br><ul><li>Same behavior as Eventual Consistency with one addition: the client will first retry the request on other replicas.</li></ul></li><li>When using Bounded Staleness or Strong Consistency: </br><ul><li>A maximum of 60 seconds is spent retrying the request if needed.</li><li>All other behavior is the same as Session Consistency </li></ul></li></ul>     |
-| 412  | <ul><li>For Write Operations: </br><ul><li>This exception is encountered when the etag that is sent to the server for validation prior to updating an Item, does not match the etag of the Item on the server. </li><li>The client does NOT retry this operation locally or against any of the remote regions for the account as retries would not help alleviate the etag mismatch. </li><li>The application would need to trigger a retry by first reading the Item, fetching the latest etag and issuing the Upsert/Replace operation. </br><ul><li>This operation can continue to fail with the same exception when multiple updates are executed concurrently for the same Item. </li><li>An upper bound on the number of retries before handing off the Item to a dead letter queue should be implemented by the application. </li></ul></li></ul></li><li>For Query and point read Operations: </br><ul><li>N/A as this exception is only encountered for Create/Insert/Replace/Upsert operations. </li></ul></li></ul>    |
-| 429  | For all Operations: </br><ul><li>By default, the client retries the request for a maximum of 9 times (or for a maximum of 30 seconds, whichever limit is reached first). </li><li>The client can also be initialized with a custom retry policy, which overrides the two limits mentioned above. </li><li>After all the retries are exhausted, the client bubbles up the exception to the application. </li><li>**For a multi-region account**, the client does NOT retry the request against a remote region for the account. </li><li>When the application receives a Request Rate too large exception (429), the application would need to instrument its own retry logic and dead letter queues. </li></ul>     |
-| 449  | <ul><li>For Write Operations: </br><ul><li>This exception is encountered when a resource is concurrently updated on the server, which can happen due to concurrent writes, user triggered while conflicts are concurrently being resolved etc. </li><li>Only one update can be executed at a time per item. The other concurrent requests will fail with a Concurrent Execution Exception (449). </li><li>The client does retry requests that failed with a 449 with the first retry triggered after 10ms, followed by an exponential backoff for subsequent retries for up to 30 seconds. If all retries are exhausted, the client bubbles up the exception to the application. </li></ul></li><li>For Query and point read Operations: </br><ul><li>N/A as this exception is only encountered for Create/Insert/Replace/Upsert operations. </li></ul></li></ul>     |
-| 500  | For all Operations: </br><ul><li>The occurrence of an Invalid Exception (500) is extremely rare, and the client does NOT retry a request that encounters this exception. </li></ul>     |
-| 503  | <ul><li>For all Operations using Direct Mode: </br><ul><li>By this point, the client has already retried the operation multiple times locally and in some case across another region (see 410 section above) and bubbles up the exception to the application as a Service Unavailable Exception (503), which can be retried by the application. </li></ul></li><li>For all Gateway Operations: </br><ul><li>These operations can be any of the following: </br><ul><li>Data Plane operations using Gateway mode </li><li>Internal operations triggered by the client for the following: </br><ul><li>Address Resolution to refresh the address for an endpoint after partition splits and replica movements </li><li>Query Plan retrieval – the Cosmos DB Java SDK retrieves (and caches) the query plan from the Gateway prior to executing query operations. </li></ul><li>When a Service Unavailable exception is encountered: </br><ul><li>The client does retry the request up to 2 times against the same Gateway endpoint. </li><li>For data plane Write Operations using Gateway mode: </br><ul><li>**For accounts with a single-write region** configuration, if both retries result in failures, the exception is bubbled up to the application. </li><li>**For accounts with a multi-write region** configuration, after both retries are exhausted, the exception is retried against the next region for the account if **usingMultipleWriteLocations** is set to true. </li><li>If all retries are exhausted against all subsequent regions for the account, then the exception is bubbled up to the application as a Service Unavailable Exception (503). </li></ul><li>For data plane Read/Query operations using Gateway mode: </br><ul><li>**For a single-region account**, the request is retried for up to 2 times, and after both retries are exhausted the exception is bubbled up to the application as a Service Unavailable Exception (503). </li><li>**For a multi-region account**, the request is retried locally, followed by retries against subsequent regions for the account. After all the retries are exhausted against 1 more region for the account, the exception is bubbled up to the application as a Service Unavailable Exception (503). </li></ul><li>For a metadata operation to retrieve the address of replicas and query plans: </br><ul><li>**For a single-region account**, the request is retried for up to 2 times, and after both retries are exhausted the exception is bubbled up to the application as a Service Unavailable Exception (503). </li><li>**For a multi-region account**, the request is retried locally, followed by retries against subsequent regions for the account. Addresses will resolve to the Gateway region’s endpoints. </br><ul><li>After all the retries are exhausted against all subsequent regions, the exception is bubbled up to the application as a Service Unavailable Exception (503). </li></ul></li></ul></li></li></li></ul></li></li></ul></li></ul></li></ul>     |
\ No newline at end of file
+| 408         | <ul><li>For Write Operations: <br><ul><li>Timeout exceptions can be encountered by both the client as well as the server. </li><li>Client-side timeouts are manifested internally as 410 exceptions due to intermittent network connectivity issues and follow the path specified below in the section detailing 410 status codes. </li><li>Server-side timeout exceptions are not retried for write operations as it is not possible to determine if the write was in fact successfully committed on the server. To avoid overwriting a previously committed write, the client does NOT retry this operation and bubbles up the exception to the application as a Request Timeout Exception (408). </li><li>For a client-generate timeout exception, there are two scenarios: </br><ul><li>The request was sent over the wire to the server by the client, but the network request timeout exceeded, while waiting for a response. In this case, it is unclear if the write was received and committed by the server and thus, the operation is not retried. A timeout exception is bubbled up to the application by the client. </li><li>The request was not sent over the wire to the server which resulted in a client-generated timeout. If this is due to a network error: </li></ul></li><li>For a server-generated timeout exception: The client DOES NOT retry.</li></ul></li><li>For Query and Point Read Operations:</br><ul><li>The request is retried locally for up to 30 seconds with an exponential backoff for subsequent retries. If all retries are exhausted, the client bubbles up the exception back to the application as a Request Timeout Exception (408). </li></ul> </li></ul>    |
+| 409         | <ul><li>For Write Operations: </br><ul><li>This exception occurs when an attempt is made by the application to Create/Insert an Item that already exists.</li><li>This exception can occur regardless of the Consistency level set for the account. </li><li>This exception can occur for write operations when an attempt is made to create an existing item or when a unique key constraint violation occurs. </li><li>The client does NOT retry on Conflict exceptions </li></ul></li><li>For Query Operations: </br><ul><li>N/A as this exception is only encountered for Create/Insert operations. </li></ul></li><li>For Point Read Operations: <br><ul><li>N/A as this exception is only encountered for Create/Insert operations. </li></ul></li></ul>    |
+| 410         | <ul><li>For both read and write operations, a 410 (Gone Exception) can be thrown in the following scenarios: </br><ul><li>When a partition is split (or merged in the future) and no longer exists </li><li>When a replica has been moved to another address. Replicas are moved to ensure load balancing of request volume on the server. However, this is a very rarely triggered operation.  </li></ul></li><li>For Write Operations: <br><ul><li>A 410 can be thrown by both the client as well as the server.   </li><li>For a server-generated 410: </br><ul><li>The client retries the write operation after triggering an address resolution operation. Retries are executed for up to 30 seconds with an exponential back off retry between successive retries. </li></ul><li>For a client-generated 410: </br><ul><li>For a client-generated 410 when the request was **NOT** sent over the wire: </br><ul><li>**For a single-write region account**, the client retries the request in the local region for up to 30 seconds with an exponential back off between successive retries. After all the retries are exhausted, the exception is bubbled up to the application as Service Unavailable Exception (503). </li><li>**For a multi-write region account**, the client first retries the request for up to 30 seconds in the local region with an exponential back off between successive retries. After all the retries in the local region have been exhausted, the client retries the request against the next region in the list of Preferred Locations (and if **usingMultipleWriteLocations** is set to true). If all the retries in the next region also result in 410 exceptions, the exception is bubbled up to the application as a Service Unavailable exception (503). </li></ul><li>For a client-generated 410 when the request **WAS** sent over the wire: </br><ul><li>The client does NOT retry these requests as it cannot be determined if the server received the request and committed the write operation. </li><li>Thus, when the 410 was due to a networking timeout when waiting on a response from the server, the client bubbles up the exception to the application as a Request Timeout (408). </li><li>If the 410 was due to a network connectivity issue after sending the request over the write, the client bubbles up the exception to the application as a Service Unavailable Exception (503). </li></ul></li></li></ul></li></ul></li><li>For Read Operations: </br><ul><li>When using Eventual Consistency: </br><ul><li>The client triggers an Address Resolution to refresh the addresses of the replicas for the partition. However, as of today this does not refresh the connection itself. The connection is refreshed when the first request to that endpoint is made. This is done for performance reasons and to ensure the number of established connections does not exceed the limits of the VM. </li><li>The client then retries the read request against a random replica for the partition. </li><li>The previous two steps are repeated if they continue to see Gone Exceptions for a maximum of 30 seconds (when using Direct Mode) and 60 seconds (when using Gateway mode). </li><li>This exception typically occurs due to intermittent network connectivity to the server. </br><ul><li>**For a multi-region account**, after all the retries against the local region are exhausted, the exception is retried against the next region for the account (for a multi-region account). The order in which regions are selected is based on the list of Preferred Locations, configured in the client. </br><ul><li>If all retries are exhausted against all subsequent regions for the account, then the exception is bubbled up to the application as a Service Unavailable Exception (503). </li></ul></li></ul><li>Important points to note about the request timeout value that is configured by the application when the Cosmos client is initialized: </br><ul><li>The setting is only applicable to calls using Direct Mode. </li><li>Gateway mode timeout settings are not exposed externally and controlled internally by the client as there are additional internal operations that rely on the Gateway and an incorrect timeout setting can lead to adverse side effects. </li><li>The range of possible values that can be set for network timeouts in Direct Mode are between 5 and 10 seconds (inclusive). </li><li>This timeout setting applies to each network request. </li><li>Thus, each retry (in a scenario with multiple retries issued by the client) will have its own timer. The timeout duration is not cumulative across all retries.</li></ul></li></li></ul></li></ul></li><li>When using Session Consistency: </br><ul><li>Same behavior as Eventual Consistency with one addition: the client will first retry the request on other replicas.</li></ul></li><li>When using Bounded Staleness or Strong Consistency: </br><ul><li>A maximum of 60 seconds is spent retrying the request if needed.</li><li>All other behavior is the same as Session Consistency </li></ul></li></ul>     |
+| 412         | <ul><li>For Write Operations: </br><ul><li>This exception is encountered when the etag that is sent to the server for validation prior to updating an Item, does not match the etag of the Item on the server. </li><li>The client does NOT retry this operation locally or against any of the remote regions for the account as retries would not help alleviate the etag mismatch. </li><li>The application would need to trigger a retry by first reading the Item, fetching the latest etag and issuing the Upsert/Replace operation. </br><ul><li>This operation can continue to fail with the same exception when multiple updates are executed concurrently for the same Item. </li><li>An upper bound on the number of retries before handing off the Item to a dead letter queue should be implemented by the application. </li></ul></li></ul></li><li>For Query and point read Operations: </br><ul><li>N/A as this exception is only encountered for Create/Insert/Replace/Upsert operations. </li></ul></li></ul>    |
+| 429         | For all Operations: </br><ul><li>By default, the client retries the request for a maximum of 9 times (or for a maximum of 30 seconds, whichever limit is reached first). </li><li>The client can also be initialized with a custom retry policy, which overrides the two limits mentioned above. </li><li>After all the retries are exhausted, the client bubbles up the exception to the application. </li><li>**For a multi-region account**, the client does NOT retry the request against a remote region for the account. </li><li>When the application receives a Request Rate too large exception (429), the application would need to instrument its own retry logic and dead letter queues. </li></ul>     |
+| 449         | <ul><li>For Write Operations: </br><ul><li>This exception is encountered when a resource is concurrently updated on the server, which can happen due to concurrent writes, user triggered while conflicts are concurrently being resolved etc. </li><li>Only one update can be executed at a time per item. The other concurrent requests will fail with a Concurrent Execution Exception (449). </li><li>The client does retry requests that failed with a 449 with the first retry triggered after 10ms, followed by an exponential backoff for subsequent retries for up to 30 seconds. If all retries are exhausted, the client bubbles up the exception to the application. </li></ul></li><li>For Query and point read Operations: </br><ul><li>N/A as this exception is only encountered for Create/Insert/Replace/Upsert operations. </li></ul></li></ul>     |
+| 500         | For all Operations: </br><ul><li>The occurrence of an Invalid Exception (500) is extremely rare, and the client does NOT retry a request that encounters this exception. </li></ul>     |
+| 503         | <ul><li>For all Operations using Direct Mode: </br><ul><li>By this point, the client has already retried the operation multiple times locally and in some case across another region (see 410 section above) and bubbles up the exception to the application as a Service Unavailable Exception (503), which can be retried by the application. </li></ul></li><li>For all Gateway Operations: </br><ul><li>These operations can be any of the following: </br><ul><li>Data Plane operations using Gateway mode </li><li>Internal operations triggered by the client for the following: </br><ul><li>Address Resolution to refresh the address for an endpoint after partition splits and replica movements </li><li>Query Plan retrieval – the Cosmos DB Java SDK retrieves (and caches) the query plan from the Gateway prior to executing query operations. </li></ul><li>When a Service Unavailable exception is encountered: </br><ul><li>The client does retry the request up to 2 times against the same Gateway endpoint. </li><li>For data plane Write Operations using Gateway mode: </br><ul><li>**For accounts with a single-write region** configuration, if both retries result in failures, the exception is bubbled up to the application. </li><li>**For accounts with a multi-write region** configuration, after both retries are exhausted, the exception is retried against the next region for the account if **usingMultipleWriteLocations** is set to true. </li><li>If all retries are exhausted against all subsequent regions for the account, then the exception is bubbled up to the application as a Service Unavailable Exception (503). </li></ul><li>For data plane Read/Query operations using Gateway mode: </br><ul><li>**For a single-region account**, the request is retried for up to 2 times, and after both retries are exhausted the exception is bubbled up to the application as a Service Unavailable Exception (503). </li><li>**For a multi-region account**, the request is retried locally, followed by retries against subsequent regions for the account. After all the retries are exhausted against 1 more region for the account, the exception is bubbled up to the application as a Service Unavailable Exception (503). </li></ul><li>For a metadata operation to retrieve the address of replicas and query plans: </br><ul><li>**For a single-region account**, the request is retried for up to 2 times, and after both retries are exhausted the exception is bubbled up to the application as a Service Unavailable Exception (503). </li><li>**For a multi-region account**, the request is retried locally, followed by retries against subsequent regions for the account. Addresses will resolve to the Gateway region’s endpoints. </br><ul><li>After all the retries are exhausted against all subsequent regions, the exception is bubbled up to the application as a Service Unavailable Exception (503). </li></ul></li></ul></li></li></li></ul></li></li></ul></li></ul></li></ul>     |
diff --git a/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ClientRetryPolicy.java b/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ClientRetryPolicy.java
index 24abb18d9def1..c6b31417e7225 100644
--- a/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ClientRetryPolicy.java
+++ b/sdk/cosmos/azure-cosmos/src/main/java/com/azure/cosmos/implementation/ClientRetryPolicy.java
@@ -33,8 +33,8 @@ public class ClientRetryPolicy extends DocumentClientRetryPolicy {
     final static int RetryIntervalInMS = 1000; //Once we detect failover wait for 1 second before retrying request.
     final static int MaxRetryCount = 120;
     private final static int MaxServiceUnavailableRetryCount = 1;
-    //  Query Plan and Address Refresh will be re-tried 3 times, please check the if condition carefully :)
-    private final static int MAX_QUERY_PLAN_AND_ADDRESS_RETRY_COUNT = 2;
+    // Address Refresh will be re-tried 3 times, please check the if condition carefully :)
+    private final static int MAX_ADDRESS_REFRESH_RETRY_COUNT = 2;
 
     private final DocumentClientRetryPolicy throttlingRetry;
     private final GlobalEndpointManager globalEndpointManager;
@@ -50,7 +50,7 @@ public class ClientRetryPolicy extends DocumentClientRetryPolicy {
     private CosmosDiagnostics cosmosDiagnostics;
     private AtomicInteger cnt = new AtomicInteger(0);
     private int serviceUnavailableRetryCount;
-    private int queryPlanAddressRefreshCount;
+    private int addressRefreshCount;
     private RxDocumentServiceRequest request;
     private RxCollectionCache rxCollectionCache;
 
@@ -126,7 +126,7 @@ public Mono<ShouldRetryResult> shouldRetry(Exception e) {
                 WebExceptionUtility.isReadTimeoutException(clientException) &&
                 Exceptions.isSubStatusCode(clientException, HttpConstants.SubStatusCodes.GATEWAY_ENDPOINT_READ_TIMEOUT)) {
 
-                boolean canFailoverOnTimeout = canGatewayRequestFailoverOnTimeout(request, clientException);
+                boolean canFailoverOnTimeout = canGatewayRequestFailoverOnTimeout(request);
 
                 //if operation is data plane read, metadata read, or query plan it can be retried on a different endpoint.
                 if(canFailoverOnTimeout) {
@@ -135,7 +135,7 @@ public Mono<ShouldRetryResult> shouldRetry(Exception e) {
 
                 // if operationType AddressRefresh then just retry
                 if (this.request.isAddressRefresh()) {
-                    return shouldRetryQueryPlanAndAddress();
+                    return shouldRetryAddressRefresh();
                 }
             } else {
                 logger.warn("Backend endpoint not reachable. ", e);
@@ -160,7 +160,7 @@ public Mono<ShouldRetryResult> shouldRetry(Exception e) {
         return this.throttlingRetry.shouldRetry(e);
     }
 
-      private boolean canGatewayRequestFailoverOnTimeout(RxDocumentServiceRequest request, CosmosException clientException) {
+      private boolean canGatewayRequestFailoverOnTimeout(RxDocumentServiceRequest request) {
         //Query Plan requests
         if(request.getOperationType() == OperationType.QueryPlan) {
             return true;
@@ -186,22 +186,22 @@ private boolean canGatewayRequestFailoverOnTimeout(RxDocumentServiceRequest requ
         return false;
     }
 
-    private Mono<ShouldRetryResult> shouldRetryQueryPlanAndAddress() {
+    private Mono<ShouldRetryResult> shouldRetryAddressRefresh() {
 
-        if (this.queryPlanAddressRefreshCount++ > MAX_QUERY_PLAN_AND_ADDRESS_RETRY_COUNT) {
+        if (this.addressRefreshCount++ > MAX_ADDRESS_REFRESH_RETRY_COUNT) {
             logger
                 .warn(
-                    "shouldRetryQueryPlanAndAddress() No more retrying on endpoint {}, operationType = {}, count = {}, " +
+                    "shouldRetryAddressRefresh() No more retrying on endpoint {}, operationType = {}, count = {}, " +
                         "isAddressRefresh = {}",
-                    this.locationEndpoint, this.request.getOperationType(), this.queryPlanAddressRefreshCount, this.request.isAddressRefresh());
+                    this.locationEndpoint, this.request.getOperationType(), this.addressRefreshCount, this.request.isAddressRefresh());
             return Mono.just(ShouldRetryResult.noRetry());
         }
 
         logger
-            .warn("shouldRetryQueryPlanAndAddress() Retrying on endpoint {}, operationType = {}, count = {}, " +
+            .warn("shouldRetryAddressRefresh() Retrying on endpoint {}, operationType = {}, count = {}, " +
                       "isAddressRefresh = {}, shouldForcedAddressRefresh = {}, " +
                       "shouldForceCollectionRoutingMapRefresh = {}",
-                  this.locationEndpoint, this.request.getOperationType(), this.queryPlanAddressRefreshCount,
+                  this.locationEndpoint, this.request.getOperationType(), this.addressRefreshCount,
                 this.request.isAddressRefresh(),
                 this.request.shouldForceAddressRefresh(),
                 this.request.forceCollectionRoutingMapRefresh);