Performance issues for queries returning larger arrays #310
Labels
api: spanner
Issues related to the googleapis/python-spanner API.
priority: p2
Moderately-important priority. Fix may not be included in next release.
type: feature request
‘Nice-to-have’ improvement, new feature or different behavior or design.
We recently upgraded the google-cloud-spanner library to the latest version (1.17.1 -> 3.3.0) as it includes a fix for a bug we were experiencing. Unfortunately, we are now experiencing some large performance issues when executing queries that return some rows with large arrays: one specific query went from 4 to 17 seconds. This results in longer calculation times in our pipelines.
We investigated the issue and found that the performance drop was a result of the _merged_values taking longer using the latest version of the library. This function parses the protobuf using
_parse_value_pb
and timing the parsing of the query mentioned above on both versions of the library gives the following result:As you can see, parsing an array has become a lot slower which is an issue if it is called quite often. This seems similar to another issue in this repo but the performance fixes mentioned over there are already merged.
Environment details
Steps to reproduce
Execute a query that returns one or more array columns with the arrays containing a large number of elements (for example: 999)
The text was updated successfully, but these errors were encountered: