![ga4](https://www.google-analytics.com/collect?v=2&tid=G-6VDTYWLKX6&cid=1&en=page_view&sid=1&dl=statmike%2Fvertex-ai-mlops%2FApplied+GenAI&dt=Vertex+AI+GenAI+For+Rewriting+-+BigQuery+Advisor+With+Codey.ipynb)

# BigQuery Advisor - Using LLMs to Understand And Rewrite Queries

**Rewriting Is A Summarization Task**

A form of summarization is rewriting.  The summary task is to take an input and regenerate it with instructions:
- make this shorter
- make this more readable
- make this more efficient (code)

When the language is code this can be very helpful to rewrite code in a more efficient way for understanding or computation.  

**LLMs For Code - Vertex AI Codey**

Vertex AI Generative AI has a series of models known as [Vertex AI Codey APIs](https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-models-overview).  The are fit for purpose LLMs:
- `code-bison`
    - Generate code from language description
    - Generate unitest for code
    - Fix code sample
    - optimize code
    - translate code
- `codechat-bison`
    - Generate code documentation with comments
    - Generate release notes
    - And similar to `code-bison` can also help with: generation, unit test, fixing, optimiation, and translation
- `code-gecko`
    - autocomplete a section of code that is started

**BigQuery SQL**

A popular language for code is SQL.  The language of BigQuery is GoogleSQL - ([read more here](https://cloud.google.com/bigquery/docs/introduction-sql)).  An awesome feature of BigQuery is the [Information Schema](https://cloud.google.com/bigquery/docs/introduction-sql).  The jobs resource type information schema can be viewed from the perspective of a [project](https://cloud.google.com/bigquery/docs/information-schema-jobs), [user](https://cloud.google.com/bigquery/docs/information-schema-jobs-by-user), [folder](https://cloud.google.com/bigquery/docs/information-schema-jobs-by-folder), or [organization](https://cloud.google.com/bigquery/docs/information-schema-jobs-by-organization).  Within these views there are rows for each job submitted with information like:
- `creation_time`, `start_time`, `end_time` - when the job was submitted, started, and finished
- `total_bytes_processed` - how many bytes of data were processed
- `total_slot_ms` - the slot usage in milliseconds over the entire run time
- `job_id` - a unique id for the job
- `job_type` - QUERY, LOAD, EXTRACT, COPY, NULL (internal jobs for materialized view refresh, ...)
- `query` - The text of the jobs SQL

**BigQuery ML For Direct Access to Vertex AI LLMs**

BigQuery ML is SQL for doing ML tasks directly inside of BigQuery.  One of the statments is [`CREATE MODEL ...`](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create) which can train a model, import a model, and connect to a remote model. A Remote model can be a model hosted at a Vertex AI Prediction Endpoint (using `OPTIONS(ENDPOINT = '')`) or a remote service (using `OPTIONS(REMOTE_SERVICE_TYPE = '')`).  These [remote service types](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model#remote_service_type) include some Vertex AI Generative AI Foundational models.

**Building A BigQuery Advisor**

This notebook builds a BigQuery Advisor for me, a user of BigQuery.  The flow built out below is:

- Use BigQuery `INFORMATION_SCHEMA.JOBS_BY_USER` view to find unique SQL queries I have submitted and filter by signs of need for efficieny like run time, resources used, and length of query syntax.
- Use BigQuery ML to connect to Vertex AI Codey APIs directly inside of BigQuery
- Use ML.GENERATE_TEXT to connect to Vertex AI LLMs to describe and attempt to rewrite any potentially inefficient queries
- USE Vertex AI SDK to connect direction to Vertex AI Codey APIs and generate a better rewrite of inefficient queries

**REQUIREMENTS**

This notebook will examine BigQuery queries run by the user in the specified Google Cloud Project over the last 365 days.  If the authenticated user does not have any queries in the project or region it will return null results.


---
## Colab Setup

To run this notebook in Colab click [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Vertex%20AI%20GenAI%20For%20Rewriting%20-%20BigQuery%20Advisor%20With%20Codey.ipynb) and run the cells in this section.  Otherwise, skip this section.

This cell will authenticate to GCP (follow prompts in the popup).

In [1]:
PROJECT_ID = 'statmike-mlops-349915' # replace with project ID

In [2]:
try:
    import google.colab
    from google.colab import auth
    auth.authenticate_user()
    !gcloud config set project {PROJECT_ID}
except Exception:
    pass

---
## Installs and API Enablement

The clients packages may need installing in this environment. 

### Installs (If Needed)

In [None]:
# tuples of (import name, install name)
packages = [
    ('google.cloud.aiplatform', 'google-cloud-aiplatform')
]

import importlib
install = False
for package in packages:
    if not importlib.util.find_spec(package[0]):
        print(f'installing package {package[1]}')
        install = True
        !pip install {package[1]} -U -q --user

### API Enablement

In [79]:
!gcloud services enable aiplatform.googleapis.com
!gcloud services enable bigqueryconnection.googleapis.com

### Restart Kernel (If Installs Occured)

After a kernel restart the code submission can start with the next cell after this one.

In [None]:
if install:
    import IPython
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

---
## Setup

In [1]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [2]:
REGION = 'us-central1'
SERIES = 'applied-genai'
EXPERIMENT = 'bq-advisor'

# change the following if the GCS bucket has a different name than the PROJECT_ID
GCS_BUCKET = PROJECT_ID

In [3]:
# make this the BQ Project / Dataset / Table prefix to store results
BQ_PROJECT = PROJECT_ID
BQ_DATASET = SERIES.replace('-', '_')
BQ_TABLE = EXPERIMENT
BQ_REGION = REGION[0:2] # subset to first two characters for multi-region

In [4]:
import json
from google.cloud import bigquery
from google.cloud import bigquery_connection_v1 as bq_connection
import vertexai.language_models

In [5]:
bq = bigquery.Client(project = PROJECT_ID)
vertexai.init(project = PROJECT_ID, location = REGION)

---
## BigQuery Information Schema

Review jobs by user with [INFORMATION_SCHEMA.JOBS_BY_USER](https://cloud.google.com/bigquery/docs/information-schema-jobs-by-user)

In [6]:
CURRENT_USER = !gcloud config list --format='value(core.account)'
print("The Current User is:\n", f"{CURRENT_USER[0]}")

The Current User is:
 1026793852137-compute@developer.gserviceaccount.com


### Find Longest Queries

Not by time, but by syntax length!

This Query:
- Shows 10 queries with most characters
- For Current User
- In Current Project
- For Datasets in Regions defined in BQ_REGION
- Started and Completed in the last 365 Days

In [7]:
query = f"""
SELECT
    query,
    count(job_id) as n_runs,
    avg(total_bytes_processed) as avg_bytes_processed,
    avg(TIMESTAMP_DIFF(end_time, start_time, SECOND)) as avg_duration_seconds

FROM `{BQ_PROJECT}.region-{BQ_REGION}.INFORMATION_SCHEMA.JOBS_BY_USER`
WHERE 
    state = "DONE"
    AND statement_type != 'SCRIPT'
    AND TIMESTAMP_TRUNC(creation_time, day) >= TIMESTAMP_ADD(TIMESTAMP_TRUNC(CURRENT_TIMESTAMP(), day), INTERVAL -365 day)
GROUP BY query
ORDER BY CHAR_LENGTH(query) DESC
LIMIT 10
"""
results = bq.query(query = query).to_dataframe()
results

Unnamed: 0,query,n_runs,avg_bytes_processed,avg_duration_seconds
0,\n SELECT 'Central Park North & Adam Cl...,2,223933.5,1.0
1,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,2,0.0,3.0
2,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
3,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
4,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,2.0
5,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,4.0
6,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
7,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
8,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
9,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0


Review the longest query:

In [8]:
print(f"This query is {len(results['query'][0])} charcters long.  Here are the first 500 characters:\n\n\n\n", results['query'][0][0:500])

This query is 7345 charcters long.  Here are the first 500 characters:



 
        SELECT 'Central Park North & Adam Clayton Powell Blvd' as start_station_name, *
        FROM ML.EVALUATE(
            MODEL `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_1`,
            (
                SELECT starttime, num_trips,
                    avg_tripduration, pct_subscriber, ratio_gender
                FROM `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped`
                WHERE splits = 'TEST'
                    AND sta


### Create Table With Longest Queries

Create Dataset (if new):

In [9]:
# create/link to dataset
ds = bigquery.DatasetReference(BQ_PROJECT, BQ_DATASET)
ds.location = BQ_REGION
ds.labels = {'series': f'{SERIES}'}
ds = bq.create_dataset(dataset = ds, exists_ok = True) 

Create Table in Dataset with Results:

In [10]:
job = bq.query(query = f"CREATE OR REPLACE TABLE `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}` AS\n" + query)
job.result()
job.state

'DONE'

Retrieve results from table.  These should match since its the same query saved as a table:

In [11]:
results = bq.query(f'SELECT * FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}`').to_dataframe()
results

Unnamed: 0,query,n_runs,avg_bytes_processed,avg_duration_seconds
0,\n SELECT 'Central Park North & Adam Cl...,2,223933.5,1.0
1,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,2,0.0,3.0
2,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
3,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
4,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,2.0
5,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,4.0
6,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
7,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
8,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0
9,\nCREATE OR REPLACE TABLE `statmike-mlops-3499...,1,0.0,3.0


---
## BigQuery ML: Connect To Vertex AI LLMs with ML.GENERATE_TEXT

BigQuery ML can `Create Model`s that are actually connections to Remote Models. [Reference](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model)

Using the `REMOTE_SERVICE_TYPE = "CLOUD_AI_LARGE_LANGUAGE_MODEL_V1"` option will link to LLMs in Vertex AI!

### Connection Requirement

To make a remote connection using BigQuery ML, BigQuery uses a CLOUD_RESOURCE connection. [Reference](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model#connection)

Create a new connection with type `CLOUD_RESOURCE`: First, check for existing connection.

In [67]:
try:
    response = bq_connection.ConnectionServiceClient().get_connection(
            request = bq_connection.GetConnectionRequest(
                name = f"projects/{BQ_PROJECT}/locations/{BQ_REGION}/connections/{SERIES}_{EXPERIMENT}"
            )
    )
    print(f'Found existing connection with service account: {response.cloud_resource.service_account_id}')
    service_account = response.cloud_resource.service_account_id
except Exception:
    request = bq_connection.CreateConnectionRequest(
        {
            "parent": f"projects/{BQ_PROJECT}/locations/{BQ_REGION}",
            "connection_id": f"{SERIES}_{EXPERIMENT}",
            "connection": bq_connection.types.Connection(
                {
                    "friendly_name": f"{SERIES}_{EXPERIMENT}",
                    "cloud_resource": bq_connection.CloudResourceProperties({})
                }
            )
        }
    )
    response = bq_connection.ConnectionServiceClient().create_connection(request)
    print(f'Created new connection with service account: {response.cloud_resource.service_account_id}')
    service_account = response.cloud_resource.service_account_id

Created new connection with service account: bqcx-1026793852137-a2ne@gcp-sa-bigquery-condel.iam.gserviceaccount.com


Assign the service account the Vertex AI User role:

In [62]:
!gcloud projects add-iam-policy-binding {BQ_PROJECT} --member=serviceAccount:{service_account} --role=roles/aiplatform.user

Updated IAM policy for project [statmike-mlops-349915].
bindings:
- members:
  - serviceAccount:service-1026793852137@gcp-sa-aiplatform-cc.iam.gserviceaccount.com
  role: roles/aiplatform.customCodeServiceAgent
- members:
  - serviceAccount:service-1026793852137@gcp-sa-aiplatform.iam.gserviceaccount.com
  role: roles/aiplatform.serviceAgent
- members:
  - serviceAccount:bqcx-1026793852137-dyw1@gcp-sa-bigquery-condel.iam.gserviceaccount.com
  - serviceAccount:bqcx-1026793852137-zfly@gcp-sa-bigquery-condel.iam.gserviceaccount.com
  role: roles/aiplatform.user
- members:
  - serviceAccount:service-1026793852137@gcp-sa-artifactregistry.iam.gserviceaccount.com
  role: roles/artifactregistry.serviceAgent
- members:
  - serviceAccount:1026793852137-compute@developer.gserviceaccount.com
  role: roles/bigquery.admin
- members:
  - serviceAccount:1026793852137@cloudservices.gserviceaccount.com
  role: roles/bigquery.dataOwner
- members:
  - serviceAccount:1026793852137@cloudbuild.gserviceaccount

### Create The Remote Model In BigQuery

In [63]:
# Create Remote Model In BigQuery
query = f"""
CREATE OR REPLACE MODEL `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}_MODEL`
    REMOTE WITH CONNECTION `{BQ_PROJECT}.{BQ_REGION}.{SERIES}_{EXPERIMENT}`
    OPTIONS(REMOTE_SERVICE_TYPE = 'CLOUD_AI_LARGE_LANGUAGE_MODEL_V1')
"""
job = bq.query(query = query)
job.result()
job.state

'DONE'

---
## Annotate and Rewrite Queries

Use [ML.GENERATE_TEXT](https://cloud.google.com/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-text) to annotate and rewrite queries!

**NOTE** Currently, this directly links to [`text-bison`](https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/text) and the model is not selectable for changing to one of the other [Vertex AI foundational models](https://cloud.google.com/vertex-ai/docs/generative-ai/learn/models).

### Describe The Longest Query

In [79]:
query = f"""
SELECT *
FROM ML.GENERATE_TEXT(
    MODEL `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}_MODEL`,
    (
        SELECT CONCAT('Describe the operation of the following Google SQL query:', query) AS prompt
        FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}`
        LIMIT 1
    ),
    STRUCT(500 AS max_output_tokens, TRUE AS flatten_json_output)
)
"""
describe = bq.query(query = query).to_dataframe()
describe

Unnamed: 0,ml_generate_text_llm_result,ml_generate_text_rai_result,ml_generate_text_status,prompt
0,The given Google SQL query uses the ML.EVALUA...,"{""blocked"":false,""categories"":[],""scores"":[]}",,Describe the operation of the following Google...


In [80]:
print(describe['ml_generate_text_llm_result'][0])

 The given Google SQL query uses the ML.EVALUATE function to evaluate multiple ARIMA+XREG models on a test dataset. The query is structured as a series of UNION ALL statements, which combine the results of each model evaluation into a single result set.

The ML.EVALUATE function takes three arguments:

- The model to be evaluated, specified as a MODEL clause.
- The data to be used for evaluation, specified as a subquery.
- A STRUCT clause that specifies the evaluation options. In this case, the perform_aggregation option is set to TRUE, which indicates that the model should be evaluated on aggregated data.

The subquery used in the ML.EVALUATE function selects the following features from the forecasting-data_prepped table:

- starttime: The start time of the trip.
- num_trips: The number of trips.
- avg_tripduration: The average trip duration.
- pct_subscriber: The percentage of trips made by subscribers.
- ratio_gender: The ratio of male to female riders.

The query then uses the star

## Rewrite The Longest Query

In [136]:
query = f"""
SELECT *
FROM ML.GENERATE_TEXT(
    MODEL `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}_MODEL`,
    (
        SELECT CONCAT('Rewrite the following BigQuery GoogleSQL query to be as short as possible. query:', query) AS prompt
        FROM `{BQ_PROJECT}.{BQ_DATASET}.{BQ_TABLE}`
    LIMIT 1
    ),
    STRUCT(1000 AS max_output_tokens, TRUE AS flatten_json_output)
)
"""
rewrite = bq.query(query = query).to_dataframe()
rewrite

Unnamed: 0,ml_generate_text_llm_result,ml_generate_text_rai_result,ml_generate_text_status,prompt
0,"```sql\n SELECT start_station_name, *\n ...","{""blocked"":false,""categories"":[],""scores"":[]}",,Rewrite the following BigQuery GoogleSQL query...


In [138]:
print(rewrite['ml_generate_text_llm_result'][0])

 ```sql
    SELECT start_station_name, *
    FROM ML.EVALUATE(
        MODEL `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_*`,
        (
            SELECT starttime, num_trips,
                avg_tripduration, pct_subscriber, ratio_gender
            FROM `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped`
            WHERE splits = 'TEST'
                AND start_station_name = 'Central Park S & 6 Ave'
        ),
        STRUCT(TRUE AS perform_aggregation)
    )
```


In [145]:
print(results['query'][0])


        SELECT 'Central Park North & Adam Clayton Powell Blvd' as start_station_name, *
        FROM ML.EVALUATE(
            MODEL `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_1`,
            (
                SELECT starttime, num_trips,
                    avg_tripduration, pct_subscriber, ratio_gender
                FROM `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped`
                WHERE splits = 'TEST'
                    AND start_station_name = 'Central Park S & 6 Ave'
            ),
            STRUCT(TRUE AS perform_aggregation)
        )
    UNION ALL
        SELECT 'Central Park S & 6 Ave' as start_station_name, *
        FROM ML.EVALUATE(
            MODEL `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_2`,
            (
                SELECT starttime, num_trips,
                    avg_tripduration, pct_subscriber, ratio_gender
                FROM `statmike-mlops-349915.appli

### Try Running The Revised Query

As Long as the referenced object still exist and the query is not overwriting an object, give the syntax a try:

In [146]:
# remove the first and last line
rewrite_query = '\n'.join(rewrite['ml_generate_text_llm_result'][0].split('\n')[1:-1])
print(rewrite_query)

    SELECT start_station_name, *
    FROM ML.EVALUATE(
        MODEL `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_*`,
        (
            SELECT starttime, num_trips,
                avg_tripduration, pct_subscriber, ratio_gender
            FROM `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped`
            WHERE splits = 'TEST'
                AND start_station_name = 'Central Park S & 6 Ave'
        ),
        STRUCT(TRUE AS perform_aggregation)
    )


In [147]:
bq.query(rewrite_query).to_dataframe()

BadRequest: 400 Model not found: `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_*` at [3:15]

Location: US
Job ID: d5b06d30-aaa0-4be3-aa72-8cff91b05249


The query does not run successfully.  It properly uses a wildcard but the objects are models rather than tables. A different LLM, one focused on code, is probably better at this rewrite step.  Let's try Vertex AI Codey API's!

---
## Vertex AI Codey APIs

CodeGenerationModel [Guide](https://cloud.google.com/vertex-ai/docs/generative-ai/code/code-generation-prompts), [API](https://cloud.google.com/python/docs/reference/aiplatform/latest/vertexai.language_models.CodeGenerationModel)

In [15]:
codegen_model = vertexai.language_models.CodeGenerationModel.from_pretrained('code-bison@latest')

In [27]:
rewrite_codey = codegen_model.predict(f"Rewrite the following query for BigQuery with GoogleSQL and optimize for length. When working the BigQuery ML functions make sure to use the TABLE keyword to refer tables as subqueries and note the BQML function dont use wildcards.:\n {results['query'][0]}", max_output_tokens = 2000).text
print(rewrite_codey)

```
WITH base_subquery AS (
  SELECT starttime, num_trips, avg_tripduration, pct_subscriber, ratio_gender
  FROM `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped`
  WHERE splits = 'TEST'
    AND start_station_name = 'Central Park S & 6 Ave'
)
SELECT
  'Central Park North & Adam Clayton Powell Blvd' AS start_station_name,
  *
FROM ML.EVALUATE(
  MODEL `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_1`,
  TABLE base_subquery,
  STRUCT(TRUE AS perform_aggregation)
)

UNION ALL

SELECT
  'Central Park S & 6 Ave' AS start_station_name,
  *
FROM ML.EVALUATE(
  MODEL `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_2`,
  TABLE base_subquery,
  STRUCT(TRUE AS perform_aggregation)
)

UNION ALL

SELECT
  'Central Park W & W 96 St' AS start_station_name,
  *
FROM ML.EVALUATE(
  MODEL `statmike-mlops-349915.applied_forecasting.forecasting-data_prepped_arimaplusxreg_3`,
  TABLE base_subquery,
  STRUCT(TRUE AS perf

In [28]:
bq.query('\n'.join(rewrite_codey.split('\n')[1:-1])).to_dataframe()

Unnamed: 0,start_station_name,mean_absolute_error,mean_squared_error,root_mean_squared_error,mean_absolute_percentage_error,symmetric_mean_absolute_percentage_error
0,Grand Army Plaza & Central Park S,119.004419,19427.65759,139.383132,41.644377,41.80421
1,Central Park West & W 68 St,154.101324,33857.047915,184.002848,48.928761,56.025046
2,W 82 St & Central Park West,238.753145,72013.375008,268.353079,67.112579,105.230112
3,Central Park West & W 72 St,121.883951,22554.630553,150.181991,41.739618,42.696262
4,Central Park West & W 102 St,260.029256,83697.364612,289.304968,74.224959,122.457784
5,Central Park W & W 96 St,202.442318,53031.090207,230.284802,56.376148,81.270972
6,Central Park West & W 100 St,257.895012,83302.784732,288.622218,72.864692,119.704992
7,Central Park West & W 76 St,214.290854,57026.50145,238.802222,62.764186,90.689374
8,Central Park S & 6 Ave,84.885617,12792.523379,113.104038,36.41994,31.547787
9,Central Park North & Adam Clayton Powell Blvd,200.693028,47953.942385,218.983886,64.146976,93.515945


## Compare the Query Lengths

In [32]:
print(f"""The Codey rewrite is {len(rewrite_codey)} characters.  Compared to the orignal with {len(results['query'][0])} characters this is a {1 - len(rewrite_codey) / len(results['query'][0]):.0%} reduction is length.""")

The Codey rewrite is 3311 characters.  Compared to the orignal with 7345 characters this is a 55% reduction is length.
