# Next Steps

- ✅ Parsed ~3,500 documents into `contracts_dataset.contracts`.  
- 📌 Goal: 
  - Use `AI.GENERATE` and `AI.GENERATE_TABLE` on the table.  
  - Create vector embeddings for `contract_summary`. `ML.GENERATE_EMBEDDING`

- 🔧 Requirement:  
  To leverage these AI functions on the gcs, I need an `ObjectRef`.  
  Therefore, I’ll be transforming `file_path` into a objectrefe (`ref`), and have it in a view.


In [1]:
import os
import json
import pandas as pd
from google.cloud import bigquery
from google.cloud.exceptions import NotFound
import time

In [2]:
# Configuration
PROJECT_ID = "cool-automata-386721"  # Your Google Cloud Project ID
GCS_BUCKET_NAME = "contracts-mcc"  # Your GCS bucket with contract documents
DATASET_NAME = "contracts_dataset"

# Initialize BigQuery client
client = bigquery.Client(project=PROJECT_ID)


### Create BigQuery Cloud resource connection and service account permissions

In [3]:
!bq mk --connection --location=us \
    --connection_type=CLOUD_RESOURCE contracts_ai_connection

BigQuery error in mk operation: Already Exists: Connection
projects/625945884811/locations/us/connections/contracts_ai_connection


In [25]:
SERVICE_ACCT = !bq show --format=prettyjson --connection us.contracts_ai_connection | grep "serviceAccountId" | cut -d '"' -f 4
SERVICE_ACCT_EMAIL = SERVICE_ACCT[-1]
print(SERVICE_ACCT_EMAIL)

bqcx-625945884811-s8gk@gcp-sa-bigquery-condel.iam.gserviceaccount.com


In [6]:
import time

!gcloud projects add-iam-policy-binding --format=none $PROJECT_ID --member=serviceAccount:$SERVICE_ACCT_EMAIL --role='roles/storage.objectViewer'
!gcloud projects add-iam-policy-binding --format=none $PROJECT_ID --member=serviceAccount:$SERVICE_ACCT_EMAIL --role='roles/aiplatform.user'

# Wait ~60 seconds, to give IAM updates time to propagate. Otherwise, subsequent cells may fail.
time.sleep(60)

Updated IAM policy for project [cool-automata-386721].
Updated IAM policy for project [cool-automata-386721].


In [6]:
%load_ext google.cloud.bigquery



## Step 1: Add objectRef into a view.

I have this table contracts with all the necessary fields, now I will focus on a few fields that I am interested in and view also add a `objectRef` to make my data useful. 


In [9]:
%%bigquery --project {PROJECT_ID}

CREATE OR REPLACE VIEW `contracts_dataset.v_contracts` AS
SELECT
  c.*,
  OBJ.FETCH_METADATA(OBJ.MAKE_REF(c.file_path, 'us.contracts_ai_connection')) AS ref
FROM `contracts_dataset.contracts` c;

Query is running:   0%|          |

### Using AI.GENERATE_TABLE with objectRef

In [None]:
%%bigquery --project {PROJECT_ID}

SELECT
company_name,
state_of_incorp,
contract_type,
clauses,
clause_explanation,
file_path,
FROM AI.GENERATE_TABLE(
  MODEL `contracts_dataset.gemini`,
  (
    SELECT (
      'What is this?'
      , ref
    ) AS prompt,
    company_name,
    contract_type,
    contract_summary,
    state_of_incorp,
    parties,
    clauses,
    file_path
    FROM `contracts_dataset.v_contracts`
    where company_name like 'Waters Corporation'
  ),
  STRUCT(
     "clause_explanation STRING" AS output_schema
)
)

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,company_name,state_of_incorp,contract_type,clauses,clause_explanation,file_path
0,Waters Corporation,DE,Award Agreement,"[Change of Control, Termination, Vesting, Requ...",This document is an exhibit of a Performance S...,gs://contracts-mcc/2020/Q1/1000697..0001193125...


### USING AI.GENERATE with objectRef

In [18]:
%%bigquery --project {PROJECT_ID}

SELECT
company_name,
state_of_incorp,
contract_type,
clauses,
file_path,
AI.GENERATE(
  (
    'Explain the clause in detail',ref
  ),
  connection_id => 'us.contracts_ai_connection',
  endpoint => 'gemini-2.0-flash'
).result as clause_explanation
FROM `contracts_dataset.v_contracts`
where company_name like '%Insperity%'
limit 3

Query is running:   0%|          |

Downloading:   0%|          |

Unnamed: 0,company_name,state_of_incorp,contract_type,clauses,file_path,clause_explanation
0,"Insperity, Inc.",,Restricted Stock Unit Agreement,"[Vesting, Change of Control, Termination, Forf...",gs://contracts-mcc/2020/Q1/1000753..0001000753...,"Okay, let's break down this Restricted Stock U..."
1,"Insperity, Inc.",,Restricted Stock Unit Agreement,"[Change of Control, Forfeiture, Clawback, Gove...",gs://contracts-mcc/2020/Q1/1000753..0001000753...,"Okay, let's break down this document, which is..."
2,"Insperity, Inc.",,Restricted Stock Unit Agreement,"[Vesting, Forfeiture, Change of Control, Termi...",gs://contracts-mcc/2020/Q1/1000753..0001000753...,"Okay, let's break down this Restricted Stock U..."
