## Install required dependency and SDK

In [8]:
!pip install --quiet google-cloud-bigquery google-cloud-aiplatform pandas

## Create the environment for the notebook
1. Create a source connection and grant IAM permissions (Steps from https://www.cloudskillsboost.google/course_templates/1210/labs/529948)

2. Generate embeddings


```
CREATE OR REPLACE MODEL `Alaska_Dataset.Embeddings`
REMOTE WITH CONNECTION `us.embedding_conn`
OPTIONS (ENDPOINT = 'text-embedding-005');
```

3. Created a Bigquery client so that I can directly use the Bigquery SQL syntax in the code. (ref: https://cloud.google.com/python/docs/reference/bigquery/latest/google.cloud.bigquery.client.Client#google_cloud_bigquery_client_Client_query)

In [9]:
from google.cloud import bigquery
from vertexai.preview.language_models import ChatModel
import vertexai
import pandas as pd

PROJECT_ID = "qwiklabs-gcp-04-d9bb68112d04"
LOCATION = "global"
BQ_DATASET = "Alaska_Dataset"
TABLE_RAW = "Alaska_Dataset_data"
TABLE_EMBEDDED = "Alaska_Dataset_embedded"
EMBED_MODEL = "Alaska_Dataset_embeddings"
TABLE_ID = f"{PROJECT_ID}.{BQ_DATASET}.{TABLE_EMBEDDED}"
RAW_TABLE_ID = f"{PROJECT_ID}.{BQ_DATASET}.{TABLE_RAW}"
EMBED_MODEL_ID = f"{PROJECT_ID}.{BQ_DATASET}.{EMBED_MODEL}"

vertexai.init(project=PROJECT_ID, location=LOCATION)
bq_client = bigquery.Client(project=PROJECT_ID)
print("Environment set up.", TABLE_ID, EMBED_MODEL_ID)

Environment set up. qwiklabs-gcp-04-d9bb68112d04.Alaska_Dataset.Alaska_Dataset_embedded qwiklabs-gcp-04-d9bb68112d04.Alaska_Dataset.Alaska_Dataset_embeddings


## CSV upload to the BiqQuery table

1. Upload data from gs://labs.roitraining.com/alaska-dept-of-snow/alaska-dept-of-snow-faqs.csv to table (Alaska_Dataset_data)

In [10]:
uri = "gs://labs.roitraining.com/alaska-dept-of-snow/alaska-dept-of-snow-faqs.csv"
job_config = bigquery.LoadJobConfig(
    source_format=bigquery.SourceFormat.CSV,
    skip_leading_rows=1,
    autodetect=True,
    write_disposition="WRITE_TRUNCATE"
)
load_job = bq_client.load_table_from_uri(uri, RAW_TABLE_ID, job_config=job_config)
load_job.result()
print(f"FAQ CSV loaded into BigQuery table: {RAW_TABLE_ID}")

FAQ CSV loaded into BigQuery table: qwiklabs-gcp-04-d9bb68112d04.Alaska_Dataset.Alaska_Dataset_data


## Create Embedding Models
 1. Creates a Remote Model in BigQuery ML
This SQL command registers a remote Vertex AI model (text-embedding-005) as a BigQuery ML model so you can use it in ML.GENERATE_EMBEDDING() queries.

2. Uses a Preconfigured Connection
The REMOTE WITH CONNECTION clause refers to a BigQuery connection (us.embedding_conn) that links BigQuery to Vertex AI's embedding endpoint.

3. Executes SQL via Python Client
The Python code executes the SQL command using BigQuery's Python client and waits for it to complete using .result().

In [11]:
create_model_sql = f"""
CREATE OR REPLACE MODEL `{EMBED_MODEL_ID}`
REMOTE WITH CONNECTION `us.embedding_conn`
OPTIONS (ENDPOINT = 'text-embedding-005');
"""
bq_client.query(create_model_sql).result()
print("Remote embedding model created.")

Remote embedding model created.


## Generate Embeddings Using BigQuery ML

1. Generate Embeddings Using BigQuery ML produces an embedding vector using the model referenced.
2. Store Results in a New Table.
3. Execute via BigQuery Python Client.

# Note:
1. Since the loading of the csv we dont know what are the columns **string_field_0** and **string_field_1** is generated automatically.
2. Here
  1. **string_field_0** -  Question
  2. **string_field_1** - Answer

In [12]:
generate_embed_sql = f"""
CREATE OR REPLACE TABLE `{TABLE_ID}` AS
SELECT *, ml_generate_embedding_result AS embedding
FROM ML.GENERATE_EMBEDDING(
  MODEL `{EMBED_MODEL_ID}`,
  (
    SELECT CONCAT(string_field_0, ' ', string_field_1) AS content,
           string_field_0 AS question,
           string_field_1 AS answer
    FROM `{RAW_TABLE_ID}`
  )
);
"""
bq_client.query(generate_embed_sql).result()
print("Embeddings generated and stored.")

Embeddings generated and stored.


Load Vector Search Results from BigQuery Table is computed in the python app for the chat bot.