## BigFrames LLM

### Overview

Use this notebook to walk through an example use case of generating sample code by using BigQuery DataFrames and its integration with Generative AI support on Vertex AI.

Learn more about [BigQuery DataFrames](https://cloud.google.com/python/docs/reference/bigframes/latest)

## Installation

Install the following packages, which are required to run this notebook:


In [2]:
!pip install bigframes --upgrade --quiet


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.1.2[0m[39;49m -> [0m[32;49m23.2.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## Before you begin

### Set up your Google Cloud project

**Set your project ID**

If you don't know your project ID, try the following:

- Run ``gcloud config list``.

- Run ``gcloud projects list``.

See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113).

In [3]:
PROJECT_ID = "bigframes-dev"  # @param {type:"string"}

# Set the project id
! gcloud config set project {PROJECT_ID}

/bin/bash: line 1: gcloud: command not found


**Set the region**

You can also change the REGION variable used by BigQuery.

Learn more about [BigQuery regions](https://cloud.google.com/bigquery/docs/locations#supported_locations).

In [4]:
REGION = "US"  # @param {type: "string"}

### Authenticate your Google Cloud account

Uncomment and run the following cell:

In [5]:
! gcloud auth login

/bin/bash: line 1: gcloud: command not found


### Import libraries

In [6]:
import bigframes.pandas as bf
from google.cloud import bigquery_connection_v1 as bq_connection

### Set BigQuery DataFrames options

In [7]:
bf.options.bigquery.project = PROJECT_ID
bf.options.bigquery.location = REGION

If you want to reset the location of the created DataFrame or Series objects, reset the session by executing ``bf.reset_session()``. After that, you can reuse ``bf.options.bigquery.location`` to specify another location.

## Define the LLM model

BigQuery DataFrames provides integration with ``text-bison`` [model of the PaLM API](https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/text) via Vertex AI.


This section walks through a few steps required in order to use the model in your notebook.

### Create a BigQuery Cloud resource connection

You need to create a [Cloud resource connection](https://cloud.google.com/bigquery/docs/create-cloud-resource-connection) to enable BigQuery DataFrames to interact with Vertex AI services.

In [8]:
CONN_NAME = "bqdf-llm"

client = bq_connection.ConnectionServiceClient()
new_conn_parent = f"projects/{PROJECT_ID}/locations/{REGION}"
exists_conn_parent = f"projects/{PROJECT_ID}/locations/{REGION}/connections/{CONN_NAME}"
cloud_resource_properties = bq_connection.CloudResourceProperties({})

try:
    request = client.get_connection(
        request=bq_connection.GetConnectionRequest(name=exists_conn_parent)
    )
    CONN_SERVICE_ACCOUNT = f"serviceAccount:{request.cloud_resource.service_account_id}"
except Exception:
    connection = bq_connection.types.Connection(
        {"friendly_name": CONN_NAME, "cloud_resource": cloud_resource_properties}
    )
    request = bq_connection.CreateConnectionRequest(
        {
            "parent": new_conn_parent,
            "connection_id": CONN_NAME,
            "connection": connection,
        }
    )
    response = client.create_connection(request)
    CONN_SERVICE_ACCOUNT = (
        f"serviceAccount:{response.cloud_resource.service_account_id}"
    )
print(CONN_SERVICE_ACCOUNT)



DefaultCredentialsError: Your default credentials were not found. To set up Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information.

### Set permissions for the service account

The resource connection service account requires certain project-level permissions:

- ``roles/aiplatform.user`` and ``roles/bigquery.connectionUser``: These roles are required for the connection to create a model definition using the LLM model in Vertex AI ([documentation](https://cloud.google.com/bigquery/docs/generate-text#give_the_service_account_access)).
- ``roles/run.invoker``: This role is required for the connection to have read-only access to Cloud Run services that back custom/remote functions ([documentation](https://cloud.google.com/bigquery/docs/remote-functions#grant_permission_on_function)).

Set these permissions by running the following ``gcloud`` commands:

In [8]:
!gcloud projects add-iam-policy-binding {PROJECT_ID} --condition=None --no-user-output-enabled --member={CONN_SERVICE_ACCOUNT} --role='roles/bigquery.connectionUser'
!gcloud projects add-iam-policy-binding {PROJECT_ID} --condition=None --no-user-output-enabled --member={CONN_SERVICE_ACCOUNT} --role='roles/aiplatform.user'
!gcloud projects add-iam-policy-binding {PROJECT_ID} --condition=None --no-user-output-enabled --member={CONN_SERVICE

/bin/bash: line 1: gcloud: command not found
/bin/bash: line 1: gcloud: command not found
/bin/bash: line 1: gcloud: command not found


### Using Langchan BigFramesLLM

In [9]:
session = bf.get_global_session()
connection = f"{PROJECT_ID}.{REGION}.{CONN_NAME}"



In [1]:
from langchain.llms import BigFramesLLM

llm = BigFramesLLM(session=session, 
                  connection=connection,
                  model="PaLM2TextGenerator",
                  max_new_tokens=128,
                  top_k=10,
                  top_p=0.95,
                  temperature=0.8,
)

print(llm("What is the capital of France ?"))

ImportError: cannot import name 'BigFramesLLM' from 'langchain.llms' (/home/vscode/.local/lib/python3.11/site-packages/langchain/llms/__init__.py)

### Integrate the model in an LLMChain

In [26]:
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

template = """Question: {question}

Answer: Let's think step by step."""
prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(prompt=prompt, llm=llm)

# answer is a bigframes dataframe
answer = llm_chain.run("What is BigFrames?")

NameError: name 'llm' is not defined

### Set BigQuery DataFrames options

In [4]:
import bigframes.pandas as bpd

In [5]:
bpd.options.bigquery.project = "bigframes-dev"
bpd.options.bigquery.location = "us"