<h1 align="center"> Generative AI Hackathon</h1>
<table align="center">
    <td>
        <a href="https://colab.research.google.com/github/teamdatatonic/gen-ai-hackathon/blob/main/hackathon.ipynb">
            <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo">
            <span style="vertical-align: middle;">Run in Colab</span>
        </a>
    </td>
    <td>
        <a href="https://github.com/teamdatatonic/gen-ai-hackathon/blob/main/hackathon.ipynb">
            <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
            <span style="vertical-align: middle;">View on GitHub</span>
        </a>
    </td>
    <td>
        <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/teamdatatonic/gen-ai-hackathon/main/hackathon.ipynb">
            <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"> 
            <span style="vertical-align: middle;">Open in Vertex AI Workbench</span>
        </a>
    </td>
</table>
<hr>

**➡️ Your task:** Learn about Generative AI by building your own Database Analytics using Python and LangChain!

**❗ Note:** This workshop has been designed to be run in Google CoLab. Support for running the workshop locally or using VertexAI Workbench is provided, but we heavily recommend CoLab for the best experience.


**❗ Note:** If your kernel doesn't restart automatically, click the "Restart Runtime" button above your notebook.
If you dont see a restart button, go to the "Runtime" toolbar tab then "Restart Runtime". After restarting, continue executing the project from below this cell.

## Accessing the Vertex AI Endpoint

Currently, Vertex AI LLMs are accessible via Google Cloud projects. 

1. Set the env variables `project_id` and `dataset_id` with the filepath (**❗ Note:** the `/content/` folder is where uploaded files are stored by default).

In [23]:
# Replace 'your-project-id' with your Google Cloud project ID
project_id = 'dt-genai-analytics-dev'
dataset_id = 'database_analytics_demo_v2'

### Setting up the BigQuery client

In [24]:
from google.cloud import bigquery


# Initialize the BigQuery client
client = bigquery.Client(project=project_id)

# Test the connection by listing datasets in the project
datasets = list(client.list_datasets())
print("Datasets in project {}:".format(project_id))
if datasets:
    for dataset in datasets:
        print("{}".format(dataset.dataset_id))
else:
    print("No datasets found in project {}".format(project_id))

Datasets in project dt-genai-analytics-dev:
database_analytics_demo_v2
logging


### Pip install package dependencies

In [25]:
# %pip install pybigquery
# %pip install SQLAlchemy==1.4.49
# %pip install pyarrow
# %pip install google-cloud-bigquery-storage
# %pip install sqlalchemy-bigquery==1.7.0
# %pip install google-cloud-aiplatform

### Create database engine and SQL chain

In [26]:

from langchain.chains.sql_database.base import SQLDatabaseChain
from langchain.chains import SQLDatabaseSequentialChain
from langchain.sql_database import SQLDatabase
from langchain.llms import VertexAI
from sqlalchemy.engine import create_engine


engine = create_engine(f"bigquery://{project_id}/{dataset_id}")

llm = VertexAI(model_name='text-bison@001',
               temperature=0, max_output_tokens=1024)

db = SQLDatabase(engine=engine)


db_chain = SQLDatabaseSequentialChain.from_llm(
        llm,
        db,
        verbose=True
    )

# test questions
# db_chain("How many employees are there?")
# db_chain("total_revenue")
db_chain("How many customers are there?")




[1m> Entering new  chain...[0m




Table names to use:
[33;1m[1;3m['customers'][0m

[1m> Entering new  chain...[0m
How many customers are there?
SQLQuery:[32;1m[1;3mSELECT count(*) FROM customers[0m
SQLResult: [33;1m[1;3m[(1000,)][0m
Answer:[32;1m[1;3m1000[0m
[1m> Finished chain.[0m

[1m> Finished chain.[0m


{'query': 'How many customers are there?', 'result': '1000'}