# IBM & DataStax Demo - Banking AI Agent

## Before start

Check the prereqs on the README.md file.

## Part III - Agents and NoSQL

- Create a CQL table to store banking transactions
- Load sample data
- Connect to Astra using the Data API
- Create an Banking Agent Flow on Langflow.
- Connect the Astra DB Tools to the agent.
- Run the Flow through the Langflow API


# Astra DB NoSQL Tables

In the previous steps, we focused on the **Collection** data model — a semi-structured format that allows customers to store and query flexible JSON documents.

However, for some use cases where very low latency and throughput are required, tables should be the data model to choose.

In the next steps, we will create a table, insert data using CQL (Cassandra Query Language) and Data-API/Astrapy (Python client library).


# Astra DB - Banking use case with NoSQL

We will generate a banking statement with transactions. To do that, we will create a table on the the same database from the previous steps and load some sample data.

## Astra DB - Driver connection with CQL

To start, you will need to download the [Secure Connect Bundle](https://docs.datastax.com/en/astra-db-serverless/databases/secure-connect-bundle.html) on the Astra DB Dashboard.

<img src="./img/scb.png" alt="Secure Connect Bundle" width="600"/>

CLick on the "Download SCB" button. Copy the cURL command:

<img src="./img/scb1.png" alt="Secure Connect Bundle" width="600"/>


In [None]:
# Run the curl command on the line below (add the exclamation mark at the start of the line) to download the secure-connect bundle for the Astra DB instance.
# This bundle is used to connect to the Astra DB instance from the Python code.
# The bundle contains the necessary certificates and configuration files.
!curl -o secure-connect-astra-ibm-demo.zip '<your secure-connect bundle URL>'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 12323  100 12323    0     0  24218      0 --:--:-- --:--:-- --:--:-- 24210


You should see the `secure-connect-astra-ibm-demo.zip`file on this folder

With the file and environment variables set, lets connect to the DB


In [1]:
# Import necessary libraries
from cassandra.query import SimpleStatement, PreparedStatement, BatchStatement
from cassandra.cluster import Cluster, PlainTextAuthProvider

In [43]:
# Make sure the environment variables are set
import os
from dotenv import load_dotenv
load_dotenv(override=True)

if os.getenv("ASTRA_DB_APPLICATION_TOKEN") is None:
    raise ValueError("Environment variable ASTRA_DB_APPLICATION_TOKEN not set")

if os.getenv("ASTRA_DB_APPLICATION_TOKEN")[:8] != "AstraCS:":
    raise ValueError(
        "Environment variable ASTRA_DB_APPLICATION_TOKEN invalid format")

if ".apps.astra.datastax.com" not in os.getenv("ASTRA_DB_API_ENDPOINT"):
    raise ValueError("Environment variable ASTRA_DB_API_ENDPOINT invalid")

print(
    f'Astra Token: {os.getenv("ASTRA_DB_APPLICATION_TOKEN")[:10]}...{os.getenv("ASTRA_DB_APPLICATION_TOKEN")[-5:]}')
print(f'Astra Endpoint: ...{os.getenv("ASTRA_DB_API_ENDPOINT")[-25:]}')
print("Good to go!")

Astra Token: AstraCS:bf...35ec6
Astra Endpoint: ...2.apps.astra.datastax.com
Good to go!


In [3]:
# create a connection to the Astra DB
cluster = Cluster(
    cloud={
        "secure_connect_bundle": "./secure-connect-astra-ibm-demo.zip",
    },
    auth_provider=PlainTextAuthProvider(
        "token",
        os.getenv("ASTRA_DB_APPLICATION_TOKEN"),
    ),
)
session = cluster.connect("default_keyspace")

# Execute a simple query to test the connection
rs = session.execute("SELECT * FROM system_schema.tables where keyspace_name = 'default_keyspace' limit 1")
print(rs.one())

Row(keyspace_name='default_keyspace', table_name='banking_knowledge_layer', additional_write_policy='99p', bloom_filter_fp_chance=0.01, caching=OrderedMapSerializedKey([('keys', 'ALL'), ('rows_per_partition', 'NONE')]), cdc=None, comment='{"collection":{"name":"banking_knowledge_layer","schema_version":1,"options":{"vector":{"dimension":1024,"metric":"cosine","sourceModel":"OTHER","service":{"provider":"nvidia","modelName":"NV-Embed-QA"}},"defaultId":{"type":""},"lexical":{"enabled":true,"analyzer":"standard"},"rerank":{"enabled":true,"service":{"provider":"nvidia","modelName":"nvidia/llama-3.2-nv-rerankqa-1b-v2","authentication":null,"parameters":null}}}}}', compaction=OrderedMapSerializedKey([('class', 'org.apache.cassandra.db.compaction.UnifiedCompactionStrategy')]), compression=OrderedMapSerializedKey([('chunk_length_in_kb', '16'), ('class', 'org.apache.cassandra.io.compress.LZ4Compressor')]), crc_check_chance=1.0, dclocal_read_repair_chance=0.0, default_time_to_live=0, extensions=

Before creating the table, a summary about the Cassandra Data Modeling:

<img src="./img/data_modeling.png" alt="Data Modeling" width="600"/>

In [40]:
# Lets create a table to store transactions by month

create_table = """
CREATE TABLE IF NOT EXISTS transactions_by_month (
    account_id UUID,
    year_month TEXT,                -- Format: 'YYYYMM', e.g., '202504'
    transaction_timestamp TIMESTAMP,
    transaction_id UUID,
    amount DECIMAL,
    currency TEXT,
    channel TEXT,                  -- e.g., 'app', 'credit_card'
    location TEXT,                 -- e.g., 'Geo: -23.5,-46.6'
    merchant TEXT,
    description TEXT,
    balance_after DECIMAL,         -- NEW: balance after this transaction

    PRIMARY KEY ((account_id, year_month), transaction_timestamp, transaction_id)
) WITH CLUSTERING ORDER BY (transaction_timestamp DESC, transaction_id ASC);

"""

session.execute(create_table)
print("Table transactions_by_month created successfully.")

Table transactions_by_month created successfully.


Note the PRIMARY KEY definition:

```
PRIMARY KEY ((account_id, year_month), transaction_timestamp, transaction_id)
```

The combined account_id and year_month are the partition key
transaction_timestamp and transaction_id are the clustering keys

This allows us to query transactions by account_id and year_month, and sort them by transaction_timestamp and transaction_id


### Astra DB CQL Console

The Astra DB CQL Console is a web-based tool that allows you to interact with the database using the CQL (Cassandra Query Language). It provides a user-friendly interface for executing queries, managing tables, and monitoring the database.

To access the CQL Console, navigate to the Astra DB Dashboard and click on the "CQL Console" button.

<img src="./img/console.png" alt="Console" width="600"/>

The DESCRIBE command is used to get the table schema.

```
USE default_keyspace;
DESCRIBE transactions_by_month;
```

<img src="./img/console1.png" alt="Console" width="600"/>

Now, lets load some data into the table.

In [None]:
# Run this cell to generate transactions
import uuid
import random
from datetime import datetime, timedelta
from faker import Faker
import random

fake = Faker()
start_date = datetime(2025, 6, 1)
end_date = datetime(2025, 7, 1)
# Define possible channels and categories
channels = ['mobile_app', 'web_banking', 'atm', 'branch', 'credit_card']
categories = ['groceries', 'dining', 'transportation',
              'entertainment', 'shopping', 'utilities', 'healthcare', 'travel']

# Prepare the insert statement
cmd = session.prepare("""INSERT INTO transactions_by_month (
    account_id, 
    year_month, 
    transaction_timestamp, 
    transaction_id, 
    amount, 
    currency, 
    channel, 
    location, 
    merchant, 
    description, 
    balance_after) VALUES (
    :account_id, 
    :year_month, 
    :transaction_timestamp, 
    :transaction_id, 
    :amount, 
    :currency, 
    :channel, 
    :location, 
    :merchant, 
    :description, 
    :balance_after)""")

# Prepare the batch statement
batch = BatchStatement()

# Generate 50 transactions for each customer
batch = BatchStatement()
customers = [uuid.uuid4() for _ in range(10)]
merchants = ['Walmart', 'Target', 'Amazon', 'Starbucks', 'McDonalds', 'Uber', 'Lyft', 'Netflix', 'Spotify', 'Apple Store']

for customer in customers:
    account_id = uuid.uuid4() 
    year_month = '202506'  # June 2025
    balance = random.uniform(1000.00, 10000.00)  # Starting balance
    current_date = start_date
    time_increment = (end_date - start_date) / 60  # Divide the month into 50 equal intervals
    
    
    for _ in range(50):
        transaction_id = uuid.uuid4()
        transaction_timestamp = current_date
        current_date += time_increment
        amount = round(random.uniform(-500.00, 500.00), 2)
        balance += amount
        merchant = random.choice(merchants)
        
        batch.add(cmd, {
            'account_id': account_id,
            'year_month': year_month,
            'transaction_timestamp': transaction_timestamp,
            'transaction_id': transaction_id,
            'amount': amount,
            'currency': 'USD',
            'channel': random.choice(channels),
            'location': 'Geo: -23.5,-46.6',
            'merchant': random.choice(merchants),
            'description': f"{random.choice(merchants)} - {random.choice(categories)}",
            'balance_after': balance
        })

    # Execute the batch
    session.execute(batch)
    print(f"Successfully inserted {len(batch) } transactions for {account_id} customer")
    batch.clear()
    
   

Successfully inserted 50 transactions for 0a662701-cb5d-4253-834c-b28233a832fa customer
Successfully inserted 50 transactions for a93236cc-2495-479b-8154-4eced1c1df87 customer
Successfully inserted 50 transactions for 7049e58a-36ab-44e6-8a59-66dd2bf40fff customer
Successfully inserted 50 transactions for 67c3b9fb-3c76-4b4d-8a69-0905ce046388 customer
Successfully inserted 50 transactions for 28d3991d-3e61-41fb-921f-4b247156acdd customer
Successfully inserted 50 transactions for 1ed051a5-d1db-4909-9c94-3fec92abf00c customer
Successfully inserted 50 transactions for c45b4f5f-5c7b-41b3-b969-5cf53476e2f7 customer
Successfully inserted 50 transactions for a34fd963-648b-4730-9069-38c7dab512bb customer
Successfully inserted 50 transactions for 0ba7bedf-94cd-409d-bed7-123715e884e1 customer
Successfully inserted 50 transactions for aabdf051-bc6b-4c8a-820f-cfa094ba78ad customer


Check the result on the CQL Console.

```
SELECT * FROM default_keyspace.transactions_by_month LIMIT 1;
```

<img src="./img/console2.png" alt="Console" width="600"/>

## Astra DB Data API for NoSQL Tables

It is possible to access the Astra DB NoSQL tables through the the [Data-API](https://docs.datastax.com/en/astra-db-serverless/api-reference/dataapiclient.html), a REST API service available on the Astra DB.

This connection method is recommended when the developer is used to simplify the app development by avoiding CQL commands and abstract from driver complexities.

https://docs.datastax.com/en/astra-db-serverless/api-reference/tables.html

The Data API is also available with clients for the languages:

- Python: [Astrapy](https://github.com/datastax/astrapy)
- TypeScript: [astra-db-ts](https://github.com/datastax/astra-db-ts)
- Java: [astra-db-java](https://github.com/datastax/astra-db-java)

Other languages can use HTTP commmands to interact with the Data API - check the [Postman collection](https://www.postman.com/datastax/stargate-cassandra/collection/knuwhcu/data-api-vector-collection?tab=overview) for examples.

### Astrapy

*Astrapy* is the Python client we will use in the next section.

It can be used to create a connection with Astra DB, create objects (tables, indexes and collections) and interact with tables and collections.

As we already created a table, let's connect to Astra and query the table.

In [None]:
from astrapy import DataAPIClient
client = DataAPIClient()
astra_db = client.get_database(os.getenv("ASTRA_DB_API_ENDPOINT"), 
                               token=os.getenv("ASTRA_DB_APPLICATION_TOKEN"),
                               keyspace="default_keyspace")

With the connection established, we can create a reference to the table and query it.

In [None]:
transactions_table = astra_db.get_table( "transactions_by_month")
rs = transactions_table.find({"account_id": "0a662701-cb5d-4253-834c-b28233a832fa",
                        "year_month": "202506"},
                             limit=3)

for row in rs:
    print(row)

{'account_id': UUID('0a662701-cb5d-4253-834c-b28233a832fa'), 'year_month': '202506', 'transaction_timestamp': DataAPITimestamp(timestamp_ms=1750852800000 [2025-06-25T12:00:00.000Z]), 'transaction_id': UUID('8eea0b07-6e58-4a7b-ae74-86b518918168'), 'amount': Decimal('394.16000000000002501110429875552654266357421875'), 'balance_after': Decimal('6458.32253184610817697830498218536376953125'), 'channel': 'credit_card', 'currency': 'USD', 'description': 'Lyft - groceries', 'location': 'Geo: -23.5,-46.6', 'merchant': 'Netflix'}
{'account_id': UUID('0a662701-cb5d-4253-834c-b28233a832fa'), 'year_month': '202506', 'transaction_timestamp': DataAPITimestamp(timestamp_ms=1750809600000 [2025-06-25T00:00:00.000Z]), 'transaction_id': UUID('82311281-8893-4951-9feb-07266821a63f'), 'amount': Decimal('330.05000000000001136868377216160297393798828125'), 'balance_after': Decimal('6064.1625318461083224974572658538818359375'), 'channel': 'branch', 'currency': 'USD', 'description': 'Spotify - entertainment', 'l

# Langflow Agent

We will create an AI Agent that access the table with banking transactions.

The flow is available on this repo (file: Part_III - Langflow Agent with NoSQL data from Astra DB.json). 

After accessing Langflow, create a blank flow:

<img src="./img/lf_blank_flow.png" alt="Console" width="600"/>

Then, you can drag the json file to the canvas or use the import option on the menu:

<img src="./img/lf_import.png" alt="Console" width="600"/>

After import, your flow will be something like this:

<img src="./img/lf_agent.png" alt="Console" width="600"/>

## Customizing the flow.

### Text Input

Change the content of the component "Text Input" with a account_id stored on yout database. Execute the next cell to retriev this value:


In [44]:
rs = session.execute("SELECT account_id FROM default_keyspace.transactions_by_month limit 1")
print(rs.one())

Row(account_id=UUID('c45b4f5f-5c7b-41b3-b969-5cf53476e2f7'))


Click on the edit button and input the ID.

<img src="./img/lf_agent_edit_acc.png" alt="Console" width="600"/>

### Astra DB CQL

On the "Astra DB CQL" component, click on the options for "Astra DB Application Token", then select "+ Add New Variable":

<img src="./img/lf_agent_var.png" alt="Console" width="300"/>

Then, create a variable named "ASTRA_IBM_TOKEN" and fill the value with "ASTRA_DB_APPLICATION_TOKEN" available on your .env file and save the variable.

<img src="./img/lf_agent_var2.png" alt="Console" width="300"/>

Do the same with the API Endpoint. Create a variable named "ASTRA_IBM_API" and fill the value with "ASTRA_DB_API_ENDPOINT" available on your .env file and save the variable.

Now, click on the button "Tool Parameters". These are the parameters that the LLM will fill in order to run the tool.

<img src="./img/lf_agent_tool_param.png" alt="Console" width="600"/>

### Agent

As we are using the Open AI model to run our agent, it is needed define the Open AI API KEY the agent will use. 

Execute the same procedure for creating a variable named "OPENAI_API_KEY" to save your key and select it on the component.

## Running the flow

Click on the ```Playground``` button and enter the question:

> How much did I spend between June 15 and 25, 2025?

Your result should be something like this (but the numbers and dates may be different):

<img src="./img/lf_agent_run1.png" alt="Console" width="600"/>

Now, click on the "detail" button and inspect the execution.

The "Input" section shows how the LLM filled the tool parameters:

<img src="./img/lf_agent_run2.png" alt="Console" width="600"/>

The "Output" section shows the data returned from the tool (Astra DB) to the LLM:

<img src="./img/lf_agent_run3.png" alt="Console" width="600"/>

Then, the LLM had real time context to answer the question:

<img src="./img/lf_agent_run4.png" alt="Console" width="600"/>


# Recap

In this exercise, you learned:

**Using Astra:**
- The main features available for companies
- Common use cases
- How to connect to Astra
- How to create tables in Astra
- How to use the Astra DB Console
- How to load data into Astra

**Using Langflow:**
- How to import flows
- How to customize flow execution
- How to configure the Astra DB Tool
- How to create and reuse variables
- How to inspect agent execution
