# Congress Bot

## Problem Statement
This will be a genai agent designed to interact with the publically available congressional API to inquire what is going on in Congress. 
It will be able to use information from that API to get information about members of Congress and legislation in front of Congress. While the data is available in a format that can be parsed by computers, the text of the bills themselves are in natural language and thus really need an LLM to understand them.

In [1]:
import os
from google import genai
from google.genai import types
from dotenv import load_dotenv, find_dotenv
import congress
import agent
import db
import datetime

genai.__version__

'1.7.0'

## Setup

Uncomment this if running in kaggle and make sureyou have secrets for the GOOGLE_API_KEY (available [here](https://aistudio.google.com/app/apikey)) and CONGRESS_API KEY (available [here](https://api.congress.gov/sign-up/)).


In [2]:
#from kaggle_secrets import UserSecretsClient
#os.environ["GOOGLE_API_KEY"] = UserSecretsClient().get_secret("GOOGLE_API_KEY")
#os.environ["CONGRESS_API_KEY"] = UserSecretsClient().get_secret("CONGRESS_API_KEY")

If you are running locally make sure you have those keys set in your .env and load it here

In [3]:
load_dotenv(find_dotenv())

True

## The agent
CongressBot will use a gen AI agent to respond to queries from the user. Let's first look at the prompt.

In [4]:
print(agent.INSTRUCTION)

You are a helpful chatbot designed to help the user learn about the activities congress. 
You will use the congressional api to access information about bills and members of congress.
If you don't know what value to include for an optional parameter, don't include anything.
For any object in a response that contains a url, call call_endpoint to get more information on it


The prompt is pretty basic. It does provide some instructions on how to use the provided API.

In [5]:
agent.MODEL

'gemini-2.0-flash'

The agent uses the gemini 2.0 Flash model by default

In [6]:
a = agent.CongressAgent(temperature=0)

## The API
For this project we will use the Congressional API found [here](https://api.congress.gov/). This provides several endpoints to get information about congress. I wrote a few functions for accessing it. The most basic function is one that simply calls an endpoint.

In [7]:
import pprint
pprint.pp(congress.call_endpoint_schema)

{'name': 'call_endpoint',
 'description': 'call an endpoint on the congress API',
 'parameters': {'type': 'object',
                'properties': {'endpoint': {'type': 'string',
                                            'description': 'The endpoint to '
                                                           'access. Can be '
                                                           'absolute or '
                                                           'relative to '
                                                           'https://api.congress.gov/v3'}},
                'required': ['endpoint']}}


This function is more powerful than it first appears. Many of the responses from the API use [HATEOAS](https://en.wikipedia.org/wiki/HATEOAS) links to related resources. So if a bill response has a link to a cosponsor resource, the agent can automatically find that without having to know details about that API.

I also have some entry points to the API.

In [8]:
pprint.pp(congress.list_bills_schema)
pprint.pp(congress.get_bill_schema)
pprint.pp(congress.get_members_schema)

{'name': 'list_bills',
 'description': 'Lists bills being considered by Congress',
 'parameters': {'type': 'object',
                'properties': {'offset': {'type': 'integer',
                                          'description': 'offset for the list '
                                                         'of bills'},
                               'limit': {'type': 'integer',
                                         'description': 'the number of bills '
                                                        'to return'},
                               'fromDate': {'type': 'string',
                                            'description': 'the date of the '
                                                           'earliest bills to '
                                                           'return, leave '
                                                           'blank for no '
                                                           'restriction'},
                 

Each of these calls the call endpoint function with the details specific for that endpoint.

There is one exception I had to make and that was to get the bill text. This endpoint does not follow the same pattern as the other endpoints. It lists several identical versions of the bill text except in different formats (HTML, PDF, and XML) and all external to the API, so the call endpoint function won't access it correctly. So I made a separate function for it.

In [9]:
pprint.pp(congress.get_bill_text_schema)

{'name': 'get_bill_text',
 'description': 'Gets the bill text',
 'parameters': {'type': 'object',
                'properties': {'congress': {'type': 'integer',
                                            'description': 'the number of the '
                                                           'congress which '
                                                           'considered the '
                                                           'bill'},
                               'billType': {'type': 'string',
                                            'description': 'the type of bill',
                                            'enum': ['hr',
                                                     's',
                                                     'hjres',
                                                     'sjres',
                                                     'hconres',
                                                     'sconres',
                           

Putting this all together we can use the agent to get information from the API.

In [10]:
a.run(["Who represents North Carolina in district 2?",
       "Can you list some bills she has sponsored?",
       "Can you give some information about the first one?",
       "What reason does the bill text give for its passage?",
       "q"])

-- calling get_members(**{'current': True, 'district': 2, 'state': 'NC'})
The representative for North Carolina in district 2 is Deborah K. Ross.
-- calling call_endpoint(**{'endpoint': 'https://api.congress.gov/v3/member/R000305?format=json'})
-- calling call_endpoint(**{'endpoint': 'https://api.congress.gov/v3/member/R000305/sponsored-legislation'})
Here are the first 20 bills sponsored by Deborah Ross:
Expressing the sense of Congress that the votes of overseas servicemembers must be counted and honored as required under the Uniformed and Overseas Citizens Absentee Voting Act. (HCONRES-28, 119th Congress)
To amend the Outer Continental Shelf Lands Act to withdraw the outer Continental Shelf in the Mid-Atlantic Planning Area from disposition, and for other purposes. (HR-2886, 119th Congress)
Insurance Fraud Accountability Act (HR-2079, 119th Congress)
Safer Sports for Athletes Act of 2024 (HR-10326, 118th Congress)
Expressing support for designating the week of November 4 through Nov

This provides for a nice way to learn about what your congressperson is doing. But I found the bill entry points a little underwhelming. I don't usually know the identifier for a certain bill, and going through the list of bills is a little tedious. I would like a way to search for interesting bills, but the API doesn't provide for much in terms of search.

## Embeddings and the Vector DB
This can be solved by using the "bulk data" system to store embeddings for the bills in a vector DB.
For brevities sake we will only load the bill summaries which are much shorter than the bills themselves. This loading can still take some time so we will only load the documents since April 1st, 2025. The full dataset for a given congress can be loaded via the commandline with `python load_db -c {congress_number}`.

In [11]:
a.db.embed_fn.set_doc_mode()
a.db.load_bill_summaries(119, datetime.datetime(2025,4,1))
a.db.embed_fn.set_query_mode()

Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119
Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119/hr
Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119/s
Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119/hres
Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119/hjres
Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119/sconres
Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119/sjres
Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119/hconres
Loading from https://www.govinfo.gov/bulkdata/json/BILLSUM/119/sres
Inserted 168 documents

I also expose a query function to search bills

In [12]:
pprint.pp(db.query_bill_summaries_schema)

{'name': 'query_bill_summaries',
 'description': 'Finds bill summaries relevant to a given query',
 'parameters': {'type': 'object',
                'properties': {'query': {'type': 'string',
                                         'description': 'A natural language '
                                                        'query to use to find '
                                                        'relevant bills. It '
                                                        'will return a map '
                                                        'with the bill '
                                                        'summary, the '
                                                        'congress, bill type, '
                                                        'bill number, and the '
                                                        'endpoint you can '
                                                        'access it with the '
                                    

In [13]:
a.run(["Are there any bills involving the president's ability to set tariff rates?",
       "Tell me more about S. 151", "q"])

-- calling query_bill_summaries(**{'n': 5, 'query': 'president tariff rates'})
Yes, there are a few bills related to the president's ability to set tariff rates. Here are some of them:

1.  H.R.505 directs the President to impose additional duties on all imports entering the United States.
2.  S.151, the Protecting Americans from Tax Hikes on Imported Goods Act of 2025, prohibits the President from exercising authorities under the International Emergency Economic Powers Act (IEEPA) to impose or increase duties or impose tariff-rate quotas on imports.
-- calling call_endpoint(**{'endpoint': '/bill/119/s/151'})
S. 151, also known as the "Protecting Americans from Tax Hikes on Imported Goods Act of 2025," was introduced on January 17, 2025. It was read twice and referred to the Committee on Banking, Housing, and Urban Affairs. The bill's primary sponsor is Senator Shaheen from New Hampshire. The bill prohibits the President from exercising authorities under the International Emergency Eco

## Going forward
The vector DB is promising. There are some other data sources that could be put in it that could also be queried, including the full text. In addition more metadata could be included allowing more complex filtering.