In [1]:
import pandas as pd # works like tidydata
import numpy as np # works with matrices
import requests # conducts web transactions
import json # works with json style data
import dotenv # works with .env files
import os # allows for operating system level commands

API documentation: https://legiscan.com/gaits/documentation/legiscan

Step 1: Find the registration for the API key

Step 2: Bring that key into the code in a way that does NOT copy-paste the key into this file

In [2]:
dotenv.load_dotenv() # finds and loads (silently) the .env file
legiscan_key = os.getenv('legiscan_key')

Step 3: Use the API key to access the data we want

An API is a URL constructed generally as: root / endpoint ? parameters

1. Find the right root
2. Find the right endpoint (this one isn't using endpoints, it's using a parameter called 'op')
3. Find the right parameters
4. Learn how this API wants us to supply the API key

In [3]:
root = 'https://api.legiscan.com'
params = {'key': legiscan_key,
         'op': 'getBill',
         'id': '1167968'}
r = requests.get(root, params=params)
r

<Response [200]>

In [4]:
myjson = json.loads(r.text)

In [5]:
pd.json_normalize(myjson, record_path = ['bill','texts'])

Unnamed: 0,doc_id,date,type,type_id,mime,mime_id,url,state_link,text_size,text_hash,alt_bill_text,alt_mime,alt_mime_id,alt_state_link,alt_text_size,alt_text_hash
0,1868195,2019-01-23,Introduced,1,application/pdf,2,https://legiscan.com/MD/text/SB181/id/1868195,https://mgaleg.maryland.gov/2019RS/bills/sb/sb...,85467,423ba752efdfa002d991006e2b358a7f,0,,0,,0,
1,1917357,2019-02-19,Engrossed,4,application/pdf,2,https://legiscan.com/MD/text/SB181/id/1917357,https://mgaleg.maryland.gov/2019RS/bills/sb/sb...,86322,d23002e61f4eba5cfb4e3c4d25890b18,0,,0,,0,
2,2034836,2019-06-07,Chaptered,6,application/pdf,2,https://legiscan.com/MD/text/SB181/id/2034836,https://mgaleg.maryland.gov/2019RS/Chapters_no...,77961,083666935344581572f6df42e7fe6c5c,0,,0,,0,


The next two questions:

1. How to find the bill ID in a systematic and automated way
2. How to find machine readable bill text without webscraping or pulling off a PDF if at all possible

To do:
* Install packages for Python on your computer. Type on the command line:
    * pip3 install pandas
    * pip3 install numpy
    * pip3 install requests
    * pip3 install python-dotenv
    * pip3 install jupyter
    * pip3 install jupyterlab
* Open the terminal, use cd to move into the folder where you want to save the legisan_api folder. Then type: git clone https://github.com/jkropko/legiscan_api
* On the command line, type: jupyter lab (this launches Jupyter lab) then open the api_access.ipynb file
* On the command line type: touch .env
* Then: open .env
* Inside the .env file type legiscan_key=.... where the dots are your own key. Then save
* Then run everything in legiscan_api to check it works

## How to find the bill ID in a systematic and automated way


In [6]:
#https://api.legiscan.com/?key=APIKEY&op=getSessionList
params = {'key': legiscan_key,
         'op': 'getSessionList'}
r = requests.get(root, params=params)
r

<Response [200]>

In [10]:
myjson = json.loads(r.text)
session_df = pd.json_normalize(myjson, record_path = ['sessions'])

In [13]:
session_df

Unnamed: 0,session_id,state_id,year_start,year_end,prefile,sine_die,prior,special,session_tag,session_title,session_name,dataset_hash,session_hash,name
0,2148,1,2025,2025,1,0,0,0,Regular Session,2025 Regular Session,2025 Regular Session,ff7569b944bb45724bc269e86cd225e4,ff7569b944bb45724bc269e86cd225e4,2025 Regular Session
1,2103,1,2024,2024,0,1,1,0,Regular Session,2024 Regular Session,Regular Session 2024,8163cf95b36859f0f327c0576e729221,8163cf95b36859f0f327c0576e729221,Regular Session 2024
2,2060,1,2023,2023,0,1,1,1,2nd Special Session,2023 2nd Special Session,Second Special Session 2023,c9c74795f8e51bd3b8e4da60ce85d424,c9c74795f8e51bd3b8e4da60ce85d424,Second Special Session 2023
3,2048,1,2023,2023,0,1,1,1,1st Special Session,2023 1st Special Session,First Special Session 2023,72239fe0272a1f14b4f5de3370c32172,72239fe0272a1f14b4f5de3370c32172,First Special Session 2023
4,2014,1,2023,2023,0,1,1,0,Regular Session,2023 Regular Session,Regular Session 2023,d17b533ce7196680956ab4bbbb4cfb99,d17b533ce7196680956ab4bbbb4cfb99,Regular Session 2023
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
884,1435,52,2017,2018,0,1,1,0,Regular Session,2017-2018 Regular Session,115th Congress,0f28f2be28536b4e03913d72f88a5451,0f28f2be28536b4e03913d72f88a5451,115th Congress
885,1156,52,2015,2016,0,1,1,0,Regular Session,2015-2016 Regular Session,114th Congress,84eb8ad3003a508f91ad26247ad731ef,84eb8ad3003a508f91ad26247ad731ef,114th Congress
886,1026,52,2013,2014,0,1,1,0,Regular Session,2013-2014 Regular Session,113th Congress,3c139f1f45314574d030410e6acb24d0,3c139f1f45314574d030410e6acb24d0,113th Congress
887,84,52,2011,2012,0,1,1,0,Regular Session,2011-2012 Regular Session,112th Congress,0474b6bbcefe5106043ce4161d446ba0,0474b6bbcefe5106043ce4161d446ba0,112th Congress


In [15]:
session_ids = session_df['session_id']
session_ids

0      2148
1      2103
2      2060
3      2048
4      2014
       ... 
884    1435
885    1156
886    1026
887      84
888      77
Name: session_id, Length: 889, dtype: int64

In [18]:
session_ids[0]

2148

In [19]:
params = {'key': legiscan_key,
         'op': 'getMasterList',
         'id': session_ids[0]}
r = requests.get(root, params=params)
r

<Response [200]>

In [20]:
myjson = json.loads(r.text)

In [31]:
myjson = myjson['masterlist']

In [32]:
del myjson['session']

In [37]:
bill_df = pd.DataFrame(myjson).T

In [38]:
bill_ids = bill_df['bill_id']

In [39]:
bill_ids

0     1886114
1     1886274
2     1886289
3     1886100
4     1886187
5     1886231
6     1886217
7     1886129
8     1885983
9     1886085
10    1886245
11    1885954
12    1886303
13    1885895
14    1886173
15    1885968
16    1885881
17    1885997
18    1886202
19    1886070
20    1886158
21    1885939
22    1886259
23    1886385
24    1886386
25    1886388
26    1886387
27    1886383
28    1886382
29    1886384
30    1886580
31    1886582
32    1886583
33    1886584
34    1886577
35    1886579
36    1886581
37    1886578
38    1886869
39    1886875
40    1886885
41    1886864
42    1886859
43    1886881
44    1886027
45    1886056
46    1886041
47    1886144
48    1885924
49    1886318
50    1885910
51    1886012
52    1886486
53    1886487
Name: bill_id, dtype: object

In [40]:
bill_df

Unnamed: 0,bill_id,number,change_hash,url,status_date,status,last_action_date,last_action,title,description
0,1886114,HB1,d42adee19655c73ebd7e7522bf9d497b,https://legiscan.com/AL/bill/HB1/2025,2024-07-08,1,2024-07-08,"Pending House Ports, Waterways & Intermodal Tr...","Seafood, to assess a fee on seafood dealers de...","Seafood, to assess a fee on seafood dealers de..."
1,1886274,HB2,717ee4fdc6b222769e2d4f58cc6817ba,https://legiscan.com/AL/bill/HB2/2025,2024-07-08,1,2024-07-08,Pending House Judiciary,"Vaccines, parental consent for minor to receiv...","Vaccines, parental consent for minor to receiv..."
2,1886289,HB3,c01cc61a9285559fa8fc6d3f441b6960,https://legiscan.com/AL/bill/HB3/2025,2024-07-08,1,2024-07-08,Pending House Judiciary,Crimes and offenses; conviction of illegal ali...,Crimes and offenses; conviction of illegal ali...
3,1886100,HB4,b18f049d983984ab618795fd37fe74c1,https://legiscan.com/AL/bill/HB4/2025,2024-07-08,1,2024-07-08,Pending House Judiciary,"Crimes and offenses, further provides for obsc...","Crimes and offenses, further provides for obsc..."
4,1886187,HB5,2fd9f18470208d86a4c9e80d9f2af1e5,https://legiscan.com/AL/bill/HB5/2025,2024-07-08,1,2024-07-08,Pending House Ways and Means General Fund,Alabama State Law Enforcement Agency; salary a...,Alabama State Law Enforcement Agency; salary a...
5,1886231,HB6,5aad27e2be09b3180d2cd6e2c22e48f8,https://legiscan.com/AL/bill/HB6/2025,2024-07-08,1,2024-07-08,"Pending House Constitution, Campaigns and Elec...",Political parties; disqualifying candidate fro...,Political parties; disqualifying candidate fro...
6,1886217,HB7,fdee76f6cf666b08ade72333dfa85787,https://legiscan.com/AL/bill/HB7/2025,2024-07-08,1,2024-07-08,Pending House Judiciary,"Illegal immigration; procedures for arrest, de...","Illegal immigration; procedures for arrest, de..."
7,1886129,HB8,fefd3b1483871b302189608caf63f4fa,https://legiscan.com/AL/bill/HB8/2025,2024-07-08,1,2024-07-08,Pending House Judiciary,"Alcoholic Beverage Control Board, regulation o...","Alcoholic Beverage Control Board, regulation o..."
8,1885983,HB9,4fd1afa8ca4f74ff65a4cdae5a8ae115,https://legiscan.com/AL/bill/HB9/2025,2024-07-08,1,2024-07-08,Pending House Education Policy,Three cueing system prohibited in public K-12 ...,Three cueing system prohibited in public K-12 ...
9,1886085,HB10,ec5b8823c3e961155fcb830e0a45a287,https://legiscan.com/AL/bill/HB10/2025,2024-07-08,1,2024-07-08,Pending House Public Safety and Homeland Security,Body-worn and dashboard cameras; delay in disc...,Body-worn and dashboard cameras; delay in disc...


In [48]:
params = {'key': legiscan_key,
         'op': 'getBill',
         'id': bill_ids[0]}
r = requests.get(root, params=params)
myjson = json.loads(r.text)
textlink = myjson['bill']['texts'][0]['state_link']

  'id': bill_ids[0]}


## How to find machine readable bill text without webscraping or pulling off a PDF if at all possible

In [52]:
import io
import requests
from PyPDF2 import PdfReader

response = requests.get(url=textlink, timeout=120)
on_fly_mem_obj = io.BytesIO(response.content)
pdf_file = PdfReader(on_fly_mem_obj)

In [54]:
print(pdf_file)

<PyPDF2._reader.PdfReader object at 0x10ae67da0>


In [62]:
print(pdf_file.pages[3].extract_text())

HB1 INTRODUCED
Page 3place of business,  then  an additional  dealer's licenses must
license shall  be purchased for each separate place of
business, providing the location of each. A vehicle used
solely for transporting  seafoods  seafood  to or from an Alabama
seafood dealer is not considered a place of business. Each
vehicle from which seafood is sold to or purchased from any
person,  firm, or corporation  other than an Alabama seafood
dealer, is a place of business and shall be licensed under
this section.  The  A seafood dealer shall purchase a license
for each  such  vehicle for a fee of one hundred dollars ($100)
per license and the operator of the vehicle shall have the
original license in his or her possession when selling or
buying seafood from that vehicle. Seafood dealers may purchase
seafoods  seafood  only from commercial fishermen validly
licensed in Alabama, Alabama seafood dealers, and any
nonresident seller who is validly licensed to sell  seafoods
seafood  under the 