# CaLPA Scratch Folder

This is a scratch notebook used to test the code and functionality of the AI California Legislative Policy Analysis (CALPA) system. It is not intended for production use and may contain incomplete or experimental code. The purpose of this notebook is to facilitate the development and testing of the CALPA system, including its data processing, analysis, and visualization components. The notebook may include code snippets, comments, and notes related to the development process. Please refer to the official documentation and user guides for the CALPA system for more information on its usage and features.

In [None]:
#%reset

## Initialization

In [2]:
# Import required libraries
import os
import time
from datetime import date
from datetime import datetime
import json
import mimetypes
import glob
import base64
import zipfile
import io
import dotenv
import requests
import pandas as pd
import feedparser
import webbrowser
from mrkdwn_analysis import MarkdownAnalyzer

# Import the calpa library
from calpa import calpa

In [3]:
# Load environment variables from .env file
dotenv.load_dotenv(os.path.join(os.getcwd(), '.env'))

# Create project metadata for the AI project
prjMetadata = calpa.projectMetadata(1)

# Create the project directories dictionary
prjDirs = calpa.projectDirectories(os.getcwd())

# Instantiate the LegiScan and Calpa classes
legiscan = calpa.LegiScan()


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 AI Legislative Policy Analysis (CaLPA-AI)
 California Legislative Policy Analysis for Artificial Intelligence Related Bills
 Part 1 - Preliminary Operations
 Version 1.0 (MIT License), Dr. Kostas Alexandridis, GISP
 GitHub Repository: https://github.com/ktalexan/CaLPA
 Last Updated: Apr 28, 2025
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Dates: 2010-12-02 through 2025-04-28
Periods: 2009-2010, 2011-2012, 2013-2014, 2015-2016, 2017-2018, 2019-2020, 2021-2022, 2023-2024, 2025-2026
Directory Global Settings:

General:
- Project (pathPrj): c:\Users\ktale\OneDrive\Documents\GitHub\CaLPA
- Admin (pathAdmin): c:\Users\ktale\OneDrive\Documents\GitHub\CaLPA\admin
- Metadata (pathMetadata): c:\Users\ktale\OneDrive\Documents\GitHub\CaLPA\metadata
- Analysis (pathAnalysis): c:\Users\ktale\OneDrive\Documents\GitHub\CaLPA\analysis
- Obsidian Vault (pathObsidian): C:\Users\ktal

In [4]:
# Codebook lookup variables
codebookLookupVars = [var for var in dir(calpa.codebook) if var.startswith('lookup')]
codebookDictVars = [var for var in dir(calpa.codebook) if var.startswith('dict')]
print(f"Codebook Lookup Variables:\n{codebookLookupVars}\n")
print(f"Codebook Dictionary Variables:\n{codebookDictVars}\n")

Codebook Lookup Variables:
['lookupBillCode', 'lookupBillTextType', 'lookupBillType', 'lookupBodyType', 'lookupEventType', 'lookupMimeType', 'lookupPartyType', 'lookupProgressType', 'lookupReasonType', 'lookupRoleType', 'lookupSastType', 'lookupSponsorType', 'lookupStateType', 'lookupStatusType', 'lookupSupplementType', 'lookupVoteType']

Codebook Dictionary Variables:
['dictGetAmendment', 'dictGetBill', 'dictGetBillText', 'dictGetPerson', 'dictGetRollCall', 'dictGetSessionList', 'dictGetSupplement']



In [5]:
# Obtain the stored sessions list from JSON dictionary on disk (data/lookup directory)
sessionListStored = legiscan.getStoredData(dataType = "session")

# Obtain the stored session People list from JSON dictionary on disk (data/lookup directory)
sessionPeopleStored = legiscan.getStoredData(dataType = "people")

# Obtain the stored dataset list from JSON dictionary on disk (data/lookup directory)
datasetListStored = legiscan.getStoredData(dataType = "dataset")

# Get the stored raw master list from JSON dictionary on disk (data/lookup directory)
masterListRawStored = legiscan.getStoredData(dataType = "master", raw = True)
# Get the stored master list from JSON dictionary on disk (data/lookup directory)
masterListStored = legiscan.getStoredData(dataType = "master", raw = False)

# Get the AI monitoring list from disk (data/lookup directory)
aiBillListStored = legiscan.getStoredData(dataType = "bills", project = "AI")

# Get the AI full list of bills from dism (data/legis/json directory)
aiBills = legiscan.getStoredData(dataType = "data", project = "AI")

# Get the AI bill summries list from disk (data/lookup directory)
aiBillsSummariesStored = legiscan.getStoredData(dataType = "summaries", project = "AI")

## End of Initialization

In [None]:
# open test["main"] in a web browser
webbrowser.open(test["status"])

In [None]:
webbrowser.open(aiBills["2013-2014"]["AB1465"]["url"], new=2, autoraise=True)

In [None]:
webbrowser.open(aiBills["2013-2014"]["AB1465"]["state_link"], new=2, autoraise=True)

In [None]:

analyzer = MarkdownAnalyzer(samplePath)

headers = analyzer.identify_headers()
paragraphs = analyzer.identify_paragraphs()
blockquotes = analyzer.identify_blockquotes()
links = analyzer.identify_links()
codeBlocks = analyzer.identify_code_blocks()

In [None]:
analysis = analyzer.analyse()

print(analysis)

In [None]:
headers

In [None]:
paragraphs

In [None]:
links

In [None]:
import io
import pypandoc
import panflute

def action(elem, doc):
    if isinstance(elem, panflute.Image):
        doc.images.append(elem)
    elif isinstance(elem, panflute.Link):
        doc.links.append(elem)

if __name__ == '__main__':
    data = pypandoc.convert_file('example.md', 'json')
    doc = panflute.load(io.StringIO(data))
    doc.images = []
    doc.links = []
    doc = panflute.run_filter(action, prepare=prepare, doc=doc)

    print("\nList of image URLs:")
    for image in doc.images:
        print(image.url)

In [51]:
legiscan.aiSearchQuery(sessionId = 993)

3 bills added


{'SB836': {'relevance': 100,
  'state': 'CA',
  'bill_number': 'SB836',
  'bill_id': 577638,
  'change_hash': '3638ac09cd094bfa00ceb018d027ba1a',
  'url': 'https://legiscan.com/CA/bill/SB836/2013',
  'text_url': 'https://legiscan.com/CA/text/SB836/2013',
  'research_url': 'https://legiscan.com/CA/research/SB836/2013',
  'last_action_date': '2014-06-24',
  'last_action': 'Set, first hearing. Hearing canceled at the request of author.',
  'title': 'Brain research: Cal-BRAIN program.'},
 'SB860': {'relevance': 93,
  'state': 'CA',
  'bill_number': 'SB860',
  'bill_id': 581712,
  'change_hash': 'e38cd0ab53209fc7aaf182ffc8ba1f5b',
  'url': 'https://legiscan.com/CA/bill/SB860/2013',
  'text_url': 'https://legiscan.com/CA/text/SB860/2013',
  'research_url': 'https://legiscan.com/CA/research/SB860/2013',
  'last_action_date': '2014-06-20',
  'last_action': 'Chaptered by Secretary of State. Chapter 34, Statutes of 2014.',
  'title': 'Education finance: education omnibus trailer bill.'},
 'AB146

In [57]:
print("Test string 1")
print("Test string 2")

Test string 1
Test string 2


In [61]:
print("Test string 1", end="| ")
print("Test string 2", end="; ", flush=True)

Test string 1| Test string 2; 

In [None]:
colors = ["red", "green", "blue"]
for c in colors:
    print(c, end=", ")

red, green, blue, 

In [167]:
def head(label, type = "general", t=0):

    # Convert each word except the first word to title case, and join them with nothing in between
    words = label.split()
    regionLabel = words[0].lower()+"".join(word.title() for word in words[1:])
    
    match(type):
        case "general":
            label = label
        case "function":
            label = f"Function: {label}"
            regionLabel = f"Function: {regionLabel}"
        case "class":
            label = f"Class: {label}"
            regionLabel = f"Class: {regionLabel}"
        case "method":
            label = f"Method: {label}"
            regionLabel = f"Method: {regionLabel}"
    print(t*"   ")
    print(t*"   ",f"# region {regionLabel}")
    print(t*"   ","#", f" {label} ".center(80, "~"))
    print(t*"   ","#", 80 * "~")
    print(t*"   ",f"# endregion {regionLabel}")
    print(t*"   ")
    del words, label, regionLabel

In [192]:
head("Part 1: Define Variables and Input Data", type = "general", t=2)

      
       # region part1:DefineVariablesAndInputData
       # ~~~~~~~~~~~~~~~~~~~ Part 1: Define Variables and Input Data ~~~~~~~~~~~~~~~~~~~~
       # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       # endregion part1:DefineVariablesAndInputData
      


In [199]:
print("#", " Part 3: Create the Main Markdown Content Section of the File ".center(80, "~"))

# ~~~~~~~~~ Part 3: Create the Main Markdown Content Section of the File ~~~~~~~~~


In [210]:
print("#", " Webpage (iframe) ".center(50, "~"))


# ~~~~~~~~~~~~~~~~ Webpage (iframe) ~~~~~~~~~~~~~~~~


In [8]:
myNotesPath = os.path.join(
            prjDirs["pathScriptsMd"], "notes", "2013-2014", "AB1465.md"
        )

In [10]:
# Get the parent directory of the myNotesPath
myNotesDir = os.path.dirname(myNotesPath)
# Get the parent directory of the myNotesDir
myNotesDirParent = os.path.dirname(myNotesDir)
print(myNotesPath)
print(myNotesDir)
print(myNotesDirParent)

c:\Users\ktale\OneDrive\Documents\GitHub\CaLPA\markdown\notes\2013-2014\AB1465.md
c:\Users\ktale\OneDrive\Documents\GitHub\CaLPA\markdown\notes\2013-2014
c:\Users\ktale\OneDrive\Documents\GitHub\CaLPA\markdown\notes
