# Transfer Learning using a Pre-trained Transformer Model for Text Summarization

**Approach**

In this notebook, we solve a natural language processing (NLP) task of text summarization, by using an **encoder-decoder** Transformer architecture-based pre-trained model T5. Specifically, the T5 model is fine-tuned for transfer learning. The pre-trained T5 model is obtained from the Huggingface library.


- Step 1: Load the raw dataset
- Step 2: Tokenize the raw dataset
- Step 3: Instantiate a pre-trained model from the model checkpoint
- Step 4: Create a data collator
- Step 5: Create train and test dataset loader objects
- Step 6: Compile the Model
- Step 7: Train the model
- Step 8: Test the model by generating summary of a sample text



**Dataset**

BillSum is a text summarization dataset consisting of the summarization of US Congressional and California state bills. It has 1237 samples each containing the bill text ("text"), a  summary of the bill ("summary"), and the title of the bill ("title").


**Acknowledgmentgement**

This notebook is adapted from the following resources.
- https://huggingface.co/learn/nlp-course/chapter7/5?fw=tf
- https://huggingface.co/docs/transformers/tasks/summarization
- https://github.com/huggingface/notebooks/blob/main/examples/summarization-tf.ipynb
- https://keras.io/examples/nlp/t5_hf_summarization/



In [1]:
! pip install transformers datasets
! pip install rouge-score
! pip install huggingface_hub
! pip install keras_nlp

Collecting transformers
  Downloading transformers-4.34.0-py3-none-any.whl (7.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.7/7.7 MB[0m [31m20.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting datasets
  Downloading datasets-2.14.5-py3-none-any.whl (519 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m519.6/519.6 kB[0m [31m33.8 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.16.4 (from transformers)
  Downloading huggingface_hub-0.18.0-py3-none-any.whl (301 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.0/302.0 kB[0m [31m32.1 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers<0.15,>=0.14 (from transformers)
  Downloading tokenizers-0.14.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.8/3.8 MB[0m [31m48.8 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting safetensors>=0.3.1 (from transformers)
  Downloading saf

**Store the Fine-Tuned Model on the Hugging Face Hub**

We can store the fine-tuned model Hugging Face Hub cloud repository. This will make it easier to reuse the fine-tuned model.

We will use push_to_hub API for this purpose. However, to use this utility, we need to have a Hugging Face account (sign up with a Hugging Face account at: https://huggingface.co/welcome). Then, get an authentication token and input the token after running the following cell.


In [2]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [3]:
import os
import logging

import random
from IPython.display import display, HTML

import numpy as np
import pandas as pd

from sklearn.metrics import classification_report

import tensorflow as tf
import keras_nlp

import datasets
from datasets import load_dataset

import transformers
from transformers.keras_callbacks import PushToHubCallback, KerasMetricCallback
from transformers import AutoTokenizer, DataCollatorForSeq2Seq
from transformers import TFAutoModelForSeq2SeqLM
from transformers import create_optimizer, AdamWeightDecay


# Only log error messages
# tf.get_logger().setLevel(logging.ERROR)

# os.environ["TOKENIZERS_PARALLELISM"] = "false"


print(transformers.__version__)


Using TensorFlow backend
4.34.0


**Useful Variables**

In [4]:
TRAIN_TEST_SPLIT = 0.2 # Percentage of the dataset for evaluation
MAX_INPUT_LENGTH = 1024  # Maximum length of the input to the model
MAX_TARGET_LENGTH = 128  # Maximum length of the output from the model

# The t5-small checkpoint from the Hugging Face Model Hub
MODEL_CHECKPOINT = "t5-small"

**Step 1: Load the raw dataset**

The load_dataset method returns a dictionary object of type DatasetDict.


In [5]:
raw_datasets=load_dataset("billsum", split="ca_test")
raw_datasets


Downloading builder script:   0%|          | 0.00/3.66k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/6.70k [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/67.3M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/18949 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/3269 [00:00<?, ? examples/s]

Generating ca_test split:   0%|          | 0/1237 [00:00<?, ? examples/s]

Dataset({
    features: ['text', 'summary', 'title'],
    num_rows: 1237
})

In [6]:
# Split the dataset into a train and test set with the train_test_split method

raw_datasets = raw_datasets.train_test_split(test_size=TRAIN_TEST_SPLIT)
raw_datasets

DatasetDict({
    train: Dataset({
        features: ['text', 'summary', 'title'],
        num_rows: 989
    })
    test: Dataset({
        features: ['text', 'summary', 'title'],
        num_rows: 248
    })
})

In [7]:

# Access a sample of the training raw_datasets object by indexing
raw_train_dataset = raw_datasets["train"]
print(raw_train_dataset[0])

{'text': 'The people of the State of California do enact as follows:\n\n\nSECTION 1.\nArticle 3 (commencing with Section 115810) is added to Chapter 4 of Part 10 of Division 104 of the Health and Safety Code, to read:\nArticle  3. The Consideration of Alternatives for Artificial Turf Infill Act of 2016\n115810.\nThe Legislature finds and declares all of the following:\n(a) Thousands of schools, parks, and local governments have installed artificial turf fields throughout the state. It has allowed them to use fields year round, save water, and save money, among other benefits.\n(b) Not all artificial turf fields are made from the same materials. While most artificial turf fields use less expensive crumb rubber infill from groundup used car and truck tires, many companies now offer artificial turf infill alternatives made from coconut fibers, rice husks, cork, sand, or virgin crumb rubber. Organic alternative infills can help reduce synthetic turf field temperatures on hot days by as muc

In [8]:
# Inspect the features of the raw_train_dataset object
raw_train_dataset.features

{'text': Value(dtype='string', id=None),
 'summary': Value(dtype='string', id=None),
 'title': Value(dtype='string', id=None)}

**Display some random samples from the dataset**

In [9]:
# A function to display some random samples from the dataset
def show_random_elements(dataset, num_examples=5):
    assert num_examples <= len(
        dataset
    ), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset) - 1)
        while pick in picks:
            pick = random.randint(0, len(dataset) - 1)
        picks.append(pick)

    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))


# Display some random samples from the training dataset
show_random_elements(raw_datasets["train"])

Unnamed: 0,text,summary,title
0,"The people of the State of California do enact as follows:\n\n\nSECTION 1.\nSection 79707.5 is added to the Water Code, to read:\n79707.5.\nIn furtherance of subdivision (g) of Section 79707, all recipients of funding pursuant to Chapter 6 (commencing with Section 79730) shall post signs acknowledging the source of funds in accordance with guidelines that the secretary shall develop. For the purposes of this section, state funding shall be listed first on the sign if the state is the source of 50 percent or more of the total project costs.\nSECTION 1.\nChapter 11.2 (commencing with Section 8852) is added to Division 1 of Title 2 of the\nGovernment Code\n, to read:\n11.2.\nState Facilities Renewal Bond Act of 2016\n1.\nGeneral Provisions\n8852.\nThis chapter shall be known as the State Facilities Renewal Bond Act of 2016.\n8852.1.\nAs used in this chapter, the following terms have the following meanings:\n(a)“Committee” means the State Facilities Renewal Bond Finance Committee created pursuant to Section 8852.31.\n(b)“Fund” means the State Facilities Renewal Bond Fund created pursuant to Section 8852.2.\n(c)“State agency” means any state agency, department, office, division, bureau, board, commission, district agricultural association, the California State University, the University of California, and the Judicial Council.\n(d)“Deferred maintenance projects” means delayed projects to replace infrastructure and building components in order to preserve or maintain these assets in an acceptable condition.\n2.\nState Facilities Renewal Bond Fund and Program\n8852.2.\n(a)The proceeds of bonds issued and sold pursuant to this chapter shall be deposited in the State Facilities Renewal Bond Fund, which is hereby created. Fund moneys shall only be used to address deferred maintenance projects on state-owned property and shall be made available for expenditure only upon appropriation by the Legislature in the annual Budget Act. Funds shall be appropriated to state agencies as part of their respective agency budgets for state operations. Fund moneys appropriated to a state agency shall supplement, not supplant, an agency’s existing deferred maintenance expenditures. It is the intent of the Legislature that the projects funded by these bonds shall have a useful life of at least 20 years.\n(b)The Governor shall propose appropriations from the State Facilities Renewal Bond Fund as part of his or her January 10 budget proposal.\n(1)Within 10 days following release of the budget proposal, the Department of Finance shall report all of the following to the respective budget committees of the Legislature:\n(A)The administration’s methodology for allocating the bond funds among the various state agencies.\n(B)The criteria used for establishing deferred maintenance project funding priorities.\n(2)A state agency for which the Governor proposes an appropriation from the State Facilities Renewal Bond Fund shall report, within 30 days following the release of the budget proposal, the following to the respective budget committees of the Legislature:\n(A)The agency’s total deferred maintenance backlog.\n(B)The agency’s deferred maintenance expenditures in the prior fiscal year.\n(C)A list of deferred maintenance projects proposed to be undertaken by the agency with moneys from the fund proposed for appropriation.\n(D)The agency’s expenditures in the prior fiscal year for maintenance other than deferred maintenance.\n(E)The extent to which the agency’s current budget for maintenance is insufficient to prevent an increase in the agency’s deferred maintenance backlog.\n3.\nFiscal\n8852.3.\nBonds in the total amount of two billion dollars ($2,000,000,000), or so much thereof as is necessary, not including the amount of any refunding bonds, or so much thereof as is necessary, may be issued and sold to provide a fund to be used for carrying out the purposes expressed in this chapter and to reimburse the General Obligation Bond Expense Revolving Fund pursuant to Section 16724.5. The bonds, when sold, shall be and constitute a valid and binding obligation of the State of California, and the full faith and credit of the State of California is hereby pledged for the punctual payment of both principal of, and interest on, the bonds as the principal and interest become due and payable. The bonds issued pursuant to this chapter shall be repaid within 20 years from the date they are issued.\n8852.31.\nThe bonds authorized by this chapter shall be prepared, executed, issued, sold, paid, and redeemed as provided in the State General Obligation Bond Law (Chapter 4 (commencing with Section 16720) of Part 3 of Division 4 of Title 2), and all of the provisions of that law apply to the bonds and to this chapter and are hereby incorporated in this chapter as though set forth in full in this chapter, except subdivisions (a) and (b) of Section 16727.\n8852.32.\n(a)Solely for the purpose of authorizing the issuance and sale pursuant to the State General Obligation Bond Law of the bonds authorized by this chapter, the State Facilities Renewal Bond Finance Committee is hereby created. For purposes of this chapter, the State Facilities Renewal Bond Finance Committee is “the committee” as that term is used in the State General Obligation Bond Law. The committee consists of the Controller, Director of Finance, and Treasurer, or their designated representatives.\n(b)The Treasurer shall serve as chairperson of the committee.\n(c)A majority of the committee may act for the committee.\n8852.33.\nThe committee shall determine whether or not it is necessary or desirable to issue bonds authorized pursuant to this chapter in order to carry out the actions specified in Section 8852.2 and, if so, the amount of bonds to be issued and sold. Successive issues of bonds may be authorized and sold to carry out those actions progressively, and it is not necessary that all of the bonds authorized to be issued be sold at any one time.\n8852.34.\nThere shall be collected each year and in the same manner and at the same time as other state revenue is collected, in addition to the ordinary revenues of the state, a sum in an amount required to pay the principal of, and interest on, the bonds each year. It is the duty of all officers charged by law with any duty in regard to the collection of the revenue to do and perform each and every act that is necessary to collect that additional sum.\n8852.35.\nNotwithstanding Section 13340, there is hereby appropriated from the General Fund in the State Treasury, for the purposes of this chapter, an amount that will equal the total of the following:\n(a)The sum annually necessary to pay the principal of, and interest on, bonds issued and sold pursuant to this chapter, as the principal and interest become due and payable.\n(b)The sum necessary to carry out Section 8852.36, appropriated without regard to fiscal years.\n8852.36.\nFor the purposes of carrying out this chapter, the Director of Finance may authorize the withdrawal from the General Fund of an amount not to exceed the amount of the unsold bonds that have been authorized by the committee to be sold for the purpose of carrying out this chapter. Any amounts withdrawn shall be deposited in the fund. Any moneys made available under this section shall be returned to the General Fund, with interest at the rate earned by the moneys in the Pooled Money Investment Account, from proceeds received from the sale of bonds for the purpose of carrying out this chapter.\n8852.37.\nAll moneys deposited in the fund that is derived from premium and accrued interest on bonds sold shall be reserved in the fund and shall be available for transfer to the General Fund as a credit to expenditures for bond interest.\n8852.38.\nPursuant to Chapter 4 (commencing with Section 16720) of Part 3 of Division 4 of Title 2, the cost of bond issuance shall be paid out of the bond proceeds. These costs shall be shared proportionally by each program funded through this bond act.\n8852.39.\nThe committee may request the Pooled Money Investment Board to make a loan from the Pooled Money Investment Account, including other authorized forms of interim financing that include, but are not limited to, commercial paper, in accordance with Section 16312, for purposes of carrying out this chapter. The amount of the request shall not exceed the amount of the unsold bonds that the committee, by resolution, has authorized to be sold for the purpose of carrying out this chapter. The committee shall execute any documents required by the Pooled Money Investment Board to obtain and repay the loan. Any amounts loaned shall be deposited in the fund to be allocated by the board in accordance with this chapter.\n8852.40.\nThe bonds may be refunded in accordance with Article 6 (commencing with Section 16780) of Chapter 4 of Part 3 of Division 4 of Title 2, which is a part of the State General Obligation Bond Law. Approval by the voters of the state for the issuance of the bonds described in this chapter includes the approval of the issuance of any bonds issued to refund any bonds originally issued under this chapter or any previously issued refunding bonds.\n8852.41.\nNotwithstanding any other provision of this chapter, or of the State General Obligation Bond Law, if the Treasurer sells bonds pursuant to this chapter that include a bond counsel opinion to the effect that the interest on the bonds is excluded from gross income for federal tax purposes, subject to designated conditions, the Treasurer may maintain separate accounts for the investment of bond proceeds and for the investment of earnings on those proceeds. The Treasurer may use or direct the use of those proceeds or earnings to pay any rebate, penalty, or other payment required under federal law or take any other action with respect to the investment and use of those bond proceeds required or desirable under federal law to maintain the tax exempt status of those bonds and to obtain any other advantage under federal law on behalf of the funds of this state.\n8852.42.\nThe Legislature hereby finds and declares that, inasmuch as the proceeds from the sale of bonds authorized by this chapter are not “proceeds of taxes” as that term is used in Article XIII B of the California Constitution, the disbursement of these proceeds is not subject to the limitations imposed by that article.\nSEC. 2.\nSection 1 of this act shall take effect upon the approval by the voters of the State Facilities Renewal Bond Act of 2016, as set forth in Section 1 of this act.\nSEC. 3.\nSection 1 of this act shall be submitted to the voters at the June 7, 2016, statewide primary election in accordance with provisions of the Government Code and the Elections Code governing the submission of a statewide measure to the voters.","Existing law, the Water Quality, Supply, and Infrastructure Improvement Act of 2014, approved by the voters as Proposition 1 at the November 4, 2014, statewide general election, authorizes the issuance of general obligation bonds in the amount of $7,545,000,000 to finance a water quality, supply, and infrastructure improvement program. The act provides that it is the intent of the people that, to the extent practicable, a project supported by the funds made available by the act will include signage informing the public that the project received funds from the act.\nThis bill would require certain recipients of funding pursuant to the act to post signs acknowledging the source of funds in accordance with guidelines that the Secretary of the Natural Resources Agency would be required to develop.\nThe annual Budget Act appropriates funds to state agencies for operations as part of their respective agency budgets. Existing law requires the Department of General Services to report to the Legislature, as specified, on expenditures for seismic hazard abatement for state buildings and facilities, in connection with the Earthquake Safety and Public Buildings Rehabilitation Bond Act of 1990.\nThis bill would enact the State Facilities Renewal Bond Act of 2016, which, if adopted by the voters at the June 7, 2016, statewide primary election, would authorize the issuance of bonds in the amount of $2,000,000,000, pursuant to the State General Obligation Bond Law, to finance deferred maintenance on state-owned property, subject to appropriation by the Legislature in the annual Budget Act.","An act to\nadd Chapter 11.2 (commencing with Section 8852) to Division 1 of Title 2 of the Government\nadd Section 79707.5 to the Water\nCode, relating to\nfinancing deferred maintenance on state facilities, by providing the funds necessary therefor through an election for the issuance and sale of bonds of the State of California and for the handling and disposition of those funds.\nwater."
1,"The people of the State of California do enact as follows:\n\n\nSECTION 1.\nIt is the intent of the Legislature to further the purposes of the federal Stephen Beck Jr., Achieving a Better Life Experience Act to ensure that people with disabilities may save for the future to achieve greater independence.\nSEC. 2.\nThis act shall be known, and may be cited, as the California Achieving a Better Life Experience Act.\nSEC. 3.\nSection 23711.4 is added to the Revenue and Taxation Code, to read:\n23711.4.\nFor taxable years beginning on or after January 1, 2016, Section 529A of the Internal Revenue Code, relating to qualified ABLE programs, added by Section 102 of Division B of Public Law 113-295, shall apply, except as otherwise provided.\n(a) Section 529A(a) of the Internal Revenue Code is modified as follows:\n(1) By substituting the phrase “under Part 10 (commencing with Section 17001) and this part” in lieu of the phrase “under this subtitle.”\n(2) By substituting “Article 2 (commencing with Section 23731)” in lieu of “Section 511.”\n(b) Section 529A(c)(3)(A) of the Internal Revenue Code is modified by substituting “2.5 percent” in lieu of “10 percent.”\n(c) A copy of the report required to be filed with the Secretary of the Treasury under Section 529A(d) of the Internal revenue Code, relating to reports shall be filed with the Franchise Tax Board at the same time and in the same manner as specified in that section.\nSEC. 4.\nSection 4877 is added to the Welfare and Institutions Code, to read:\n4877.\n(a) There is hereby created an instrumentality of the State of California to be known as the California ABLE Program Trust.\n(b) The purposes, powers, and duties of the California ABLE Program Trust are vested in, and shall be exercised by, the board.\n(c) The board, in the capacity of trustee, shall have the power and authority to do all of the following:\n(1) Sue and be sued.\n(2) Make and enter into contracts necessary for the administration of the ABLE program trust, and engage personnel, including consultants, actuaries, managers, counsel, and auditors, as necessary for the purpose of rendering professional, managerial, and technical assistance and advice.\n(3) Adopt a corporate seal and change and amend it from time to time.\n(4) Cause moneys in the program fund to be held and invested and reinvested.\n(5) Accept any grants, gifts, appropriations, and other moneys from any unit of federal, state, or local government or any other person, firm, partnership, or corporation for deposit to the administrative fund or the program fund.\n(6) Enter into agreements with designated beneficiaries or eligible individuals to establish and maintain an ABLE account.\n(7) Make provisions for the payment of costs of administration and operation of the ABLE program trust.\n(8) Carry out the duties and obligations of the ABLE program trust pursuant to this chapter and the federal ABLE Act pursuant to Section 529A of the Internal Revenue Code and federal regulations issued pursuant to that code, and have any other powers as may be reasonably necessary for the effectuation of the purposes, objectives, and provisions of this chapter.\n(9) Carry out studies and projections in order to advise designated beneficiaries or eligible individuals regarding present and estimated future qualified disability expenses and the levels of financial participation in the ABLE program trust required in order to assist designated beneficiaries or eligible individuals.\n(10) Participate in any other way in any federal, state, or local governmental program for the benefit of the ABLE program trust.\n(11) Promulgate, impose, and collect administrative fees and charges in connection with transactions of the ABLE program trust, and provide for reasonable service charges, including penalties for cancellations.\n(12) Set minimum and maximum investment levels.\n(13) Administer the funds of the ABLE program trust.\n(14) Procure insurance against any loss in connection with the property, assets, or activities of the ABLE program trust.\n(15) Procure insurance indemnifying any member of the board from personal loss or liability resulting from a member’s action or inaction as a member of the board.\n(d) The Treasurer shall, on behalf of the board, appoint an executive director, who shall not be a member of the board and who shall serve at the pleasure of the board. The Treasurer shall determine the duties of the executive director and other staff as necessary and set his or her compensation. The board may authorize the executive director to enter into contracts on behalf of the board or conduct any business necessary for the efficient operation of the board.\nSEC. 5.\nSection 4878 is added to the Welfare and Institutions Code, to read:\n4878.\n(a) The board shall segregate moneys received by the ABLE program trust into two funds, which shall be identified as the program fund and the administrative fund.\n(1) Notwithstanding Section 13340 of the Government Code, the program fund is hereby continuously appropriated, without regard to fiscal years, to the ABLE Act Board for the purposes specified in this act.\n(2) The moneys in the administrative fund shall be available for the ABLE Act Board, upon appropriation, for administration of the act. Administrative costs shall not exceed 3 percent of the incoming funds for each fiscal year for the first five fiscal years following the opening of the first ABLE Act account. After the five-year period, administrative costs shall not exceed 1 percent of the incoming funds for each fiscal year.\n(3) Funding for startup and administrative costs for the board shall be provided in the form of a loan from the General Fund sufficient to cover the board’s projected administrative costs for its first two years of implementing the program. Once the loan has been expended and revenues from the program are sufficient to cover the board’s ongoing costs, the board shall repay, within five years, the amount loaned, plus interest calculated at the rate earned by the Pooled Money Investment Account.\n(b) Not later than 30 days after the close of each month, the investment manager shall place on file for public inspection during business hours a report with respect to investment performance. The investment manager shall report the following information, to the extent applicable, to the board within 30 days following the end of each month:\n(1) The type of investment, name of the issuer, date of maturity, and the par and dollar amount invested in each security, investment, and money within the program fund.\n(2) The weighted average maturity of the investments within the program fund.\n(3) Any amounts in the program fund that are under the management of an investment manager.\n(4) The market value as of the date of the report and the source of this valuation for any security within the program fund.\n(5) A description of the compliance with the statement of investment policy.\n(c) Moneys in the program fund may be invested or reinvested by the Treasurer or may be invested in whole or in part under contract with an investment manager, as determined by the board.\n(d) The board shall annually prepare and adopt a written statement of investment policy. The board shall consider the statement of investment policy and any changes in the investment policy at a public hearing. The board shall approve the investment management entity or entities consistent with subdivision (c).\n(e) Transfers may be made from the program fund to the administrative fund for the purpose of paying operating costs associated with administering the ABLE program trust and as required by this chapter. All costs of administration of the ABLE program trust shall be paid out of the administrative fund.\n(f) All moneys paid by designated beneficiaries or eligible individuals in connection with ABLE accounts shall be deposited as received into the program fund, and shall be promptly invested and accounted for separately. Deposits and interest thereon accumulated on behalf of designated beneficiaries in the program fund of the ABLE program trust may be used for qualified disability expenses.\n(g) The board shall maintain separate accounting for each designated beneficiary.\n(h) Any designated beneficiary may, directly or indirectly, direct the investment of any contributions to his or her ABLE account, or any earnings thereon, no more than two times in any calendar year.\n(i) The assets of the trust, including the program fund, shall at all times be preserved, invested, and expended solely and only for the purposes of the trust and shall be held in trust for the designated beneficiaries and no property rights therein shall exist in favor of the state. The assets shall not be transferred or used by the state for any purposes other than the purposes of the trust and consistent with the provisions of the federal ABLE Act.\nSEC. 6.\nSection 4880 is added to the Welfare and Institutions Code, to read:\n4880.\nNotwithstanding any other law, moneys in, contributions to, and any distribution for qualified disability expenses from, an ABLE account, not to exceed one hundred thousand dollars ($100,000), shall not count toward determining eligibility for a state or local means-tested program.\nSEC. 7.\nSection 4882 is added to the Welfare and Institutions Code, to read:\n4882.\n(a) The board shall adopt regulations as it deems necessary to implement this chapter consistent with the federal Internal Revenue Code and regulations issued pursuant to that code to ensure that this program meets all criteria for federal tax-exempt benefits.\n(b) The board may adopt regulations to implement this chapter as emergency regulations in accordance with the rulemaking provisions of the Administrative Procedure Act (Chapter 3.5 (commencing with Section 11340) of Part 1 of Division 3 of Title 2 of the Government Code). The adoption of the regulations shall be deemed to be an emergency and necessary for the immediate preservation of the public peace, health and safety, or general welfare.\nSEC. 8.\nSection 4884 is added to the Welfare and Institutions Code, to read:\n4884.\nThe board shall market this program to residents of the State of California to the extent funds are available to do so.\nSEC. 9.\nThis act shall become operative only if Senate Bill 324 of the 2015–16 Regular Session is enacted and takes effect on or before January 1, 2016.","The Personal Income Tax Law and the Corporation Tax Law, in specified conformity with federal income tax laws regarding qualified tuition programs, provide that distributions from a qualified tuition program are generally not included in the income of the donor or the beneficiary, as specified.\nExisting federal law, the Stephen Beck Jr., Achieving a Better Life Experience Act of 2014 (ABLE Act), for taxable years beginning on or after January 1, 2015, encourages and assists individuals and families to save private funds for the purpose of supporting persons with disabilities to maintain their health, independence, and quality of life by excluding from gross income distributions used for qualified disability expenses by a beneficiary of a Qualified ABLE Program established and maintained by a state, as specified.\nThis bill would, for taxable years beginning on or after January 1, 2016, conform to these federal income tax law provisions relating to the ABLE Act under the Corporation Tax Law, as provided. The bill would also establish in state government the ABLE program trust for purposes of implementing the federal ABLE Act. The bill would authorize the ABLE Act Board to adopt regulations to implement the program. The bill would create the program fund, a continuously appropriated fund, thereby making an appropriation, and the administrative fund, as specified. The bill would require the board to administer the program in compliance with the requirements of the federal ABLE Act.\nThis bill would become operative only if SB 324 is enacted and takes effect on or before January 1, 2016.","An act to add Section 23711.4 to the Revenue and Taxation Code, and to add Sections 4877, 4878, 4880, 4882, and 4884 to the Welfare and Institutions Code, relating to taxation, and making an appropriation therefor."
2,"The people of the State of California do enact as follows:\n\n\nSECTION 1.\nSection 17140.4 is added to the Revenue and Taxation Code, to read:\n17140.4.\nFor taxable years beginning on or after January 1, 2016, Section 529A of the Internal Revenue Code, relating to qualified ABLE programs, added by Section 102 of Division B of Public Law 113-295, shall apply, except as otherwise provided.\n(a) Section 529A(a) of the Internal Revenue Code is modified as follows:\n(1) By substituting the phrase “under this part and Part 11 (commencing with Section 23001)” in lieu of the phrase “under this subtitle.”\n(2) By substituting “Article 2 (commencing with Section 23731)” in lieu of “Section 511.”\n(b) Section 529A(c)(3)(A) of the Internal Revenue Code is modified by substituting “2.5 percent” in lieu of “10 percent.”\n(c) A copy of the report required to be filed with the Secretary of the Treasury under Section 529A(d) of the Internal Revenue Code, relating to reports, shall be filed with the Franchise Tax Board at the same time and in the same manner as specified in that section.\nSEC. 2.\nChapter 15 (commencing with Section 4875) is added to Division 4.5 of the Welfare and Institutions Code, to read:\nCHAPTER 15. Qualified ABLE Program\n4875.\nFor purposes of this chapter:\n(a) “ABLE account” or “account” means the account established and owned by a designated beneficiary pursuant to this chapter for the purpose of meeting the qualified disability expenses of the designated beneficiary of the account.\n(b) “Administrative fund” means the fund used to administer this chapter.\n(c) “Board” means the California ABLE Act Board established under this chapter.\n(d) “California ABLE Program Trust” or “ABLE program trust” means the trust created pursuant to this chapter.\n(e) “Designated beneficiary” means the eligible individual who established an ABLE account and is the owner of the account.\n(f) “Eligible individual” means an individual who is eligible under the program for a taxable year if during that taxable year both of the following criteria are met:\n(1) The individual is entitled to benefits based on blindness or disability under Title II or XVI of the federal Social Security Act, and that blindness or disability occurred before the date on which the individual attained 26 years of age.\n(2) A disability certification, as defined in the federal ABLE Act, with respect to the individual is filed pursuant to the requirements set forth in the federal ABLE Act.\n(g) “Federal ABLE Act” means the federal Stephen Beck, Jr., Achieving a Better Life Experience Act of 2014.\n(h) “Investment management” means the functions performed by a manager contracted to perform functions delegated by the board.\n(i) “Investment manager” means a manager contracted to perform functions delegated by the board.\n(j) “Program fund” means the program fund established by this chapter, which shall be held as a separate fund within the California ABLE Program Trust.\n(k) “Qualified ABLE Program” or “program” means the program established by this chapter to implement the federal ABLE Act pursuant to Section 529A of the Internal Revenue Code.\n(l) “Qualified disability expenses” means any expenses related to the eligible individual’s blindness or disability that are made for the benefit of an eligible individual who is the designated beneficiary, including expenses related to education, housing, transportation, employment training and support, assistive technology and personal support services, health, prevention and wellness, financial management and administrative services, legal fees, expenses for oversight and monitoring, funeral and burial expenses, and other expenses, which are approved by the Secretary of the Treasury under regulations and consistent with the purposes of the federal ABLE Act.\n4876.\nThere is hereby created the California ABLE Act Board that consists of the Treasurer, the Director of Finance, the Controller, the Director of Developmental Services, the Chairperson of the State Council on Developmental Disabilities, the Director of Rehabilitation, and the Chair of the State Independent Living Council, or their designees. The Treasurer shall serve as chair of the board.\n4879.\n(a) Under the program, a person may make contributions for a taxable year, for the benefit of an individual who is an eligible individual for that taxable year, to an ABLE account that is established for the purpose of meeting the qualified disability expenses of the designated beneficiary of the account if both of the following criteria are met:\n(1) The designated beneficiary is limited to one ABLE account for purposes of this chapter.\n(2) The ABLE account is established only for a designated beneficiary who is a resident of this state.\n(b) A contribution shall not be accepted if either of the following occurs:\n(1) The contribution is not in cash.\n(2) Except in the case of contributions under Section 529A(c)(1)(C) of the Internal Revenue Code, relating to change in designated beneficiaries or programs, the contribution to an ABLE account would result in aggregate contributions from all contributors to the ABLE account for the taxable year exceeding the amount in effect under Section 2503(b) of the Internal Revenue Code, relating to exclusion from gifts, for the calendar year in which the taxable year begins.\n(c) The designated beneficiary shall retain ownership of all contributions made to the designated beneficiary’s ABLE account to the date of utilization for qualified disability expenses, and all interest derived from the investment of the contributions to the designated beneficiary’s ABLE account shall be deemed to be held in the ABLE program trust for the benefit of the designated beneficiary. Neither the contributions, nor any interest derived therefrom, may be pledged as collateral for any loan.\n(d) The board shall develop adequate safeguards to prevent aggregate contributions on behalf of a designated beneficiary in excess of the maximum contribution limits necessary to provide for the qualified disability expenses of the designated beneficiary. For purposes of this subdivision, aggregate contributions include contributions under any prior qualified ABLE program of any state or agency or instrumentality thereof.\n4881.\n(a) The board shall provide an annual listing of distributions to individuals with respect to an interest in an ABLE account to the Franchise Tax Board at a time and in a manner and form as specified by the Franchise Tax Board. The taxpayers’ identification numbers obtained in connection with an ABLE account shall be used exclusively for state and federal tax administration purposes.\n(b) The board shall make a report to the appropriate individual of any distribution to any individual with respect to an interest in an ABLE account, at a time and in a form and manner as required by the Franchise Tax Board.\n(c) The board shall report annually to each designated beneficiary all of the following:\n(1) The value of the designated beneficiary’s account.\n(2) The interest earned thereon.\n(3) The rate of return of the investments in the designated beneficiary’s account for that reporting period.\n(4) Information on investments and qualified disability expenses that designated beneficiaries can use to set savings goals and contribution amounts.\n(d) The board shall provide a means for designated beneficiaries to express concerns or comments regarding the ABLE program trust and any information required to be reported by this section.\n4883.\nThis act shall be construed liberally in order to effectuate its legislative intent. The purposes of this act and all of its provisions with respect to powers granted shall be broadly interpreted to effectuate the intent and purposes of the federal ABLE Act and not as a limitation of those powers.\nSEC. 3.\nThis act shall only become effective if Assembly Bill 449 of the 2015–16 Regular Session is enacted and becomes effective.","The Personal Income Tax Law and the Corporation Tax Law, in specified conformity with federal income tax laws regarding qualified tuition programs, provide that distributions from a qualified tuition program are generally not included in the income of the donor or the beneficiary, as specified.\nExisting federal law, the Stephen Beck, Jr., Achieving a Better Life Experience Act of 2014 (ABLE Act), for taxable years beginning on or after January 1, 2014, encourages and assists individuals and families to save private funds for the purpose of supporting persons with disabilities to maintain their health, independence, and quality of life by excluding from gross income distributions used for qualified disability expenses by a beneficiary of a Qualified ABLE Program established and maintained by a state, as specified.\nThis bill, for taxable years beginning on or after January 1, 2016, would conform to these federal income tax law provisions relating to the ABLE Act under the Personal Income Tax Law, as provided. The bill would create the ABLE Act Board and would require the board provide an annual listing of distributions to individuals that have an interest in an ABLE account to the Franchise Tax Board, as provided.\nThis bill would provide that it will only become effective if AB 449 is enacted and becomes effective.","An act to add Section 17140.4 to the Revenue and Taxation Code, and to add Chapter 15 (commencing with Section 4875) to Division 4.5 of the Welfare and Institutions Code, relating to taxation."
3,"The people of the State of California do enact as follows:\n\n\nSECTION 1.\nSection 3104 of the Elections Code is amended to read:\n3104.\nApplications for the ballots of military or overseas voters shall be received and, except as provided in\nSection 3106,\nSections 3106, 3106.2, and 3106.5,\nthe ballots shall be received and canvassed, at the same time and under the same procedure as vote by mail ballots, insofar as that procedure is not inconsistent with this chapter.\nSEC. 2.\nSection 3105 of the Elections Code is amended to read:\n3105.\n(a)\nAny\nAn\napplication made pursuant to this chapter that is received by the elections official prior to the 60th day before the election shall be kept and processed on or after the 60th day before the election.\n(b) (1) The elections official shall send the ballot not earlier than 60 days but not later than 45 days before the election and shall include with the ballot a list of all candidates who have qualified for the ballot and a list of all measures that are to be submitted to the voters and on which the voter is qualified to vote. The voter shall be entitled to write in the name of any specific candidate seeking nomination or election to any office listed on the ballot.\n(2) The military or overseas voter may, in the alternative to the ballot provided pursuant to paragraph (1), use a federal write-in absentee ballot to vote in any election in which the military or overseas voter is qualified to vote.\n(c) Notwithstanding Section 15341 or any other\nprovision of\nlaw, any name written upon a ballot for a particular office pursuant to subdivision (b) shall be counted for the office or nomination, providing the candidate whose name has been written on the ballot has, as of the date of the election, qualified to have his or her name placed on the ballot for the office, or has qualified as a write-in candidate for the office.\n(d) Except as provided in\nSection 3106,\nSections 3106, 3106.2, and 3106.5,\nthe elections official shall receive and canvass military or overseas voter ballots described in this section under the same procedure as vote by mail ballots, insofar as that procedure is not inconsistent with this section.\n(e) In the event that a military or overseas voter executes a ballot pursuant to this section and an application for a vote by mail ballot pursuant to Section 3102, the elections official shall process the application and the ballot in accordance with this chapter.\n(f) Notwithstanding any other\nprovision of\nlaw, a military or overseas voter who qualifies pursuant to this chapter may, by facsimile transmission, register to vote and apply for a ballot pursuant to this section or a vote by mail ballot. Upon request, the elections official shall send the ballot to the qualified military or overseas voter either by mail, facsimile, or electronic transmission, as requested by the voter.\nSEC. 3.\nSection 3106.2 is added to the Elections Code, to read:\n3106.2.\n(a) A military or overseas voter, as described in subdivision (b) of Section 300, may return his or her vote by mail ballot by electronic mail in the manner prescribed in subdivision (b). To be counted, the ballot returned by electronic mail must be received by the voter’s elections official no later than the closing of the polls on election day and must be accompanied by a copy of an identification envelope containing all of the information required by Section 3011 and an oath of voter declaration in substantially the form described in subdivision (a) of Section 3106.\n(b) To submit a ballot by electronic mail, the ballot and accompanying identification envelope and oath of voter declaration must be scanned to create electronic copies of the documents. The electronic copies of the documents shall be included in the electronic mail sent to the elections official as attachments. The Secretary of State shall adopt uniform regulations for the use of electronic mail in returning ballots.\n(c) Notwithstanding the voter’s waiver of the right to a secret ballot, each elections official shall adopt appropriate procedures to protect the secrecy of ballots returned by electronic mail.\n(d) Upon receipt of a ballot returned by electronic mail, the elections official shall determine the voter’s eligibility to vote by comparing the signature on the scanned copy of the identification envelope with the signature on the voter’s affidavit of registration. The ballot shall be duplicated and all materials preserved according to procedures set forth in this code.\nSEC. 4.\nSection 3106.5 is added to the Elections Code, to read:\n3106.5.\n(a) Notwithstanding any other law, a military or overseas voter, as described in subdivision (b) of Section 300, may cast his or her vote on the Internet by electronically marking his or her ballot and securely transmitting the voted ballot to the appropriate elections official using the Internet. To be counted, the voted ballot must be received by the voter’s elections official no later than the closing of the polls on election day.\n(b) The Secretary of State shall adopt uniform regulations for military and overseas voters to cast votes using the Internet.\n(c) This section shall become operative only if the Secretary of State certifies that he or she has identified and addressed all issues regarding the security of casting a vote using the Internet.\nSEC. 5.\nNo reimbursement is required by this act pursuant to Section 6 of Article XIII B of the California Constitution for certain costs that may be incurred by a local agency or school district because, in that regard, this act creates a new crime or infraction, eliminates a crime or infraction, or changes the penalty for a crime or infraction, within the meaning of Section 17556 of the Government Code, or changes the definition of a crime within the meaning of Section 6 of Article XIII B of the California Constitution.\nHowever, if the Commission on State Mandates determines that this act contains other costs mandated by the state, reimbursement to local agencies and school districts for those costs shall be made pursuant to Part 7 (commencing with Section 17500) of Division 4 of Title 2 of the Government Code.\nSECTION 1.\nChapter 1.5 (commencing with Section 3050) is added to Division 3 of the\nElections Code\n, to read:\n1.5.\nElectronic Ballot Transmission\n3050.\n(a)An elections official may send a voter a ballot by secure electronic transmission if the election is conducted wholly within the county.","Existing law requires that a vote by mail ballot be available to any registered voter and specifies the manner by which the ballot must be returned. Existing law permits a military or overseas voter who is temporarily living outside of the territorial limits of the United States or the District of Columbia, or is called to military service, to return his or her vote by mail ballot by facsimile transmission to the elections official. The ballot must be received by the closing of the election day polls and accompanied by an identification envelope and an oath of voter declaration in a prescribed form. Existing law requires a military or overseas voter who returns a ballot by facsimile transmission to agree in an oath of voter declaration under penalty of perjury to waive his or her right to a secret ballot and that he or she has not applied for a vote by mail ballot from any other jurisdiction for the election. The elections official is required to determine the voter’s eligibility to vote by comparing the voter’s signature from the materials returned by facsimile transmission to the signature on the voter’s affidavit of registration.\nThis bill would permit a military or overseas voter to return his or her ballot by electronic mail, as prescribed. The bill would require the ballot to be accompanied by a copy of an identification envelope and an oath of voter declaration in substantially the form described with respect to facsimile transmission of ballots. This bill would require the elections official to determine the voter’s eligibility to vote by comparing the signature on the scanned copy of the identification envelope with the signature on the voter’s affidavit of registration.\nThis bill would permit a military or overseas voter to cast his or her vote on the Internet by electronically marking his or her ballot and securely transmitting the voted ballot to the appropriate elections official. To be counted, the voted ballot must be received by the voter’s elections official no later than the closing of the polls on election day. These provisions would become operative only if the Secretary of State certifies that he or she has identified and addressed all issues regarding the security of casting a vote using the Internet.\nBecause the bill requires elections officials to provide a higher level of service and expands the scope of the crime of perjury, it would impose a state-mandated local program.\nThe California Constitution requires the state to reimburse local agencies and school districts for certain costs mandated by the state. Statutory provisions establish procedures for making that reimbursement.\nThis bill would provide that with regard to certain mandates no reimbursement is required by this act for a specified reason.\nWith regard to any other mandates, this bill would provide that, if the Commission on State Mandates determines that the bill contains costs so mandated by the state, reimbursement for those costs shall be made pursuant to the statutory provisions noted above.\nExisting law requires that a vote by mail ballot be available to any registered voter. Under existing law, a voter may: (1) return the vote by mail ballot by mail or in person to the elections official from whom it came; (2) return the vote by mail ballot in person to a member of a precinct board at a polling place within the jurisdiction, or (3), if unable to return the vote by mail ballot, designate his or her spouse, child, parent, grandparent, grandchild, brother, sister, or a person residing in the same household as the vote by mail voter to return the ballot to the elections official from whom it came or to the precinct board at a polling place within the jurisdiction.\nThis bill would authorize an elections official to send a voter a ballot by secure electronic transmission for an election conducted wholly within the county. This bill would also require a voter who receives a ballot in this manner to print the ballot for return to the elections official.","An act to\namend Sections 3104 and 3105 of, and to\nadd\nChapter 1.5 (commencing with Section 3050) to Division 3 of\nSections 3106.2 and 3106.5 to,\nthe Elections Code, relating to elections."
4,"The people of the State of California do enact as follows:\n\n\nSECTION 1.\nSection 23612 of the Vehicle Code is amended to read:\n23612.\n(a) (1) (A) A person who drives a motor vehicle is deemed to have given his or her consent to chemical testing of his or her blood or breath for the purpose of determining the alcoholic content of his or her blood, if lawfully arrested for an offense allegedly committed in violation of Section 23140, 23152, or 23153. If a blood or breath test, or both, are unavailable, then paragraph (2) of subdivision (d) applies.\n(B) A person who drives a motor vehicle is deemed to have given his or her consent to chemical testing of his or her blood for the purpose of determining the drug content of his or her blood, if lawfully arrested for an offense allegedly committed in violation of Section 23140, 23152, or 23153. If a blood test is unavailable, the person shall be deemed to have given his or her consent to chemical testing of his or her urine and shall submit to a urine test.\n(C) The testing shall be incidental to a lawful arrest and administered at the direction of a peace officer having reasonable cause to believe the person was driving a motor vehicle in violation of Section 23140, 23152, or 23153.\n(D) The person shall be told that his or her failure to submit to, or the failure to complete, the required chemical testing will result in a fine, mandatory imprisonment if the person is convicted of a violation of Section 23152 or 23153, and (i) the suspension of the person’s privilege to operate a motor vehicle for a period of one year, (ii) the revocation of the person’s privilege to operate a motor vehicle for a period of two years if the refusal occurs within 10 years of a separate violation of Section 23103 as specified in Section 23103.5, or of Section 23140, 23152, or 23153 of this code, or of Section 191.5 or subdivision (a) of Section 192.5 of the Penal Code that resulted in a conviction, or if the person’s privilege to operate a motor vehicle has been suspended or revoked pursuant to Section 13353, 13353.1, or 13353.2 for an offense that occurred on a separate occasion, or (iii) the revocation of the person’s privilege to operate a motor vehicle for a period of three years if the refusal occurs within 10 years of two or more separate violations of Section 23103 as specified in Section 23103.5, or of Section 23140, 23152, or 23153 of this code, or of Section 191.5 or subdivision (a) of Section 192.5 of the Penal Code, or any combination thereof, that resulted in convictions, or if the person’s privilege to operate a motor vehicle has been suspended or revoked two or more times pursuant to Section 13353, 13353.1, or 13353.2 for offenses that occurred on separate occasions, or if there is any combination of those convictions, administrative suspensions, or revocations.\n(2) (A) If the person is lawfully arrested for driving under the influence of an alcoholic beverage, the person has the choice of whether the test shall be of his or her blood or breath and the officer shall advise the person that he or she has that choice. If the person arrested either is incapable, or states that he or she is incapable, of completing the chosen test, the person shall submit to the remaining test. If a blood or breath test, or both, are unavailable, then paragraph (2) of subdivision (d) applies.\n(B) If the person is lawfully arrested for driving under the influence of any drug or the combined influence of an alcoholic beverage and any drug, the person has the choice of whether the test shall be of his or her blood or breath, and the officer shall advise the person that he or she has that choice.\n(C) A person who chooses to submit to a breath test may also be requested to submit to a blood test if the officer has reasonable cause to believe that the person was driving under the influence of a drug or the combined influence of an alcoholic beverage and a drug and if the officer has a clear indication that a blood test will reveal evidence of the person being under the influence. The officer shall state in his or her report the facts upon which that belief and that clear indication are based. The officer shall advise the person that he or she is required to submit to an additional test. The person shall submit to and complete a blood test. If the person arrested is incapable of completing the blood test, the person shall submit to and complete a urine test.\n(3) If the person is lawfully arrested for an offense allegedly committed in violation of Section 23140, 23152, or 23153, and, because of the need for medical treatment, the person is first transported to a medical facility where it is not feasible to administer a particular test of, or to obtain a particular sample of, the person’s blood or breath, the person has the choice of those tests, including a urine test, that are available at the facility to which that person has been transported. In that case, the officer shall advise the person of those tests that are available at the medical facility and that the person’s choice is limited to those tests that are available.\n(4) The officer shall also advise the person that he or she does not have the right to have an attorney present before stating whether he or she will submit to a test or tests, before deciding which test or tests to take, or during administration of the test or tests chosen, and that, in the event of refusal to submit to a test or tests, the refusal may be used against him or her in a court of law.\n(5) A person who is unconscious or otherwise in a condition rendering him or her incapable of refusal is deemed not to have withdrawn his or her consent and a test or tests may be administered whether or not the person is told that his or her failure to submit to, or the noncompletion of, the test or tests will result in the suspension or revocation of his or her privilege to operate a motor vehicle. A person who is dead is deemed not to have withdrawn his or her consent and a test or tests may be administered at the direction of a peace officer.\n(b) A person who is afflicted with hemophilia is exempt from the blood test required by this section, but shall submit to, and complete, a urine test.\n(c) A person who is afflicted with a heart condition and is using an anticoagulant under the direction of a licensed physician and surgeon is exempt from the blood test required by this section, but shall submit to, and complete, a urine test.\n(d) (1) A person lawfully arrested for an offense allegedly committed while the person was driving a motor vehicle in violation of Section 23140, 23152, or 23153 may request the arresting officer to have a chemical test made of the arrested person’s blood or breath for the purpose of determining the alcoholic content of that person’s blood, and, if so requested, the arresting officer shall have the test performed.\n(2) If a blood or breath test is not available under subparagraph (A) of paragraph (1) of subdivision (a), or under subparagraph (A) of paragraph (2) of subdivision (a), or under paragraph (1) of this subdivision, the person shall submit to the remaining test in order to determine the percent, by weight, of alcohol in the person’s blood. If both the blood and breath tests are unavailable, the person shall be deemed to have given his or her consent to chemical testing of his or her urine and shall submit to a urine test.\n(e) If the person, who has been arrested for a violation of Section 23140, 23152, or 23153, refuses or fails to complete a chemical test or tests, or requests that a blood or urine test be taken, the peace officer, acting on behalf of the department, shall serve the notice of the order of suspension or revocation of the person’s privilege to operate a motor vehicle personally on the arrested person. The notice shall be on a form provided by the department.\n(f) If the peace officer serves the notice of the order of suspension or revocation of the person’s privilege to operate a motor vehicle, the peace officer shall take possession of all driver’s licenses issued by this state that are held by the person. The temporary driver’s license shall be an endorsement on the notice of the order of suspension and shall be valid for 30 days from the date of arrest.\n(g) (1) The peace officer shall immediately forward a copy of the completed notice of suspension or revocation form and any driver’s license taken into possession under subdivision (f), with the report required by Section 13380, to the department. If the person submitted to a blood or urine test, the peace officer shall forward the results immediately to the appropriate forensic laboratory. The forensic laboratory shall forward the results of the chemical tests to the department within 15 calendar days of the date of the arrest.\n(2) (A) Notwithstanding any other law, a document containing data prepared and maintained in the governmental forensic laboratory computerized database system that is electronically transmitted or retrieved through public or private computer networks to or by the department is the best available evidence of the chemical test results in all administrative proceedings conducted by the department. In addition, any other official record that is maintained in the governmental forensic laboratory, relates to a chemical test analysis prepared and maintained in the governmental forensic laboratory computerized database system, and is electronically transmitted and retrieved through a public or private computer network to or by the department is admissible as evidence in the department’s administrative proceedings. In order to be admissible as evidence in administrative proceedings, a document described in this subparagraph shall bear a certification by the employee of the department who retrieved the document certifying that the information was received or retrieved directly from the computerized database system of a governmental forensic laboratory and that the document accurately reflects the data received or retrieved.\n(B) Notwithstanding any other law, the failure of an employee of the department to certify under subparagraph (A) is not a public offense.\n(h) A preliminary alcohol screening test that indicates the presence or concentration of alcohol based on a breath sample in order to establish reasonable cause to believe the person was driving a vehicle in violation of Section 23140, 23152, or 23153 is a field sobriety test and may be used by an officer as a further investigative tool.\n(i) A preliminary oral fluid screening test that indicates the presence or concentration of a drug or controlled substance based on a sample in order to establish reasonable cause to believe the person was driving a vehicle in violation of Section\n23140, 23152,\n23152\nor 23153 is a field sobriety test and may be used by an officer as a further investigative tool.\n(j) If the officer decides to use a preliminary alcohol or oral fluid screening test, the officer shall advise the person that he or she is requesting that person to take a preliminary alcohol or oral fluid screening test to assist the officer in determining if that person is under the influence of alcohol or drugs, or a combination of alcohol and drugs. The person’s obligation to submit to a blood, breath, or urine test, as required by this section, for the purpose of determining the alcohol or drug content of that person’s blood, is not satisfied by the person submitting to a preliminary alcohol or oral fluid screening test. The officer shall advise the person of that fact and of the person’s right to refuse to take the preliminary alcohol or oral fluid screening test.","Existing law provides that a person who drives a motor vehicle is deemed to have given his or her consent to chemical testing of his or her blood for the purpose of determining the drug content of his or her blood if lawfully arrested for driving under the influence of alcohol or drugs. Existing law provides that if a blood test is unavailable, the person shall be deemed to have given his or her consent to chemical testing of his or her urine and shall submit to a urine test. Existing law authorizes an officer to use a preliminary alcohol screening test that indicates the presence or concentration of alcohol based on a breath sample as a further investigatory tool in order to establish reasonable cause to believe the person was driving a vehicle in violation of certain prohibitions against driving under the influence of alcohol or drugs.\nThis bill would authorize an officer to use a preliminary oral fluid screening test that indicates the presence or concentration of a drug or controlled substance as a further investigatory tool in order to establish reasonable cause to believe the person was driving a vehicle in violation of certain prohibitions against driving under the influence of drugs.","An act to amend Section 23612 of the Vehicle Code, relating to vehicles."


**Step 2: Tokenize the raw dataset**

Tokenization is the process of converting the text to numbers
This will be done via a tokenizer object.

- Step 2(a): Instantiate a tokenizer object
- Step 2(b): Define a function to preprocess and tokenize the input
- Step 2(c): Tokenize the batches


In [10]:
'''
Step 2(a): Instantiate a tokenizer object
This is done by using a suitable model checkpoint
'''
tokenizer = AutoTokenizer.from_pretrained(MODEL_CHECKPOINT)



'''
Step 2(b): Define a function to preprocess and tokenize the input

- Preprocess: Prefix the input with a prompt so T5 knows this is a summarization task.
  Some models capable of multiple NLP tasks require prompting for specific tasks.

- Tokenize
- Apply truncation to both the input text and their labels (summaries).


The truncation=True argument of the tokenizer object will truncate the sequences
    that are longer than the model max length
    (e.g., 512 for BERT or DistilBERT)
model_inputs = tokenizer(sequences, truncation=True)

For truncating the sequences that are longer than the specified max length,
use the following code:
tokenizer(sequences, max_length=8, truncation=True)

Use the as_target_tokenizer() method of the tokenizer object
  to tokenize the labels in parallel to the inputs.

'''

# Prefix the input with a prompt so T5 knows this is a summarization task
if MODEL_CHECKPOINT in ["t5-small", "t5-base", "t5-large", "t5-3b", "t5-11b"]:
    prefix = "summarize: "
else:
    prefix = ""


def preprocess_function(examples):
    inputs = [prefix + doc for doc in examples["text"]]
    model_inputs = tokenizer(inputs,
                             max_length=MAX_INPUT_LENGTH,
                             truncation=True)

    # Setup the tokenizer for targets
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["summary"],
                           max_length=MAX_TARGET_LENGTH,
                           truncation=True)

    model_inputs["labels"] = labels["input_ids"]

    return model_inputs




'''
- Step 2(c): Tokenize the batches

We want to keep the data as a dataset object after tokenization.
So, we use the Dataset map() method for tokenization,
by enabling it to utilize the tokenizer function.
The map() method works by applying a function to each element of the dataset.
This gives us the flexibility to apply additional preprocessing
via the map() method.

We set batched=True load samples in the RAM in batches.
This is possible because the datasets from the 🤗 Datasets library
are stored on the disk as they are Apache Arrow files.

Note that we didn't pad the samples.
Because it's more efficient to apply padding during the creation of the batches.
In such a case, we only need to pad to the maximum length in that batch,
and not the maximum length in the entire dataset.
This will save time and processing power
when the inputs have very variable lengths!
'''
tokenized_datasets = raw_datasets.map(preprocess_function, batched=True)
tokenized_datasets



Downloading (…)okenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

Downloading (…)ve/main/spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

Map:   0%|          | 0/989 [00:00<?, ? examples/s]



Map:   0%|          | 0/248 [00:00<?, ? examples/s]

DatasetDict({
    train: Dataset({
        features: ['text', 'summary', 'title', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 989
    })
    test: Dataset({
        features: ['text', 'summary', 'title', 'input_ids', 'attention_mask', 'labels'],
        num_rows: 248
    })
})

In [11]:
# Inspect a preprocessed sample
preprocess_function(raw_datasets["train"][:2])

{'input_ids': [[21603, 10, 37, 151, 13, 8, 1015, 13, 1826, 103, 3, 35, 2708, 38, 6963, 10, 180, 3073, 9562, 1300, 7491, 220, 41, 287, 526, 4733, 28, 5568, 850, 3449, 16968, 19, 974, 12, 8647, 314, 13, 2733, 335, 13, 6022, 3, 15442, 13, 8, 1685, 11, 6859, 3636, 6, 12, 608, 10, 7491, 1877, 37, 9151, 257, 13, 13661, 7, 21, 24714, 7538, 89, 86, 16326, 1983, 13, 1421, 850, 3449, 10415, 37, 28204, 12902, 11, 15884, 7, 66, 13, 8, 826, 10, 41, 9, 61, 3, 29541, 13, 2061, 6, 9307, 6, 11, 415, 10524, 43, 3029, 7353, 23802, 4120, 1019, 8, 538, 5, 94, 65, 2225, 135, 12, 169, 4120, 215, 1751, 6, 1097, 387, 6, 11, 1097, 540, 6, 859, 119, 1393, 5, 41, 115, 61, 933, 66, 7353, 23802, 4120, 33, 263, 45, 8, 337, 1397, 5, 818, 167, 7353, 23802, 4120, 169, 705, 2881, 5764, 5937, 7859, 16, 16326, 45, 1591, 413, 261, 443, 11, 4072, 12541, 6, 186, 688, 230, 462, 7353, 23802, 16, 16326, 10336, 263, 45, 9417, 6433, 7, 6, 6605, 3, 11823, 157, 7, 6, 4301, 157, 6, 3, 7, 232, 6, 42, 24556, 5764, 5937, 7859, 5, 15045

**Hyperparameters**

In [12]:
BATCH_SIZE = 8

MAX_EPOCHS = 4

'''
The initial learning rate is used by the optimizers, e.g., SGD, ADAM, NADAM, etc.

Note that transformer models benefit from a much lower learning rate than the default for Adam, which is 1e-3,
A much smaller rate, e.g., 5e-5, is a better starting point.
'''
INITIAL_LEARNING_RATE = 2e-5

WEIGHT_DECAY = 0.01


**Step 3: Instantiate a pre-trained model from the model checkpoint**


In [13]:
model = TFAutoModelForSeq2SeqLM.from_pretrained(MODEL_CHECKPOINT)

Downloading (…)lve/main/config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

Downloading model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

All PyTorch model weights were used when initializing TFT5ForConditionalGeneration.

All the weights of TFT5ForConditionalGeneration were initialized from the PyTorch model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFT5ForConditionalGeneration for predictions without further training.


**Step 4: Create a data collator**

Put together the samples in a batch by using a collate function.
By default, this function converts samples to tf.Tensor and
concatenate them (recursively if the elements are lists,
tuples, or dictionaries).

We can't utilize the default function as our inputs have variable lengths.
To address the variable-length issue, we will apply padding during the batching.
This is an efficient approach (padding during baching) as we can
avoid having over-long inputs with a lot of padding.

Note that we need a special kind of data collator, which will not only pad the inputs
to the maximum length in the batch, but also the labels.

For applying the correct amount of padding to the items
of the input and labels in a batch, we use DataCollatorForSeq2Seq.

Set the return_tensors='np' argument to get the output as a NumPy array.
We use "np" instead of "tf", because "np" is more reliable and performant.
The TF dataset pipeline uses a NumPy loader internally.


In [14]:
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model, return_tensors="np")


'''
Define another data collator to use during the the training via a callback function
for generating evaluation text data so that we can compute the ROGUE score and use it as a training metric.
The ROGUE metrics require to generate text from the model.
To speed things up, we can compile our generation loop with XLA.
This results in a huge speedup - up to 100X!
The downside of XLA generation, though, is that it doesn't like variable input shapes,
because it needs to run a new compilation for each new input shape!
To compensate for that, let's use pad_to_multiple_of for the dataset we use for text generation.
This will reduce the number of unique input shapes a lot, meaning we can get
the benefits of XLA generation with only a few compilations.

Note:
pad_to_multiple_of (int, optional) — If set will pad the sequence to a multiple of the provided value.
'''
generation_data_collator = DataCollatorForSeq2Seq(tokenizer,
                                                  model=model,
                                                  return_tensors="np",
                                                  pad_to_multiple_of=128)



**Step 5: Create train and test dataset loader objects**

Create the train and validation dataset by putting togerher the
tokenized dataset and collated dataset via the model prepare_tf_dataset() method.

It will wrap a tf.data.Dataset around the dataset, with an optional collation function.

The tf.data.Dataset is a native TensorFlow format that Keras can use for model.fit().


In [15]:
tf_train_dataset = model.prepare_tf_dataset(
    tokenized_datasets["train"],
    batch_size=BATCH_SIZE,
    shuffle=True,
    collate_fn=data_collator,
)

tf_validation_dataset = model.prepare_tf_dataset(
    tokenized_datasets["test"],
    batch_size=BATCH_SIZE,
    shuffle=False,
    collate_fn=data_collator,
)

generation_dataset = model.prepare_tf_dataset(
    tokenized_datasets["test"],
    batch_size=8,
    shuffle=False,
    collate_fn=generation_data_collator
)

You're using a T5TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.


**Step 6: Compile the Model**

Before we instantiate and compile the model, we need to define the optimizer and the loss function.


In [16]:
'''
Reset all states generated by Keras.
It deletes the TensorFlow graph before creating a new model,
otherwise memory overflow will occur.
'''
tf.keras.backend.clear_session()

'''
To reproduce the same result by the model in each iteration, we use fixed seeds for random number generation.
'''
# np.random.seed(42)
# tf.random.set_seed(42)



###################### Optimizer ##########################
# We provide various choices for the optimizer

'''
For the learning schedule, we need to set how long training is going to be, i.e., the number of training steps.
num_of_training_steps = (num_of_training_samples // batch_size) *  epochs

Since the tf_train_dataset is batched, its len() is already num_of_training_samples // batch_size
'''
num_of_training_steps = len(tf_train_dataset) * MAX_EPOCHS

'''
Scheduler: ExponentialDecay
'''
# lr_scheduler = tf.keras.optimizers.schedules.ExponentialDecay(
#     initial_learning_rate=INITIAL_LEARNING_RATE,
#     decay_steps=num_of_training_steps,
#     decay_rate=WEIGHT_DECAY,
#     staircase=True)

'''
Scheduler: PolynomialDecay
'''
lr_scheduler = tf.keras.optimizers.schedules.PolynomialDecay(
    initial_learning_rate=INITIAL_LEARNING_RATE,
    decay_steps=num_of_training_steps,
    end_learning_rate=0.0,
    power=1.0,
    cycle=False,
    name=None
)

'''
Optimizer:

Instantiate an optimizer. Use one of the following choices.
- Fixed LR: learning_rate=INITIAL_LEARNING_RATE
- Scheduled LR: learning_rate=lr_scheduler
'''
#optimizer = tf.keras.optimizers.SGD(learning_rate=lr_scheduler, momentum=0.9, nesterov=False)
#optimizer=tf.keras.optimizers.Adam(learning_rate=lr_scheduler)
#optimizer=tf.keras.optimizers.Nadam(learning_rate=lr_scheduler)
#optimizer=tfa.optimizers.AdamW(learning_rate=lr_scheduler, weight_decay=WEIGHT_DECAY)
#optimizer=tfa.optimizers.LAMB(learning_rate=lr_scheduler, weight_decay_rate=WEIGHT_DECAY)

optimizer = AdamWeightDecay(learning_rate=INITIAL_LEARNING_RATE, weight_decay_rate=WEIGHT_DECAY)

# optimizer, schedule = create_optimizer(
#     init_lr=INITIAL_LEARNING_RATE,
#     num_warmup_steps=0,
#     num_train_steps=num_train_steps,
#     weight_decay_rate=WEIGHT_DECAY,
# )




###################### Compile the Model ##########################

'''
No loss specified in compile().
The model's internal loss computation will be used as the loss.
This is a common way to train TensorFlow models in Transformers!
'''
model.compile(optimizer=optimizer)
model.summary()

Model: "tft5_for_conditional_generation"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 shared (Embedding)          multiple                  16449536  
                                                                 
 encoder (TFT5MainLayer)     multiple                  35330816  
                                                                 
 decoder (TFT5MainLayer)     multiple                  41625344  
                                                                 
Total params: 60506624 (230.81 MB)
Trainable params: 60506624 (230.81 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


**Evaluate the training performance using the ROUGE metric from Keras NLP**

- ROUGE-L is a score based on the length of the longest common subsequence
in the reference text and the hypothesis text.

- ROUGE-N is a score based on the number of matching n-grams between
the reference text and the hypothesis text.


In [17]:
rouge_l = keras_nlp.metrics.RougeL()

'''
A function to calculate the ROUGE metric using the predictions and labels
'''
def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    for label in labels:
        label[label < 0] = tokenizer.pad_token_id  # Replace masked label tokens
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

    # Compute the rogue score
    result = rouge_l(decoded_labels, decoded_preds)
    #use_stemmer=True

    # Disply the F1 score; other options are "precision", "recall"
    result = {"RougeL": result["f1_score"]}

    return result

**Callback Functions**

Define the following callback functions.
- PushToHubCallback
- KerasMetricCallback



In [18]:
'''
PushToHubCallback
It will sync up the fine-tuned model with the Hugging Face Hub.
First, the model will be stored (serialized) on the disk (output_dir).
Then, it will be synced.

This function will allow model reuse.
- The locally stored model can be loaded from "output_dir"
- The cloud-stored model can be loaded from the Hub

The function will allow to resume training from other machines,
share the model after training is finished,
and even test the model's inference quality midway through training!
'''

# Define a name of the fine-tuned model for the callback function
model_name = MODEL_CHECKPOINT.split("/")[-1]
push_to_hub_model_id = f"{model_name}-finetuned-summarization-billsum"


push_to_hub_callback = PushToHubCallback(
    output_dir="./model_summarization_save",
    tokenizer=tokenizer,
    hub_model_id=push_to_hub_model_id,
)


'''
KerasMetricCallback is a callback for computing advanced metrics.
There are a number of common metrics in NLP like ROUGE which are hard to fit into
our compiled training loop because they depend on decoding predictions and labels
back to strings with the tokenizer, and calling arbitrary Python functions to compute the metric.
The KerasMetricCallback will wrap a metric function, outputting metrics as training progresses.
'''
metric_callback = KerasMetricCallback(
    metric_fn=compute_metrics, eval_dataset=generation_dataset,
    predict_with_generate=True, use_xla_generation=True
)

callbacks = [metric_callback, push_to_hub_callback]

Cloning https://huggingface.co/hasan-mr/t5-small-finetuned-summarization-billsum into local empty directory.


Download file tf_model.h5:   0%|          | 15.4k/357M [00:00<?, ?B/s]

Clean file tf_model.h5:   0%|          | 1.00k/357M [00:00<?, ?B/s]

**Step 7: Train the model**


In [19]:
'''
Train in mixed-precision float16
Mixed precision is the use of both 16-bit and 32-bit floating-point types
in a model during training to make it run faster and use less memory.
'''
tf.keras.mixed_precision.set_global_policy("mixed_float16")


# Fine-tune the model
model.fit(tf_train_dataset,
          validation_data=tf_validation_dataset,
          epochs=MAX_EPOCHS,
          callbacks=callbacks)


Epoch 1/4

  return py_builtins.overload_of(f)(*args)


Epoch 2/4
Epoch 3/4
Epoch 4/4


<keras.src.callbacks.History at 0x7bb761180400>

**Step 8: Test the model by generating summary of a sample text**

We will use the fine-tuned model to summarize a sample text by using 3 approaches.
- Use the current fine-tuned model
- Load the saved fine-tuned model from the disk
- Load the saved fine-tuned model from the Hugging Face Hub

**Step 8 (a): Use the current fine-tuned model**

In [20]:
# A sample text for summarization
text = "summarize: The Inflation Reduction Act lowers prescription drug costs, health care costs, and energy costs. It's the most aggressive action on tackling the climate crisis in American history, which will lift up American workers and create good-paying, union jobs across the country. It'll lower the deficit and ask the ultra-wealthy and corporations to pay their fair share. And no one making under $400,000 per year will pay a penny more in taxes."

# Tokenize the text
inputs = tokenizer(text, return_tensors="tf").input_ids

# Generate output token ids
outputs = model.generate(inputs, max_new_tokens=100, do_sample=False)

# Generate output text by decoding the token ids
tokenizer.decode(outputs[0], skip_special_tokens=True)

"The Inflation Reduction Act lowers prescription drug costs, health care costs, and energy costs. It's the most aggressive action on tackling the climate crisis in American history. It'll ask the ultra-wealthy and corporations to pay their fair share."

**Step 8 (b): Load the saved fine-tuned model from the disk**

In [None]:
# Path to the saved model on the disk
output_dir="./model_summarization_save"

# Load tokenizer and model from the saved model on the disk
tokenizer_saved_model = AutoTokenizer.from_pretrained(output_dir)
saved_model = TFAutoModelForSeq2SeqLM.from_pretrained(output_dir)

# A sample text for summarization
text = "summarize: The Inflation Reduction Act lowers prescription drug costs, health care costs, and energy costs. It's the most aggressive action on tackling the climate crisis in American history, which will lift up American workers and create good-paying, union jobs across the country. It'll lower the deficit and ask the ultra-wealthy and corporations to pay their fair share. And no one making under $400,000 per year will pay a penny more in taxes."

# Tokenize the text
inputs_saved_model = tokenizer_saved_model(text, return_tensors="tf").input_ids

# Generate output token ids
outputs_saved_model = saved_model.generate(inputs_saved_model, max_new_tokens=100, do_sample=False)

# Generate output text by decoding the token ids
tokenizer_saved_model.decode(outputs_saved_model[0], skip_special_tokens=True)


**Step 8 (c): Load the saved fine-tuned model from the Hugging Face Hub**

In [None]:
from transformers import AutoTokenizer, TFAutoModelForSeq2SeqLM

# Fine-tuned model stored on the HUgging Face Hub
FINE_TUNED_MODEL = "hasan-mr/t5-small-finetuned-summarization-billsum"

# Load tokenizer and model from the Hugging Face Hub
tokenizer_hgg = AutoTokenizer.from_pretrained(FINE_TUNED_MODEL)
model_hgg = TFAutoModelForSeq2SeqLM.from_pretrained(FINE_TUNED_MODEL)

# A sample text for summarization
text = "summarize: The Inflation Reduction Act lowers prescription drug costs, health care costs, and energy costs. It's the most aggressive action on tackling the climate crisis in American history, which will lift up American workers and create good-paying, union jobs across the country. It'll lower the deficit and ask the ultra-wealthy and corporations to pay their fair share. And no one making under $400,000 per year will pay a penny more in taxes."

# Tokenize the text
inputs_hgg = tokenizer_hgg(text, return_tensors="tf").input_ids

# Generate output token ids
outputs_hgg = model_hgg.generate(inputs_hgg, max_new_tokens=100, do_sample=False)

# Generate output text by decoding the token ids
tokenizer_hgg.decode(outputs_hgg[0], skip_special_tokens=True)
