If you're opening this Notebook on colab, you will probably need to install 🤗 Transformers and 🤗 Datasets as well as other dependencies. Uncomment the following cell and run it.

In [4]:
#! pip install datasets evaluate transformers rouge-score nltk

If you're opening this notebook locally, make sure your environment has an install from the last version of those libraries.

To be able to share your model with the community and generate results like the one shown in the picture below via the inference API, there are a few more steps to follow.

First you have to store your authentication token from the Hugging Face website (sign up [here](https://huggingface.co/join) if you haven't already!) then execute the following cell and input your username and password:

In [3]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

Then you need to install Git-LFS. Uncomment the following instructions:

In [4]:
# !apt install git-lfs

Make sure your version of Transformers is at least 4.11.0 since the functionality was introduced in that version:

In [5]:
import transformers

print(transformers.__version__)

4.53.3


You can find a script version of this notebook to fine-tune your model in a distributed fashion using multiple GPUs or TPUs [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq).

We also quickly upload some telemetry - this tells us which examples and software versions are getting used so we know where to prioritize our maintenance efforts. We don't collect (or care about) any personally identifiable information, but if you'd prefer not to be counted, feel free to skip this step or delete this cell entirely.

In [6]:
from transformers.utils import send_example_telemetry

send_example_telemetry("summarization_notebook", framework="pytorch")

# Fine-tuning a model on a summarization task

In this notebook, we will see how to fine-tune one of the [🤗 Transformers](https://github.com/huggingface/transformers) model for a summarization task. We will use the [XSum dataset](https://arxiv.org/pdf/1808.08745.pdf) (for extreme summarization) which contains BBC articles accompanied with single-sentence summaries.

![Widget inference on a summarization task](https://github.com/huggingface/notebooks/blob/main/examples/images/summarization.png?raw=1)

We will see how to easily load the dataset for this task using 🤗 Datasets and how to fine-tune a model on it using the `Trainer` API.

In [7]:
model_checkpoint = "t5-small"

This notebook is built to run  with any model checkpoint from the [Model Hub](https://huggingface.co/models) as long as that model has a sequence-to-sequence version in the Transformers library. Here we picked the [`t5-small`](https://huggingface.co/t5-small) checkpoint.

## Loading the dataset

We will use the [🤗 Datasets](https://github.com/huggingface/datasets) library to download the data and get the metric we need to use for evaluation (to compare our model to the benchmark). This can be easily done with the functions `load_dataset` and `load_metric`.  

In [9]:
!pip install datasets evaluate transformers[sentencepiece] rouge_score

Collecting evaluate
  Downloading evaluate-0.4.5-py3-none-any.whl.metadata (9.5 kB)
Collecting rouge_score
  Downloading rouge_score-0.1.2.tar.gz (17 kB)
  Preparing metadata (setup.py) ... [?25l[?25hdone
Downloading evaluate-0.4.5-py3-none-any.whl (84 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m84.1/84.1 kB[0m [31m867.2 kB/s[0m eta [36m0:00:00[0m
[?25hBuilding wheels for collected packages: rouge_score
  Building wheel for rouge_score (setup.py) ... [?25l[?25hdone
  Created wheel for rouge_score: filename=rouge_score-0.1.2-py3-none-any.whl size=24934 sha256=dd5f2747e63bce202ad3136796075ef7b28a75c5b67104a8077b168b12846ee7
  Stored in directory: /root/.cache/pip/wheels/1e/19/43/8a442dc83660ca25e163e1bd1f89919284ab0d0c1475475148
Successfully built rouge_score
Installing collected packages: rouge_score, evaluate
Successfully installed evaluate-0.4.5 rouge_score-0.1.2


In [14]:
from datasets import load_dataset
from evaluate import load

# This is the correct dataset name. It will work.
raw_datasets = load_dataset("billsum")
metric = load("rouge")

README.md: 0.00B [00:00, ?B/s]

train-00000-of-00001.parquet:   0%|          | 0.00/91.8M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/15.8M [00:00<?, ?B/s]

ca_test-00000-of-00001.parquet:   0%|          | 0.00/6.12M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/18949 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/3269 [00:00<?, ? examples/s]

Generating ca_test split:   0%|          | 0/1237 [00:00<?, ? examples/s]

Downloading builder script: 0.00B [00:00, ?B/s]

The `dataset` object itself is [`DatasetDict`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasetdict), which contains one key for the training, validation and test set:

In [15]:
raw_datasets

DatasetDict({
    train: Dataset({
        features: ['text', 'summary', 'title'],
        num_rows: 18949
    })
    test: Dataset({
        features: ['text', 'summary', 'title'],
        num_rows: 3269
    })
    ca_test: Dataset({
        features: ['text', 'summary', 'title'],
        num_rows: 1237
    })
})

To access an actual element, you need to select a split first, then give an index:

In [16]:
raw_datasets["train"][0]

{'text': "SECTION 1. LIABILITY OF BUSINESS ENTITIES PROVIDING USE OF FACILITIES \n              TO NONPROFIT ORGANIZATIONS.\n\n    (a) Definitions.--In this section:\n            (1) Business entity.--The term ``business entity'' means a \n        firm, corporation, association, partnership, consortium, joint \n        venture, or other form of enterprise.\n            (2) Facility.--The term ``facility'' means any real \n        property, including any building, improvement, or appurtenance.\n            (3) Gross negligence.--The term ``gross negligence'' means \n        voluntary and conscious conduct by a person with knowledge (at \n        the time of the conduct) that the conduct is likely to be \n        harmful to the health or well-being of another person.\n            (4) Intentional misconduct.--The term ``intentional \n        misconduct'' means conduct by a person with knowledge (at the \n        time of the conduct) that the conduct is harmful to the health \n        or w

To get a sense of what the data looks like, the following function will show some examples picked randomly in the dataset.

In [17]:
import datasets
import random
import pandas as pd
from IPython.display import display, HTML

def show_random_elements(dataset, num_examples=5):
    assert num_examples <= len(dataset), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset)-1)
        while pick in picks:
            pick = random.randint(0, len(dataset)-1)
        picks.append(pick)

    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))

In [18]:
show_random_elements(raw_datasets["train"])

Unnamed: 0,text,summary,title
0,"SECTION 1. SHORT TITLE.\n\n This Act may be cited as the ``Improvement of Information Access \nAct of 1993''.\n\nSEC. 2. FINDINGS.\n\n The Congress finds the following:\n (1) A well-informed citizenry is essential for the well-\n being of a democratic society.\n (2) Access to Government information is essential for \n citizens who seek to make the Federal Government accountable \n for its actions.\n (3) The public should have timely, complete, equitable, and \n affordable access to Government information.\n (4) Federal agencies should use modern information \n technology for the benefit of citizens of the United States.\n (5) Government information is a national resource that \n should be treated as a public good.\n (6) Government information is a valuable economic asset \n that belongs to the public.\n (7) Taxpayers pay for the creation, collection, and \n organization of Government information and should not be \n required to pay excessive fees to receive and use that \n information.\n (8) It is unnecessarily difficult for citizens to provide \n Federal agencies with comments and suggestions on Federal \n information policies. As a result, many Federal agencies do not \n take into account the public interest in the information \n resources they manage.\n (9) Federal agencies have been slow in developing standards \n for record and file formats, software query command structures, \n and other important topics that will make Government \n information easier to obtain and use.\n (10) Many Federal agencies do not provide timely access to \n Government information products and services at reasonable \n costs.\n\nSEC. 3. IMPROVED PUBLIC ACCESS TO GOVERNMENT INFORMATION.\n\n (a) In General.--Title 44, United States Code, is amended by adding \nat the end the following new chapter:\n\n ``CHAPTER 41--INFORMATION DISSEMINATION POLICIES AND PRACTICES\n\n``Sec.\n``4101. Ensuring public access to Government information products and \n services.\n``Sec. 4101. Ensuring public access to Government information products \n and services\n ``(a) Each executive department, military department, and \nindependent establishment shall prepare by not later than February 1 of \neach year, and make freely available to the public upon request and at \nno charge, a report which describes the information dissemination \npolicies and practices of the department or establishment, including--\n ``(1) plans of the department or establishment to introduce \n new information products and services or discontinue old ones;\n ``(2) efforts of the department or establishment to develop \n or implement standards for file and record formats, software \n query command structures, user interfaces, and other matters \n that make information easier to obtain and use;\n ``(3) progress of the department or establishment in \n creating and disseminating comprehensive indexes and \n bibliographies of information products and services, including \n coordinated efforts conducted with other agencies;\n ``(4) the methods to be used by the public for accessing \n information, including the modes and outlets available to the \n public;\n ``(5) provisions for protecting access to records stored \n with technologies that are superseded or obsolete;\n ``(6) methods used to make the public aware of information \n resources, services, and products; and\n ``(7) a summary of the comments received from the public \n under subsection (b) in the year preceding the report, and the \n response of the department or establishment to those comments.\n ``(b)(1) Not later than February 1 of each year, each executive \ndepartment, military department, and independent establishment shall \npublish in the Federal Register, and provide in such other manner as \nwill notify users of information of the department or establishment, a \nnotice of--\n ``(A) the availability of the report prepared under \n subsection (a); and\n ``(B) a period of not less than 90 days for submission by \n the public of comments regarding the information dissemination \n policies and practices of the department or establishment, \n including comments regarding--\n ``(i) the types of information the department or \n establishment collects and disseminates;\n ``(ii) the methods and outlets the department or \n establishment uses to store and disseminate \n information;\n ``(iii) the prices charged by the department or \n establishment, or such outlets, for the information; \n and\n ``(iv) the validity, reliability, timeliness, and \n usefulness to the public of the information.\n ``(2) Comments received under this subsection by a department or \nindependent establishment shall be available for inspection to the \npublic. Each year the department or establishment shall provide a \nreasonable opportunity for dialogue between responsible agency \nofficials and interested members of the public, including through \nhearings and informal forums, regarding both proposed and existing \npolicies, procedures, and mechanisms for disseminating information \nunder this section and for otherwise implementing this section.\n ``(c) Before discontinuing an information product or service, an \nagency shall--\n ``(1) publish in the Federal Register, or provide by other \n means adequate to inform users of information of the agency, a \n notice of a period of not less than 120 days for submission by \n the public of comments regarding that discontinuation;\n ``(2) include in that notice an explanation of the reasons \n for the discontinuation; and\n ``(3) consider comments received pursuant to the notice.\n ``(d) Each agency shall--\n ``(1) disseminate information in diverse modes and through \n appropriate outlets that will reinforce statutory requirements \n for depository distribution, as well as offering other channels \n of distribution, with adequate documentation software, indexes, \n or other resources that will permit and broaden public access \n to Government information;\n ``(2) disseminate information in a manner that ensures the \n timeliness, usefulness, and reliability of the information for \n the public;\n ``(3) store and disseminate information products and \n services in standardized record formats; and\n ``(4) use depository libraries, national computer networks, \n and other distribution channels that improve and assure free or \n low-cost public access to Government information.\n ``(e)(1) Except as specifically authorized by statute, an agency \nmay not--\n ``(A) charge to depository libraries the costs of \n distributing information products and services;\n ``(B) charge more than the incremental cost of distributing \n an information product or service regardless of channels \n utilized by the agency; or\n ``(C) charge any royalty or other fee for any use or \n redissemination of Government information.\n ``(2) For purposes of this subsection, the incremental cost of \ndistributing an information product or service does not include any \nportion of the cost of collecting, organizing, or processing \ninformation disseminated through the product or service.\n ``(f)(1) The Archivist of the United States and the Director of the \nNational Institute of Standards and Technology shall jointly issue and \nperiodically revise model performance standards under which agencies \nshall be encouraged to provide access to public records.\n ``(2) Standards issued under this subsection shall include the \nestablishment of a period within which an agency, upon request, shall \nprovide by mail a copy of any decision, rule, notice, docket filing, \npress release, or other public document of the agency.''.\n (b) Clerical Amendment.--The table of chapters at the beginning of \ntitle 44, United States Code, is amended by adding at the end the \nfollowing:\n\n``41. Government Information Products and Services.......... 4101''.\n\nSEC. 4. STANDARDS FOR ACCESS TO PUBLIC RECORDS.\n\n The Archivist of the United States and the Director of the National \nInstitute of Standards and Technology shall jointly issue model \nperformance standards for providing access to agency records under \nsection 4101(f) of title 44, United States Code (as added by section \n3), by not later than 1 year after the date of the enactment of this \nAct.","Improvement of Information Access Act of 1993 - Amends Federal law to require each executive and military department and independent establishment to prepare and make available to the public upon request a report which describes its information dissemination policies and practices. Requires each such entity to provide an opportunity for dialogue between responsible agency officials and interested members of the public regarding both proposed and existing policies, procedures, and mechanisms and disseminating information under this Act. \nSpecifies the actions an agency must take before discontinuing an information product or service. \nRequires agencies to: (1) disseminate information in diverse modes and through appropriate outlets that will permit and broaden public access to Government information; and (2) use depository libraries, national computer networks, and other distribution channels that improve and assure free or low-cost public access to Government information. \nProvides that except as specifically authorized by statute, an agency may not: (1) charge to depository libraries the costs of distributing information products and services; (2) charge more than the incremental cost of distributing an information product or service regardless of channels utilized; or (3) charge any royalty or other fee for any use or redissemination of Government information. \nRequires the Archivist of the United States and the Director of the National Institute of Standards and Technology to jointly issue and periodically revise model performance standards under which agencies shall be encouraged to provide access to public records.",Improvement of Information Access Act of 1993
1,"SECTION 1. TRANSFER BY CERTAIN MEMBERS OF THE ARMED FORCES OF PORTION \n OF ENTITLEMENT TO EDUCATIONAL ASSISTANCE UNDER THE \n MONTGOMERY GI BILL.\n\n (a) In General.--Chapter 30 of title 38, United States Code, is \namended--\n (1) by redesignating section 3020 as section 3020A; and\n (2) by inserting after section 3019 the following new \n section 3020:\n``Sec. 3020. Transfer of entitlement to basic educational assistance: \n certain members of the Armed Forces agreeing to \n additional service\n ``(a) In General.--Subject to the provisions of this section, an \nindividual described in subsection (b) who is entitled to basic \neducational assistance under this subchapter may transfer to one or \nmore of the dependents specified in subsection (c) a portion of such \nindividual's entitlement to such assistance, subject to the limitation \nunder subsection (d).\n ``(b) Eligible Individuals.--An individual referred to in \nsubsection (a) is any member of the Armed Forces who--\n ``(1) has completed at least six years of service in the \n Armed Forces; and\n ``(2) enters into an agreement to serve at least four more \n years as a member of the Armed Forces.\n ``(c) Eligible Dependents.--An individual referred to in subsection \n(a) may transfer entitlement to basic educational assistance under this \nsection as follows:\n ``(1) To the individual's spouse.\n ``(2) To one or more of the individual's children.\n ``(3) To a combination of the individuals referred to in \n paragraphs (1) and (2).\n ``(d) Limitation on Months Transferable.--The total number of \nmonths of entitlement to basic educational assistance transferable by \nan individual under this section may not exceed the lesser of--\n ``(1) the number of months equal to one quarter of the \n aggregate number of months of basic educational assistance to \n which the individual is entitled under this subchapter (as \n determined under section 3013 of this title); or\n ``(2) the number of months of entitlement to basic \n educational assistance which remain unused by the individual at \n the time of transfer under this section.\n ``(e) Designation of Transferee.--An individual transferring an \nentitlement to basic educational assistance under this section shall--\n ``(1) designate the dependent or dependents to whom such \n entitlement is being transferred;\n ``(2) designate the number of months of such entitlement to \n be transferred to each such dependent; and\n ``(3) specify the period for which the transfer shall be \n effective for each dependent designated under paragraph (1).\n ``(f) Time for Transfer; Revocation and Modification.--(1) Subject \nto the time limitation for use of entitlement under section 3031 of \nthis title, an individual entitled to transfer basic educational \nassistance under this subchapter may transfer such entitlement at any \ntime, without regard to whether the individual is a member of the Armed \nForces when the transfer is executed.\n ``(2)(A) An individual transferring entitlement under this section \nmay modify or revoke at any time the transfer of any unused portion of \nthe entitlement so transferred.\n ``(B) The modification or revocation of the transfer of entitlement \nunder this paragraph shall be made by the submittal of written notice \nof the action to both the Secretary concerned and the Secretary of \nVeterans Affairs.\n ``(g) Additional Administrative Matters.--(1) The use of any \nentitlement to basic educational assistance transferred under this \nsection shall be charged against the entitlement of the individual \nmaking the transfer at the rate of one month for each month of \ntransferred entitlement that is used.\n ``(2) Except as provided under subsection (f)(2) and subject to \nparagraphs (5) and (6), a dependent to whom entitlement is transferred \nunder this section is entitled to basic educational assistance under \nthis subchapter in the same manner and at the same rate as the \nindividual from whom the entitlement was transferred.\n ``(3) The monthly rate of educational assistance payable to a \ndependent to whom entitlement is transferred under this section shall \nbe the monthly rate payable under sections 3105 and 3022 of this title \nat the time of the use of such entitlement by the dependent.\n ``(4) The death of an individual transferring an entitlement under \nthis section shall not affect the use of the entitlement by the \ndependent to whom the entitlement is transferred.\n ``(5) Notwithstanding section 3031 of this title, a child to whom \nentitlement is transferred under this section may not use any \nentitlement so transferred after attaining the age of 26 years.\n ``(6) The administrative provisions of this chapter (including the \nprovisions set forth in section 3034(a)(1) of this title) shall apply \nto the use of entitlement transferred under this section, except that \nthe dependent to whom the entitlement is transferred shall be treated \nas the eligible veteran for purposes of such provisions.\n ``(7) The purposes for which a dependent to whom entitlement is \ntransferred under this section may use such entitlement shall include \nthe pursuit and completion of the requirements of a secondary school \ndiploma (or equivalency certificate).\n ``(h) Overpayment.--(1) In the event of an overpayment of basic \neducational assistance with respect to a dependent to whom entitlement \nis transferred under this section, the dependent and the individual \nmaking the transfer shall be jointly and severally liable to the United \nStates for the amount of the overpayment for purposes of section 3685 \nof this title.\n ``(2) Except as provided in paragraph (3), if an individual \ntransferring entitlement under this section fails to complete the \nservice agreed to by the individual under subsection (b)(2) in \naccordance with the terms of the agreement of the individual under that \nsubsection, the amount of any transferred entitlement under this \nsection that is used by a dependent of the individual as of the date of \nsuch failure shall be treated as an overpayment of basic educational \nassistance under paragraph (1).\n ``(3) Paragraph (2) shall not apply in the case of an individual \nwho fails to complete service agreed to by the individual--\n ``(A) by reason of the death of the individual; or\n ``(B) for a reason referred to in section \n 3011(a)(1)(A)(ii)(I) of this title.\n ``(i) Construction With Other Transfer Authority.--The authority of \nan individual to transfer entitlement to basic educational assistance \nunder this section is in addition to the authority, if any, of the \nindividual to transfer entitlement to basic educational assistance \nunder section 3020A of this title.\n ``(j) Regulations.--The Secretary of Defense shall prescribe \nregulations for purposes of this section. Such regulations shall \nspecify the manner and effect of an election to modify or revoke a \ntransfer of entitlement under subsection (f)(2) and shall specify the \nmanner of the applicability of the administrative provisions referred \nto in subsection (g)(6) to a dependent to whom entitlement is \ntransferred under this section.\n ``(k) Secretary Concerned Defined.--Notwithstanding section 101(25) \nof this title, in this section, the term `Secretary concerned' means--\n ``(1) the Secretary of the Army with respect to matters \n concerning the Army;\n ``(2) the Secretary of the Navy with respect to matters \n concerning the Navy or the Marine Corps;\n ``(3) the Secretary of the Air Force with respect to \n matters concerning the Air Force; and\n ``(4) the Secretary of Defense with respect to matters \n concerning the Coast Guard, or the Secretary of Transportation \n when it is not operating as a service in the Navy.''.\n (b) Source of Funds for Increased Usage.--(1) Section 3035(b)(4) of \ntitle 38, United States Code, is amended by inserting ``or 3020A'' \nafter ``section 3020''.\n (2) Section 2006(b)(2)(D) of title 10, United States Code, is \namended by inserting ``or 3020A'' after ``section 3020''.\n (c) Clerical Amendment.--The table of sections at the beginning of \nchapter 30 of title 38, United States Code, is amended by striking the \nitem relating to section 3020 and inserting the following new items:\n\n``3020. Transfer of entitlement to basic educational assistance: \n certain members of the Armed Forces \n agreeing to additional service.\n``3020A. Transfer of entitlement to basic educational assistance: \n members of the Armed Forces with critical \n military skills.''.","Authorizes military personnel who have completed at least six years of service and who agree to serve for at least four more years to transfer a portion of their entitlement to veterans' basic educational assistance to a spouse, child, or combination of such individuals. Limits the transferable number of months of such assistance. Requires the member to designate the dependent(s) to whom such assistance is being transferred as well as the number of months being transferred. Allows such members to make, revoke, or modify such transfers at any time. Requires a pro rata repayment of transferred assistance for any of the four-year service period not successfully served by the member (with exceptions in the case of member death or release or discharge for a service-connected disability, for hardship, or for a physical or mental condition).","A bill to amend title 38, United States Code, to permit the transfer to spouses and children of a portion of the entitlement of certain members of the Armed Forces to educational assistance under the Montgomery GI Bill, and for other purposes."
2,"SECTION 1. SHORT TITLE.\n\n This Act may be cited as the ``Reform Americans Can Afford Act of \n2010''.\n\nSEC. 2. FINDINGS.\n\n Congress finds the following:\n (1) The nonpartisan Congressional Budget Office (referred \n to in this section as the ``CBO'') predicts that health \n insurance premiums will increase by $2,100 for millions of \n families by 2016 as a result of the Democrats' health overhaul.\n (2) The Obama Administration's own actuaries at the Centers \n for Medicare & Medicaid Services (referred to in this section \n as the ``CMS'') predict that, ``[N]ational health expenditures \n under the health reform act would increase by a total of \n $311,000,000,000 (0.9 percent) during calendar years 2010-\n 2019'' as a result of the Democrats' health overhaul.\n (3) The CMS actuaries predict that 14,000,000 Americans \n would lose their employer-sponsored insurance as a result of \n the Democrats' health overhaul.\n (4) The Democrats' health overhaul penalizes Americans who \n save money to pay for their health care and threatens to reduce \n the value of the health benefits of 43,000,000 Americans with \n Flexible Spending Arrangements and Health Savings Accounts.\n (5) CBO estimates the Democrats' health overhaul slashes \n Medicare by more than one-half trillion dollars in order to \n fund a new Government entitlement program.\n (6) The Medicare actuaries found these Medicare cuts to be \n so drastic that they caution, ``providers for whom Medicare \n constitutes a substantive portion of their business could find \n it difficult to remain profitable and, absent legislative \n intervention, might end their participation in the program \n (possibly jeopardizing access to care for beneficiaries)''.\n (7) The CMS actuaries predict 7,000,000 Medicare \n beneficiaries will no longer be enrolled in a Medicare \n Advantage plan and millions of seniors who are currently \n enrolled in a Medicare Advantage plan will see their benefits \n slashed and out-of-pocket costs increase.\n (8) According to the Joint Committee on Taxation and the \n CBO, the Democrats' health law contains a total of \n $569,200,000,000 in tax increases, including a dozen separate \n provisions that break President Obama's pledge to avoid tax \n increases on middle-class Americans earning less than $200,000 \n per year and families earning less than $250,000 per year.\n (9) The national unemployment rate remains near 10 percent.\n (10) CBO estimates that the Democrats' health overhaul will \n raise taxes on employers who fail to provide Government-\n approved health insurance to their employees by \n $52,000,000,000.\n (11) CBO said that ``employees largely bear the cost of . . \n . [employer mandate] fees in the form of lower wages''.\n (12) The costs incurred by businesses who avoid the tax by \n complying with the employer mandate may also be felt by \n potential workers (who will have fewer employment opportunities \n as businesses respond to the mandate by reducing additional \n hiring) and by consumers (who may have to pay more for goods \n and services to offset the higher costs imposed on businesses \n by the mandate).\n (13) The U.S. Chamber of Commerce, which represents more \n than 3,000,000 businesses and organizations, said the \n Democrats' health overhaul, ``will not increase coverage-rather \n it will lead to out-sourcing, off-shoring, hiring of \n independent contractors, spinning-off small new companies, \n reducing workforces, and reducing wages''.\n (14) The National Federation of Independent Business, which \n represents 350,000 small businesses, said through mandates, \n ``employees ultimately bear the cost of their health insurance \n through lower employment, depressed wages, depressed \n productivity, and loss of economic opportunities''.\n (15) CBO found that 3,900,000 Americans would pay \n $17,000,000,000 in taxes for not purchasing Government-approved \n health insurance and that nearly half of these taxes would be \n paid by families earning less than 300 percent of the Federal \n poverty level.\n (16) The Internal Revenue Service may have to hire as many \n as 16,500 additional agents, auditors, and other workers to \n enforce all the new taxes and penalties in the Democrats' \n health overhaul, dangerously expanding the Government's reach \n into the lives of virtually every American.\n (17) The CMS actuaries predict the nearly $110,000,000,000 \n in new health care industry taxes in the Democrats' health \n overhaul will be passed onto consumers in the form of higher \n premiums and out-of-pocket costs.\n (18) The subsidies for individuals and families (who earn \n less than 400 percent of the Federal poverty level) in the \n Democrats' health overhaul are structured in a way that will \n financially punish married couples. For example, a woman \n earning $32,000 in 2016 who gets married to a man earning the \n same amount will pay an average marriage penalty of $9,640 \n versus what they would have paid for health coverage had they \n not married.\n (19) The rapid phase-out of the premium tax credits, when \n combined with existing income and payroll tax rates, create \n effective marginal tax rates exceeding 100 percent in certain \n cases, thus destroying any incentive to work harder and earn \n more income.\n (20) The so-called ``Patient-Centered Outcomes Research \n Institute'' paves the way for Government-sanctioned rationing \n of life-saving treatments by allowing the coverage of health \n care treatments and services to be based on how much those \n treatments and services cost.\n (21) The CMS actuaries predict the program to help cover \n the sickest Americans will be so inadequately funded that \n premiums will have to increase ``substantially'' to maintain \n solvency.\n (22) The CMS actuaries estimate 18,000,000 Americans will \n be dumped into Medicaid, a program in which they are likely to \n have a difficult time finding a doctor to treat them, as a \n result of the Democrats' health overhaul.\n (23) The Medicaid expansion in the Democrats' health \n overhaul will force States to spend an additional \n $20,000,000,000 on their Medicaid programs at time where the \n vast majority of States are facing a budget crisis.\n (24) The 2010 budget deficit currently stands at \n $1,400,000,000,000 and the national debt totals \n $12,000,000,000,000.\n (25) The CMS actuaries exposed the Democrats' budget \n gimmicks, saying a new Government-run long-term care program \n that Democrats have touted as saving $72,000,000,000 over the \n next ten years will ``face a significant risk of failure'' and \n also that ``the improved [Medicare] financing cannot be \n simultaneously used to finance other Federal outlays (such as \n the coverage expansions) and to extend the trust fund''.\n (26) CBO estimates the House Republican health reform bill \n would reduce premiums across the board by up to $1,050 \n annually.\n (27) The House Republican health reform bill would not cut \n Medicare or increase taxes.\n (28) CBO estimates the Republican health reform bill would \n reduce the Federal deficit by $68,000,000,000 over the next 10 \n years.\n (29) As of introduction of this bill, 21 State attorneys \n general are suing the Federal Government, challenging the \n constitutionality of the Democrats' new health care law.\n\nSEC. 3. REPEAL OF THE PATIENT PROTECTION AND AFFORDABLE CARE ACT AND \n THE HEALTH CARE AND EDUCATION RECONCILIATION ACT OF 2010.\n\n (a) Patient Protection and Affordable Care Act.--Effective as of \nthe enactment of the Patient Protection and Affordable Care Act, such \nAct is repealed, and the provisions of law amended or repealed by such \nAct are restored or revived as if such Act had not been enacted.\n (b) Health Care and Education Reconciliation Act of 2010.--\nEffective as of the enactment of the Health Care and Education \nReconciliation Act of 2010, such Act is repealed, and the provisions of \nlaw amended or repealed by such Act are restored or revived as if such \nAct had not been enacted.\n\nSEC. 4. ENACTMENT OF THE COMMON SENSE HEALTH CARE REFORM AND \n AFFORDABILITY ACT.\n\n H.R. 4038, entitled the ``Common Sense Health Care Reform and \nAffordability Act'', as introduced in the House of Representatives on \nNovember 6, 2009, is enacted into law.","Reform Americans Can Afford Act of 2010 - Repeals the Patient Protection and Affordable Care Act and the Health Care and Education Reconciliation Act of 2010, effective as of their enactment. Restores provisions of law amended by such Acts.\n\nEnacts the Common Sense Health Care Reform and Affordability Act (H.R. 4038), as introduced in the House of Representatives on November 9, 2009.",To repeal the Patient Protection and Affordable Care Act and the Health Care and Education Reconciliation Act of 2010 and enact the Common Sense Health Care Reform and Affordability Act.
3,"SECTION 1. SHORT TITLE.\n\n This Act may be cited as the ``Greater Middle East and Central Asia \nDevelopment Act of 2004''.\n\nSEC. 2. PURPOSE.\n\n The purpose of this Act is to authorize assistance for political \nfreedom and economic development, particularly through private sector \ndevelopment, in the Greater Middle East and Central Asia, including \ncontributions to and participation in 3 new entities: a Trust for \nDemocracy, a Development Foundation, and a Development Bank.\n\nSEC. 3. FINDINGS.\n\n Congress makes the following findings:\n (1) The terrorist attacks of September 11, 2001, signaled a \n turning point in United States foreign policy.\n (2) Al Qaeda and affiliated groups have established a \n terrorist network with linkages in Afghanistan, Pakistan, \n throughout the Greater Middle East and Central Asia, and around \n the world.\n (3) The war on terrorism requires that the United States \n consider the Greater Middle East and Central Asia as a \n strategic region with its own political, economic, and security \n dynamics.\n (4) While rich in cultural, geographic, and language \n diversity, the Greater Middle East and Central Asia face common \n impediments to economic development and political freedom.\n (5) Although poverty and economic underdevelopment do not \n alone cause terrorism, the expansion of economic growth, free \n trade, and private sector development can contribute to an \n environment that undercuts radical political tendencies that \n give rise to terrorism.\n (6) Given the relationship between economic and political \n development and winning the global war on terror, America's \n support for freedom in the Greater Middle East and Central Asia \n must be matched with expanded and new programs of partnership \n with the people and governments of the region to promote good \n governance, political freedom, private sector development, and \n more open economies.\n (7) The United States and other donors should support those \n citizens of the Greater Middle East and Central Asia who share \n our desire to undertake reforms that result in more open \n political and economic systems.\n (8) Turkey, which should be supported in its aspirations \n for membership in the European Union, plays a pivotal and \n unique role in efforts to bring economic development and \n stability to the Greater Middle East and Central Asia.\n (9) The President should seek new mechanisms to work \n together with European and other nations, as well as with the \n countries of the Greater Middle East and Central Asia to \n promote political and economic development in the Greater \n Middle East and Central Asia.\n (10) Because the dynamics of the Greater Middle East and \n Central Asia have a serious impact on global security, the \n North Atlantic Treaty Organization (NATO) should now shift its \n strategic focus to the region, including expanded roles in \n Iraq, Afghanistan, and the Mediterranean.\n\nSEC. 4. DEFINITION; SPECIAL RULE.\n\n (a) Greater Middle East and Central Asia Defined.--In this Act, the \nterm ``Greater Middle East and Central Asia'' means the 22 members of \nthe Arab League (Algeria, Bahrain, Comoros, Djibouti, Egypt, Iraq, \nJordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman, the \nPalestinian Authority, Qatar, Saudi Arabia, Somalia, Sudan, Syria, \nTunisia, United Arab Emirates, and Yemen), Afghanistan, Iran, Israel, \nKazakhstan, Kyrgyzstan, Pakistan, Tajikistan, Turkey, Turkmenistan, and \nUzbekistan.\n (b) Special Rule.--A country listed in subsection (a) may not \nreceive assistance under this Act if such country is identified as a \ncountry supporting international terrorism pursuant to section \n6(j)(1)(A) of the Export Administration Act of 1979 (as in effect \npursuant to the International Emergency Economic Powers Act; 50 U.S.C. \n1701 et seq.), section 40(d) of the Arms Export Control Act (22 U.S.C. \n2780(d)), section 620A of the Foreign Assistance Act of 1961 (22 U.S.C. \n2371), or any other provision of law.\n\nSEC. 5. AUTHORIZATION OF ASSISTANCE.\n\n Notwithstanding any other provision of law, the President is \nauthorized to provide assistance to the Greater Middle East and Central \nAsia for the purpose of promoting economic and political freedoms, free \ntrade, and private sector development, including the programs described \nin the following paragraphs:\n (1) United states contribution to and membership in a \n greater middle east and central asia development bank.--The \n President is authorized to work with other donors and \n representatives from the Greater Middle East and Central Asia \n to establish a Greater Middle East and Central Asia Development \n Bank to promote private sector development, trade, including \n intra-regional trade, and investment in the Greater Middle East \n and Central Asia.\n (2) Creation of a greater middle east and central asia \n development foundation.--The President is authorized to work \n with other donors and representatives from the Greater Middle \n East and Central Asia to establish a multilateral Greater \nMiddle East and Central Asia Development Foundation to assist in the \nadministration and implementation of assistance programs, including \npublic-private programs, pursuant to this Act, with specific emphasis \non programs at the grass-roots level, to include volunteer-based \norganizations and other nongovernmental organizations that support \nprivate sector development, entrepreneurship, and development of small- \nand medium-size enterprises and exchanges.\n (3) Creation of trust for democracy.--The President is \n authorized to establish, together with other donors and private \n sector and nongovernmental leaders from the Greater Middle East \n and Central Asia, a multilateral, public-private Trust for \n Democracy to support grass-roots development of civil society, \n democratic reform, good governance practices, and rule of law \n reform in the Greater Middle East and Central Asia. Private \n foundations shall be encouraged to participate in the Trust \n through the provision of matching funds.\n\nSEC. 6. SENSE OF CONGRESS REGARDING COORDINATION OF ASSISTANCE TO THE \n GREATER MIDDLE EAST AND CENTRAL ASIA.\n\n Recognizing the importance of coordination of assistance to the \nGreater Middle East and Central Asia, and the strategic imperatives \nrequired by the war on terrorism, it is the sense of Congress that--\n (1) the Secretary of State and the heads of other relevant \n Government agencies should consider new approaches to the \n coordination of the provision of political and economic support \n for the Greater Middle East and Central Asia; and\n (2) the Secretary of State should consider appointing a \n Coordinator for Assistance to the Greater Middle East and \n Central Asia.\n\nSEC. 7. PROGRAM REPORTS.\n\n (a) Requirement for Reports.--Beginning on January 31, 2005, and \nannually thereafter, the President shall submit to Congress a report on \nthe progress of the Greater Middle East and Central Asia, the Greater \nMiddle East and Central Asia Development Bank, the Greater Middle East \nand Central Asia Development Foundation, and the Trust for Democracy in \ndeveloping more open political and economic systems and the degree to \nwhich United States assistance has been effective at promoting these \nchanges.\n (b) Content.--The reports required by subsection (a) shall include \ngeneral information regarding such progress and specific information on \nthe progress of each of the Greater Middle East and Central Asia \nDevelopment Bank, the Greater Middle East and Central Asia Development \nFoundation, and the Trust for Democracy in--\n (1) encouraging entrepreneurial development and supporting \n growth of small- and medium-size enterprises in the Greater \n Middle East and Central Asia;\n (2) promoting private sector development, democratic \n political reform, good governance building, rule of law reform, \n and other appropriate goals in the Greater Middle East and \n Central Asia;\n (3) fostering intra-regional trade and investment by United \n States businesses and financial institutions in the Greater \n Middle East and Central Asia;\n (4) developing public-private partnerships to carry out the \n purpose of this Act; and\n (5) encouraging the involvement of the Greater Middle East \n and Central Asia, and other donors in each institution.\n\nSEC. 8. ENTERPRISE FUNDS REPORTS TO CONGRESS.\n\n Not later than 1 year after the date of enactment of this Act, the \nPresident shall submit to Congress a comprehensive report evaluating \nthe appropriateness of the establishment of enterprise funds in the \nGreater Middle East and Central Asia. The report shall evaluate whether \nand to what extent enterprise funds might be an effective mechanism for \npromoting economic reform and investment in the Greater Middle East and \nCentral Asia.\n\nSEC. 9. REPORT ON COORDINATION OF ASSISTANCE TO THE GREATER MIDDLE EAST \n AND CENTRAL ASIA.\n\n Not later than 1 year after the date of enactment of this Act, the \nPresident shall submit to Congress a report that describes the measures \nthat have been employed, and the measures that are planned to be \nemployed, to improve the coordination within the Department of State \nand among the heads of the relevant Government agencies of the \nprovision of support to the Greater Middle East and Central Asia.\n\nSEC. 10. NOTIFICATIONS TO CONGRESS REGARDING ASSISTANCE.\n\n Section 634A of the Foreign Assistance Act of 1961 (22 U.S.C. 2394-\n1) (relating to reprogramming notifications) shall apply with respect \nto obligations of funds made available to carry out this Act.\n\nSEC. 11. AUTHORIZATION OF APPROPRIATIONS.\n\n (a) Authorization of Appropriations.--In addition to funds \notherwise available for such purpose and for the countries to which \nthis Act applies, there are authorized to be appropriated to the \nDepartment of State to carry out the provisions of this Act, \n$1,000,000,000 for each of the fiscal years 2005 through 2009.\n (b) Availability of Funds.--Amounts appropriated pursuant to \nsubsection (a) shall remain available until expended.","Greater Middle East and Central Asia Development Act of 2004 - Authorizes the President to provide assistance to countries (excluding those countries supporting international terrorism) in the Greater Middle East and Central Asia to promote economic and political freedoms, free trade, and private sector development, including working with other donors and the countries of the Greater Middle East and Central Asia to establish: (1) a Greater Middle East and Central Asia Development Bank to promote private sector development, trade, including intra-regional trade, and investment in the Greater Middle East and Central Asia; (2) a multilateral Greater Middle East and Central Asia Development Foundation to assist in the administration and implementation of assistance programs, including public-private programs, with emphasis on programs at the grass-roots level; and (3) a multilateral, public-private Trust for Democracy to support grass-roots development of civil society, democratic reform, good governance practices, and rule of law reform in the Greater Middle East and Central Asia.\n\nDefines ""Greater Middle East and Central Asia'' as the 22 nations of the Arab world (Algeria, Bahrain, Comoros, Djibouti, Egypt, Iraq, Jordan, Kuwait, Lebanon, Libya, Mauritania, Morocco, Oman, Palestine/West Bank/Gaza, Qatar, Saudi Arabia, Somalia, Sudan, Syria, Tunisia, United Arab Emirates, and Yemen), Afghanistan, Iran, Israel, Kazakhstan, Kyrgyzstan, Pakistan, Tajikistan, Turkey, Turkmenistan, and Uzbekistan.\n\nExpresses the sense of Congress that: (1) the Secretary of State and the heads of other Government agencies should consider new approaches for the coordination of political and economic support for the countries of the Greater Middle East and Central Asia; and (2) the Secretary should consider appointing a Coordinator for Assistance to the Greater Middle East and Central Asia.\n\nAmends the Foreign Assistance Act of 1961 to require congressional notification of fund obligations under this Act.","A bill to authorize programs that support economic and political development in the Greater Middle East and Central Asia and support for three new multilateral institutions, and for other purposes."
4,"SECTION 1. SHORT TITLE.\n\n This Act may be cited as the ``Tonto and Coconino National Forests \nLand Exchange Act''.\n\n TITLE I--TONTO AND COCONINO NATIONAL FORESTS LAND EXCHANGE\n\nSEC. 101. FINDINGS; PURPOSE.\n\n (a) Findings.--Congress finds the following:\n (1) Certain private lands adjacent to the Montezuma Castle \n National Monument in Yavapai County, Arizona, are desirable for \n Federal acquisition to protect important riparian values along \n Beaver Creek and the scenic backdrop for the National Monument.\n (2) Certain other inholdings in the Coconino National \n Forest are desirable for Federal acquisition to protect \n important public values near Double Cabin Park.\n (3) Approximately 108 acres of land within the Tonto \n National Forest, northeast of Payson, Arizona, are currently \n occupied by 45 residential cabins under special use permits \n from the Secretary of Agriculture, and have been so occupied \n since the mid-1950s, rendering such lands of limited use and \n enjoyment potential for the general public. Such lands are, \n therefore, appropriate for transfer to the cabin owners in \n exchange for lands that will have higher public use values.\n (4) In return for the privatization of such encumbered \n lands the Secretary of Agriculture has been offered \n approximately 495 acres of non-Federal land (known as the Q \n Ranch) within the Tonto National Forest, east of Young, \n Arizona, in an area where the Secretary has completed previous \n land exchanges to consolidate public ownership of National \n Forest lands.\n (5) The acquisition of the Q Ranch non-Federal lands by the \n Secretary will greatly increase National Forest management \n efficiency and promote public access, use, and enjoyment of the \n area and surrounding National Forest System lands.\n (b) Purpose.--The purpose of this title is to authorize, direct, \nfacilitate, and expedite the consummation of the land exchanges set \nforth herein in accordance with the terms and conditions of this title.\n\nSEC. 102. DEFINITIONS.\n\n As used in this title:\n (1) DPSHA.--The term ``DPSHA'' means the Diamond Point \n Summer Homes Association, a nonprofit corporation in the State \n of Arizona.\n (2) Federal land.--The term ``Federal land'' means land to \n be conveyed into non-Federal ownership under this title.\n (3) FLPMA.--The term ``FLPMA'' means the Federal Land \n Policy Management Act of 1976.\n (4) MCJV.--The term ``MCJV'' means the Montezuma Castle \n Land Exchange Joint Venture Partnership, an Arizona \n Partnership.\n (5) Non-federal land.--The term ``non-Federal land'' means \n land to be conveyed to the Secretary of Agriculture under this \n title.\n (6) Secretary.--The term ``Secretary'' means the Secretary \n of Agriculture, unless otherwise specified.\n\nSEC. 103. MONTEZUMA CASTLE LAND EXCHANGE.\n\n (a) Land Exchange.--Upon receipt of a binding offer from MCJV to \nconvey title acceptable to the Secretary to the land described in \nsubsection (b), the Secretary shall convey to MCJV all right, title, \nand interest of the United States in and to the Federal land described \nin subsection (c).\n (b) Non-Federal.--The land described in this subsection is the \nfollowing:\n (1) The approximately 157 acres of land adjacent to the \n Montezuma Castle National Monument, as generally depicted on \n the map entitled ``Montezuma Castle Contiguous Lands'', dated \n May 2002.\n (2) Certain private land within the Coconino National \n Forest, Arizona, comprising approximately 108 acres, as \n generally depicted on the map entitled ``Double Cabin Park \n Lands'', dated September 2002.\n (c) Federal Land.--The Federal land described in this subsection is \nthe approximately 222 acres in the Tonto National Forest, Arizona, and \nsurveyed as Lots 3, 4, 8, 9, 10, 11, 16, 17, and Tract 40 in section \n32, Township 11 North, Range 10 East, Gila and Salt River Meridian, \nArizona.\n (d) Equal Value Exchange.--The values of the non-Federal and \nFederal land directed to be exchanged under this section shall be equal \nor equalized as determined by the Secretary through an appraisal \nperformed by a qualified appraiser mutually agreed to by the Secretary \nand MCJV and performed in conformance with the Uniform Appraisal \nStandards for Federal Land Acquisitions (U.S. Department of Justice, \nDecember 2000), and section 206(d) of the FLPMA (43 U.S.C. 1716(d)). If \nthe values are not equal, the Secretary shall delete Federal lots from \nthe conveyance to MCJV in the following order and priority, as \nnecessary, until the values of Federal and non-Federal land are within \nthe 25 percent cash equalization limit of 206(b) of FLPMA:\n (1) Lot 3.\n (2) Lot 4.\n (3) Lot 9.\n (4) Lot 10.\n (5) Lot 11.\n (6) Lot 8.\n (e) Cash Equalization.--Any difference in value remaining after \ncompliance with subsection (d) shall be equalized by the payment of \ncash to the Secretary or MCJV, as the circumstances dictate, in \naccordance with section 206(b) of FLPMA (43 U.S.C. 1716(b)). Public Law \n90-171 (16 U.S.C. 484a; commonly known as the ``Sisk Act'') shall, \nwithout further appropriation, apply to any cash equalization payment \nreceived by the United States under this section.\n\nSEC. 104. DIAMOND POINT--Q RANCH LAND EXCHANGE.\n\n (a) In General.--Upon receipt of a binding offer from DPSHA to \nconvey title acceptable to the Secretary to the land described in \nsubsection (b), the Secretary shall convey to DPSHA all right, title, \nand interest of the United States in and to the land described in \nsubsection (c).\n (b) Non-Federal Land.--The land described in this subsection is the \napproximately 495 acres of non-Federal land generally depicted on the \nmap entitled ``Diamond Point Exchange--Q Ranch Non-Federal Lands'', \ndated May 2002.\n (c) Federal Land.--The Federal land described in this subsection is \nthe approximately 108 acres northeast of Payson, Arizona, as generally \ndepicted on a map entitled ``Diamond Point Exchange--Federal Land'', \ndated May 2002.\n (d) Equal Value Exchange.--The values of the non-Federal and \nFederal land directed to be exchanged under this section shall be equal \nor equalized as determined by the Secretary through an appraisal \nperformed by a qualified appraiser mutually agreed to by the Secretary \nand DPSHA and in conformance with the Uniform Appraisal Standards for \nFederal Land Acquisitions (U.S. Department of Justice, December 2000), \nand section 206(d) of FLPMA (43 U.S.C. 1716(d)). If the values are not \nequal, they shall be equalized by the payment of cash to the Secretary \nor DPSHA pursuant to section 206(b) of FLPMA (43 U.S.C. 1716(b)). \nPublic Law 90-171 (16 U.S.C. 484a; commonly known as the ``Sisk Act'') \nshall, without further appropriation, apply to any cash equalization \npayment received by the United States under this section.\n (e) Special Use Permit Termination.--Upon execution of the land \nexchange authorized by this section, all special use cabin permits on \nthe Federal land shall be terminated.\n\nSEC. 105. MISCELLANEOUS PROVISIONS.\n\n (a) Exchange Timetable.--Not later than 6 months after the \nSecretary receives an offer under section 103 or 104, the Secretary \nshall execute the exchange under section 103 or 104, respectively, \nunless the Secretary and MCJV or DPSHA, respectively, mutually agree to \nextend such deadline.\n (b) Exchange Processing.--Prior to executing the land exchanges \nauthorized by this title, the Secretary shall perform any necessary \nland surveys and required preexchange clearances, reviews, and \napprovals relating to threatened and endangered species, cultural and \nhistoric resources, wetlands and floodplains and hazardous materials. \nIf 1 or more of the Federal land parcels or lots, or portions thereof, \ncannot be transferred to MCJV or DPSHA due to hazardous materials, \nthreatened or endangered species, cultural or historic resources, or \nwetland and flood plain problems, the parcel or lot, or portion \nthereof, shall be deleted from the exchange, and the values of the \nlands to be exchanged adjusted in accordance with subsections (d) and \n(e) of section 103 or section 104(d), as appropriate. In order to save \nadministrative costs to the United States, the costs of performing such \nwork, including the appraisals required pursuant to this title, shall \nbe paid by MCJV or DPSHA for the relevant property, except for the \ncosts of any such work (including appraisal reviews and approvals) that \nthe Secretary is required or elects to have performed by employees of \nthe Department of Agriculture.\n (c) Federal Land Reservations and Encumbrances.--The Secretary \nshall convey the Federal land under this title subject to valid \nexisting rights, including easements, rights-of-way, utility lines and \nany other valid encumbrances on the Federal land as of the date of the \nconveyance under this title. If applicable to the land conveyed, the \nSecretary shall also retain any right of access as may be required by \nsection 120(h) of the Comprehensive Environmental Response, \nCompensation and Liability Act of 1980 (42 U.S.C. 9620(h)) for remedial \nor corrective action relating to hazardous substances as may be \nnecessary in the future.\n (d) Administration of Acquired Land.--The land acquired by the \nSecretary pursuant to this title shall become part of the Tonto or \nCoconino National Forest, as appropriate, and be administered as such \nin accordance with the laws, rules, and regulations generally \napplicable to the National Forest System. Such land may be made \navailable for domestic livestock grazing if determined appropriate by \nthe Secretary in accordance with the laws, rules, and regulations \napplicable thereto on National Forest System land.\n (e) Transfer of Land to Park Service.--Upon their acquisition by \nthe United States, the ``Montezuma Castle Contiguous Lands'' identified \nin section 103(d)(1) shall be transferred to the administrative \njurisdiction of the National Park Service, and shall thereafter be \npermanently incorporated in, and administered by the Secretary of the \nInterior as part of, the Montezuma Castle National Monument.\n\n TITLE II--MENDOCINO NATIONAL FOREST LAND CONVEYANCE\n\nSEC. 201. LAND CONVEYANCE, FARAWAY RANCH, MENDOCINO NATIONAL FOREST, \n CALIFORNIA.\n\n (a) Conveyance Required.--Subject to subsection (b), the Secretary \nof Agriculture shall convey to the owner of the property known as the \nFaraway Ranch in Lake County, California (in this section referred to \nas the ``recipient''), by quitclaim deed, all right, title, and \ninterest of the United States in and to the following National Forest \nSystem lands in Mendocino National Forest in Lake County, California:\n (1) ``Faraway Ranch, Tract 39'' (approximately 15.8 acres) \n consisting of a portion of lot 6 of section 4, township 18 \n north, range 10 west, Mount Diablo base and meridian, as \n generally depicted on the map entitled ``Faraway Ranch, Tracts \n 39 and 40'' and dated June 30, 2002.\n (2) ``Faraway Ranch, Tract 40'' (approximately 105.1 acres) \n consisting of a portion of the N\1/2\SW\1/4\ and lot 7 of \n section 4, and a portion of lots 15 and 16 of section 5, \n township 18 north, range 10 west, Mount Diablo base and \n meridian, as generally depicted on the map entitled ``Faraway \n Ranch, Tracts 39 and 40'' and dated June 30, 2002.\n (b) Time for Conveyance.--The Secretary shall make the conveyance \nunder subsection (a) not later than 120 days after the date on which \nthe recipient deposits sufficient funds with the Bureau of Land \nManagement, California State Office, Branch of Geographic Services, to \ncover survey work costs and with the Forest Service, Mendocino National \nForest, to cover Forest Service direct transaction costs described in \nsubsection (e).\n (c) Corrections.--With the agreement of the recipient, the \nSecretary may make minor corrections to the legal descriptions and map \nof the lands to be conveyed pursuant to this section.\n (d) Consideration.--As consideration for the conveyance under \nsubsection (a), the recipient shall pay to the Secretary an amount \nequal to the fair market value of the National Forest System lands \nconveyed under such subsection. The fair market value of such lands \nshall be determined by an appraisal that is acceptable to the Secretary \nand conforms with the Federal appraisal standards, as defined in the \nUniform Appraisal Standards for Federal Land Acquisitions developed by \nthe Interagency Land Acquisition Conference.\n (e) Payment of Costs.--All direct transaction costs associated with \nthe conveyance under section (a), including the costs of appraisal, \ntitle, and survey work, shall be paid by the recipient.\n (f) Use of Proceeds.--\n (1) Deposit.--The Secretary shall deposit the amounts \n received by the Secretary as consideration under subsection (d) \n in the fund established by Public Law 90-171 (commonly known as \n the Sisk Act; 16 U.S.C. 484a).\n (2) Use.--Funds deposited under paragraph (1) shall be \n available to the Secretary until expended, without further \n appropriation--\n (A) for the acquisition of land and interests in \n land for National Forest System purposes in the State \n of California; and\n (B) for reimbursement of costs incurred by the \n Forest Service in making the conveyance under \n subsection (a).\n (3) Status of acquired land.--Notwithstanding Public Law \n 85-862 (16 U.S.C. 521a), any lands acquired under paragraph \n (2)(A) shall be managed as lands acquired under the March 1, \n 1911 (commonly known as the Weeks Act; 16 U.S.C. 480, 500, 515 \n et seq.), regardless of whether any of the lands conveyed under \n subsection (a) were reserved from the public domain.\n (g) Withdrawal.--Subject to valid existing rights, the lands to be \nconveyed under subsection (a) are hereby withdrawn from all forms of \nlocation, entry, and patent\n\nunder the public land laws and the mining and mineral leasing laws of \nthe United States.\n\n Passed the House of Representatives September 24, 2002.\n\n Attest:\n\n JEFF TRANDAHL,\n\n Clerk.","Tonto and Coconino National Forests Land Exchange Act - Title I: Tonto and Coconino National Forests Land Exchange - Directs the Secretary of Agriculture to convey to certain private land owners specified lands in the Tonto National Forest in exchange for the conveyance by such land owners of certain lands adjacent to the Montezuma Castle National Monument and certain lands within the Coconino National Forest. Requires that the values of Federal and non-Federal lands be equalized.Directs the Secretary of Agriculture to convey to certain private land owners specified lands northeast of Payson, Arizona, in exchange for the conveyance by such land owners of certain lands within the Tonto National Forest. Requires that the values of Federal and non-Federal lands be equalized. Terminates all special use cabin permits on the Federal land upon execution of the exchange.Deletes from an exchange Federal land parcels that cannot be transferred due to hazardous materials, threatened or endangered species, cultural or historic resources, or wetland and flood plain problems and requires making appropriate adjustments to equalize land values being exchanged.Provides that the land acquired by the Secretary become part of the Tonto or Coconino National Forest, as appropriate. Provides for: (1) the Secretary of Agriculture to transfer all or a portion of the lands acquired adjacent to the Montezuma Castle National Monument to the administrative jurisdiction of the National Park Service; and (2) the incorporation of such lands in the Montezuma Castle National Monument.Title II: Mendocino National Forest Land Conveyance - Directs the Secretary of Agriculture to convey to the owner of Faraway Ranch in Lake County, California (""the recipient""), by quitclaim deed, all right, title, and interest of the United States in and to specified National Forest System (NFS) lands in Mendocino National Forest in Lake County. Directs the recipient to pay the Secretary an amount equal to the fair market value of the NFS lands. Assigns all transaction costs associated with the conveyance to the recipient. Requires the funds received by the Secretary to be used for the acquisition of land and interests in land for NFS purposes in California and for reimbursement of costs incurred by the Forest Service in making the conveyance. Withdraws, subject to valid existing rights, the lands being conveyed from all forms of location, entry, and patent under the public land laws and the mining and mineral leasing laws of the United States.","To provide for the exchange of certain lands in the Coconino and Tonto National Forests in Arizona, and for other purposes."


The metric is an instance of [`datasets.Metric`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Metric):

In [19]:
metric

EvaluationModule(name: "rouge", module_type: "metric", features: [{'predictions': Value('string'), 'references': List(Value('string'))}, {'predictions': Value('string'), 'references': Value('string')}], usage: """
Calculates average rouge scores for a list of hypotheses and references
Args:
    predictions: list of predictions to score. Each prediction
        should be a string with tokens separated by spaces.
    references: list of reference for each prediction. Each
        reference should be a string with tokens separated by spaces.
    rouge_types: A list of rouge types to calculate.
        Valid names:
        `"rouge{n}"` (e.g. `"rouge1"`, `"rouge2"`) where: {n} is the n-gram based scoring,
        `"rougeL"`: Longest common subsequence based scoring.
        `"rougeLsum"`: rougeLsum splits text using `"
"`.
        See details in https://github.com/huggingface/datasets/issues/617
    use_stemmer: Bool indicating whether Porter stemmer should be used to strip word suffixes.
 

You can call its `compute` method with your predictions and labels, which need to be list of decoded strings:

In [20]:
fake_preds = ["hello there", "general kenobi"]
fake_labels = ["hello there", "general kenobi"]
metric.compute(predictions=fake_preds, references=fake_labels)

{'rouge1': np.float64(1.0),
 'rouge2': np.float64(1.0),
 'rougeL': np.float64(1.0),
 'rougeLsum': np.float64(1.0)}

## Preprocessing the data

Before we can feed those texts to our model, we need to preprocess them. This is done by a 🤗 Transformers `Tokenizer` which will (as the name indicates) tokenize the inputs (including converting the tokens to their corresponding IDs in the pretrained vocabulary) and put it in a format the model expects, as well as generate the other inputs that the model requires.

To do all of this, we instantiate our tokenizer with the `AutoTokenizer.from_pretrained` method, which will ensure:

- we get a tokenizer that corresponds to the model architecture we want to use,
- we download the vocabulary used when pretraining this specific checkpoint.

That vocabulary will be cached, so it's not downloaded again the next time we run the cell.

In [21]:
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

tokenizer_config.json:   0%|          | 0.00/2.32k [00:00<?, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.39M [00:00<?, ?B/s]

By default, the call above will use one of the fast tokenizers (backed by Rust) from the 🤗 Tokenizers library.

You can directly call this tokenizer on one sentence or a pair of sentences:

In [22]:
tokenizer("Hello, this one sentence!")

{'input_ids': [8774, 6, 48, 80, 7142, 55, 1], 'attention_mask': [1, 1, 1, 1, 1, 1, 1]}

Depending on the model you selected, you will see different keys in the dictionary returned by the cell above. They don't matter much for what we're doing here (just know they are required by the model we will instantiate later), you can learn more about them in [this tutorial](https://huggingface.co/transformers/preprocessing.html) if you're interested.

Instead of one sentence, we can pass along a list of sentences:

In [23]:
tokenizer(["Hello, this one sentence!", "This is another sentence."])

{'input_ids': [[8774, 6, 48, 80, 7142, 55, 1], [100, 19, 430, 7142, 5, 1]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1]]}

To prepare the targets for our model, we need to tokenize them using the `text_target` parameter. This will make sure the tokenizer uses the special tokens corresponding to the targets:

In [24]:
print(tokenizer(text_target=["Hello, this one sentence!", "This is another sentence."]))

{'input_ids': [[8774, 6, 48, 80, 7142, 55, 1], [100, 19, 430, 7142, 5, 1]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1]]}


If you are using one of the five T5 checkpoints we have to prefix the inputs with "summarize:" (the model can also translate and it needs the prefix to know which task it has to perform).

In [25]:
if model_checkpoint in ["t5-small", "t5-base", "t5-large", "t5-3b", "t5-11b"]:
    prefix = "summarize: "
else:
    prefix = ""

We can then write the function that will preprocess our samples. We just feed them to the `tokenizer` with the argument `truncation=True`. This will ensure that an input longer that what the model selected can handle will be truncated to the maximum length accepted by the model. The padding will be dealt with later on (in a data collator) so we pad examples to the longest length in the batch and not the whole dataset.

In [26]:
max_input_length = 1024
max_target_length = 128

def preprocess_function(examples):
    inputs = [prefix + doc for doc in examples["document"]]
    model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)

    # Setup the tokenizer for targets
    labels = tokenizer(text_target=examples["summary"], max_length=max_target_length, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

This function works with one or several examples. In the case of several examples, the tokenizer will return a list of lists for each key:

In [28]:

def preprocess_function(examples):
    # THE FIX IS HERE: We changed "document" to "text" to match the new dataset.
    inputs = [prefix + doc for doc in examples["text"]]

    model_inputs = tokenizer(inputs, max_length=1024, truncation=True)

    labels = tokenizer(text_target=examples["summary"], max_length=128, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

To apply this function on all the pairs of sentences in our dataset, we just use the `map` method of our `dataset` object we created earlier. This will apply the function on all the elements of all the splits in `dataset`, so our training, validation and testing data will be preprocessed in one single command.

In [29]:
tokenized_datasets = raw_datasets.map(preprocess_function, batched=True)

Map:   0%|          | 0/18949 [00:00<?, ? examples/s]

Map:   0%|          | 0/3269 [00:00<?, ? examples/s]

Map:   0%|          | 0/1237 [00:00<?, ? examples/s]

Even better, the results are automatically cached by the 🤗 Datasets library to avoid spending time on this step the next time you run your notebook. The 🤗 Datasets library is normally smart enough to detect when the function you pass to map has changed (and thus requires to not use the cache data). For instance, it will properly detect if you change the task in the first cell and rerun the notebook. 🤗 Datasets warns you when it uses cached files, you can pass `load_from_cache_file=False` in the call to `map` to not use the cached files and force the preprocessing to be applied again.

Note that we passed `batched=True` to encode the texts by batches together. This is to leverage the full benefit of the fast tokenizer we loaded earlier, which will use multi-threading to treat the texts in a batch concurrently.

## Fine-tuning the model

Now that our data is ready, we can download the pretrained model and fine-tune it. Since our task is of the sequence-to-sequence kind, we use the `AutoModelForSeq2SeqLM` class. Like with the tokenizer, the `from_pretrained` method will download and cache the model for us.

In [30]:
from transformers import AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq, Seq2SeqTrainingArguments, Seq2SeqTrainer

model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

config.json:   0%|          | 0.00/1.21k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/242M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

Note that  we don't get a warning like in our classification example. This means we used all the weights of the pretrained model and there is no randomly initialized head in this case.

To instantiate a `Seq2SeqTrainer`, we will need to define three more things. The most important is the [`Seq2SeqTrainingArguments`](https://huggingface.co/transformers/main_classes/trainer.html#transformers.Seq2SeqTrainingArguments), which is a class that contains all the attributes to customize the training. It requires one folder name, which will be used to save the checkpoints of the model, and all other arguments are optional:

In [42]:
batch_size = 16
model_name = model_checkpoint.split("/")[-1]
args = Seq2SeqTrainingArguments(
    f"{model_name}-finetuned-xsum",
    eval_strategy = "epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=1,
    predict_with_generate=True,
    fp16=True,
    push_to_hub=False,
    report_to="none", # <-- THIS LINE FIXES THE PROBLEM
)

Here we set the evaluation to be done at the end of each epoch, tweak the learning rate, use the `batch_size` defined at the top of the cell and customize the weight decay. Since the `Seq2SeqTrainer` will save the model regularly and our dataset is quite large, we tell it to make three saves maximum. Lastly, we use the `predict_with_generate` option (to properly generate summaries) and activate mixed precision training (to go a bit faster).

The last argument to setup everything so we can push the model to the [Hub](https://huggingface.co/models) regularly during training. Remove it if you didn't follow the installation steps at the top of the notebook. If you want to save your model locally in a name that is different than the name of the repository it will be pushed, or if you want to push your model under an organization and not your name space, use the `hub_model_id` argument to set the repo name (it needs to be the full name, including your namespace: for instance `"sgugger/t5-finetuned-xsum"` or `"huggingface/t5-finetuned-xsum"`).

Then, we need a special kind of data collator, which will not only pad the inputs to the maximum length in the batch, but also the labels:

In [32]:
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

The last thing to define for our `Seq2SeqTrainer` is how to compute the metrics from the predictions. We need to define a function for this, which will just use the `metric` we loaded earlier, and we have to do a bit of pre-processing to decode the predictions into texts:

In [33]:
import nltk
import numpy as np

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    # Replace -100 in the labels as we can't decode them.
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)

    # Rouge expects a newline after each sentence
    decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
    decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]

    # Note that other metrics may not have a `use_aggregator` parameter
    # and thus will return a list, computing a metric for each sentence.
    result = metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True, use_aggregator=True)
    # Extract a few results
    result = {key: value * 100 for key, value in result.items()}

    # Add mean generated length
    prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]
    result["gen_len"] = np.mean(prediction_lens)

    return {k: round(v, 4) for k, v in result.items()}

Then we just need to pass all of this along with our datasets to the `Seq2SeqTrainer`:

In [36]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [3]:
trainer = Seq2SeqTrainer(
    model,
    args,
    train_dataset=tokenized_datasets["train"],

eval_dataset=tokenized_datasets["test"],

        data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

NameError: name 'Seq2SeqTrainer' is not defined

We can now finetune our model by just calling the `train` method:

In [1]:
trainer.train()

NameError: name 'trainer' is not defined

You can now upload the result of the training to the Hub, just execute this instruction:

In [2]:
trainer.push_to_hub()

NameError: name 'trainer' is not defined

You can now share this model with all your friends, family, favorite pets: they can all load it with the identifier `"your-username/the-name-you-picked"` so for instance:

```python
from transformers import AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("sgugger/my-awesome-model")
```