If you're opening this Notebook on colab, you will probably need to install ðŸ¤— `Transformers` and ðŸ¤— `Datasets` as well as other dependencies. 

* `datasets`
* `transformers`
* `rogue-score`
* `nltk`
* `pytorch`
* `ipywidgets`

*Note*: Since we are using the GPU to optimize the performance of the deep learning algorithms, `CUDA` needs to be installed on the device.

In [1]:
! pip install datasets transformers rouge-score nltk ipywidgets

Collecting datasets
  Downloading datasets-1.18.4-py3-none-any.whl (312 kB)
[K     |â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 312 kB 7.8 MB/s eta 0:00:01
[?25hCollecting transformers
  Downloading transformers-4.17.0-py3-none-any.whl (3.8 MB)
[K     |â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 3.8 MB 36.6 MB/s eta 0:00:01
[?25hCollecting rouge-score
  Downloading rouge_score-0.0.4-py2.py3-none-any.whl (22 kB)
Collecting nltk
  Downloading nltk-3.7-py3-none-any.whl (1.5 MB)
[K     |â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 1.5 MB 17.7 MB/s eta 0:00:01
Collecting responses<0.19
  Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Collecting huggingface-hub<1.0.0,>=0.1.0
  Downloading huggingface_hub-0.4.0-py3-none-any.whl (67 kB)
[K     |â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ

When using `nltk`, `punkt` also needs to be installed. I guess it is not installed automatically. Not having `punkt` will result in an error during the analysis.

In [2]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /home/user/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

If you're opening this notebook locally, make sure your environment has an install from the last version of those libraries.

To be able to share your model with the community and generate results like the one shown in the picture below via the inference API, there are a few more steps to follow.

First you have to store your authentication token from the Hugging Face website (sign up [here](https://huggingface.co/join) if you haven't already!) then execute the following cell and input your username and password:

In [3]:
from huggingface_hub import notebook_login

notebook_login()

VBox(children=(HTML(value='<center>\n<img src=https://huggingface.co/front/assets/huggingface_logo-noborder.svâ€¦

Then you need to install `Git-LFS`.

If you are not using `Google Colab`, you may need to install `Git-LFS` manually, since the code below may not work and depending on your operating system. You can read about `Git-LFS` and how to install it [here](https://git-lfs.github.com/).

In [4]:
! sudo apt install git-lfs

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  git-lfs
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 3316 kB of archives.
After this operation, 11.1 MB of additional disk space will be used.
Get:1 http://fin1.clouds.archive.ubuntu.com/ubuntu focal/universe amd64 git-lfs amd64 2.9.2-1 [3316 kB]
Fetched 3316 kB in 1s (2576 kB/s)[0mm[33m[33m

7[0;23r8[1ASelecting previously unselected package git-lfs.
(Reading database ... 143519 files and directories currently installed.)
Preparing to unpack .../git-lfs_2.9.2-1_amd64.deb ...
7[24;0f[42m[30mProgress: [  0%][49m[39m [..........................................................] 87[24;0f[42m[30mProgress: [ 20%][49m[39m [###########...............................................] 8Unpacking git-lfs (2.9.2-1) ...
7[24;0f[42m[30mProgress: [ 40%][49m[39m [#######################...................

Make sure your version of `Transformers` is at least 4.11.0 since the functionality was introduced in that version:

In [5]:
import transformers

print(transformers.__version__)

4.17.0


You can find a script version of this notebook to fine-tune your model in a distributed fashion using multiple GPUs or TPUs [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq).

# Fine-tuning a model on a summarization task

In this notebook, we will see how to fine-tune one of the [ðŸ¤—`Transformers`](https://github.com/huggingface/transformers) model for a summarization task. We will use the [PubMed Summarization dataset](https://huggingface.co/datasets/ccdv/pubmed-summarization) which contains PubMed articles accompanied with abstracts.

![Widget inference on a summarization task](https://github.com/huggingface/notebooks/blob/master/examples/images/summarization.png?raw=1)

We will see how to easily load the dataset for this task using ðŸ¤— `Datasets` and how to fine-tune a model on it using the `Trainer` API.

In [6]:
model_checkpoint = "google/pegasus-arxiv"

This notebook is built to run  with any model checkpoint from the [Model Hub](https://huggingface.co/models) as long as that model has a sequence-to-sequence version in the Transformers library. Here we picked the [`google/pegasus-arxiv`](https://huggingface.co/google/pegasus-arxiv) checkpoint. 

## Loading the dataset

We will use the [ðŸ¤— `Datasets`](https://github.com/huggingface/datasets) library to download the data and get the metric we need to use for evaluation (to compare our model to the benchmark). This can be easily done with the functions `load_dataset` and `load_metric`.  

In [7]:
from datasets import load_dataset, load_metric

raw_datasets = load_dataset("ccdv/pubmed-summarization")
metric = load_metric("rouge")

Downloading:   0%|          | 0.00/4.88k [00:00<?, ?B/s]

No config specified, defaulting to: pub_med_summarization_dataset/document


Downloading and preparing dataset pub_med_summarization_dataset/document to /home/user/.cache/huggingface/datasets/ccdv___pub_med_summarization_dataset/document/1.0.0/5792402f4d618f2f4e81ee177769870f365599daa729652338bac579552fec30...


Downloading:   0%|          | 0.00/779M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/43.7M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/43.8M [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

Dataset pub_med_summarization_dataset downloaded and prepared to /home/user/.cache/huggingface/datasets/ccdv___pub_med_summarization_dataset/document/1.0.0/5792402f4d618f2f4e81ee177769870f365599daa729652338bac579552fec30. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/2.16k [00:00<?, ?B/s]

The `dataset` object itself is [`DatasetDict`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasetdict), which contains one key for the training, validation and test set:

In [8]:
raw_datasets

DatasetDict({
    train: Dataset({
        features: ['article', 'abstract'],
        num_rows: 119924
    })
    validation: Dataset({
        features: ['article', 'abstract'],
        num_rows: 6633
    })
    test: Dataset({
        features: ['article', 'abstract'],
        num_rows: 6658
    })
})

To access an actual element, you need to select a split first, then give an index:

In [9]:
raw_datasets["train"][0]

{'article': "a recent systematic analysis showed that in 2011 , 314 ( 296 - 331 ) million children younger than 5 years were mildly , moderately or severely stunted and 258 ( 240 - 274 ) million were mildly , moderately or severely underweight in the developing countries . in iran a study among 752 high school girls in sistan and baluchestan showed prevalence of 16.2% , 8.6% and 1.5% , for underweight , overweight and obesity , respectively . the prevalence of malnutrition among elementary school aged children in tehran varied from 6% to 16% . anthropometric study of elementary school students in shiraz revealed that 16% of them suffer from malnutrition and low body weight . snack should have 300 - 400 kcal energy and could provide 5 - 10 g of protein / day . nowadays , school nutrition programs are running as the national programs , world - wide . national school lunch program in the united states there are also some reports regarding school feeding programs in developing countries . 

Since the `pubmed` data is extremely large, we are going to remove rows so that we have a training set of 8,000, a validation set of 2,000, and a test set of 2,000. 

In [10]:
raw_datasets["train"] = raw_datasets["train"].select(range(1, 2001))
raw_datasets["validation"] = raw_datasets["validation"].select(range(1, 501))
raw_datasets["test"] = raw_datasets["test"].select(range(1, 501))

To get a sense of what the data looks like, the following function will show some examples picked randomly in the dataset.

In [11]:
import datasets
import random
import pandas as pd
from IPython.display import display, HTML

def show_random_elements(dataset, num_examples=5):
    assert num_examples <= len(dataset), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset)-1)
        while pick in picks:
            pick = random.randint(0, len(dataset)-1)
        picks.append(pick)
    
    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))

In [12]:
show_random_elements(raw_datasets["train"])

Unnamed: 0,article,abstract
0,"the experimental animals , feeding and management : the experimental \n procedures complied with the guide for the care and use of agricultural animals of obihiro \n university . the experiment was carried out at the field center of animal science and agriculture , \n obihiro university of agriculture and veterinary medicine . fifty multiparous holstein cows , \n in which parity was from 1st to 5th at dry period , were used in this study and had calved \n between september 2011 and august 2012 . parity and body condition score ( bcs ) of the \n experimental cows at initiation of the study were 2.4 0.2 and 3.41 0.04 , respectively . \n the study was performed from 3 weeks before the expected parturition to 100 days pp . cows \n close to the dry period , about 1 month before the expected calving date , were moved to a \n paddock and fed a limited total mixed ration [ dry matter ( dm ) basis : 127 g of crude protein \n ( cp)/kg and 6.6 mj of net energy for lactation ( nel)/kg ] consisting of grass silage ( 3.5 kg , \n dm basis : 165 g of cp / kg and 5.5 mj of nel / kg ) , maize silage ( 5.1 kg , dm basis : 86 g of \n cp / kg and 6.0 mj of nel / kg ) , concentrate for dry cows ( 2.0 kg , dm basis : 170 g of cp / kg and \n 6.8 mj of nel / kg ) and grass hay ( dm basis : 125 g of cp / kg and 5.7 mj of nel / kg ) ad \n libitum until parturition . after parturition , cows were housed in a free - stall \n barn and fed a lactation diet , which was a mixed ration ( dm basis : 155 g of cp / kg and 6.2 mj \n of nel / kg ) consisting of grass ( 6.5 kg , dm basis : 165 g of cp / kg and 5.5 mj of nel / kg ) , \n maize silage ( 12.5 kg , dm basis : 84 g of cp / kg and 6.4 mj of nel / kg ) and concentrate for \n dairy cows ( 8.0 kg , dm basis : 180 g of cp / kg and 7.1 mj of nel / kg ) ad \n libitum . in addition , the diets were supplemented with minerals , and the dairy \n cow concentrate was prepared according to each cow s specific requirements for milk \n production . grass hay ( dm basis : 104 g of cp / kg and 5.2 mj of nel / kg ) and water were \n available ad libitum . cows were milked twice daily between 05:00 and 06:30 \n hr and between 17:00 and 18:30 hr . the experimental itt and sampling : the itt was performed 3 weeks before \n the expected calving date . the cows were weighed the day before the initiation of the itt , \n and bw was used to determine the doses of insulin for the itt . immediately before the itt , an extension catheter was inserted into the \n right or left jugular vein . the itt was performed by intravenously administering 0.05 iu / kg \n bw of insulin ( novolin r 100 iu / ml ; novo nordisk pharma , tokyo , japan ) , \n followed by administration of 5 ml heparinized saline ( 100 \n iu / ml ) . blood samples were \n collected via the jugular vein at 0 ( before insulin injection ) , 30 , 45 and 60 min relative \n to the administration of insulin via caudal venipuncture to measure glucose and insulin . bcs was assessed twice a week from 3 weeks before the expected parturition to 3 weeks after \n calving by the same operator by using a 1 to 5 scale with 0.25 intervals , where 1=thin and \n 5=very fat . blood samples were obtained by \n caudal venipuncture twice a week from 3 weeks before the expected parturition to 3 weeks \n after calving . blood samples were collected via the jugular vein from the calves immediately \n after birth . nonheparinized and silicone - coated 9-ml tubes ( venoject , \n autosep , gel + clot . act . vp - as109k ; terumo corporation , tokyo , japan ) were used for \n biochemical analysis , and sterile 10-ml tubes containing 200 \n l of stabilizer solution ( 0.3 m edta and 1% acetyl salicylic acid , ph \n 7.4 ) were used for hormonal analysis . serum was obtained by centrifuging the blood samples \n for 15 min at 38c in an incubator . all the tubes were centrifuged at 2,000 \n g for 20 min at 4c , and plasma samples were maintained at 30c until \n analysis . in addition , milk samples were collected twice a week after milking until the \n onset of luteal activity . the milk samples were centrifuged at 1,500 g \n for 15 min at 4c , and the skim milk samples were stored at 30c until analysis for \n progesterone concentration . daily milk yield was recorded until 100 days pp . peripartum \n diseases , such as milk fever , hypocalcemia , ketosis , ruminal acidosis , displaced abomasum , \n lameness , retained placenta , endometritis and mastitis , were recorded when that has been \n diagnosed from 3 weeks prepartum to 3 weeks postpartum by veterinarian in the experimental \n farm . the experimental measurement of hormones and metabolites : plasma and skim \n milk progesterone concentrations were determined using enzyme immunoassay ( eia ) after \n extraction with diethyl ether , as described previously ; the extraction efficiency was 90% . the standard curve ranged from 0.05 to 50 \n ng / ml , and the 50% effective dose ( ed50 ) of \n the assay was 0.66 ng / ml . the mean intra - assay and \n inter - assay coefficients of variation ( cvs ) were 6.0% and 9.2% , respectively . the total \n plasma insulin - like growth factor 1 ( igf-1 ) concentration was determined using eia by using \n the biotin streptavidin amplification technique \n after protein extraction by using acid ethanol ( 87.5% ethanol and 12.5% 2 n hydrochloric \n acid ) to obtain igf-1 free from binding proteins . intra- and inter - assay cvs were 5.9% and 6.1% , \n respectively , and the ed50 of this assay system was 7.2 \n ng / ml . the plasma gh concentrations were determined \n using eia as described previously ; the standard \n curve ranged from 0.78 to 100 ng / ml , and the \n ed50 was 21 ng / ml . intra- and inter - assay cvs \n were 3.1% and 8.2% , respectively . the plasma insulin concentrations were determined using an \n enzyme - linked immunosorbent assay ( elisa ) kit ( bovine insulin elisa 10 - 1201 - 01 ; mercodia , \n uppsala , sweden ) . the serum concentrations of glucose , non - esterified fatty acids ( nefa ) , -hydroxybutyrate \n ( bhba ) , total protein ( tp ) , albumin ( alb ) , blood urea nitrogen ( bun ) and total cholesterol \n ( t - cho ) and the activities of aspartate aminotransferase ( ast ) were measured using a \n clinical chemistry automated analyzer ( tba120fr ; toshiba medical systems co. , ltd . , tochigi , \n japan ) . the experimental identification of the onset of luteal activity : when the \n progesterone concentration in the plasma or skim milk had increased to more than 1 \n ng / ml , the cows were considered to show luteal activity \n . the experimental statistical analysis : sixteen cows were excluded from \n data analysis , because of the following reasons : a pregnancy period of more than 287 or less \n than 273 days ( n=6 ) , severe mastitis ( n=3 ) , twin calving ( n=1 ) , blood collection loss at itt \n ( n=3 ) and mistakes in insulin injection at itt ( n=3 ) . cows were divided into two groups \n based on the time required for glucose to reach the minimum levels after insulin injection . \n cows with a minimum glucose at 60 min after insulin injection were considered to have lower \n insulin sensitivity and/or lower glucose metabolism compared to cows with a minimum glucose \n level by 45 min after insulin injection . therefore , cows with a minimum glucose level at 60 \n min after insulin injection were defined as the insulin resistant group ( ir group ) , whereas \n those with a minimum glucose level by 45 min after insulin injection were defined as the \n non - insulin resistant ( nir group ) in this study . before data analysis , bcs , plasma igf-1 , gh \n and insulin concentrations , and serum metabolite concentrations were averaged weekly . the \n period of 06 days after calving was considered as the parturient week ( 0 week pp ) , and the \n kolmogorov smirnov test ( sas enterprise guide version 4.3 ; sas institute inc . , cary , nc , \n u.s.a . ) was used for statistical testing of normality . in addition , the data were analyzed \n separately for the prepartum and pp periods . stat view ( stat view 5.0 software ; abacus \n concepts inc . , berkeley , ca , u.s.a . ) was used for data analysis by using the repeated \n measures of the anova procedure , including time ( week ) , group ( nir or ir ) and their \n interaction in the model as fixed effects . diagnosis of peripartum diseases and sex of calves in the nir and ir groups were analyzed \n using the chi - square test , and other data , including results for calves between nir and ir , \n were analyzed using the student s t - test or wilcoxon s signed rank test \n ( sas enterprise guide version 4.3 ; sas institute inc . ) . results are presented as mean \n standard error of the mean ( sem ) ; differences with p<0.05 were \n considered significant . in 28 of the 34 experimental cows , the time required for glucose to reach the minimum level \n was 45 min after insulin injection with one exception ( 30 min ; n=1 , 45 min ; n=27 , nir \n group ) . the remaining experimental cows ( n=6 ) required 60 min after insulin injection to \n attain the minimum glucose levels ( ir group ) . serum glucose concentrations at 60 min after \n insulin injection were higher in the nir group than in the ir group , although glucose levels \n at the other time points did not differ between the nir and ir groups ( fig . 1.the change in serum glucose concentration after insulin injection at insulin \n tolerance test ( itt ) in the nir ( n=28 ) and ir ( n=6 ) groups . p<0.05 ) . the change in serum glucose concentration after insulin injection at insulin \n tolerance test ( itt ) in the nir ( n=28 ) and ir ( n=6 ) groups . table 1table 1.parity , calving difficulty , sex of calves , peripartum disease , luteal activity \n onset and milk yield in the nir and ir groupsnir groupir groupp - value(n=28)(n=6)parity at the onset of experiment2.4 0.32.2 0.70.460calving difficulty1.1 0.11.0 0.00.700sex of calves ( male / female)14/143/31.000diagnosis of peripartum disease9/28 ( 32%)1/6 ( 17%)0.645days to the onset of luteal activity ( days)38.3 3.820.3 3.60.039average of daily milk yield between days 7 and 100 pp ( kg)41.4 0.935.9 2.00.013total milk yield from days 7 to 100 pp ( kg)3,888.1 81.23,375.5 185.90.013values are the mean sem . a ) nir group ; cows with a minimum glucose level by 45 min \n after insulin injection . ir group ; cows with a minimum glucose level at 60 min after \n insulin injection . b ) 1 , unassisted birth ( natural , without human assistance ) ; 2 , easy \n calving with human assistance ; 3 , difficult calving with a few humans ; 4 , dystocia \n ( requiring considerably more force than normal ) ; and 5 , surgical treatment or death of \n cow . c ) milk fever , hypocalcemia , ketosis , ruminal acidosis , displaced abomasum , \n lameness , retained placenta , endometritis and mastitis from 3 weeks prepartum to 3 \n weeks postpartum . shows the parity , calving difficulty , sex of calves , peripartum disease \n diagnosis , luteal activity onset and milk yield until 100 days pp in the nir and ir groups . \n days until the onset of luteal activity in the ir group were fewer than those in the nir \n group ( p<0.05 ) . in addition , the average ( p<0.05 ) \n and total ( p<0.05 ) milk yields until 100 days pp were lower in the ir \n group than in the nir group . peripartum diseases were diagnosed as mastitis ( n=6 ) , \n hypocalcemia ( n=1 ) and milk fever ( n=2 ) in nir group , and as mastitis ( n=1 ) in ir group , and \n there was no significant difference in the number of cows with the peripartum diseases \n between nir and ir groups . no significant difference was noted in other factors between the \n nir and ir groups . a ) nir group ; cows with a minimum glucose level by 45 min \n after insulin injection . ir group ; cows with a minimum glucose level at 60 min after \n insulin injection . b ) 1 , unassisted birth ( natural , without human assistance ) ; 2 , easy \n calving with human assistance ; 3 , difficult calving with a few humans ; 4 , dystocia \n ( requiring considerably more force than normal ) ; and 5 , surgical treatment or death of \n cow . c ) milk fever , hypocalcemia , ketosis , ruminal acidosis , displaced abomasum , \n lameness , retained placenta , endometritis and mastitis from 3 weeks prepartum to 3 \n weeks postpartum . 2.serum metabolite concentrations , activities of enzymes and plasma metabolic hormones , \n and bcs during the experimental period [ mean sem : solid , nir ( n=28 ) ; open , ir ( n=6 ) \n groups ] . * indicates differences of p<0.05 , and indicates \n differences of p<0.1 between the nir and ir groups . shows the circulating serum metabolite concentrations , enzyme levels , plasma \n metabolic concentrations and bcs during the experimental period . during the prepartum \n period , bcs ( p<0.05 ) , and serum bun concentrations \n ( p<0.05 ) were lower , whereas serum glucose ( p=0.05 ) and \n alb concentrations ( p=0.10 ) tended to be lower in the ir group than in the \n nir group . during the pp period , cows of the nir group had higher serum nefa \n ( p<0.05 ) and bhba ( p=0.09 ) concentrations than those \n in the ir group . in addition , treatment and time effects were observed \n ( p<0.05 ) for bcs during the pp period : bcs at 0 \n ( p=0.08 ) and 1 ( p<0.05 ) week pp were lower in the ir \n group than in the nir group . no significant differences were noted in the other factors \n between the nir and ir groups in each period . serum metabolite concentrations , activities of enzymes and plasma metabolic hormones , \n and bcs during the experimental period [ mean sem : solid , nir ( n=28 ) ; open , ir ( n=6 ) \n groups ] . * indicates differences of p<0.05 , and indicates \n differences of p<0.1 between the nir and ir groups . bw , plasma metabolic hormone levels and serum glucose concentrations at birth in the calves \n of cows of the nir and ir groups are shown in table \n 2table 2.bw and plasma metabolic hormones and serum glucose concentrations at birth in the \n calves of the nir and ir groupscalves of nircalves of irp - value(n=28)(n=6)bw at the birth ( kg)47.2 0.942.1 1.70.020plasma gh concentration \n ( ng / ml)13.6 1.315.2 4.80.653plasma igf-1 concentration \n ( ng / ml)121.5 6.369.8 5.60.001plasma insulin concentration \n ( ng / ml)0.3 0.00.7 0.20.061serum glucose concentration \n ( mg / dl)77.4 5.272.1 14.80.684values are the mean sem . a ) nir group ; cows with a minimum glucose level by 45 min \n after insulin injection . ir group ; cows with a minimum glucose level at 60 min after \n insulin injection .. bw at birth in the calves of the ir group was lower than that in the calves \n of the nir group ( p<0.05 ) . furthermore , the calves of the ir group \n showed lower plasma igf-1 concentration ( p<0.001 ) and higher plasma \n insulin concentration ( p=0.06 ) . no significant differences were noted in \n the plasma gh and serum glucose levels at birth between the calves of the nir and ir \n groups . a ) nir group ; cows with a minimum glucose level by 45 min \n after insulin injection . ir group ; cows with a minimum glucose level at 60 min after \n insulin injection . in this study , the six cows that reached the minimum glucose levels at 60 min after insulin \n injection were considered to be ir ; the reason for ir was thought to be the slow recovery of \n glucose after insulin injection , which is consistent with the findings of a previous study \n by lee et al . . in general , bcs \n and blood glucose and bun concentrations are known to be associated with energy status and \n feed intake [ 7 , 8 , 38 ] . during the prepartum period , ir \n cows showed lower energy status and feed intake owing to the lower bcs and glucose and bun \n concentrations . although ir by itt was confirmed at 3 weeks before calving , a difference in \n energy status between ir and nir cows was noted . in particular , bcs can not be evaluated on \n the basis of the change in energy status of the real - time feed intake ; therefore , in this study , ir cows might have become insulin \n resistant during an earlier time . malnutrition causes imbalance in glucose homeostasis , and \n the decrease of insulin in circulation induces the reduction of feed intake in dairy cows \n . therefore , the feed intake reduction in ir \n cows might have inhibited the volatile fatty acid production in the rumen and thus \n suppressed gluconeogenesis in the liver . lower \n energy status , such as lower bcs , before calving of ir group in this study might be caused \n by long - term malnutrition from previous lactation . however , in this study , the reasons for \n the lower energy status in the ir cows were not clear ; thus , further studies are warranted \n to confirm the onset of insulin resistance in pregnant dairy cows . bcs at 0 and 1 week pp were lower in the ir group than in the nir group . nir cows showed \n higher serum nefa and bhba concentrations than those in ir cows , although the levels of \n metabolic hormones did not differ between the ir and nir cows . furthermore , the average and \n total milk yield until 100 days pp were lower in the ir group than in the nir group . higher \n nefa and bhba indicate greater mobilization of adipose tissue and failure of lipid \n metabolism in the liver [ 14 , 15 ] . however , cows with lower bcs had sustained reduced plasma nefa and \n bhba concentrations after calving compared to cows with higher bcs [ 28 , 33 ] . cows with lower bcs \n produce milk by protein mobilization , because of the limited body fat ; thus , fat - corrected \n milk yield in those cows was lower than moderate and fat cows . conversely , it was indicated that higher bcs cows have ability to \n mobilize fat to maintain energetic homeostasis after feed restriction . additionally , roche et al . have concluded that bcs at calving had positive effect on milk yield , \n and optimal bcs at calving was 3.5 in the 5-point scale . in the present study , greater bcs \n and better gluconeogenesis in nir group might produce greater milk yield compared with ir \n group , although the differences of them between nir and ir groups were not so greater . days \n to the onset of luteal activity in the ir group were fewer than in the nir group . in dairy \n cows , lowered energy status during the peripartum period is known to delay the first \n ovulation after parturition [ 2 , 19 ] . butler and smith showed \n that a negative energy balance was directly related to the pp interval to the first \n ovulation and that the differences in the energy balance were reflected in the milk yield . \n in addition , cows with a delayed first ovulation showed higher nefa and bhba concentrations \n after parturition [ 21 , 31 , 39 ] . therefore , in this study , \n higher nefa and bhba concentrations of nir cows during the pp period might have delayed the \n onset of luteal activity , and the lowered milk yield of ir cows might induce earlier \n resumption of ovarian activity . the maternal endocrine and metabolic milieu transferred through the placenta during late \n pregnancy affects the environment of the fetus [ 17 , \n 24 , 30 ] . in \n humans , ir of the mother is associated with low birth weight of the infant ; in cattle , maternal malnutrition during gestation is \n related to the lowered development of both the placenta and the fetus [ 24 , 30 ] . further , in ewes , \n restricted maternal feeding during gestation was related to lower bw and plasma igf-1 , \n insulin and glucose concentrations in the fetus , although maternal igf-1 concentrations were \n not affected . in the present study , calves of \n the ir cows showed lowered bw at birth and a lower plasma igf-1 concentration , supporting \n the findings of previous studies . in addition , they showed higher insulin levels than those \n of nir cows , despite the similar glucose levels . in the late gestation , fetal growth is \n mainly regulated by igf-1 , and the dominant regulator of igf-1 production in the fetus is \n fetal glucose and insulin . thus , the differences \n in blood metabolic hormones and glucose concentrations between the calves of the nir and ir \n groups might be attributed to the fetal nutritional condition that was affected by maternal \n endocrine and metabolic milieu . in humans , lower bw at birth is known to be associated with \n a wide range of adverse outcomes later in life , including diabetes ; further , obese children with low birth weight have higher blood \n insulin to glucose concentration and show higher insulin resistance as revealed by the \n homeostasis model assessment compared with obese children with normal birth weight . therefore , calves of ir cows might develop insulin \n resistance in the future . in conclusion , the findings of the present study suggest that ir at 3 weeks before \n parturition in dairy cows is related to the pp metabolic status , milk production and \n resumption of ovarian activity along with growth , as well as the metabolic status of their \n calves . therefore , ir evaluated on the basis of the recovery of glucose after an injection \n of a small dose of insulin during the dry period might be an indication of the pp \n performance of pregnant dairy cows , as well as the growth , fertility and milk production of \n their calves . in addition , the reason for ir in the present study was thought to be the slow \n recovery of glucose after insulin injection as well as the previous study . therefore , the enhancement of the gluconeogenesis in \n the liver by energy supplementation , such as glycerol , or hepatic stimulant , such as amino \n acids , should be confirmed in order to improve the ir .","<S> this study aimed to investigate the effects of insulin resistance ( ir ) during the \n close - up dry period on the metabolic status and performance of dairy cows as well as to \n determine the effects on body weight ( bw ) and metabolic status of their calves . an insulin \n tolerance test ( itt ) was conducted by administering 0.05 iu / kg bw of insulin to 34 \n multiparous holstein cows at 3 weeks prepartum . </S> <S> blood samples were collected at 0 , 30 , 45 \n and 60 min after insulin injection , and cows were divided into two groups based on the \n time required for glucose to reach the minimum levels [ non - ir ( nir ) , 45 min ( n=28 ) ; and \n ir , 60 min ( n=6 ) ] . </S> <S> blood or milk sampling and body condition score ( bcs ) estimation were \n performed twice weekly during the experimental period . </S> <S> blood samples from calves were \n collected immediately after birth . </S> <S> cows with ir showed lower bcs \n ( p<0.05 ) and serum urea nitrogen ( p<0.05 ) and \n glucose concentration ( p=0.05 ) before calving , and lower serum \n non - esterified fatty acid concentration ( p<0.05 ) and milk yield \n ( p<0.05 ) and earlier resumption of luteal activity \n ( p<0.05 ) after calving ; their calves showed lower bw \n ( p<0.05 ) and plasma insulin - like growth factor - i concentration \n ( p<0.001 ) and higher plasma insulin concentration \n ( p<0.05 ) . in conclusion , ir at 3 weeks prepartum in dairy cows is \n related to postpartum metabolic status and performance along with growth and metabolic \n status of their calves . </S>"
1,"direct coronary stent implantation is an elegant technique for coronary artery revascularization.1 however , calcified coronary lesions , often seen in older patients suffering from diabetes mellitus , renal failure and hypertension , are challenging to deal with , as they require optimal lesion preparation prior stenting for avoiding stent underexpansion which is related to in - stent restenosis , target lesion revascularization and subsequent stent thrombosis.2 several strategies and technologies have been developed to address the problem of heavily calcified coronary lesions . these include simple dilatation using standard non - compliant balloon , cutting balloon and plaque modification using rotational atherectomy . we report on the management of an underexpanded bare - metal stent in a patient with heavily calcified lesion not amenable to high - pressure balloon - dilatation . a 72-year old man suffering from progressive angina over the past 8 weeks presented to our chest pain unit . he had previously documented insulin - dependent diabetes , alimentary obesity , hyperlipidemia and arterial hypertension . an ambulatory performed myocardial perfusion scintigraphy revealed a reduced tracer - uptake in the apex , left posterior and antero - lateral wall during physical examination ( 100 watt - cycling ) . coronary angiography , which was performed via right radial access using a 5f sheath , revealed a 50% stenosis of the left anterior descending artery ( lad ) and ramus posterolateralis sinistra ( rpls ) while the right coronary artery ( rca ) had a critical 90% stenosis ( fig . due to the patients symptoms and the angiographic findings we decided to perform a percutaneous coronary intervention ( pci ) . the patient received 600 mg clopidogrel , 500 mg aspirin and 5000u heparin followed by primary pci and direct stenting of a bare - metal stent ( bms ) ( coroflex blue 3.5 mm/8 mm , b. braun , melsungen , germany ) with 18 atm for 30 sec ( fig . post - pci angiography revealed a 75% stenosis in the mid - portion of the stent ( fig . 1c ) . a subsequent dilatation with a semi - compliant balloon ( pantera 3,5/10 mm with [ biotronik , berlin , germany ] 18 atm over 30 sec ) , a non - compliant balloon ( quantum 3,5/8 mm [ boston scientific , natick , usa ] with 20 atm over 30 sec ) and a cutting - balloon ( 3,0/10 mm [ boston scientific , natick , usa ] with 18 atm over 30 sec ) could not expand the stent further ; pointing out the heavily calcified nature of this lesion . due to the fact that an underexpanded stent is a predictor for worse clinical outcome we decided on rotablation . additionally we introduced a 5f sheath in right femoral vein and inserted a transient pacemaker lead . after passing across the stenosis with the 0.009 rotawire we ablade the heavily calcified stenosis as well as the stent struts ( stentablation ) ( fig . all ablations were performed with a 1.75 mm burr with at least 150,000 rpm and ablation times < 30 sec without a decrease in rotational speed of > 5,000 rpm . the procedure was free of complications and we continued with dilatation with a non - compliant balloon ( quantum 3,5/8 mm with 20 atm over 30 sec ) and a cutting - balloon ( 3,0/10 mm with 16 atm over 30 sec ) . with complete expansion of the balloons the procedure was continued with implantation of a drug - eluting stent ( taxus libert 4.0/12 mm [ boston scientific , natick , usa ] with 16 atm over 30 sec ) ( rotastenting ) ( fig . finally , there was timi 3 without evidence of dissection or residual stenosis ( fig . 2c ) . following uneventful hospital stay without evidence of myocardial necrosis the patient was discharged after 3 days on 100 mg aspirin , 75 mg clopidogrel , 5 mg bisoprolol , 5 mg of ramipril and 40 mg simvastatin with a recommendation for dual antiplatelet therapy of 1 year without any change in his extra - cardiovascular medications . a routine coronary angiography performed 6 months after index - pci revealed a good result with a mild ( 25% ) restenosis ( fig . 2d ) . direct stenting is the implanation of stents in coronary lesions without predilatation.1 from animal restenosis models , direct stenting without the need for predilatation appears to reduce vessel trauma , in particular as a result of less endothelial denudation , resulting in less neointimal hyperplasia subsequently.1 pci of calcified and complex lesions has been associated with lower success rates , an increased frequency of acute complications , and higher restenosis rates than pci of simple lesions.2 as seen in our case , delivering the stent may be difficult and stent expansion may be inadequate in heavily calcified lesions , resulting in smaller acute gain compared to non - calcified lesions.2 it is widely accepted that achieving postprocedural residual stenosis is a major determinant of restenosis during follow - up and optimal stent expansion is a crucial factor in minimizing the risk of stent thrombosis pointed out by the fact that only 22% of patients that experienced subacute stent thrombosis have an acceptable pci result as assessed by ivus.2,3 a variety of strategies and technologies have been developed to address the problem of an underexpanded stent . the postit trial revealed that in case of using only the stent delivery balloon over 70% of patients did not achieve optimal stent deployment.4 use of non - compliant balloon to achieve full distension in resitant lesions is a reasonable first - step . however , focal points of resistance within a lesion result in non - uniform balloon expansion and characteristic dog - boning with overexpansion in the more compliant segments . in this non - uniform expansion may cause vessel dissection and rupture acutely as well as restenosis due to deep - wall injury in the follow - up . cutting - balloon , designed to score the vessel longitudinally rather than causing uncontrolled plaque disruption , have been used successfully in the treatment of undilatable lesions.5 in our case , none of these techniques were successful in reducing the underexpansion , demonstrating the nature of the heavily calcification , which was not assumed on initial fluoroscopy . thus , despite the existence of limited data,6 we decided to rotablade the remaining calcification and the underexpanded stent struts to avoid aforementioned complications . high - speed rotational atherectomy preferentially cuts hard plaque , increases plaque compliance and thereby renders the lesion more amenable to balloon dilatation.7 the rotablator is able to ablate inelastic tissue selectively while maintaining the integrity of elastic tissue due to the principle of differential cutting . these particles are small enough to pass through the coronary microcirculation and ultimately undergo phagocytosis in the liver , spleen , and lung.7 the procedure performed in our case was uneventful with no dissection , slow - flow , heamodynamic compromise or myocardial necrosis . we had applied a transient pacemaker via the right femoral vein to overcome possible conduction disturbances when handling in the right coronary artery . several observational studies have confirmed that rotational atherectomy prior to stent deployment in severely calcified lesions does facilitate stent delivery and expansion , but incidence of restenosis remains unsatisfactory ( 23% ) when bms are used.8 there is limited information about rotational atherectomy followed by des implantation , but initial results seem promising.9 a comparison of bms ( n = 84 ) and des ( n = 213 ) after rotablation with cardiac death and recurrent myocardial infarction being defined as primary endpoint and binary restenosis as secondary endpoint revealed lower rates for primary endpoint in des group ( 2.3% versus 7.1% ; p = 0.04 ) during a follow - up of 1300 days.10 despite our procedural success and good midterm result , there are no data on long - term follow - up after stentablation and rotastening . thus , it should be emphasized that a better lesion preparation is needed to avoid stent underexpansion in undilatable lesions .","<S> calcified coronary lesions are challenging to deal with , as they require optimal lesion preparation . </S> <S> direct stenting in this scenario is associated with risk of stent - underexpansion , which is related to in - stent restenosis , target lesion revascularization and stent - thrombosis . </S> <S> we report on the interventional management of an underexpanded bare - metal stent not amenable to high - pressure balloon dilation and cutting - balloon . </S> <S> by using rotablation we could abrade the underexpanded stent struts and the calcification with subsequent implantation of a drug - eluting stent . </S> <S> follow - up of 6 months revealed good results without evidence of significant restenosis . </S> <S> our clinical experience and case reports in the literature suggest that this strategy might be an option for underexpanded stents not amenable to conventional techniques . </S>"
2,"total joint arthroplasty , in particular tha , has had a revolutionary role in improving the quality of life . the use of bone cement for the implant stability versus biologic fixation with bone ingrowth in cementless tha has been a controversial issue for years . while immediate cement fixation in very old people or in those with poor bone stock might provide a quicker return to daily activity , cementless implants have gained more popularity over the years . the relative superiority of cementless acetabular component over cemented ones is nowadays a well - accepted fact . the femoral stem , however , has very good long - term reports both in cemented and cementless forms . efforts to decrease the rate of loosening have included the use of newer materials , improvement in the design of the implant ; and modification of operative techniques . after the midterm follow - up results of cementless tha , the long - term results with impressive survival rates , are being reported more and more . the purpose of this study is to evaluate the efficacy and prosthesis survival in an iranian society , with its unique cultural lifestyle and social differences from western societies . the cases of cementless total hip arthroplasty performed in nemazee hospital by a single surgeon from may 1997 to june 2007 were included in a retrospective outcome study . from the total of 63 hips in 52 consecutive patients , 3 patients had died at the time of the last follow up due to problems unrelated to the operation and 7 patients could not be reached . the information from medical records of all the cases , including radiographs , was collected and the patients were called in for an interview , physical examination , and radiographic assessment . the patients filled the general - health assessment form , short form 36 ( sf-36 ) ; the arthritis specific functional instrument womac ( western ontario and mcmaster universities osteoarthritis index ) , and patient reference disability questionnaire mactar ( mcmaster toronto arthritis ) . harris hip rating scale was also filled for all the hips , ( where 90 - 100 points would be excellent , 80 - 89 good , 70 - 79 fair , and below 70 is assumed a poor result ) . the radiographic assessment was by measurement of the cup and stem alignment in the immediate post - operative anteroposterior and frog - lateral views . the same views in the final follow - ups were specifically evaluated for any possible change in cup orientation , loosening in cup or stem ( based on gruen classification for zones of the femoral stem and martell et.al . osseous integration of the acetabular and femoral components were assessed using the proposed criteria of moore et al . considering the five criteria of absence of radiolucent lines , superolateral and inferomedial buttressing , medial stress - shielding , and radial trabeculae for acetabular shell and stem bony ingrowth according to engh et al . osseo - integration in stem was evaluated by the presence of spot welds , cortical hypertrophy ; and absence of radiolucent lines , and pedestal formation . harris - galante ii ( hg ii ) porous - coated prosthesis ( hgp , zimmer , warsaw , indiana ) was implanted in 15 and versys - trilogy ( v - t ) system ( zimmer , warsaw , indiana ) in the remaining 37 hips . the straight stem is tivanium ti-6 al-4v alloy with a proximal pure titanium fiber metal mesh coating ; with a collar and a modula.r morse tapered neck , which is available in three lengths . there are porous pads , made of commercially pure titanium wire , proximally on the anterior and posterior surfaces and a small medial pad immediately distal to the collar . the shell is a partial hemisphere made of titanium alloy with fiber metal mesh coating with variable number of holes for screw fixation . versys femoral stem ( zimmer , warsaw , indiana ) is a collarless proximally and circumferentially coated prosthesis for cementless use . trilogy shells are coated with commercially pure titanium fiber metal , which is clinically proven to enhance fixation through bone ingrowths . infection prophylaxis was with cephalosporin and gentamicin at the time of surgery and 48 hours post - surgery . thromboembolic prophylaxis was mostly by warfarin for 6 weeks with the intended inr ( international normalized ratio ) of 1.7 to 2 . early mobilization and first post - surgery day ambulation and crutch walking for 6 weeks were the uniform care received by all the patients . the prosthesis survivorship limit was defined as implant life span and revision - ready state with progressive symptoms . definite when subsidence , varus , or valgus orientation change was observed in femoral component and angle change , or migration of two millimeter or more seen in two views for the acetabular component . the 42 patients ( 52 hips ) included fourteen males ( 33.3% ) and twenty - eight females ( 66.7% ) , with the mean age of 48.83 years ( 13 years , range 22 - 75 ) at surgery . harris - galante ii prosthesis was used in 15 cases and versys - trilogy prosthesis in 37 hips . the average duration of follow - up was 65 months ( range 26 - 136 ) . the hg ii group of prostheses had a longer follow - up of 105 months ( range 52 - 136 ) . this figure was 49 months for versys - trilogy group ( range 26 - 78 ) . the overall mean follow - up was 65 months ( 32 , range 26 - 136 ) . the overall arthroplasty survival ( i.e. , well - functioning prosthesis with no clinical or radiographic evidence of wear , loosening , infection , etc . ) , which would suggest the need for revision was 65 months . therefore , 43 hips in 34 patients were in good and functional status by the time of last follow - up . post - operatively , hips had a mean flexion arc of 114 degrees and 9 degrees of flexion contracture . the overall hhs with a mean of 85 ( 15 , range 24 - 100 ) was excellent in 65.9% , good in 27.3% , fair in 4.5% and poor in 2.3% of cases . the womac score had a mean of 22.7 ( 13 , range 3 - 62 ) , with 3 being the best and 62 the worst case scenario . the pain subscore of womac was 2.87 , joint stiffness 2.21 and functional subscore 17.62 . the items in the function , which were of most concern to the patients were , in a descending order : inability in stair climbing ; sitting or getting up from the floor or from flat - top toilets ; picking up objects from the floor ; and putting on or taking off socks . sf 36 measurement had a total mean score of 61.33 ( range 18 - 95 ) . out of the 8 items in sf 36 , the patient expectation questionnaire of mactar had the following findings : pain relief was achieved in 41 cases ( 97.6% ) , improvement in walking in 39 ( 92.8% ) , and improved ability in performing daily living activities in 37 ( 88% ) . the correlation of the above scoring system in this group of patients was evaluated . a close correlation between harris hip score and total womac score pain and function items , but not as much with stiffness item in womac ( p values 0.002 , 0.001 and 0.45 , respectively ) . sf 36 and harris hip score were closely correlated and value of r was 0.67 . sf 36 was more closely correlated with pain and function subscores of womac ( r=-0.77 and 0.78 , respectively ) . there was no infection , and no thromboembolic event in any of the 52 hips . in the last follow - up assessments , 44 hips ( 84.6% ) were functional and well fixed ; 8 cases had undergone revision and one patient is suspected of the early stage of loosening and is being followed . pedestal was seen in 3 ( 7% ) stems , and 1 - 2 millimeter non - progressive radiolucent lines in 20% of femurs and 4.5% of acetabular components . heterotopic ossification as a late complication was found in 35 hips ( 67.3% ) , 29 ( 82.7% ) of which were brookers i and ii , 5 brookers iii and one brookers iv . since there were two groups of prostheses from the same company used in this study , they also were separately evaluated : among the 15 cases of harris - galante ii prosthesis , with average follow - up of 105 months ( range 52 - 136 ) , 8 cases had developed problems , all of which had been already revised . all of the revised cases had problems in the acetabulum with cup wear , loosening , and polyethylene fracture and two of them had a simultaneous femoral loosening and osteolysis secondary to polyethylene wear debris . the etiology of hip disease in these 8 revisions , included five acetabular dysplasia , one avascular necrosis and systemic lupus erythematosus , one multiple epiphyseal dysplasia , and one primary osteoarthritis . in reviewing the original radiographs , no initial radiographic malposition was present and the stems were in normal orientation and mean shell inclination angle was 47 degrees ( range 40 - 57 degrees ) , which was not statistically different from the versys - trilogy group ( p=0.51 ) . the primary etiology of hip disease , in terms of distribution in these two groups , was different ( table 1 ) . comparison of etiology of hip replacement in two groups broken tines of fiber metal - coated acetabular shells were seen in 5 patients , all in the failed acetabular components ( figure 1 ) . broken tine of the cup the harris hip score , womac and sf36 in the 15 hg ii cases were significantly poorer than the 37 cases with versys - trilogy prosthesis : harris hip score of 66 versus 92 , womac 30 versus 20 ; and sf36 of 49 versus 66 ( p value 0.009 ) . the versys - trilogy prostheses are all surviving in a mean follow - up of 49 months ( range 26 - 78 ) with no radiographic or clinical evidence of loosening or wear . the five early complications , mentioned above , were all in this group of prostheses . this is a small group of cases with a midterm follow - up on porous - coated hip arthroplasty in a society with unique social habits and customs . the charnley prosthesis reported by ranawat had 90% survival of the femoral component , while harris had about 80% survival with revision mainly on the acetabular side . porous - coated implants were used with the idea of removing the so - called weak link in the hip replacement from 1971 . this has survived as a very good hip arthroplasty option for young active individuals with good bone stock . the results with non - circumferential proximal porous - coating prosthesis like porous - coated anatomic , pca ( howmedica , rutherford , new jersey ) and harris - galante i ( zimmer , warsaw , indiana ) were not satisfactory : failures of 43% and only 57% survival in 8 years . kim recently reported that pca prosthesis ( howmedica ) had 21% revision in 20 years for the acetabular component and 9% for femoral component . after generally satisfactory short and midterm results of the second generation of cementless implants ( with proximal circumferential porous - coating ) , clohisy and harris reported a 96% 10-year survival rate for acetabular component and archibeck et al . in a study of 92 patients with the same follow - up had a 96.4% survival rate for acetabular and 100% for femoral components . most studies have evaluated the functional results with hhs with 83 - 95 points on average . reported 100% ten years survival in a hydroxyapatite - coated , proximally and circumferentially coated prosthesis . engh et al . , using an extensively coated prosthesis reported on 5-year , 10-year , and 15-year follow - ups . the acetabular component in most reports is the one with more problems and the responsible section for loosening ( kim , engh , archibeck ) . the number of holes for temporary screw fixation has also been a point of concern , as more holes might provide better access for migration of polyethylene debris behind the shell and into the femoral canal . the locking of the polyethylene cup into the metal shell is variable in different designs of prostheses . poor locking mechanism can cause micromotion between the liner and the shell , causing more wear and subsequent dislodgment of the liner . the survival rate in the present series with 96.2% for the femoral component and 84.6% for the acetabular component in 5.5 years is not a very promising result . the high revision rate of 15.4% was primarily in the hg ii components and all were related to the acetabular side with wear , breakage , and dislodgment of polyethylene liner . louwerse et al . in 1999 reported 26 cases of liner failure , 13 of which belonged to hg cups . curry et al . in a 10-year follow - up reported 271 cases of hg ii prosthesis in 2008 . our hg ii group of arthroplasty in 9 years average follow - up had 46.7% overall prosthesis survival . the femoral stems were revised in only two cases that had severe bone lyses secondary to the acetabular liner problem , and the remaining 50 ( 96.2% ) stems are stable and functioning well . although the revised cups , except one , did not have primary osteoarthritis or inflammatory arthritis , the numbers are too few to draw any conclusion as to whether the primary etiology could have had any bearing on the high rate of liner problem in hg ii cups . the appearance of broken tines was visible on radiographs , one to two years before the hips became symptomatic . broken tines are probably early warning signs of instability . excessive motion will cause wear of the liner and material debris will initiate retro acetabular and proximal femoral osteolysis . this would eventually lead to failure . at the same time , the trilogy cups ( zimmer , warsaw , indiana ) with versys circumferentially coated stems have 100% survival in 4 years average follow - up . the locking mechanism in the trilogy is split - ring mechanism , which has been used in several other designs with a good track record . the harris hip score in the v - t group and surviving hg ii ( not revised ) were excellent or good in 93.2% and good in , but in the total group including the revised hips were 85 . the adjusted general health measures ( sf36 ) and disease specific outcome measures ( womac ) and patients expectations have been previously studied for knee arthroplasty in this region , but not for hip arthroplasty . the hg ii group had , understandably , a significant drop in their womac and sf36 scores due to the inclusion of the 15.6% revision . expectations of the patients , that were mainly relief of pain and ability to walk comfortably , were fulfilled in nearly all the patients ( 97.6% and 92.8% , respectively ) . some preoperative problems relatively unique to our culture , flattop toilet and cross - legged sitting on the floor , were not the expectations of the patients and seem to be modified after surgery by the patients . the radiographic evaluation in the present paper showed good positioning of cups and stems in accordance with established standards . the radiolucent lines and pedestal formation in those few cases were not indicative of loosening . there were only 4 cases ( 9.3% ) of thigh pain in this series that had no correlation with the size of the femoral stem . the incidence of thigh pain , which is related to the stability of the prosthesis , is reported between 0 and 28% in different articles . in spite of the literature report of 0.28 - 4% infection and 2.2 - 14.7% thromboembolic events the ulnar nerve injuries were in the contralateral upper limbs from arm malpositioning during anesthesia . the tibial and peroneal nerve injuries from the traction effect of lengthening observed in this report is a recognized problem , and has been reported in the literature with an incidence of 0.3 - 3.7% in hip arthroplasty , usually associated with lengthening of over 1.7 centimeters . the main limitations of the present study are its retrospective nature and small numbers of cases , however , the merits are that it is a single surgeon s experience with uniform technique and post - operative care and being a unique study in iran with the special cultural and daily living habits . the generally satisfactory results of hip arthroplasty as demonstrated by harris hip scores and functional assessments with womac , sf 36 and mactar are shown in iranian society in spite of some cultural and social differences . the outcome of cementless tha is satisfactory and comparable with the literature based on the results of function and survival of this small comparative group . the use of hgii acetabular component should be abandoned , because of the poor locking mechanism of the shell with the liner .","<S> background : cementless hip prosthesis was designed to provide biologic fixation , without the use of cement . </S> <S> the second generation components have shown more reliable bone ingrowths and survival rates . </S> <S> we are reporting a midterm result of two designs of cementless prosthesis in a unique culture with different social habits and expectations.methods:52 primary cementless total hip arthroplasty in 42 patients with the mean age of 48.8 years were retrospectively studied . </S> <S> two groups of prosthesis had been implanted : harris - galante ii ( hgii ) in 15 and versys - trilogy ( v - t ) in 37 hips , both from zimmer company . </S> <S> the patients were assessed clinically , radiographically and with harris hip score , sf36 , womac , and mactar questionnaires , with 65 months ( 26 - 136 ) mean follow-up.results:all the v - t prostheses had survived well . </S> <S> eight of hg ii were revised by the last follow - up in 19 - 102 months . </S> <S> all had undergone acetabular revision and 2 combined with femoral revision . broken tines of hgii cups </S> <S> were seen in 4 radiographs . </S> <S> the 65 months overall survival was 96.2% for femoral and 84.6% for acetabular components . </S> <S> 90% had good or excellent harris hip scores . </S> <S> the functional scores were poorer in the hg ii group . </S> <S> pain relief and improved walking were the two main patients expectations fulfilled in 97.6% and 92.8% , respectively.conclusions:the outcome of cementless total hip arthroplasty ( tha ) is satisfactory and comparable with the literature based on the results of function and survival of this small comparative group . the use of hgii acetabular component should be abandoned . </S>"
3,"out of the 10,807 patients enrolled for the original survey , access survival data were available for 7058 ( 65% ) patients . these patients resided in portugal , the united kingdom , ireland , italy , turkey , romania , slovenia , poland , and spain . the mean age was 63.515.0 years , 38.5% were female , 27.1% were diabetic , 90.6% had a native fistula , and 9.4% had a graft . median dialysis vintage was 43.2 months ( minimum : 0.1 months ; maximum : 419.6 months ) . access location was lower arm for 51.2% of patients . during the follow - up , prevalent needle sizes were 15 and 16 g for 63.7% and 32.2% of the patients , respectively ( 14 g : 2.7% 17 g : 1.4% ) . in spain , 98% of patients were treated with 15-g needles , and in romania 75% of patients were treated with 16-g needles . cannulation technique was area for 65.8% , rope - ladder for 28.2% , and buttonhole for 6% of patients , with some country preferences clearly visible : area technique was applied in as much as 77% of patients in romania , and rope - ladder was more common in poland than in the total study population ( 44% ) . the direction of arterial puncture was antegrade for 57.3% of patients ; this was the preference for 99% of patients in poland . the bevel orientation was upward for 70.2% of the patients , peaking in poland with 95% . the practice of needle rotation after insertion was practiced for 42% of patients , with a much higher percentage in italy ( 82% ) . the prevalent combination between arterial needle puncturing and bevel direction was antegrade with bevel upward ( 43.1% ) , followed by retrograde with bevel up ( 27.1% ) . the proportion of the two other combinations , that is , antegrade and retrograde with bevel downward , was 14.2% and 15.6% , respectively . the 15.6% with retrograde and bevel down were mainly treated in two countries ( spain and portugal ) . median blood flow was 350400 ml / min . in italy and spain , 40% and 38% of patients conversely , in slovenia and in poland 5455% of patients were treated with blood flows below 300 ml / min . figure 3 shows the distribution of patients according to the prescribed needle size , blood flow , and venous pressure levels . the primary outcome event ( i.e. , surgery for a new va during the follow - up period ) was observed in 1485 patients ( 21% ) . univariate survival analysis revealed a significant benefit for access survival for patients who are younger , nondiabetic , male , have lower body mass index , do not take platelet antiaggregants , do not have heart failure , and are able to assist with compression . a significant benefit was also seen for patients with fistula ( vs. graft ) , smaller needles , distal location of the access , and low venous pressure . with regard to cannulation technique , positive effects were observed for antegrade needle direction ( vs. retrograde ) , bevel up ( vs. down ) , nonalcohol - based disinfection , and application of local anesthesia . although not statistically significant , a potential survival benefit was indicated for higher blood flow ( p=0.056 ) and buttonhole technique ( vs. rope - ladder and area , p=0.11 ) . needle rotation did not affect the access survival ( p=0.81 ) , neither did access vintage ( age < 1 month before baseline vs. 1 month before baseline ; p=0.29 ) . meier access survival curves according to blood flows , venous pressures , needle sizes , and cannulation techniques are presented in figure 4 . in a second step , after adjustment for age , gender , diabetes , va type , access location ( proximal vs. distal ) , dialysis vintage and heart failure , and incorporation of country differences , the use of a 16-g needle was associated with a significantly higher risk of access failure ( hazard ratio ( hr ) 1.21 ) compared with the use of a 15-g needle . very few ( 1.4% ) patients were treated with the even smaller 17-g needles , but the direction of the results is the same , that is , increased hr for smaller needle size . using a blood flow of 300350 ml / min as a reference , the hr tended to decrease as the blood flow increased . with regard to cannulation technique , both rope - ladder and buttonhole techniques performed significantly better than the area technique . considering antegrade with the bevel up as reference , the retrograde direction of the arterial needle with bevel down is associated with a significant increase of access failure risk of 18% . all other options , that is , antegrade direction with bevel down or retrograde direction with bevel up , were not associated with a hr significantly different from 1.00 . with regard to venous pressure , using as reference the range between 100 and 150 mm hg , the hrs increased proportionally to 1.4 , 1.87 , and 2.09 with the increase of venous pressure from 150 to 200 mm hg , 200 to 300 mm hg , and > 300 mm hg , respectively ( all p0.008 ) . of note , venous pressures of > 300 mm hg are extreme cases and were only recorded in 0.6% of the patients . in addition , a venous pressure of < 100 mm hg was associated with a significantly higher hr of 1.51 . to investigate this further , we also looked for interaction effects between blood flow and venous pressure , as well as between arterial and venous pressures ; no significant associations were found . finally , the use of a tourniquet and not applying any pressure at the time of cannulation were associated with hrs of 1.30 and 1.25 ( p<0.008 and < 0.02 ) , respectively , compared with exertion of arm compression by the patient at the time of cannulation ( labeled patient assistance ' in table 1 ) . in summary , this study revealed that area cannulation technique , albeit being identified as the most commonly used technique in this population of over 7000 patients , was inferior to rope - ladder and to buttonhole for maintenance of va functionality . with regard to the effect of needle and bevel direction , the combination of antegrade positioning of the arterial needle with bevel - up orientation was significantly associated with better access survival than retrograde positioning with bevel down . the use of larger needles tended to favor access patency , with 15 g being superior to 16 or 17 g. the application of arm pressure by the patient at the time of cannulation had a favorable effect on access longevity compared with not applying pressure or using a tourniquet . results pertaining to the type and location of the access and the technical parameters ( i.e. , blood flow and venous pressure ) were as follows : there was an increased risk for access failure for grafts vs. fistulas , proximal location vs. distal , right arm vs. left arm , blood flows below 300 ml / min vs. those in the range of 300350 ml / min , and for the presence of a venous pressure > 150 mm hg vs. pressures between 100 and 150 mm hg . tissue reparative processes triggered by cannulation procedures may cause enlargement of the fistula and the formation of aneurisms and scars that , in turn , can favor the development of stenotic lesions and ultimately impact fistula survival . repetitive punctures at the va site cause vessel wall defects that are initially filled by thrombi before finally healing . of the three cannulation techniques , the buttonhole approach has the theoretical advantage of limiting the process of dilatation and fibrosis because the thrombus is displaced while being formed , favoring the formation of a cylindrical scar from the subcutaneous and vessel wall tissues . the rope - ladder technique may have the initial advantage of favoring progressive maturation along the entire length of the fistula , but it requires fistula with sufficiently long segments suitable for cannulation . the area puncture technique weakens the fistula wall and is associated with the least favorable consequences , that is , localized dilation , disruption of the vessel wall , and subsequent development of ( pseudo)aneurysms and strictures . despite this and the fact that area cannulation has been discouraged for over two decades , it was disheartening to observe that this was the predominant practice in almost two - thirds of patients . according to the ebpg and the clinical practice guidelines for va , the rope - ladder technique should be used for cannulation of grafts . specifically , according to the latter , this study showed a 22% lower risk for va failure in those patients whose va was cannulated with the buttonhole technique as opposed to area , confirming the results of a recently published randomized controlled clinical trial . although the buttonhole technique is associated with good results , one should also take into consideration that it is a practice performed in centers with highly trained personnel that work with strict protocols and that it may also be used for fistulas with only short segments available for cannulation . in our study , this practice is used in 22 centers , mainly in portugal , turkey , the united kingdom , and italy . research questions that arise from current guidelines address the effectiveness of structured cannulation training , increased remuneration for expert cannulators , and whether self - cannulation can lead to better outcomes . indeed , as buttonhole cannulation requires the designation of a reference nurse , especially for the initial 46 weeks , it is likely that this technique benefits from its association with centers offering the necessary training ( i.e. , centers capable of stemming the increased organizational effort and assigning the right cannulator to the right patient ) . in addition , once the tunnel is created , cannulation can be performed directly by patients . however , irrespective of the influence of cannulator training and center organizational issues , the underlying question to be addressed , optimally in a well - designed clinical study , is which cannulation techniques can be recommended to ensure long - term va functionality . this study showed that retrograde direction of arterial needle with bevel down is associated with the least favorable outcome . this is consistent with the findings of woodson and shapiro who reported that retrograde puncturing may be associated for an increased risk of hematoma formation , possibly owing to the related venous return of the blood ( i.e. , retrograde filling ) . antegrade puncturing , on the other hand , may be considered fistula - protective by the same reasoning , that is , tract closure through flow force . therefore , retrograde direction of the arterial needle is more likely to be associated with a higher risk for aneurism . despite recommendations by kdoqi to rotate the needle during insertion , the univariate analysis performed here found no evidence of any benefit of this practice . on the contrary , the authors share the opinion of many cannulators that the 180 rotation of the needle is unnecessary and may constitute an additional trauma to the va . further studies are needed to clarify whether rotation of the va needle during cannulation should be recommended or not . there are a number of possible reasons for the association of the higher failure risk with smaller needle sizes . while increased trauma and prolonged bleeding time are generally associated with the use of large needles , the use of small needles at the same blood flow results in a higher speed of the blood returning to the vasculature , possibly damaging the intima of the avfs . for example , at an operative blood flow of 350 ml / min , the maximum speed of the injected blood will be 8.79 m / s with a 17-g needle and 5.80 m / s with a 15-g needle ( presented by ralf jungmann at vascular access coursestockholm , 1112 october 2012 , stockholm , sweden ) . furthermore , the shear forces created by returning blood can have a role in inflammation and stenosis formation . stenotic fistula and graft lesions are associated with the induction of the expression of profibrotic cytokines , local inflammation , and neointimal proliferation . however , we can not exclude that this association may be a consequence of bias by indication . needles of smaller inner dimension are generally prescribed not only for a new va but also for problematic avfs , that is , those likely to fail in the following months . therefore , it is difficult to derive a conclusion from this association , but on the basis of figure 3 , 17-g needles are clearly linked to blood flow levels below 300 ml / min and , on the contrary , 14-g needles are mainly prescribed to patients with 350400 ml / min or greater blood flows . it is also of interest to underline that higher venous pressure is mainly associated with the 16-g needles , which have a wider distribution of different blood flows . measurement of venous pressure during dialysis is currently used as a surveillance tool within the dialysis session , and not as a standard monitoring strategy . this study showed a significant and proportionally increasing risk of va failure with venous pressures higher than 150 mm hg . an increased hr was also detected for venous pressure below 100 mm hg . as shown in figure 3 , an association between needle size , blood flow , and venous pressure is indicated in that for needle gauges 15 , 16 , and 17 low venous pressures appear to be associated with low blood flows . such an association could be an indication of stenosis in the artery . venous pressure is crucially dependent on the characteristics of the needle ( e.g. , the needle gauge , the length of the metallic portion , and the length and the thickness of the needle shaft ) , which vary among manufacturers . in this network , at the time of the study , the vast majority of the needles ( 85% ) were from a single producer and the length of the needle was 25 mm . the unexpectedly high hr associated with a venous pressure of under 100 mm hg compared with 150200 mm hg should motivate reflection on the currently accepted limits . one could consider integration of venous pressure monitoring into an algorithm for the detection of increased risk of access failure . this study has certain limitations over and beyond those inherent to observational studies , for example , that residual confounding can not be completely ruled out . being a retrospective study , patient data for those patients on dialysis before admission to the nephrocare clinic were not collectable , and thus robust information on the number of prior vaes , on their respective lengths , and on first cannulation was not available . particularly , the missing information on the length of the va , its depth , and the access flow constitute a major weakness because a particular cannulation technique could have been chosen on the basis of what is possible with the given access characteristics . in addition , the length of the access can influence the way in which the needles are placed . despite these missing data , we feel that this study has its merits , as it shows that traditional local practices have a significant influence on procedures exercised . a further limitation is that the va practice was surveyed in april 2009 and was assumed not to have been changed during the follow - up ( 31 march 2012 ) . however , as nursing practices in this field are strongly related to the clinic culture and experience , we have reason to believe that it is should not constitute a significant bias . of course , some cannulation particulars , such as needle size and arterial blood flow , may vary over time , in that smaller needle sizes and low blood flow rate are used for initial access use and that large needles are taken for mature accesses . however , we feel that the model selected here is also justified because it is an explanatory model , based on the association of baseline characteristics with access survival . other limitations are that we had follow - up of 65% of the patients and that most countries were in europe ( owing to deployment of the electronic reporting system ) . as reported , an association between clinical practice patterns and country has been detected , and consequently not all different practices were covered by our model . however , according to the results of this analysis , each country has a combination of practices that positively and negatively influence the va survival . for example , in romania , positive influences were the puncture direction being antegrade ( 82% ) , bevel orientation being predominantly upward ( 95% ) , and needle not rotated ( 84% ) ; negative associations were the use of area technique ( 77% ) , preferred needle size ( 75% with 16 g ) , and the use of blood flows < 300 ml / min ( 47% ) . for this reason , intracountry correlations were considered using a sandwich estimator in the multivariate model . to assess the influence of individual center practices , we also performed a sensitivity analysis by applying the sandwich estimator at the center level . there were only negligible differences to the results obtained with the original model at the country level , raising our confidence that there is no severe confounding of the model by center practice effects . given the relevant impact of the investigated variables on the survival of the va , itself a key driver of hemodialysis patient survival , we believe it is time to organize a large - scale randomized clinical trial to facilitate the formulation of practical and comprehensive cannulation practice guidelines . as the associations between practice patterns and va survival reported here are mainly related to national procedures and only partially related to actual patient limitations , they offer some promising indications for improving clinical practice . in april 2009 , a cross - sectional survey was conducted in 171 dialysis units located in europe , the middle east , and africa to collect details on va cannulation practices on a clinic by clinic level . all patients who were on double - needle hemodialysis or online hemodiafiltration during the week of the survey were selected for analysis , as long as a fistula or graft was used for va , survey data were complete , and follow - up data were available in our clinical database . primary outcome was time until the first surgical access intervention resulting in the generation of a new access ( i.e. , as opposed to any surgical intervention done just for revision , thrombectomy , etc . , or any endovascular intervention ) . patients were censored for transplantation , death , loss of follow - up , or end of the follow - up period ( 31 march 2012 ) . information on cannulation retrieved from the survey comprised fistula type and location , cannulation technique , needle size , needle and bevel direction , needle rotation , blood flow , arterial and venous pressure , use of disinfectants , use of local anesthesia , and application of arm compression at the time of cannulation . to adjust for individual patient characteristics , the following information was extracted from the clinical database : patient age , gender and body mass index , prevalence of diabetes , and the use of ace inhibitors , platelet antiaggregants , and anticoagulants . in addition , the median blood flow prescription was documented at a center level at the time of the survey . for univariate analysis , kaplan meier curves were calculated and comparisons were performed using the log - rank test . by combining univariate results with medical and statistical experience , a set of variables for multivariable analysis was determined . in particular , specific interaction terms ( e.g. , bevel vs. arterial needle direction ) were defined for statistical examination , and decisions were made regarding their inclusion or omission in the cox model depending on their significance or collinearity , respectively . a final cox model based on these variables was calculated , using the sandwich estimator to account for within - country correlation . step by step , the final model was reduced , setting a p - value of 0.1 for variable inclusion . all analyses were performed with sas v9.2 ( sas institute , cary , nc ) .","<S> hemodialysis patient survival is dependent on the availability of a reliable vascular access . in clinical practice , </S> <S> procedures for vascular access cannulation vary from clinic to clinic . </S> <S> we investigated the impact of cannulation technique on arteriovenous fistula and graft survival . </S> <S> based on an april 2009 cross - sectional survey of vascular access cannulation practices in 171 dialysis units , a cohort of patients with corresponding vascular access survival information was selected for follow - up ending march 2012 . of the 10,807 patients enrolled in the original survey , access survival data were available for 7058 patients from nine countries . </S> <S> of these , 90.6% had an arteriovenous fistula and 9.4% arteriovenous graft . </S> <S> access needling was by area technique for 65.8% , rope - ladder for 28.2% , and buttonhole for 6% . </S> <S> the most common direction of puncture was antegrade with bevel up ( 43.1% ) . </S> <S> a cox regression model was applied , adjusted for within - country effects , and defining as events the need for creation of a new vascular access . </S> <S> area cannulation was associated with a significantly higher risk of access failure than rope - ladder or buttonhole . </S> <S> retrograde direction of the arterial needle with bevel down was also associated with an increased failure risk . </S> <S> patient application of pressure during cannulation appeared more favorable for vascular access longevity than not applying pressure or using a tourniquet . </S> <S> the higher risk of failure associated with venous pressures under 100 or over 150 mm hg should open a discussion on limits currently considered acceptable . </S>"
4,"health is not only related to the absence of the disease , therefore we need to conceptualize and operationalize what health is . increasingly , we have come to understand that information about functional status is needed in order to appreciate the full picture regarding the health of an individual or a population . an individual 's health fundamentally includes their capacity to carry out the full range of actions , activities and tasks required to fully engage in all areas of human life . the health state of a person can be described in terms of capacity to carry out a set of tasks or actions . in addition , the health state also includes changes in body functions and/or structures arising from a health condition . the impact of the health state on a person 's life can be understood by measuring performance of tasks and actions in the person 's real - life or actual environment . the full picture of the health experience can further be appreciated by taking into cognizance the value that people place on levels of functioning in given domains in association with a health condition . plainly , the concept of functional status is integral to health and its achievement . two individuals with identical diagnoses may have utterly different levels of functioning that determine their actual health status . without fsi , our picture of the health of an individual , or a population , is flawed and incomplete . fsi has , of course , long been collected in various ways and used clinically , especially in rehabilitative medicine ; physical , occupational and speech and language therapy ; and in nursing home and home care settings . fsi is essential for needs assessment as well as the development and monitoring of rehabilitative interventions to restore or maintain functions . it is also essential in this area of health care because the aim of therapy is to assist patients in maximizing their capacities to perform activities needed for their lives . although no one doubts that restoring functioning is restoring health ( the ultimate purpose of all forms of health care ) some clinicians , focusing exclusively on acute - care needs , do not see the need to collect or utilize fsi . in most countries with a sophisticated health administrative data collection and utilization infrastructure , a wide variety of information what is often missing is information that would link diagnosis and treatment with health outcomes that are fully meaningful to the patient 's life , namely information about the presence of decrements in capacity to carry out tasks and actions in areas of life as well as how these decrements play out in the person 's actual , real - life environment ( deyo and patrick , 1989 ; lubetkin et al . , 2003 ) . there is growing recognition that there is a gap in health administrative records : the failure to collect or disseminate fsi across all health care settings . unless fsi becomes an essential part of administrative records , the potential value of these data will be lost , not merely to clinicians , but to health administrators concerned about management and quality of care issues , health researchers , and public health agencies . this insight is clearly expressed in a report by the national committee on vital and health statistics ( ncvhs ) ( 2001 ) : without functional status information , the researchers , policymakers , and others who are already using administrative data have at best a rough idea of how people , individually and collectively , are doing and at worst they are making erroneous assumptions and decisions . the report outlines in some detail the benefits of routinely collecting fsi across the entire health care delivery system and throughout all care settings . fsi can serve management needs of all the stakeholders in the health care system clinicians , providers , payers , patients , and government regulatory bodies . this is true especially with respect to evaluating outcomes , comparing treatment modalities , and predicting and managing costs . this links directly to debates of modes of service provision , single or multiple payer , managed care , fee - for - service , or some hybrid mixture . the policy and research applications of fsi are evident for local health management and quality control , and in the broader arena of public health . policy decisions about priorities must be made at the level of individual clinics or hospitals , local or regional health care agencies , or at the level of government planning and budgeting . given the importance of getting the complete picture of health outcomes , fsi is an essential input into evidence - based policy decisionmaking . researchers in all areas of health and social policy , at all levels , need valid and reliable data about functional status in order to make informed decisions . for example , it is a matter of debate whether , as the world 's population lives longer and ages , they will be unhealthy and pose a greater burden on health systems . there is some evidence suggesting that elderly persons today are functioning at higher levels than before . without reliable information on levels of functioning , this debate would be unresolvable because it would not be possible to detect functional status , since the disease morbidity may not have changed very much . compression of morbidity occurs when disability or decrement in functioning is postponed more than longevity is extended , as for example with the effects of exercise or better eating habits . the direct test of compression ( or extension ) of morbidity depends on the effects of reduced health risks on cumulative lifetime disability ( fries , 1980 , 2001 ; vita et al . , , fsi is a crucial element for the description of health states and quantification of overall health status in individuals that can be aggregated to a summary measure of population health . at the who , the use of fsi is in this area , in particular , because this data ( collected by the world health survey now in the field in more than 70 countries ) feeds into ongoing endeavors to determine levels and distributions of health this survey would be inconceivable without information on health outcomes that describe health on multiple dimensions in terms of levels of functioning in a parsimonious set of domains . it is commonly known that the demographic trends toward an older population , at least in developed counties , will create unprecedented burdens on all age - sensitive social policies , such as social security and other pensions , retirement , unemployment , and long - term care . aging , according to a recent organization for economic cooperation and development ( oecd ) ( 2001 ) report , is the principal factor currently driving pension spending costs . since age - sensitive social programming constitutes between 40 and 60 percent of total public spending , the impact of aging is considerable . to comprehend the nature and magnitude of its social impact , those responsible for policies from transportation and housing to employment and taxation , will need reliable data on functional status and how it plays out in the lives of the aging population . for fsi to be available for this wide variety of uses , however , it must be routinely and consistently collected across the entire health care delivery system , preferably in some electronic format . nonetheless , before contemplating the systemwide changes required to collect fsi , a classification that provides a common language and framework to describe the universe of functioning and disability is required . in order to complement the classification scheme , a comprehensive coding system that creates consistent and comparable data across all settings of care and a method of routinely capturing and disseminating these data ( in a mode and manner consistent with social interests in preserving privacy ) linked to measurement tools for clinical and related encounters the foundation of a new structure for collecting fsi is , therefore , a standard classification and coding system that will make it feasible for fsi to be included in administrative data . as the ncvhs report stated : while the international classification of diseases ( icd ) has served us well for more than a century in characterizing diagnoses , it is now time to complement it with a parallel system for characterizing functional status . although the committee argued that more research , analysis , testing , and demonstration projects are required before final recommendations can be made , it concluded that : the concepts and conceptual framework of the icf have promise as a code set for reporting functional status information in administrative records and computerized medical records . in the committee 's view , the icf is the only existing classification system that could be used to code functional status across the age span . in this article , we want to briefly describe the extensive international developmental process that lead to the revision of the original international classification of impairments , disabilities and handicaps ( icidh ) ( world health organization , 1980 ) and produced the icf . we also want to describe the basic principles and structure of the icf , in particular , to show its value in the context of collecting fsi for administrative records . the primary mandate of who is the production and dissemination of reliable and timely information about the health of populations . who 's 1947 constitution requires that : each member shall provide statistical and epidemiological reports in a manner to be determined by the health assembly . countries have long reported causes of death or mortality statistics based on who 's ( 1992 ) \n international statistical classification of diseases and related health problems ( icd-10 ) . though useful for calculating life expectancy for different countries , however , who recognized that these data did not capture the overall health status of living populations . missing was information about non - fatal health outcomes , i.e. , functioning and disability across all areas of life . to meet this need , who ( 1980 ) issued a tool for the classification of the consequences of disease , namely the icidh . a considerable academic literature built up around clinical and other uses of the icidh , but much of this literature was critical of the underlying model of disability . responding to these critiques and an international call for an updated version , who launched a revision process in 1993 to address what many viewed as an urgent international need for a framework for measuring and reporting the health as functional status at both individual and population levels . over the next 10 years , who 's international collaborating centers and governmental and non - governmental organizations , including groups representing persons with disabilities , engaged in the systematic revision of the icidh . from an exhaustive literature search of existing classifications and assessment tools , the who revision team developed a 3,000-plus item pool of potential classification domain names for areas of human functioning at the body , person , and social levels . all efforts were made to ensure that the icidh-2 , as it was initially named , would be a suitable classification for all domains of functioning associated with both physical and mental health conditions . adopting the strategy of computer software development , alpha and beta drafts were prepared from 1996 forward . the original 1980 icidh had only been approved for field - trial purposes . in light of that , the who team felt for icidh-2 to have the necessary credibility and legitimacy to serve as the international standard language of health and functioning , that the revision process should include several years of field trials and other tests . the first phase of field trials concentrated on the cross - cultural and linguistic applicability of the model and classificatory structure and language of the icidh-2 . the intent of this phase of field trials was to establish the conceptual and functional equivalence of the items contained within the classification . stn et al.(1999a , b ; 2000 ) provide the rationale for the methodologies and presentation and analysis of the 15-country field trials . these results fed into further international collaboration in which the who team relied on a global network of who collaborating centers , non - governmental organizations , disability groups , and individual experts and key informants . the next revision phase began in 1999 when a series of expert drafting teams were assembled in geneva to produce the beta 2 draft . this draft was used for the second round of international field trials , these focusing on questions of reliability , utility , and feasibility of use . once the results of these tests were collected and analyzed , a pre - final draft was produced in early fall 2000 as a result of an intensive editing process grounded in the expert input being received from around the world . the icidh-2 , unlike its predecessor , was from the outset developed in multiple languages , primarily to identify and respond to cross - cultural and linguistic differences that might affect the usefulness of the classification . the collaborating centers and others provided constant input at this stage as the language and classification structures were redrafted and refined in multiple iterations . the draft was put on the internet for comment from a wide range of individuals , including both providers and consumers . after presentation before the executive board in december 2000 , the classification was put on the agenda of the fifty - fourth world health assembly and renamed the icf . the new title reflected the philosophy of moving beyond the consequence of disease approach and highlighted functioning as a component of health . in may 2001 , it was unanimously endorsed , member states were urged to use the icf in their research , surveillance and reporting as appropriate . with its approval , the icf became a member of the who family of international classifications . whereas icd-10 provides the codes for mortality and morbidity , icf provides the codes to describe the complete range of functional states that capture the complete experience of health . the icd-10 and icf are , therefore , complementary and who encourages users to utilize both together , wherever applicable . this will ensure a more meaningful and complete picture of the health of people or populations . soon after its official release , who 's director general , gro harlem bruntland , announced that the icf is who 's framework for measuring health and disability at both the individual and population levels . who has already implemented icf as the basis for its extensive world health survey program , demonstrating its use as a global and universal tool . to improve health , tools are needed to measure health , and in particular to measure the changes in health brought about by interventions . icf is the ruler with which we will take precise measurements of health and disability . ( brundtland , 2002 . ) from the public health perspective , the usefulness of icf goes beyond that of the measuring of population health and the effectiveness of internationally coordinated interventions funded by initiatives , such as the global fund to fight aids , tuberculosis and malaria . in addition , with the icf as their framework , countries will be able to identify social factors such as education , transportation , or housing , both as determinants of health , and social factors influenced by improvements in health . making these links will further support the relationship between health and economic development . in short , we have in the shape of a little red book , an extraordinarily versatile tool a swiss army knife for health ministries , researchers and decision - makers . ( brundtland , 2002 . ) undoubtedly the primary reason that icf can plausibly claim to be a universal tool for classifying states of functioning and disability is that the underlying model of the icf reflects our best understanding of the complex phenomena of functioning and disability in a manner that is , to the greatest extent possible , theory - neutral and therefore compatible with whichever theoretical account of how disability arises , at the individual and population levels , that evidence may confirm . it is the conceptual basis for the definition , measurement , and policy formulations for all aspects of disability . a paradigmatic shift in the thinking with regard to disability that is captured in the icf is the stress placed on health and levels of functioning . heretofore , disability has been construed as an all or none phenomenon : a distinct category to which an individual either belonged or not . the icf , on the other hand , presents disability as a continuum , relevant to the lives of all people to different degrees and at different times in their lives . disability is not something that happens only to a minority of humanity , it is a common ( indeed natural ) feature of the human condition . the icf is for all people , not just people traditionally referred to as disabled and isolated as a separate group . icf thus mainstreams the experience of disability and recognizes it as a universal human experience . by shifting the focus from cause to the full range of lived experiences , it places all health conditions on an equal footing , allowing them to be compared using a common metric the ruler of health and disability . from emphasizing people 's disabilities , and labeling people as disabled , we now focus on the level of health and functional capacity of all people . decrements in functioning may be the result of decrements in intrinsic capacity or problems with body functions or structures ; or they can result from features of the person 's physical , human - built or social environment that lead to problems in performance over and above decrements in capacity . very likely , decrements in functioning are the result of both processes . yet , the extent to which intrinsic decrements in capacity or environmental factors are the cause is not a matter that can be determined a priori . moreover , icf is grounded in the principle of universality , namely that functioning and disability are applicable to all people , irrespective of health condition , and in particular that disability or decrement in functioning at one or more levels is not the mark of a specific minority class of people , but is a feature of the human condition , which is , epidemiologically speaking , over the lifespan , a universal phenomena . in addition , icf is committed to the principle of parity , which states that the functional status is not determined by background etiology , and in particular by whether one has a physical rather than mental health condition . much time , effort , and international collaboration has gone into the development of the icf . it is no longer plausible to insist that the icf is a medical classification of people with disability , that it reduces all issues of functional status to underlying medical conditions , that it ignores the often salient role of the physical and social environment in the creation of restrictions of participation experienced by persons with functional problems . the revision process has produced a classification that has already stood up to rigorous tests of validity , reliability , and cross - cultural applicability . it is , as the ncvhs has concluded , the only existing classification system that could be used to code functional status across the age span . we now turn to the structure of icf as a classification system , in part to show why the committee has correctly assessed the value of the icf as a coding system for functional status , suitable for use in administrative records . the model that informs icf , portrays functioning and decrements in functioning , or disability , as a dynamic interaction between health conditions ( diseases , disorders , and injuries ) and contextual factors . contextual factors include environmental factors , that is , all aspects of the physical , human - built , social , and attitudinal environment that create the lived experience of functioning and disability . although not classified in icf , contextual factors also include personal factors such as sex , age , coping styles , social background , education , and overall behavior patterns that may influence how disability is experienced by the individual . the terms functioning and disability in the icf are the general or umbrella terms for , respectively , the positive and negatives aspects of the interaction between an individual ( with a health condition ) and that individual 's contextual factors ( environmental and personal factors ) . in the icf , health condition is the umbrella term for disease ( acute or chronic ) , disorder , injury or trauma . a health condition may also include other circumstances such as pregnancy , aging , stress , congenital anomaly , or genetic predisposition . the icf interactive model identifies three levels of human functioning : functioning at the level of body or body part , the whole person , and the whole person in their complete environment . these levels in turn define three aspects of functioning : body functions and structures , activities , and participation . disability similarly denotes a decrement in functioning at one or more of these levels that is , an impairment , activity limitation or participation restrictions . table 2 shows the complete list of all of the chapters found in the three classifications included in icf . under each of these chapters are second , third , and in some instances , fourth levels of categories , arranged in a hierarchical , tree - branch - stem - leaf , arrangement . this structure makes it possible for icf to be used as a classification tool for systematically describing situations of human functioning and problems with functioning . this complex information is organized by icf by means of a hierarchical coding system , thereby creating a common international language for functioning and disability . icf organizes information by means of several classifications distributed into two parts : ( 1 ) a component of functioning and disability that includes the component of the body with the body function and body structure classifications , and the component of activities and participation that includes all domains denoting aspects of functioning from an individual and social perspective organized into a single classification , and ( 2 ) a component of contextual factors that has a list of environmental factors organized from the individual 's most immediate to the wider environment . the classifications in the first part identify all of the domains of functioning from basic physiological functions and body structures , to simple and complex actions , tasks , social performances and relationships . the environmental factors list provides a tool for identifying those features of a person 's physical , human - built , social and attitudinal environment that , in interaction with the domains of functioning , constitute the complete lived experience of human functioning and disability . within the contextual factors part , besides the environmental factors , the icf recognizes the existence of personal factors as another component , but provides no classification of these . domains are a practical , meaningful set of related physiological functions , anatomical structures , actions , tasks , or areas of life . domains make up the different chapters and blocks within each component ( world health organization , 2001 ) . in order for these domains to capture descriptive information about functioning and disability in particular cases , they must be used in conjunction with qualifiers that record the presence and severity of a problem or decrement in functioning at the body , person , and social levels . for the classifications of body function and structure , the primary qualifier indicates the presence of an impairment and , on a five - point scale , the degree of the impairment of function or structure ( no impairment , mild , moderate , severe , and complete ) . in the case of the activity and participation list of domains , two essential qualifiers are provided to capture the full range of relevant information about disability . the performance qualifier is used to describe what an individual does in their current or actual environment , including whatever assistive devices or other accommodations the person may use to perform actions or tasks and whatever barriers and hindrances exist in the person 's actual environment . because the current environment always incorporates the overall social context , performance might be understood as involvement in the lived experience of disability . the capacity qualifier describes an individual 's inherent ability to execute a task or an action . operationally , this qualifier identifies the highest probable level of functioning of a person in a given functional domain at a given moment without any specific assistance . for measurement purposes , this level of capacity presumes a standardized assessment environment , namely one that reveals the inherent capacity of a person in a specific functional domain without any particular enhancements . the environmental factors list can be used to describe such a standard assessment environment in order to ensure that results across different studies can be compared by holding this environment constant . intuitively , the performance qualifier captures what people actually do in their lives , whereas the capacity qualifier identifies the person 's inherent capacity without explicit environmental facilitation ( or hindrance ) . who is developing a standard application guide that will operationalize the constructs of capacity and performance with respect to individual items that form the classification . table 3 shows how data can be organized to reflect the role of these two qualifiers used for the domains of the activity and participation classification . as a general matter of describing functioning and disability phenomena fully and accurately , the performance / capacity having access to both performance and capacity data enables icf users to determine the gap between capacity and performance . if capacity is less than performance , then the person 's actual or current environment has enabled him or her to perform better than what data about their capacity would predict : the environment has facilitated performance . on the other hand , if capacity is greater than performance , then some aspect of the environment is acting as a barrier to a level of performance that is feasible in a more suitable environment . icf thus makes it possible to measure the effect of a person 's environment on their decrement in functioning , given their health condition . the environmental factors classification can be used to identify specific features of the person 's actual environment that are barriers or facilitators in general for the person or with specific regard to each item of the person 's body functions , body structures or activities and participation that have been described . it can also be used , as previously stated , to describe specific testing environments where capacity has been measured . for its use as a classification of functional status relevant for health administrative records , icf provides a complete classification of both body and person level domains of functioning . given that it has been designed for a multiplicity of uses and users , there is far more in icf than could ever be plausibly integrated into a viable coding system for health records , although it remains the ultimate lexicon to which any coder , for clinical or research purposes , could turn . clearly , for implementation purposes in this area , a simplified checklist of items is needed . such a checklist was produced and used during the beta 1 and 2 field - testing phase in the revision process ( world health organization , 2001 ) . this checklist , which takes less than 30 minutes to complete , is currently being extensively tested in clinical studies in different disorders in order to study its feasibility , reliability , and concurrent validity with existing assessment instruments as part of a larger project to define core sets of items that may be used in rehabilitation settings for specific conditions and across several disorders ( stucki et al . , 2002 ) . the core sets of items with their corresponding scales could also be then converted into even shorter assessment instruments . the challenge for incorporating the icf into clinical and administrative records beyond a lexicon and framework lies in identifying this parsimonious set of domains or items that captures decrements in functioning across different health conditions and a smaller subset of domains or items that uniquely describe the decrements of functioning that typify a given health condition . in addition , the mapping of instruments ( that measure functioning and disability that are already in use ) onto icf categories will allow a ready crosswalk between measurements already being made at points of encounter to a common framework ( cieza et al . , 2002 ) . the use of the icf in larger population based surveys will also provide data on norms and distributions of health , functioning and disability that will enable the setting of appropriate thresholds for a multitude of purposes . table 4 maps the domains of the icf that have been included in different waves of the world health survey that ought to be included as a minimum or ideal set for information systems . these domains are also included on the icf checklist , which is designed to be a clinical tool . primary data collection strategies with regard to functional status , in a manner that is truly comparable , are in their infancy especially for international use and for use across population groups . further tools need to be developed , and standards and procedures established , so that these data become meaningful and usable . as a final issue , it must be mentioned that the icf has been conceived as a dynamic classification that will not only serve multiple users requiring different levels of detail , but also will continue to evolve with advancements in science . the classification is flexible in its structure such that it can be expanded in the level of detail ( for example , the fourth level ) for specific uses , or new codes added where gaps have been left in the numbering system . a set of operational rules will specify the procedure for this evidence - based expansion , adaptation , or revision of the classification . a common language for describing fsi is the key to ensuring comparability of data from a myriad of sources as well as in providing users with a tool for precise and accurate communication with each other . the recognition that a description of health and health - related outcomes must go beyond a narrow view of health restricted to the absence of disease , as well as that the definition of disability must move beyond the narrow impairment - based view that has been traditionally adopted to define a minority population , will go a long way in bridging the gap between health and disability data . it will also fill the void in existing health outcomes data while measuring the impact of interventions and monitoring them over time . health records must include functioning information in order to ensure a complete description of health states . the icf is the common language and framework that users will employ from now on . in the same way that all languages grow , evolve , and flourish over time and are adapted and modified to express new ideas , the icf will have a multitude of applications where it will be creatively used such that it continues to be a living classification . as with all new languages , it will be important to develop tools to learn this new language . toward this end , who is developing standardized application manuals and web - based learning courses that will use state - of - the art pedagogic methodology to assist end users . its usefulness in describing functional health status information will be one of the measures of its success .","<S> a common framework for describing functional status information ( fsi ) in health records is needed in order to make this information comparable and of value . </S> <S> the world health organization 's ( who 's ) international classification of functioning , disability and health ( icf ) , which has been approved by all its member states , provides this common language and framework . the biopsychosocial model of functioning and disability embodied in the icf goes beyond disease and conceptualizes functioning from the individual 's body , person , and lived experience vantage points , thereby allowing for planning interventions targeted at the individual 's body , the individual as a whole or toward the environment . </S> <S> this framework then permits the evaluation of both the effectiveness and cost effectiveness of these different interventions in devising programs at the personal or societal level . </S>"


The metric is an instance of [`datasets.Metric`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Metric):

In [13]:
metric

Metric(name: "rouge", features: {'predictions': Value(dtype='string', id='sequence'), 'references': Value(dtype='string', id='sequence')}, usage: """
Calculates average rouge scores for a list of hypotheses and references
Args:
    predictions: list of predictions to score. Each predictions
        should be a string with tokens separated by spaces.
    references: list of reference for each prediction. Each
        reference should be a string with tokens separated by spaces.
    rouge_types: A list of rouge types to calculate.
        Valid names:
        `"rouge{n}"` (e.g. `"rouge1"`, `"rouge2"`) where: {n} is the n-gram based scoring,
        `"rougeL"`: Longest common subsequence based scoring.
        `"rougeLSum"`: rougeLsum splits text using `"
"`.
        See details in https://github.com/huggingface/datasets/issues/617
    use_stemmer: Bool indicating whether Porter stemmer should be used to strip word suffixes.
    use_agregator: Return aggregates if this is set to True
Retu

You can call its `compute` method with your predictions and labels, which need to be list of decoded strings:

In [14]:
fake_preds = ["hello there", "general kenobi"]
fake_labels = ["hello there", "general kenobi"]
metric.compute(predictions=fake_preds, references=fake_labels)

{'rouge1': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rouge2': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rougeL': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rougeLsum': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0))}

## Preprocessing the data

Before we can feed those texts to our model, we need to preprocess them. This is done by a ðŸ¤— `Transformers` `Tokenizer` which will (as the name indicates) tokenize the inputs (including converting the tokens to their corresponding IDs in the pretrained vocabulary) and put it in a format the model expects, as well as generate the other inputs that the model requires.

To do all of this, we instantiate our tokenizer with the `AutoTokenizer.from_pretrained` method, which will ensure:

- we get a tokenizer that corresponds to the model architecture we want to use,
- we download the vocabulary used when pretraining this specific checkpoint.

That vocabulary will be cached, so it's not downloaded again the next time we run the cell.

To tokenize the inputs for this particular model, we need to have `sentencepiece` installed.

In [15]:
! pip install sentencepiece



Now we can instantiate our tokenizer.

In [16]:
from transformers import AutoTokenizer
    
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

Downloading:   0%|          | 0.00/88.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.09k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.82M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/65.0 [00:00<?, ?B/s]

By default, the call above will use one of the fast tokenizers (backed by Rust) from the ðŸ¤— `Tokenizers` library.

You can directly call this tokenizer on one sentence or a pair of sentences:

In [17]:
tokenizer("Hello, this one sentence!")

{'input_ids': [8087, 108, 136, 156, 5577, 147, 1], 'attention_mask': [1, 1, 1, 1, 1, 1, 1]}

Depending on the model you selected, you will see different keys in the dictionary returned by the cell above. They don't matter much for what we're doing here (just know they are required by the model we will instantiate later), you can learn more about them in [this tutorial](https://huggingface.co/transformers/preprocessing.html) if you're interested.

Instead of one sentence, we can pass along a list of sentences:

In [18]:
tokenizer(["Hello, this one sentence!", "This is another sentence."])

{'input_ids': [[8087, 108, 136, 156, 5577, 147, 1], [182, 117, 372, 5577, 107, 1]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1]]}

To prepare the targets for our model, we need to tokenize them inside the `as_target_tokenizer` context manager. This will make sure the tokenizer uses the special tokens corresponding to the targets:

In [19]:
with tokenizer.as_target_tokenizer():
    print(tokenizer(["Hello, this one sentence!", "This is another sentence."]))

{'input_ids': [[8087, 108, 136, 156, 5577, 147, 1], [182, 117, 372, 5577, 107, 1]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1]]}


If you are using one of the five T5 checkpoints we have to prefix the inputs with "summarize:" (the model can also translate and it needs the prefix to know which task it has to perform).

In [20]:
if model_checkpoint in ["t5-small", "t5-base", "t5-larg", "t5-3b", "t5-11b"]:
    prefix = "summarize: "
else:
    prefix = ""

We can then write the function that will preprocess our samples. We just feed them to the `tokenizer` with the argument `truncation=True`. This will ensure that an input longer that what the model selected can handle will be truncated to the maximum length accepted by the model. The padding will be dealt with later on (in a data collator) so we pad examples to the longest length in the batch and not the whole dataset.

The max input length of `google/pegasus-arxiv` is 1024, so `max_input_length = 1024`.

In [21]:
max_input_length = 1024
max_target_length = 256

def preprocess_function(examples):
    inputs = [prefix + doc for doc in examples["article"]]
    model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)

    # Setup the tokenizer for targets
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["abstract"], max_length=max_target_length, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

This function works with one or several examples. In the case of several examples, the tokenizer will return a list of lists for each key:

In [22]:
preprocess_function(raw_datasets['train'][:2])

{'input_ids': [[126, 4403, 115, 154, 197, 4567, 113, 1044, 111, 218, 1111, 8895, 115, 878, 1020, 113, 15791, 110, 108, 704, 115, 1044, 12857, 16020, 111, 191, 490, 7755, 2495, 107, 740, 32680, 117, 3365, 130, 142, 14069, 22021, 476, 113, 58117, 143, 110, 55654, 110, 158, 143, 110, 55654, 110, 105, 665, 3957, 943, 110, 20815, 110, 158, 111, 218, 6860, 130, 114, 711, 113, 109, 5910, 1568, 110, 108, 11300, 110, 108, 2111, 5173, 110, 108, 16020, 110, 108, 132, 7755, 2495, 110, 107, 8823, 1683, 2298, 120, 5690, 111, 49159, 233, 2881, 562, 244, 7755, 2495, 110, 108, 704, 115, 693, 111, 3464, 15791, 110, 108, 218, 129, 12409, 141, 32680, 107, 6304, 32680, 432, 64142, 2775, 253, 130, 8466, 110, 108, 10353, 110, 108, 111, 35368, 1379, 28247, 110, 108, 111, 2297, 218, 133, 114, 2404, 1298, 124, 348, 113, 271, 143, 15593, 6045, 110, 158, 111, 637, 1932, 115, 1044, 122, 1695, 110, 107, 2297, 110, 108, 112, 927, 1312, 7233, 110, 108, 15593, 6045, 110, 108, 111, 32261, 115, 1044, 122, 1695, 110, 108

To apply this function on all the pairs of sentences in our dataset, we just use the `map` method of our `dataset` object we created earlier. This will apply the function on all the elements of all the splits in `dataset`, so our training, validation and testing data will be preprocessed in one single command.

In [23]:
tokenized_datasets = raw_datasets.map(preprocess_function, batched=True)

  0%|          | 0/2 [00:00<?, ?ba/s]

  0%|          | 0/1 [00:00<?, ?ba/s]

  0%|          | 0/1 [00:00<?, ?ba/s]

Even better, the results are automatically cached by the ðŸ¤— `Datasets` library to avoid spending time on this step the next time you run your notebook. The ðŸ¤— `Datasets` library is normally smart enough to detect when the function you pass to map has changed (and thus requires to not use the cache data). For instance, it will properly detect if you change the task in the first cell and rerun the notebook. ðŸ¤— `Datasets` warns you when it uses cached files, you can pass `load_from_cache_file=False` in the call to `map` to not use the cached files and force the preprocessing to be applied again.

Note that we passed `batched=True` to encode the texts by batches together. This is to leverage the full benefit of the fast tokenizer we loaded earlier, which will use multi-threading to treat the texts in a batch concurrently.

## Fine-tuning the model

Now that our data is ready, we can download the pretrained model and fine-tune it. Since our task is of the sequence-to-sequence kind, we use the `AutoModelForSeq2SeqLM` class. Like with the tokenizer, the `from_pretrained` method will download and cache the model for us.

In [24]:
from transformers import AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq, Seq2SeqTrainingArguments, Seq2SeqTrainer

model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

Downloading:   0%|          | 0.00/2.12G [00:00<?, ?B/s]

Note that  we don't get a warning like in our classification example. This means we used all the weights of the pretrained model and there is no randomly initialized head in this case.

To instantiate a `Seq2SeqTrainer`, we will need to define three more things. The most important is the [`Seq2SeqTrainingArguments`](https://huggingface.co/transformers/main_classes/trainer.html#transformers.Seq2SeqTrainingArguments), which is a class that contains all the attributes to customize the training. It requires one folder name, which will be used to save the checkpoints of the model, and all other arguments are optional:

In [25]:
batch_size = 2
model_name = model_checkpoint.split("/")[-1]
args = Seq2SeqTrainingArguments(
    f"{model_name}-finetuned-pubmed",
    evaluation_strategy = "epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=5,
    predict_with_generate=True,
    push_to_hub=True,
    seed = 42,
)

Here we set the evaluation to be done at the end of each epoch, tweak the learning rate, use the `batch_size` defined at the top of the cell and customize the weight decay. Since the `Seq2SeqTrainer` will save the model regularly and our dataset is quite large, we tell it to make three saves maximum. Lastly, we use the `predict_with_generate` option (to properly generate summaries) and activate mixed precision training (to go a bit faster).

The last argument to setup everything so we can push the model to the [Hub](https://huggingface.co/models) regularly during training. Remove it if you didn't follow the installation steps at the top of the notebook. If you want to save your model locally in a name that is different than the name of the repository it will be pushed, or if you want to push your model under an organization and not your name space, use the `hub_model_id` argument to set the repo name (it needs to be the full name, including your namespace: for instance `"sgugger/t5-finetuned-xsum"` or `"huggingface/t5-finetuned-xsum"`).

Then, we need a special kind of data collator, which will not only pad the inputs to the maximum length in the batch, but also the labels:

In [26]:
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

The last thing to define for our `Seq2SeqTrainer` is how to compute the metrics from the predictions. We need to define a function for this, which will just use the `metric` we loaded earlier, and we have to do a bit of pre-processing to decode the predictions into texts:

In [27]:
import nltk
import numpy as np

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    # Replace -100 in the labels as we can't decode them.
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
    
    # Rouge expects a newline after each sentence
    decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
    decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]
    
    result = metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)
    # Extract a few results
    result = {key: value.mid.fmeasure * 100 for key, value in result.items()}
    
    # Add mean generated length
    prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]
    result["gen_len"] = np.mean(prediction_lens)
    
    return {k: round(v, 4) for k, v in result.items()}

Then we just need to pass all of this along with our datasets to the `Seq2SeqTrainer`:

In [28]:
trainer = Seq2SeqTrainer(
    model,
    args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Cloning https://huggingface.co/Kevincp560/pegasus-arxiv-finetuned-pubmed into local empty directory.


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

We can now finetune our model by just calling the `train` method:

In [29]:
trainer.train()

The following columns in the training set  don't have a corresponding argument in `PegasusForConditionalGeneration.forward` and have been ignored: abstract, article. If abstract, article are not expected by `PegasusForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 2000
  Num Epochs = 5
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 5000


Epoch,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum,Gen Len
1,2.65,1.984835,40.6984,16.387,25.0097,36.4831,215.294
2,2.1317,1.852423,43.6431,18.6794,26.7571,39.6642,224.646
3,2.0591,1.825366,43.6707,18.5176,26.6015,39.6325,225.894
4,2.0109,1.813821,44.1244,18.8866,26.8313,40.0913,229.656
5,1.9894,1.811831,44.286,19.0477,27.1122,40.2609,230.586


Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-500
Configuration saved in pegasus-arxiv-finetuned-pubmed/checkpoint-500/config.json
Model weights saved in pegasus-arxiv-finetuned-pubmed/checkpoint-500/pytorch_model.bin
tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-500/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-500/special_tokens_map.json
tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/special_tokens_map.json


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-1000
Configuration saved in pegasus-arxiv-finetuned-pubmed/checkpoint-1000/config.json
Model weights saved in pegasus-arxiv-finetuned-pubmed/checkpoint-1000/pytorch_model.bin
tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-1000/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-1000/special_tokens_map.json
The following columns in the evaluation set  don't have a corresponding argument in `PegasusForConditionalGeneration.forward` and have been ignored: abstract, article. If abstract, article are not expected by `PegasusForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 500
  Batch size = 2
Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-1500
Configuration saved in pegasus-arxiv-finetuned-pubmed/checkpoint-1500/config.json
Model weights saved in pegasus-arxiv-finet

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/special_tokens_map.json


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-2000
Configuration saved in pegasus-arxiv-finetuned-pubmed/checkpoint-2000/config.json
Model weights saved in pegasus-arxiv-finetuned-pubmed/checkpoint-2000/pytorch_model.bin
tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-2000/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-2000/special_tokens_map.json
Deleting older checkpoint [pegasus-arxiv-finetuned-pubmed/checkpoint-500] due to args.save_total_limit
The following columns in the evaluation set  don't have a corresponding argument in `PegasusForConditionalGeneration.forward` and have been ignored: abstract, article. If abstract, article are not expected by `PegasusForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 500
  Batch size = 2
Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-2500
Configuration saved i

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/special_tokens_map.json


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Deleting older checkpoint [pegasus-arxiv-finetuned-pubmed/checkpoint-1000] due to args.save_total_limit


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-3000
Configuration saved in pegasus-arxiv-finetuned-pubmed/checkpoint-3000/config.json
Model weights saved in pegasus-arxiv-finetuned-pubmed/checkpoint-3000/pytorch_model.bin
tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-3000/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-3000/special_tokens_map.json
Deleting older checkpoint [pegasus-arxiv-finetuned-pubmed/checkpoint-1500] due to args.save_total_limit
The following columns in the evaluation set  don't have a corresponding argument in `PegasusForConditionalGeneration.forward` and have been ignored: abstract, article. If abstract, article are not expected by `PegasusForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 500
  Batch size = 2
Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-3500
Configuration saved 

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/special_tokens_map.json


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Deleting older checkpoint [pegasus-arxiv-finetuned-pubmed/checkpoint-2000] due to args.save_total_limit


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-4000
Configuration saved in pegasus-arxiv-finetuned-pubmed/checkpoint-4000/config.json
Model weights saved in pegasus-arxiv-finetuned-pubmed/checkpoint-4000/pytorch_model.bin
tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-4000/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-4000/special_tokens_map.json
Deleting older checkpoint [pegasus-arxiv-finetuned-pubmed/checkpoint-2500] due to args.save_total_limit
The following columns in the evaluation set  don't have a corresponding argument in `PegasusForConditionalGeneration.forward` and have been ignored: abstract, article. If abstract, article are not expected by `PegasusForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 500
  Batch size = 2
Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-4500
Configuration saved 

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/special_tokens_map.json


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Deleting older checkpoint [pegasus-arxiv-finetuned-pubmed/checkpoint-3000] due to args.save_total_limit


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Saving model checkpoint to pegasus-arxiv-finetuned-pubmed/checkpoint-5000
Configuration saved in pegasus-arxiv-finetuned-pubmed/checkpoint-5000/config.json
Model weights saved in pegasus-arxiv-finetuned-pubmed/checkpoint-5000/pytorch_model.bin
tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-5000/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/checkpoint-5000/special_tokens_map.json
Deleting older checkpoint [pegasus-arxiv-finetuned-pubmed/checkpoint-3500] due to args.save_total_limit
The following columns in the evaluation set  don't have a corresponding argument in `PegasusForConditionalGeneration.forward` and have been ignored: abstract, article. If abstract, article are not expected by `PegasusForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 500
  Batch size = 2


Training completed. Do not forget to share your model on huggingface.co/models =)




TrainOutput(global_step=5000, training_loss=2.2452167724609375, metrics={'train_runtime': 8517.4138, 'train_samples_per_second': 1.174, 'train_steps_per_second': 0.587, 'total_flos': 2.885677635649536e+16, 'train_loss': 2.2452167724609375, 'epoch': 5.0})

You can now upload the result of the training to the Hub, just execute this instruction:

In [None]:
trainer.push_to_hub()

Saving model checkpoint to pegasus-arxiv-finetuned-pubmed
Configuration saved in pegasus-arxiv-finetuned-pubmed/config.json
Model weights saved in pegasus-arxiv-finetuned-pubmed/pytorch_model.bin
tokenizer config file saved in pegasus-arxiv-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in pegasus-arxiv-finetuned-pubmed/special_tokens_map.json


huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Av

Upload file pytorch_model.bin:   0%|          | 32.0k/2.13G [00:00<?, ?B/s]

You can now share this model with all your friends, family, favorite pets: they can all load it with the identifier `"your-username/the-name-you-picked"` so for instance:

```python
from transformers import AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("sgugger/my-awesome-model")
```