If you're opening this Notebook on colab, you will probably need to install 🤗 `Transformers` and 🤗 `Datasets` as well as other dependencies. 

* `datasets`
* `transformers`
* `rogue-score`
* `nltk`
* `pytorch`
* `ipywidgets`

*Note*: Since we are using the GPU to optimize the performance of the deep learning algorithms, `CUDA` needs to be installed on the device.

In [1]:
! pip install datasets transformers rouge-score nltk ipywidgets

Collecting datasets
  Downloading datasets-1.18.3-py3-none-any.whl (311 kB)
[K     |████████████████████████████████| 311 kB 5.0 MB/s 
[?25hCollecting transformers
  Downloading transformers-4.17.0-py3-none-any.whl (3.8 MB)
[K     |████████████████████████████████| 3.8 MB 56.3 MB/s 
[?25hCollecting rouge-score
  Downloading rouge_score-0.0.4-py2.py3-none-any.whl (22 kB)
Collecting fsspec[http]>=2021.05.0
  Downloading fsspec-2022.2.0-py3-none-any.whl (134 kB)
[K     |████████████████████████████████| 134 kB 60.3 MB/s 
Collecting aiohttp
  Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 52.4 MB/s 
[?25hCollecting xxhash
  Downloading xxhash-3.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)
[K     |████████████████████████████████| 212 kB 64.2 MB/s 
Collecting huggingface-hub<1.0.0,>=0.1.0
  Downloading huggingface_hub-0.4.0-p

When using `nltk`, `punkt` also needs to be installed. I guess it is not installed automatically. Not having `punkt` will result in an error during the analysis.

In [2]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

If you're opening this notebook locally, make sure your environment has an install from the last version of those libraries.

To be able to share your model with the community and generate results like the one shown in the picture below via the inference API, there are a few more steps to follow.

First you have to store your authentication token from the Hugging Face website (sign up [here](https://huggingface.co/join) if you haven't already!) then execute the following cell and input your username and password:

In [3]:
from huggingface_hub import notebook_login

notebook_login()

Login successful
Your token has been saved to /root/.huggingface/token
[1m[31mAuthenticated through git-credential store but this isn't the helper defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set this credential helper as the default

git config --global credential.helper store[0m


Then you need to install `Git-LFS`.

If you are not using `Google Colab`, you may need to install `Git-LFS` manually, since the code below may not work and depending on your operating system. You can read about `Git-LFS` and how to install it [here](https://git-lfs.github.com/).

In [4]:
! apt install git-lfs

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following package was automatically installed and is no longer required:
  libnvidia-common-470
Use 'apt autoremove' to remove it.
The following NEW packages will be installed:
  git-lfs
0 upgraded, 1 newly installed, 0 to remove and 39 not upgraded.
Need to get 2,129 kB of archives.
After this operation, 7,662 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 git-lfs amd64 2.3.4-1 [2,129 kB]
Fetched 2,129 kB in 1s (2,456 kB/s)
Selecting previously unselected package git-lfs.
(Reading database ... 155320 files and directories currently installed.)
Preparing to unpack .../git-lfs_2.3.4-1_amd64.deb ...
Unpacking git-lfs (2.3.4-1) ...
Setting up git-lfs (2.3.4-1) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...


Make sure your version of `Transformers` is at least 4.11.0 since the functionality was introduced in that version:

In [5]:
import transformers

print(transformers.__version__)

4.17.0


You can find a script version of this notebook to fine-tune your model in a distributed fashion using multiple GPUs or TPUs [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq).

# Fine-tuning a model on a summarization task

In this notebook, we will see how to fine-tune one of the [🤗`Transformers`](https://github.com/huggingface/transformers) model for a summarization task. We will use the [PubMed Summarization dataset](https://huggingface.co/datasets/ccdv/pubmed-summarization) which contains PubMed articles accompanied with abstracts.

![Widget inference on a summarization task](https://github.com/huggingface/notebooks/blob/master/examples/images/summarization.png?raw=1)

We will see how to easily load the dataset for this task using 🤗 `Datasets` and how to fine-tune a model on it using the `Trainer` API.

In [6]:
model_checkpoint = "sshleifer/distilbart-cnn-12-6"

This notebook is built to run  with any model checkpoint from the [Model Hub](https://huggingface.co/models) as long as that model has a sequence-to-sequence version in the Transformers library. Here we picked the [`sshleifer/distilbart-cnn-12-6`](https://huggingface.co/sshleifer/distilbart-cnn-12-6) checkpoint. 

## Loading the dataset

We will use the [🤗 `Datasets`](https://github.com/huggingface/datasets) library to download the data and get the metric we need to use for evaluation (to compare our model to the benchmark). This can be easily done with the functions `load_dataset` and `load_metric`.  

In [7]:
from datasets import load_dataset, load_metric

raw_datasets = load_dataset("ccdv/pubmed-summarization")
metric = load_metric("rouge")

Downloading:   0%|          | 0.00/4.88k [00:00<?, ?B/s]

No config specified, defaulting to: pub_med_summarization_dataset/document


Downloading and preparing dataset pub_med_summarization_dataset/document to /root/.cache/huggingface/datasets/ccdv___pub_med_summarization_dataset/document/1.0.0/5792402f4d618f2f4e81ee177769870f365599daa729652338bac579552fec30...


Downloading:   0%|          | 0.00/779M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/43.7M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/43.8M [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

Dataset pub_med_summarization_dataset downloaded and prepared to /root/.cache/huggingface/datasets/ccdv___pub_med_summarization_dataset/document/1.0.0/5792402f4d618f2f4e81ee177769870f365599daa729652338bac579552fec30. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/2.16k [00:00<?, ?B/s]

The `dataset` object itself is [`DatasetDict`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasetdict), which contains one key for the training, validation and test set:

In [8]:
raw_datasets

DatasetDict({
    train: Dataset({
        features: ['article', 'abstract'],
        num_rows: 119924
    })
    validation: Dataset({
        features: ['article', 'abstract'],
        num_rows: 6633
    })
    test: Dataset({
        features: ['article', 'abstract'],
        num_rows: 6658
    })
})

To access an actual element, you need to select a split first, then give an index:

In [9]:
raw_datasets["train"][0]

{'abstract': "<S> background : the present study was carried out to assess the effects of community nutrition intervention based on advocacy approach on malnutrition status among school - aged children in shiraz , iran.materials and methods : this case - control nutritional intervention has been done between 2008 and 2009 on 2897 primary and secondary school boys and girls ( 7 - 13 years old ) based on advocacy approach in shiraz , iran . </S> <S> the project provided nutritious snacks in public schools over a 2-year period along with advocacy oriented actions in order to implement and promote nutritional intervention . for evaluation of effectiveness of the intervention growth monitoring indices of pre- and post - intervention were statistically compared.results:the frequency of subjects with body mass index lower than 5% decreased significantly after intervention among girls ( p = 0.02 ) . </S> <S> however , there were no significant changes among boys or total population . </S> <S> 

Since the `pubmed` data is extremely large, we are going to remove rows so that we have a training set of 8,000, a validation set of 2,000, and a test set of 2,000. 

In [10]:
raw_datasets["train"] = raw_datasets["train"].select(range(1, 8001))
raw_datasets["validation"] = raw_datasets["validation"].select(range(1, 2001))
raw_datasets["test"] = raw_datasets["test"].select(range(1, 2001))

To get a sense of what the data looks like, the following function will show some examples picked randomly in the dataset.

In [11]:
import datasets
import random
import pandas as pd
from IPython.display import display, HTML

def show_random_elements(dataset, num_examples=5):
    assert num_examples <= len(dataset), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset)-1)
        while pick in picks:
            pick = random.randint(0, len(dataset)-1)
        picks.append(pick)
    
    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))

In [12]:
show_random_elements(raw_datasets["train"])

Unnamed: 0,article,abstract
0,"the prevalence of obesity is on the increase globally , both in developed and developing nations . in the united states , it is estimated that approximately 64.5% of adults can be classified as overweight or obese individuals . in addition to the morbidity associated with obesity , approximately 325,000 deaths in the united states each year among nonsmokers are attributable to obesity . bakari et al . evaluated obesity rates using body mass index ( bmi ) and waist - to - hip ratio ( whr ) among type 2 diabetic hausa - fulanis . among this group , 35% and 5% were overweight and obese , respectively , whereas 95% had central obesity when gender - specific whr was used . , in a study on type 2 diabetic patients in enugu , southeast nigeria , reported central obesity [ determined by waist circumference ( wc ) ] in 22.8% of the subjects studied . the use of different indices to determine or measure obesity clearly explains why the prevalence rates of obesity differ globally . among 6208 type 2 diabetic subjects seen in india , over 50% were found to be overweight or obese when bmi of > 25 kg / m was used , whereas 59.66% and 95.57% of men and women , respectively , had central obesity when gender - specific whr cut - off points of > 0.95 and > 0.85 , respectively , were used . insulin resistance with compensatory hyperinsulinemia has been suggested to underlie the clustering of cardiovascular risk factors , including glucose intolerance , hypertension , elevated serum triglycerides , low serum high density lipoprotein cholesterol and central obesity . central obesity has been shown to worsen the degree of insulin resistance . in the earlybird study , wc correlated significantly with homa - ir in both genders , while bmi correlated significantly with homa - ir in girls only . similarly , wc seems to be particularly associated with the risk for non - communicable diseases as shown by many other studies.[810 ] recently , the international diabetes federation ( idf ) proposed the use of ethnic - specific cut - off values for wc , having made it a compulsory criterion in its definition of the metabolic syndrome . several indices such as wc , whr , waist to height ratio ( whtr ) and sagittal abdominal diameter ( sad ) have been used as clinical measures of central obesity . magnetic resonance imaging ( mri ) and computed tomography ( ct ) are however considered the gold standard methods for determining the quantity of subcutaneous abdominal adipose tissue ( saat ) and intra - abdominal adipose tissue ( iaat ) . this gold standard , however , can not be routinely used in a clinic setting to measure these indices . wc is a simple and valid measure that may be used independently as an estimate of abdominal fat and it has been found to be more strongly associated with cardiovascular health risk . in this study , we compared the performances of two measures of central obesity , namely , wc and whr , in predicting the presence of cardiovascular risk markers in an apparently healthy nigerian population . this study was a cross - sectional descriptive survey of the inhabitants of enugu ( urban ) . it was formerly the capital of the former eastern nigeria and is currently the capital city of enugu state .. the state is one of the five states in the southeast geopolitical zone of nigeria . it is geographically located between longitude 726 e and 730 e and latitude 625 n and 628 n. enugu city is predominantly an urban christian community with a population of 722,664 people out of the total population of 3,257,298 for the whole state ( 2006 population census figure ) . its work force consists mainly of civil servants , business men , industrialists , farmers and students . the town is richly endowed with large deposits of coal which served as a major revenue source for nigeria before the era of oil boom . ethical clearance for the study was obtained from the ethics committee of the university of nigeria teaching hospital ( unth ) , and consent was obtained from all the participants . the subjects comprised apparently healthy individuals ( not known hypertensive or diabetic patients ) who were residents of enugu ( urban ) . a total of 1000 subjects aged 1870 years and who were of the igbo tribe were recruited through a multi - stage sampling procedure . subjects who were physically challenged ( either wheelchair bound or unable to stand ) and female subjects who were pregnant , based on their date of last menstrual period , were excluded . in stage one , five areas of the town , namely , emene , abakpa , trans - ekulu , asata and ogui layout , were selected by simple random sampling ( using the balloting technique ) . in the second stage , 200 participants were selected from among those who reported and got registered on the day of recruitment in each of the five areas . these subjects were invited for registration following health awareness campaigns executed in some of the churches within the five selected areas . those who were not selected were not interviewed but had their anthropometric indices measured and medical advice given based on the individual 's cardiovascular risk status . a total of 898 subjects ( 318 males and 580 females ) were used for analysis after cleaning up of the data . data collection and physical measurements were based on world health organization ( who ) 's steps instrument . it is a simple , standardized method for collecting and analyzing data for chronic disease risk factors in who member countries . it involves a sequential process which starts with gathering information on key behavioural risk factors ( step 1 ) and then moving to simple physical measurements ( step 2 ) . the physical measurements undertaken included height ( measured to the nearest 0.1 cm ) , weight ( recorded to the nearest 0.1 kg ) , wc and hip circumference ( recorded to the nearest 0.1 cm ) using non - stretching flexible linear tapes , and blood pressure ( recorded to the nearest whole number in mmhg ) using mercury sphygmomanometers ( accosons , essex , england ) . the landmark for the measurement of wc was the midpoint between the lowest rib and the iliac crest , as recommended by who . measurement was done at the end of expiration , with the arms by the side and patient standing with the feet together . both wc and hip circumference were measured in privacy with the subject in light clothing . the 1 and 5 korotkoff sounds were used to mark the systolic blood pressure ( sbp ) and diastolic blood pressure ( dbp ) respectively . generalized obesity was defined as bmi 30 kg / m , while central obesity was defined according to the international diabetes federation ( idf ) ethnic - specific criteria as wc 94 cm and 80 cm for males and females , respectively . blood pressure was classified using the jnc 7 classification . risk category was defined using the hypertensive category ( sbp 140 mmhg and/or dbp 90 mmhg ) . comparison of means between two groups was done using the independent t - test while test of association / independence between categorical variables was performed using chi - square test of independence . the receiver operating characteristic ( roc ) analysis was used to compare the performance of wc and whr ( measures of central obesity ) as determined by the area under the curve ( auc ) . data analysis was conducted using statistical package for social sciences for windows ( spss ) version 10 ( chicago , il , usa ) . the subjects comprised apparently healthy individuals ( not known hypertensive or diabetic patients ) who were residents of enugu ( urban ) . a total of 1000 subjects aged 1870 years and who were of the igbo tribe were recruited through a multi - stage sampling procedure . subjects who were physically challenged ( either wheelchair bound or unable to stand ) and female subjects who were pregnant , based on their date of last menstrual period , were excluded . in stage one , five areas of the town , namely , emene , abakpa , trans - ekulu , asata and ogui layout , were selected by simple random sampling ( using the balloting technique ) . in the second stage , 200 participants were selected from among those who reported and got registered on the day of recruitment in each of the five areas . these subjects were invited for registration following health awareness campaigns executed in some of the churches within the five selected areas . those who were not selected were not interviewed but had their anthropometric indices measured and medical advice given based on the individual 's cardiovascular risk status . a total of 898 subjects ( 318 males and 580 females ) were used for analysis after cleaning up of the data . data collection and physical measurements were based on world health organization ( who ) 's steps instrument . it is a simple , standardized method for collecting and analyzing data for chronic disease risk factors in who member countries . it involves a sequential process which starts with gathering information on key behavioural risk factors ( step 1 ) and then moving to simple physical measurements ( step 2 ) . the physical measurements undertaken included height ( measured to the nearest 0.1 cm ) , weight ( recorded to the nearest 0.1 kg ) , wc and hip circumference ( recorded to the nearest 0.1 cm ) using non - stretching flexible linear tapes , and blood pressure ( recorded to the nearest whole number in mmhg ) using mercury sphygmomanometers ( accosons , essex , england ) . the landmark for the measurement of wc was the midpoint between the lowest rib and the iliac crest , as recommended by who . measurement was done at the end of expiration , with the arms by the side and patient standing with the feet together . both wc and hip circumference were measured in privacy with the subject in light clothing . the 1 and 5 korotkoff sounds were used to mark the systolic blood pressure ( sbp ) and diastolic blood pressure ( dbp ) respectively . generalized obesity was defined as bmi 30 kg / m , while central obesity was defined according to the international diabetes federation ( idf ) ethnic - specific criteria as wc 94 cm and 80 cm for males and females , respectively . risk category was defined using the hypertensive category ( sbp 140 mmhg and/or dbp 90 mmhg ) . comparison of means between two groups was done using the independent t - test while test of association / independence between categorical variables was performed using chi - square test of independence . the receiver operating characteristic ( roc ) analysis was used to compare the performance of wc and whr ( measures of central obesity ) as determined by the area under the curve ( auc ) . data analysis was conducted using statistical package for social sciences for windows ( spss ) version 10 ( chicago , il , usa ) . males were older than females and had higher whr and dbp , whereas females had higher wc , bmi and sbp . the gender differences observed in all the variables except for sbp and dbp were significant [ table 1 ] . general characteristics of the subjects central obesity was more prevalent when whr ( 76.9% ) was used than when wc ( 66.5% ) was used . chi - square analysis showed that central obesity determined using both whr ( ( 1 ) = 5.15 ; p < 0.05 ) and wc ( ( 1 ) = 185.6 ; p < generalized obesity was found in 190 ( 21.2% ) subjects , while 339 ( 37.8% ) were overweight [ figure 1 ] . six ( 0.7% ) persons were undernourished ( bmi < 18.5 kg / m ) . weight categorization among the subjects according to body mass index using the jnc 7 classification , 430 ( 47.9% ) , 324 ( 36.1% ) and 144 ( 16% ) subjects were classified as having hypertension , pre - hypertension and normal blood pressure , respectively . greater proportion ( 42.7% ) of individuals had hypertension when sbp was used alone compared to 30.8% when dbp was used ( ( 1 ) = 9.2 ; p = 0.0024 ) . the auc showing the performances of wc and whr in predicting the presence of cardiovascular risk ( generalized obesity , hypertension and both obesity and hypertension ) is shown in figures 24 , respectively . receiver operating characteristic curve for waist circumference and waist - to - hip ratio for generalized obesity receiver operating characteristic curve for waist circumference and waist - to - hip ratio for hypertension receiver operating characteristic curve for waist circumference and waist - to - hip ratio for obesity and hypertension the values of the auc for each of the indices for generalized obesity , hypertension and hypertension / obesity are summarized in table 2 . based on its higher auc , areas under the curve for predicting the presence of obesity , hypertension , and obesity / hypertension the auc showing the performances of wc and whr in predicting the presence of cardiovascular risk ( generalized obesity , hypertension and both obesity and hypertension ) is shown in figures 24 , respectively . receiver operating characteristic curve for waist circumference and waist - to - hip ratio for generalized obesity receiver operating characteristic curve for waist circumference and waist - to - hip ratio for hypertension receiver operating characteristic curve for waist circumference and waist - to - hip ratio for obesity and hypertension the values of the auc for each of the indices for generalized obesity , hypertension and hypertension / obesity are summarized in table 2 . based on its higher auc , areas under the curve for predicting the presence of obesity , hypertension , and obesity / hypertension in this study , the performances of wc and whr were compared using roc analysis . roc curves are frequently used in several medical disciplines such as biomedical informatics , clinical chemistry and radiology . the roc curve plots sensitivity versus ( 1 specificity ) of a test as the threshold varies over its entire range . each data point on the plot represents a particular setting of the threshold , and each threshold setting defines a particular set of true - positive ( tp ) , false - positive ( fp ) , true - negative ( tn ) and false - negative ( fn ) frequencies , and consequently a particular pair of sensitivity and ( 1 specificity ) values . it was originally developed for radar applications in the 1940s , but roc analysis became widely used in medical diagnostics , where complex and weak signals needed to be distinguished from a noisy background . the area under an roc curve is equal to the probability that a randomly selected positive case will receive a higher score than a randomly selected negative case . the roc curve area is a good summary measure of test accuracy because it does not depend on the prevalence of disease or the cut points used to derive the curve . it is however suggested that once a test has been able to classify patients as either having a disease or not , the performance of the test for particular uses such as diagnosis or screening needs to be evaluated . as regards its use when comparing the accuracies of two tests as in this study , caution should be exercised as the roc curve area may be misleading if the curves cross each other . this study revealed that wc had higher aucs compared to whr in subjects who were classified as having generalized obesity alone , hypertension alone or both . the higher aucs therefore suggest that wc may be more useful and reliable than whr in predicting the presence of generalized obesity and hypertension or cardiovascular risk . in a similar study by pouliot et al . , sagittal abdominal diameter ( sad ) was identified to be better than other clinical measures or determinants of abdominal obesity wc among the various measures of central adiposity is very easy to measure and perhaps more time saving in a routine clinic setting . it is neither cumbersome at all compared to the gold standards ( ct , mri ) nor labourious requiring further measurements and calculations . the simplicity of its measurement and its relation to both body weight and fat distribution as a major advantage over bmi and waist - to - hip circumference ratio was highlighted by lean et al . studies have shown that anthropometric measures , such as bmi , whr , and wc cut - off levels , are not comparable across different racial populations . apart from the fact that wc is a good indicator of both the degree of obesity and the accumulation of visceral adipose tissue , the threshold values of wc corresponding to critical amounts of visceral adipose tissue do not appear to be influenced by sex or by the degree of obesity . taylor et al . , in a screening for regional fat distribution among adult women , found that wc significantly classified the subjects better than whr . al - sendi et al . also reported that wc is useful in identifying children ( 1217 years of age ) at risk of developing hypertension . among 768 middle - aged men from the olivetti heart study , wc was the strongest predictor of blood pressure , and also was related to heart rate , insulin concentrations , and insulin sensitivity . in this study , the aucs for both wc and whr for the subjects who had only hypertension were smaller when compared to those represented by obesity alone and by both obesity and hypertension . this can be explained by the fact that bmi is better related to body fat than hypertension though they can cluster in a single individual . this association is evidenced by the degree of relationships shown by the correlation coefficients ( not shown in the results ) . though the aucs were smaller in the hypertension group , wc still performed better than whr in this group . with respect to gender comparisons , females had a higher mean wc and bmi values , whereas the reverse was the case for whr . this pattern had earlier been observed among diabetic patients in the same ethnic region of nigeria and in cuban female scholars . hormonal differences particularly involving the adrenal and sex steroids have also been noted to influence body fat distribution.[3234 ] these gender patterns also appear to be established in childhood , especially with pubertal development . though the role of obesity as a health hazard in adults has been well recognized , its presence in adolescence has been associated with obesity in adulthood , thus emphasizing the importance of early detection and intervention directed at its treatment to avert the long - term consequences of obesity and development of cardiovascular diseases . adipose tissue is mostly distributed as subcutaneous fat ( 85% of total adipose tissue mass ) and then a smaller amount as intra - abdominal fat ( 15% ) in lean and obese persons . the relative contribution of intra - abdominal fat mass to total body fat is influenced by sex , age , race - ethnicity , physical activity , and total adiposity . the term visceral fat is commonly used to describe intra - abdominal fat , and intra - abdominal fat is made up of both intraperitoneal fat ( mesenteric and omental fat ) , directly draining into the portal circulation , and retroperitoneal fat , draining into the systemic circulation . currently , there is no universally accepted site for measuring intra - abdominal ( iaat ) and subcutaneous adipose ( saat ) tissue distributions . this remains a source of variation in data obtained from different studies and needs to be harmonized for effective comparison of data . abdominal circumference as measured by wc , actually measures the fat contributed by subcutaneous tissue which is under the skin of the abdomen and that deposited intra - abdominally . both saat and iaat have been found to correlate with insulin resistance . despite the obvious strengths of wc over whr and bmi , it is to be noted that adoption of different landmarks for its measurement may pose some limitations on the comparison of wc data generated from different studies . though some studies on wc involving subjects of african descent have been reported , characterization of wc among african populations still appears to be deficient , especially as it relates to diagnostic indices and threshold values . wc provides a unique indicator of body fat distribution , which can identify patients who are at increased risk of central obesity - related cardiometabolic disease , above and beyond the measurement of bmi . from this study , being a simpler index to measure , less time consuming and devoid of any calculations , we propose that clinicians will find wc a reliable means of assessing individuals cardiovascular risk status especially in a busy routine clinic setting .","<S> objective : to compare the performance of waist circumference ( wc ) and waist - to - hip ratio ( whr ) in predicting the presence of cardiovascular risk factors ( hypertension and generalized obesity ) in an apparently healthy population.materials and methods : we recruited 898 apparently healthy subjects ( 318 males and 580 females ) of the igbo ethnic group resident in enugu ( urban ) , southeast nigeria . </S> <S> data collection was done using the world health organization stepwise approach to surveillance of risk factors ( steps ) instrument . </S> <S> subjects had their weight , height , waist and hip circumferences , systolic and diastolic blood pressures measured according to the guidelines in the step 2 of steps instrument . </S> <S> generalized obesity and hypertension were defined using body mass index ( bmi ) and jnc 7 classifications , respectively . </S> <S> quantitative and qualitative variables were analyzed using t - test and chi - square analysis , respectively , while the performance of wc and whr was compared using the receiver operating characteristic ( roc ) analysis . </S> <S> p value was set at < 0.05.results:the mean age of the subjects was 48.7 ( 12.9 ) years . </S> <S> central obesity was found in 76.9% and 66.5% of subjects using whr and wc , respectively . </S> <S> wc had a significantly higher area under the curve ( auc ) than whr in all the cardiovascular risk groups , namely , generalized obesity ( auc = 0.88 vs. 0.62 ) , hypertension alone ( auc = 0.60 vs. 0.53 ) , and both generalized obesity and hypertension ( auc = 0.86 vs. 0.57).conclusion : wc performed better than whr in predicting the presence of cardiovascular risk factors . </S> <S> being a simple index , it can easily be measured in routine clinic settings without the need for calculations or use of cumbersome techniques . </S>"
1,"glucose production ( gp ) is regulated by gluconeogenesis and glycogenolysis . in type 2 diabetes , elevation of gp an elevation of plasma gluconeogenic substrate precursors such as free fatty acids ( ffa ) or lactate is also seen in type 2 diabetes [ 25 ] . these findings led to the working hypothesis that elevation of gluconeogenic substrate precursors increases gluconeogenesis and gp . the findings as a whole indicate that during the pancreatic clamp , when gluco - regulatory hormones and glucose levels are maintained at basal , intravenous ( i.v . ) infusion of ffa or lactate increases hepatic gluconeogensis but , contrary to the hypothesis , not gp because of a compensatory inhibition of glycogenolysis [ 69 ] . the underlying nutrient - sensing mechanisms that inhibit glycogenolysis and counteract the direct effects of circulating ffa or lactate on hepatic gluconeogenesis remain to be explored . in this study , we began to assess the underlying nutrient - sensing mechanisms that counteract the direct effect of circulating lactate on hepatic gluconeogenesis in vivo . the lipid - sensing mechanisms that counteract the direct effect of circulating ffa on hepatic gluconeogenesis have recently been studied . it is demonstrated that the inhibition of hepatic glycogenolysis induced by circulating ffa to restrain gp is mediated by hypothalamic lipid - sensing mechanisms in rodents . the hypothalamic lipid - sensing mechanisms to restrain glycogenolysis and gp are disrupted in diet - induced insulin resistance and obesity , leading to hyperglycaemia in response to systemic lipid infusions . this is in line with previous reports indicating that in patients with metabolic stress conditions such as type 2 diabetes , reciprocal changes in glycogenolysis do not compensate for changes in gluconeogenesis when plasma ffa concentrations are experimentally manipulated . unlike the lipid - sensing mechanisms in the body , even in metabolic stress patients with hyperglycaemia , an elevation of circulating lactate still does not increase gp because of an inhibition of glycogenolysis . consistent with this , in contrast to hypothalamic lipid - sensing mechanisms , activation of central lactate metabolism via direct delivery of lactate into the hypothalamus lowers gp in normal and in early onset of diabetic and obese rodents [ 14 , 15 ] . on the basis of these observations , we postulate that ( i ) the hypothalamic - sensing mechanism of circulating lactate that regulate gp is intact in normal , diabetic and obese individuals , and ( ii ) these hypothalamic nutrient - sensing mechanisms are designed to counteract the direct stimulatory effect of circulating lactate on hepatic gluconeogenesis to maintain glucose homeostasis . given that the ability of the hypothalamus to sense circulating lactate to regulate gp and maintain glucose homeostasis is yet to be evaluated , this will be the first study to address our working hypothesis in normal rodents . to examine whether the hypothalamus senses circulating lactate to regulate gp and maintain glucose homeostasis in vivo , we infused i.v . l - lactate to elevate plasma lactate levels in the presence or absence of direct inhibition of brain lactate - sensing mechanisms . to directly inhibit central / hypothalamic lactate - sensing mechanisms , we independently infused ( i ) lactate dehydrogenase inhibitor oxamate ( oxa ) into the third cerebral ventricle ( intracerebroventricular i.c.v . ) , ( ii ) i.c.v . atp - sensitive potassium ( katp ) channel blocker glibenclamide ( gli ) or ( iii ) oxa directly into the mediobasal hypothalamus ( mbh ) ( fig . 1a and b ) that has been previously described to prevent the inhibitory effects of central lactate administration on gp . tracer - dilution methodology and the pancreatic clamp technique were used to assess the effect of i.v . and i.c.v./mbh administrations on whole body glucose kinetics independent of changes in circulating gluco - regulatory hormones . based on this cross - discipline experimental approach , we provided the first evidence , to our knowledge , that direct inhibition of central / hypothalamic lactate - sensing mechanisms by three independent methods increased gp and disrupted glucose homeostasis in response to systemic lactate elevations in vivo . the hypothalamic metabolism of lactate to pyruvate and the subsequent activation of the katp channels are required to maintain glucose homeostasis in response to systemic lactate elevations . somatostatin ( srif ) . during the final 30 min . of the clamps , ( c ) intravenous lactate infusion elevated plasma lactate levels compared to control ( saline , sal ) by 2- to 2.5-fold . * p < 0.001 versus sal . ( d ) plasma glucose , ( e ) plasma insulin , ( f ) plasma adiponectin and glucagon levels were comparable in all groups during the clamps . adult 8-week - old male sprague dawley rats ( 250280 g ) were obtained from charles river laboratories ( montreal , qc , canada ) and maintained on a standard light - dark cycle with access to rat chow and water ad libitum . rats underwent stereotaxic surgery to insert single catheters in the third cerebral ventricle ( i.c.v . ) or bilateral catheters into the mbh 2 weeks before the experiments in vivo as previously described [ 14 , 16 , 17 ] . one week later , catheters were placed in the internal jugular vein and the carotid artery for infusion and sampling during the clamp procedures as previously described [ 14 , 16 , 18 ] . recovery from surgery was monitored by measuring daily food intake and weight gain for 45 days after surgery . the study animal protocol was approved by the institutional animal care and use committee of the university health network . all the rats were restricted to 20 g of food the night before the experiments to ensure the same nutritional status . to groups of conscious unrestrained rats , we infused i.v . sodium l - lactate ( 100 mol / kg min . , ph 7.0 ) to elevate plasma lactate concentrations by 2- to 2.5-fold as seen in exercise for 4 hrs ( fig . infusions consisted of the ( i ) lactate dehydrogenase inhibitor oxa ( dissolved in artificial cerebrospinal fluid [ acsf ] to 50 mm [ 5 l / hr ] and was first given as an i.c.v . bolus [ 3 l ] ) , ( ii ) katp channel blocker gli ( dissolved in 5% dmso to 100 m ; 5 l / hr ) or ( iii ) vehicle ( 5% dmso or acsf ; 5 l / hr ) ( fig . mbh infusions consisted of ( i ) oxa ( 50 mm ; 0.33 l / hr and was first given as a mbh bolus [ 0.33 l ] ) or ( ii ) vehicle ( acsf ; 0.33 l / hr ) ( fig . oxa or gli and mbh oxa administered at these concentrations alone do not affect glucose kinetics but were sufficient to abolish the gp - lowering effect of direct brain lactate administrations . saline ( sal ) or l - lactate infusion and a primed - continuous infusion of [ 3-h]-glucose ( perkin elmer , woodbridge , on , canada ; 40 ci bolus ; 0.4 ci / min ) were initiated at 120 min . and maintained throughout the study to assess the rate of gp and glucose uptake based on the tracer - dilution methodology . ( steady - state basal period ) , the rate of gp ( mg / kg min . ) and plasma glucose levels ( mm ) were 11.9 0.5 and 8.3 0.1 ( i.c.v . lactate ) , 12.6 0.2 and 8.2 0.2 ( mbh vehicle + i.v . lactate ) and 12.2 1.2 and 8.9 0.6 ( mbh oxa + i.v . lactate ) . a pancreatic clamp was performed in the final 2 hrs of the study starting at 240 min ; a continuous infusion of insulin ( 1.5 mu / kg min . ) and somatostatin ( 3 g / kg min . ) was administered , and a variable infusion of a 25% glucose solution was started and periodically adjusted to maintain the plasma glucose concentration at 8 mm . saline ) were performed , and the gp during the clamps in both groups ( gp : i.c.v . plasma samples for the determination of lactate , insulin , adiponectin and glucagon concentrations were obtained at 10-min . somatostatin . in the presence of systemic lactate elevation , direct inhibition of lactate metabolism within the mediobasal hypothalamus ( mbh ) via mbh administration of oxamate ( compared to control ) ( b ) decreased exogenous glucose infusion rate and ( c ) increased gp during the final 30 min . 0.001 versus lactate with mbh vehicle . central sensing mechanisms of circulating lactate regulate glucose production ( gp ) . in the presence of systemic elevation of lactate , direct inhibition of central lactate metabolism via i.c.v . administration of lactate dehydrogenase inhibitor oxamate ( oxa ) ( compared to control ) led to a marked ( a ) decrease in exogenous glucose infusion rate and ( b ) increase in gp during the final 30 min . of the clamps . administration of blocker glibenclamide ( a ) decreased exogenous glucose infusion rate and ( b ) increased gp in response to systemic lactate infusions . plasma lactate concentrations were determined by a kit in accordance with manufacturer s instructions ( sigma diagnostics . plasma glucose concentrations were measured by the glucose oxidase method ( glucose analyzer , analox instruments , lunenberg , ma , usa ) . statistical analysis was done by anova or unpaired student s t - test as appropriate . adult 8-week - old male sprague dawley rats ( 250280 g ) were obtained from charles river laboratories ( montreal , qc , canada ) and maintained on a standard light - dark cycle with access to rat chow and water ad libitum . rats underwent stereotaxic surgery to insert single catheters in the third cerebral ventricle ( i.c.v . ) or bilateral catheters into the mbh 2 weeks before the experiments in vivo as previously described [ 14 , 16 , 17 ] . one week later , catheters were placed in the internal jugular vein and the carotid artery for infusion and sampling during the clamp procedures as previously described [ 14 , 16 , 18 ] . recovery from surgery was monitored by measuring daily food intake and weight gain for 45 days after surgery . the study animal protocol was approved by the institutional animal care and use committee of the university health network . all the rats were restricted to 20 g of food the night before the experiments to ensure the same nutritional status . to groups of conscious unrestrained rats , we infused i.v . ph 7.0 ) to elevate plasma lactate concentrations by 2- to 2.5-fold as seen in exercise for 4 hrs ( fig infusions consisted of the ( i ) lactate dehydrogenase inhibitor oxa ( dissolved in artificial cerebrospinal fluid [ acsf ] to 50 mm [ 5 l / hr ] and was first given as an i.c.v . bolus [ 3 l ] ) , ( ii ) katp channel blocker gli ( dissolved in 5% dmso to 100 m ; 5 l / hr ) or ( iii ) vehicle ( 5% dmso or acsf ; 5 l / hr ) ( fig . mbh infusions consisted of ( i ) oxa ( 50 mm ; 0.33 l / hr and was first given as a mbh bolus [ 0.33 l ] ) or ( ii ) vehicle ( acsf ; 0.33 l / hr ) ( fig . oxa or gli and mbh oxa administered at these concentrations alone do not affect glucose kinetics but were sufficient to abolish the gp - lowering effect of direct brain lactate administrations . saline ( sal ) or l - lactate infusion and a primed - continuous infusion of [ 3-h]-glucose ( perkin elmer , woodbridge , on , canada ; 40 ci bolus ; 0.4 ci / min ) were initiated at 120 min . and maintained throughout the study to assess the rate of gp and glucose uptake based on the tracer - dilution methodology . ( steady - state basal period ) , the rate of gp ( mg / kg min . ) and plasma glucose levels ( mm ) were 11.9 0.5 and 8.3 0.1 ( i.c.v . lactate ) , 12.6 0.2 and 8.2 0.2 ( mbh vehicle + i.v . lactate ) and 12.2 1.2 and 8.9 0.6 ( mbh oxa + i.v . lactate ) . a pancreatic clamp was performed in the final 2 hrs of the study starting at 240 min ; a continuous infusion of insulin ( 1.5 mu / kg min . ) and somatostatin ( 3 g / kg min . ) was administered , and a variable infusion of a 25% glucose solution was started and periodically adjusted to maintain the plasma glucose concentration at 8 mm . saline ) were performed , and the gp during the clamps in both groups ( gp : i.c.v . plasma samples for the determination of lactate , insulin , adiponectin and glucagon concentrations were obtained at 10-min . somatostatin . in the presence of systemic lactate elevation , direct inhibition of lactate metabolism within the mediobasal hypothalamus ( mbh ) via mbh administration of oxamate ( compared to control ) ( b ) decreased exogenous glucose infusion rate and ( c ) increased gp during the final 30 min . 0.001 versus lactate with mbh vehicle . central sensing mechanisms of circulating lactate regulate glucose production ( gp ) . in the presence of systemic elevation of lactate , direct inhibition of central lactate metabolism via i.c.v . administration of lactate dehydrogenase inhibitor oxamate ( oxa ) ( compared to control ) led to a marked ( a ) decrease in exogenous glucose infusion rate and ( b ) increase in gp during the final 30 min . administration of blocker glibenclamide ( a ) decreased exogenous glucose infusion rate and ( b ) increased gp in response to systemic lactate infusions . plasma lactate concentrations were determined by a kit in accordance with manufacturer s instructions ( sigma diagnostics . plasma glucose concentrations were measured by the glucose oxidase method ( glucose analyzer , analox instruments , lunenberg , ma , usa ) . statistical analysis was done by anova or unpaired student s t - test as appropriate . here we first examined whether selectively blocking the central effects of lactate through the i.c.v . administration of lactate dehydrogenase inhibitor oxa or katp channel blocker gli is sufficient to alter the metabolic response of i.v . . systemic infusion of lactate at 100 mol / kg min . for 4 hrs elevated plasma lactate by 2- to 2.5-fold compared to i.v . 1c ) . in the final 2 hrs of the lactate elevation , we performed pancreatic - insulin clamps with tracer dilution methodology to assess the effects of i.v . lactate on glucose kinetics independent of changes in plasma gluco - regulatory hormones ( fig . lactate infusion did not increase gp during the pancreatic clamps when plasma glucose ( fig . 1e and f ) were maintained near basal levels . to assess whether central sensing mechanisms restrain plasma lactate from elevating gp , we administered i.c.v . f ) , central oxa administration decreased exogenous glucose infusion rate required to maintain euglycaemia ( fig . this decrease in glucose infusion rate was due entirely to an elevation of gp ( fig . together , these data indicate that direct inhibition of central lactate metabolism disrupted glucose homeostasis and increased gp in the presence of systemic lactate elevations . to complement the function findings obtained with i.c.v . katp channels blocker gli to alternatively negate central lactate effects and examine whether central sensing mechanisms restrain plasma lactate to increase gp . similar to the effects of i.c.v . oxa , central gli administration decreased the exogenous glucose infusion rate required to maintain euglycaemia ( fig . 2a ) in the presence of similar degree of plasma lactate elevation as observed in the other groups ( fig . the drop in glucose infusion rate was caused by an increase in gp ( fig . these findings indicate that direct inhibition of central katp channels disrupted glucose homeostasis in response to systemic lactate elevation . to investigate the central neuroanatomical localization of the cns - sensing mechanisms of circulating lactate on gp , we negated the lactate - sensing mechanisms specifically within the mbh . lactate infusions ( 100 mol / kg min . ) with the administration of oxa ( 50 mm ) bilaterally into the mbh ( fig . , mbh oxa administration decreased glucose infusion rate required to maintain euglycaemia in the same degree of plasma lactate elevation ( fig . again , the decrease in glucose infusion rate was due to increased gp ( fig . 3d ) . thus , direct inhibition of lactate metabolism within the mbh disrupted glucose homeostasis and increased gp in the presence of systemic lactate infusions . it remains to be assessed whether the hypothalamic restraining effect on gp in response to circulating lactate is mediated by an inhibition of hepatic glycogenolysis and/or gluconeogenesis . the hypothalamus has been recently demonstrated to detect a rise in both hormones and nutrients to regulate peripheral metabolic processes [ 9 , 15 , 1929 ] . specifically , direct activation of either ( i ) insulin / leptin signalling pathways [ 19 , 2125 , 28 , 29 ] or ( ii ) lipid / lactate metabolism [ 9 , 15 , 30 ] in the hypothalamus regulates gp . furthermore , the demonstration of the ability of the hypothalamus to sense circulating lipids to lower hepatic glycogenolysis and restrain gp ( 9 ) has added to the existing knowledge of nutrient - sensing mechanisms in the body . these findings , at least in rodents , suggest that the hypothalamus senses circulating lipids to counteract the direct stimulatory effects of lipids on hepatic gluconeogenesis . more importantly , the hypothalamic lipid - sensing mechanisms are disrupted in the early onset of diet - induced obese rodents , leading to a rise in gp and glucose levels induced by circulating lipids . these observations , if extended to human beings , could in part begin to address the inability of the body to compensate for a lipid - induced increase in gluconeogenesis in type 2 diabetes . in direct contrast to lipid - sensing mechanisms , the body has the ability to inhibit hepatic glycogenolysis and restrain gp to compensate for the circulating lactate - induced increase in hepatic gluconeogenesis in both normal individuals and metabolic stress individuals with hyperglycaemia [ 8 , 12 ] . in an attempt to elucidate the underlying mechanisms of lactate - sensing that are responsible for this metabolic restraining effect on gp , we assessed whether the hypothalamus senses an elevation of plasma lactate by 2- to 2.5-fold for 4 hrs to restrain gp and maintain glucose homeostasis in normal rodents . lactate infusion on glucose metabolism during pancreatic clamps when gluco - regulatory hormones ( fig . consistent with previous reports in human beings , an elevation of circulating gluconeogenic substrate precursor lactate did not increase gp . we postulated that the hypothalamic lactate - sensing mechanisms are responsible for this glucose homeostatic control . if this is true , direct inhibition of cns / hypothalamic lactate - sensing mechanisms should disrupt glucose homeostasis and lead to an elevation of gp in the presence of systemic lactate infusion . to test this hypothesis , we directly inhibited cns / hypothalamic lactate - sensing mechanisms that regulate gp by three independent approaches . during the pancreatic clamps , all three experimental interventions in the brain ( comparing to control ) led to a marked increase in gp in response to systemic lactate infusion . together , these data indicated that hypothalamic sensing mechanisms restrain systemic lactate to increase gp and is required to maintain glucose homeostasis . this is the first report that demonstrates the ability of the hypothalamus to sense circulating lactate to restrain gp and maintain glucose homeostasis in normal rodents . in light of the fact that ( i ) direct central lactate administration lowers gp in normal and early onset of diabetic and obese rodents [ 14 , 15 ] , ( ii ) an elevation of circulating lactate does not increase gp in normal individuals and in patients with hyperglycaemia [ 8 , 12 ] , we postulate that the hypothalamic nutrient sensing mechanisms of circulating lactate are intact in at least the early onset of diabetes and obesity . however , future studies are required to address this hypothesis . in summary , our data suggest that therapeutic strategies designed to activate cns lactate - sensing mechanisms by physiological route could eventually be proven useful to maintain glucose homeostasis in vivo .","<S> emerging studies indicate that hypothalamic hormonal signalling pathways and nutrient metabolism regulate glucose homeostasis in rodents . </S> <S> although hypothalamic lactate - sensing mechanisms have been described to lower glucose production ( gp ) , it is currently unknown whether the hypothalamus senses lactate in the blood circulation to regulate gp and maintain glucose homeostasis in vivo . to examine </S> <S> whether hypothalamic sensing of circulating lactate is required to regulate gp , we infused intravenous ( i.v . ) </S> <S> lactate in the absence or presence of inhibition of central / hypothalamic lactate - sensing mechanisms in normal rodents . </S> <S> inhibition of central / hypothalamic lactate - sensing mechanisms was achieved by three independent approaches . </S> <S> tracer - dilution methodology in combination with the pancreatic clamp technique was used to assess the effect of i.v . and central / hypothalamic administrations on glucose metabolism in vivo . </S> <S> in the presence of physiologically relevant increases in the levels of plasma lactate , inhibition of central lactate - sensing mechanisms by lactate dehydrogenase inhibitor oxamate ( oxa ) or atp - sensitive potassium channels blocker glibenclamide increased gp . </S> <S> furthermore , direct administration of oxa into the mediobasal hypothalamus increased gp in the presence of similar elevation of circulating lactate . together , these data indicate that hypothalamic sensing of circulating lactate regulates gp and is required to maintain glucose homeostasis . </S>"
2,"in recent studies , drug - eluting stents ( des ) were more widely used than bare - metal stents ( bms ) in patients who underwent percutaneous coronary intervention ( pci).1 recently , several significant complications after implantation of des have been reported , and stent thrombosis ( st ) is a rare but fatal complication among them . discontinuation of dual antiplatelet therapy is known to be a risk factor for st in patients after implantation of des.2 - 4 therefore , dual antiplatelet therapy is recommended to be maintained for at least 12 months after stent implantation to prevent late stent thrombosis ( lst ) or very late stent thrombosis ( vlst ) . however , it is still unclear how long dual antiplatelet therapy is needed and when the antiplatelet therapy can be safely stopped . here we report a case of a patient with acute myocardial infarction due to vlst that occurred 1 week after discontinuation of 5 years of dual antiplatelet therapy after implantation of a sirolimus - eluting stent . a man with no risk factors for coronary artery disease except smoking underwent pci in may 2005 at age 44 . a 3.023 mm cypher stent was deployed in the proximal right coronary artery . seven days after the discontinuation of clopidogrel , he experienced severe chest pain and was transported to the emergency room by ambulance . a physical examination revealed a temperature of 36.0 , blood pressure of 90/60 mmhg , and a regular heart rate of 100 beats / min without murmur or gallop . pulmonary rales , peripheral edema , or other clinical signs of congestive heart failure were not present . an electrocardiogram revealed st - segment elevation in lead ii , iii , and avf . the peak level of creatine kinase ( ck ) was 1,334 u / l , ck - mb was 75.4 u / l , and troponin - i was 38.1 ng / ml . transthoracic echocardiography in the emergency room revealed inferior wall akinesia , and his ejection fraction was 45% . the patient received 300 mg of aspirin and 600 mg of clopidogrel as a loading dose and was transferred to the cardiac catheterization laboratory . his pain to door time was about 60 minutes , and the door to balloon time was 72 minutes . an emergent coronary angiogram revealed a thrombotic total occlusion at the proximal right coronary artery ( fig . the occluded right coronary artery was revascularized successfully by balloon angioplasty with a 3.020 mm balloon ( fig . after pci , ultegra rapid platelet function assay ( rpfa)-asa and ultegra rpfa - p2y12 ( verifynowassay ) were performed to determine aspirin and clopidogrel resistance . the patient was discharged uneventfully and followed up with dual antiplatelet therapy of aspirin and clopidogrel . there have been no adverse events with clinical follow - up for 1 year , and a follow - up coronary angiogram revealed no significant stent restenosis . des interrupt re - endothelialization of the vessels , which results in a lower rate of target lesion revascularization than with bms . long and multiple stents , stent under - expansion or stent malposition , residual dissection , and resistance to aspirin and clopidogrel have been suggested as other possible causes of st.5 - 7 several studies revealed a higher rate of lst and vlst after des implantation than after bms implantation . dual antiplatelet therapy reduces subacute thrombotic events after pci , and at least 12 months of dual antiplatelet therapy after pci is recommended in the current guidelines to prevent st . whether long - term maintenance of dual antiplatelet therapy can prevent lst or vlst is still controversial . park et al reported that clopidogrel continuation beyond 1 year did not appear to decrease stent thrombosis and clinical events after des implantation,8 although tanzili et al concluded that 2 years of dual antiplatelet therapy can prevent the occurrence of vlst after des implantation.9 triple antiplatelet therapy adding cilostazol is considered to be another choice for preventing stent thrombosis in patients with clopidogrel resistance or for the prevention of recurrent stent thrombosis . the disadvantage of triple therapy is the bleeding tendency , but the bleeding tendency of triple therapy was reported to be not much higher than that of dual antiplatelet therapy in several studies . however , the effect of triple therapy on long - term survival or cardiac events is controversial as well.10,11 in this patient , there were no risk factors for st except former smoking and no evidence of aspirin or clopidogrel resistance . the patient had been treated with dual antiplatelet agents for enough time ( 5 years ) after des implantation , but vlst occurred at 1 week after discontinuation of clopidogrel . we concluded that 12 months of dual antiplatelet therapy may not be enough for the prevention of lst or vlst after des implantation in some patients who do not show complete re - endothelialization of the coronary artery . concerning the decision to cease clopidogrel therapy , we suggest that it should depend on the condition of the patient and risk factors for st such as underlying disease , the length or location of the lesion , the resistance of aspirin or clopidogrel , and so on . perhaps the development of imaging systems such as optical coherence tomography will give us more precise information . we suggest that physicians educate their patients about the hazards of premature cessation of dual antiplatelet therapy and delay performing surgical procedures until 1 year after implantation of des .","<S> drug - eluting stents ( des ) have reduced the rate of repeated revascularization of target lesions . for this reason , </S> <S> des are considered to be superior to bare - metal stents in reducing the restenosis rate . </S> <S> however , some problems have been reported after implantation of des . </S> <S> one of them , stent thrombosis , has arisen as a fatal complication . </S> <S> dual antiplatelet therapy is recommended for at least 12 months after implantation of des to prevent stent thrombosis . here , </S> <S> we report a case of very late stent thrombosis that occurred 1 week after discontinuation of clopidogrel at 5 years ( 1832 days ) after implantation of a sirolimus - eluting stent . </S>"
3,"the cerebral cortex of mammals , including humans , displays oscillatory spindle activity ( 1015 hz ) across the lifespan . in adults , sleep spindles , a specific type of spindle activity , occur exclusively during non - rem sleep [ 14 ] and have been implicated in memory retention and skill learning [ 57 ] . in humans , sleep spindles first appear 49 weeks postterm and become more prominent and frequent over the next several months . there exists another type of spindle activity called spindle bursts that is phenomenologically similar to sleep spindles in that they share a similar frequency range , duration , and spindle - shaped waveform . for example , spindle bursts predominate early in development : they are readily observed in premature human infants and are thought to disappear soon after birth . in rats , spindle bursts have been recorded in sensorimotor and visual cortex soon after birth through at least the end of the second postnatal week [ 8 , 10 , 11 ] . in addition , unlike sleep spindles , spindle bursts are not exclusive to non - rem sleep but rather can occur across the sleep - wake cycle . finally , spindle bursts are most closely associated with self - generated or evoked activity in the sensory periphery ; for example , manually stimulating the limb of a rat or human , during sleep or wakefulness , elicits spindle bursts in sensorimotor cortex [ 8 , 9 , 11 , 12 ] . across all sensory modalities , however , self - generated activity in the sensory periphery is the predominant trigger of spindle bursts during early development . in the visual system , for example , retinal ganglion cells fire spontaneously before eye opening , producing waves of retinal activity that provide downstream afferent input to visual brain areas [ 10 , 13 , 14 ] . in the auditory system , cochlear spiral ganglion cells fire bursts of activity before the onset of hearing that provide downstream afferent input to the auditory system ( although spindle bursts have not yet been recorded in auditory cortex , it is likely that they occur ) [ 1517 ] . by providing substantial sensory experience during early development , retinal waves and cochlear bursts shape and refine functional links between the sensory periphery and developing brain networks [ 14 , 18 , 19 ] . in the developing sensorimotor system , spontaneous activity takes the form of myoclonic twitching , which is characterized by jerky movements of the forelimbs and hindlimbs , tail , whiskers , and eyes during rem sleep [ 8 , 9 , 11 , 20 , 21 ] . in neonatal rats , sensory feedback from twitching limbs triggers activity throughout the nervous system and has been hypothesized to contribute to the development of sensorimotor circuits [ 8 , 11 , 20 , 22 , 23 ] . moreover , just as retinal waves trigger spindle bursts in visual cortex , sensory feedback from twitching limbs ( i.e. , reafference ) triggers spindle bursts in somatosensory [ 8 , 2426 ] and motor [ 11 , 12 ] cortex . twitches , however , differ from retinal waves in two fundamental ways : first , although retinal waves are transient features of development , twitches persist into adulthood [ 28 , 29 ] and second , although retinal waves seem to occur independently of behavioral state , twitches are an exclusive feature of rem sleep . to reiterate , twitches trigger spindle bursts in sensorimotor cortex [ 8 , 11 , 12 , 24 , 25 ] , occur exclusively during rem sleep , originate prenatally [ 9 , 31 ] , and persist into adulthood [ 28 , 29 ] . curiously , however , spindle bursts have not been reported during rem sleep in adults . one possible explanation for this absence is that twitches in adults are somehow prevented from triggering spindle bursts in sensorimotor cortex . given how reliably twitches activate the cerebral cortex in early development , this possibility requires that a mechanism emerges at some point between infancy and adulthood that gates twitch - triggered reafference . for example , there is evidence in adult cats of spinally mediated sensory gating that is stronger during rem sleep than during wakefulness . however , even during rem sleep , this gating mechanism only partially blocks sensory feedback . therefore , it remains possible , if not likely , that twitches continue to trigger spindle bursts in the adult sensorimotor cortex . \n figure 1 presents local field potentials ( lfps ) in infants rats recorded from the hindlimb region of sensorimotor cortex in relation to twitches detected in the hindlimb electromyogram ( emg ) ( data from ) . at postnatal days ( p ) 4 and 810 , when background lfp activity is low , hindlimb twitches trigger easily discernible cortical spindle bursts . indeed , because of the low background activity at these ages , visual inspection of the lfp alone that is , without guidance provided by the hindlimb emg allows one to confidently state when a spindle burst occurred . also , the lfp alone is sufficient to predict with high confidence when hindlimb twitches occurred . in contrast , by p12 , lfp background activity has increased such that spindle bursts are now much harder to detect . in this case , the lfp alone is not sufficient to predict with high confidence when hindlimb twitches occurred . given this dramatic increase in background activity between p810 and p12 , we expect spindle bursts to disappear further into the background over the ensuing days and weeks . but the precise relationship between twitches and spindle bursts is revealed using a twitch - triggered averaging method . in the case of sleeping infant rats at p4 , p810 , and p12 , averaging lfp power in relation to hindlimb twitches reveals increased activity in the hindlimb region of sensorimotor cortex within ~150 ms of a twitch ( figure 2(a ) ) . time - frequency spectrograms reveal that these peaks in twitch - related power occur at spindle burst frequency ( figure 2(b ) ) . note that , at p4 , the spectrogram displays a discrete hotspot immediately after hindlimb twitches . by contrast , at p12 , there is increased background activity in the spindle frequency range ; nonetheless , a concentrated twitch - related hotspot in spindle frequency is still apparent . these results suggest that , with increasing age , spindle bursts are harder to detect because they are obscured by increases in background lfp activity . event - triggered averaging , such as that illustrated in figure 2 , is routinely used to reveal hidden or obscured cortical events and oscillations . for example , cognitive neuroscientists use event - related potentials ( erps ) to reveal cortical activity associated with sensory , motor , or cognitive process ( figure 3 ; ) . because eeg electrodes reflect ongoing activity from thousands of neurons , the resulting signals are necessarily noisy . as a result , an individual event ( e.g. , a flash of light ) may not be evident in the raw eeg signal . however , by averaging over many event - triggered trials , random noise that is not associated with neural processing of the stimulus cancels out , leaving behind a distinct and stereotyped cortical pattern the erp that reflects underlying neural processes . we propose that twitch - related spindle bursts have not yet been detected in the activated eeg of human adults during rem sleep because event - triggered averaging would be necessary to observe them . undoubtedly , there are many published studies in human and nonhuman adult animals in which cortical activity recorded across behavioral states can be related to peripheral motor activity , including twitching ( e.g. , see ) . critically , to our knowledge , no study in adults has specifically used twitches for event - triggered averaging of cortical activity in somatotopically related areas of sensorimotor cortex . species - typical and behaviorally relevant body parts the bill of a platypus , the star appendage of a star - nosed mole , and the digits of a raccoon have magnified representations in sensorimotor cortex . the degree of magnification reflects the innervation of these peripheral morphological features as well as their use . if twitches are functionally important for the development and maintenance of sensorimotor circuits [ 8 , 23 ] , then we expect the quantity and patterning of twitching to reflect an appendage 's innervation density , biomechanics , and behavioral importance across species [ 23 , 36 ] . like other primates , humans are a highly visual species that execute hundreds of thousands of saccadic eye movements ( 3 per second ) during the day . interestingly , much like the skeletal muscles that control the limbs , the extraocular skeletal muscles that move the eyes twitch during rem sleep . in fact , the resulting rapid eye movements give rem sleep its name . moreover , much like the limbs of the body , proprioceptive reafference from eye movements in rats and humans is relayed to sensorimotor cortex [ 39 , 40 ] . since twitches of all skeletal muscles studied thus far trigger spindle bursts [ 8 , 11 , 21 ] because rapid eye movements are a prominent feature of rem sleep in human adults , the sensorimotor system that controls eye movement could be an ideal model for exploring the contributions of spindle bursts to the calibration , maintenance , and repair of neural networks . whereas humans , which are diurnal , rely heavily on the visual system for spatial navigation , rats , which are nocturnal , rely heavily on the whiskers . the whiskers of rats and other rodents are controlled by an elaborate set of extrinsic and intrinsic striated muscles [ 41 , 42 ] . importantly , in infant rats , these muscles twitch during rem sleep and reafference from twitches activates the whisker thalamus and barrel cortex . spindle bursts are readily detected in the infant rat 's barrel cortex , occurring immediately after twitching ; the developmental onset of whisker - related cortical activity could contribute to the refinement of the somatotopic map in barrel cortex . whisker twitching has also been observed in adult rats but has not been systematically investigated . therefore , similar to rapid eye movements in humans , we expect whisker twitching and associated spindle bursts to reflect the outsized role of this sensory modality in rats . extrapolating to other species , highly specialized sensorimotor appendages should exhibit relatively high rates of twitching . the bill contains a high density of electro- and mechanoreceptors and , consequently , a large proportion of the platypus ' cerebral cortex is devoted to processing sensory input from the bill [ 46 , 47 ] . as we would expect , platypuses exhibit vigorous twitches of the bill ( as well as the eyes and limbs ) during rem sleep , even as adults [ 48 , 49 ] . the star - nosed mole uses a specialized set of 22 facial appendages that it uses like a tactile fovea to forage for food . similar to the bill of the platypus , the star appendages contain a high density of somatosensory receptors and , consequently , a large proportion of this animal 's cerebral cortex is devoted to each appendage . although there is no information on the sleeping habits of the star - nosed mole , we predict that it will be found to exhibit high rates of twitching in the star appendages during rem sleep , as well as twitch - associated spindle bursts in sensorimotor cortex . it has been argued that local changes in cortical slow - wave activity can be triggered by learning processes that differentially involve specific brain regions . similarly , after the development of a new waking skill or in response to changes in the sensory periphery ( e.g. , due to injury ) , perhaps twitches contribute to the process of learning , adaptation , or recovery of function . for example , in one study , human adults wore goggles during the day that decreased their visual field to 5 degrees . when these subjects were observed over several days during sleep , the duration of rem sleep was unaffected , but the frequency and amplitude of rapid eye movements increased significantly . because the changes in rapid eye movements were a transient response to a waking perturbation of visual experience , these findings suggest to us that twitching in this case in the form of rapid eye movements contributes to the process of visual adaptation . this experimental approach that is , the manipulation of waking motor experience in human and non - human subjects could provide the opportunity to explore the contributions of twitching and twitch - related spindle bursts to learning and adaptation in health and disease . until recently , twitches were widely perceived as functionless byproducts of dreaming , providing little reason to search for their neural consequences . only with the discovery of twitch - related neural activity in infant rats , it has become apparent that twitches could play a functional role in sensorimotor development , akin to spontaneous activity in other sensory systems . it is the persistence of twitching into adulthood , in humans and other mammals , that raises the intriguing possibility that twitching has much more to reveal to us about its functional contributions to neural plasticity within the sensorimotor system . although the idea that twitch - related spindle bursts persist into adulthood is speculative , it can easily be tested using new or even perhaps existing recordings . if evidence of twitch - related spindle bursts is found , there will be a strong basis for expanding our understanding of the functions of spindle activity during non - rem sleep about which we currently know a lot to include rem sleep as well .","<S> sleep spindles are brief cortical oscillations at 1015 hz that occur predominantly during non - rem ( quiet ) sleep in adult mammals and are thought to contribute to learning and memory . </S> <S> spindle bursts are phenomenologically similar to sleep spindles , but they occur predominantly in early infancy and are triggered by peripheral sensory activity ( e.g. , by retinal waves ) ; accordingly , spindle bursts are thought to organize neural networks in the developing brain and establish functional links with the sensory periphery . whereas the spontaneous retinal waves that trigger spindle bursts in visual cortex are a transient feature of early development , the myoclonic twitches that drive spindle bursts in sensorimotor cortex persist into adulthood . </S> <S> moreover , twitches and their associated spindle bursts occur exclusively during rem ( active ) sleep . curiously , despite the persistence of twitching into adulthood , twitch - related spindle bursts have not been reported in adult sensorimotor cortex . </S> <S> this raises the question of whether such spindle burst activity does not occur in adulthood or , alternatively , occurs but has yet to be discovered . </S> <S> if twitch - related spindle bursts do occur in adults , they could contribute to the calibration , maintenance , and repair of sensorimotor systems . </S>"
4,"the anterior cruciate ligament ( acl ) is one of major knee ligaments and is critical to knee stability . injury to the acl can be a debilitating musculoskeletal injury seen most often in athletes . the incidence of acl injuries is currently estimated at approximately 200,000 in usa annually , with 100,000 acl reconstructions performed each year . in general , the incidence of acl injury is higher in people who participate in high - risk sports , such as basketball , football , skiing , and soccer . the goal of the acl reconstruction surgery is to prevent instability and restore the function of the torn ligament , creating a stable knee so that the young can go back to the sporting activities . acl reconstruction is uasually performed using either the patellar bone tendon and semitendinosis and gracilis tendons , and both are not free from complications . the most common is the graft failure and stretching of the graft due to delay in the tendon - bone healing and tendon - bone incorporation of a tendon - graft within the bone tunnel . improvement of graft healing to bone is crucial to facilitate early and aggressive rehabilitation and a rapid return to full activity . to counteract this bone morphogenetic factors have been used with good results to improve the bone in growth in the tendon . ( 2012 ) reported a potential role of growth factors and bio - scaffolds for improving healing and mechanical integrity of the acl injury that is reconstructed with a tendon - graft . it was shown that the use of a collagen - platelet - rich plasma scaffold stimulated healing of a defect in the canine acl . many other growth factors have been used in the early and better bone ingrowth at the site of acl reconstruction with bone tendon - graft and tendon - graft . sadat - habdan msenchymal stimulating peptide ( shmsp ) was discovered at university of dammam , dammam and king fahd hospital of the university , alkhobar and was patented in 2008 ( united states patency and trade office , us 7,399,826 , b1 given on july 15 , 2008 ) . shmp is a 13 amino acids with a molecular weight of 1460 kd , which is now available in the synthesized form . a recent study showed that when topically applied there was early and better healing in diabetic animals . the objective of this study is to assess the efficacy of a bone growth factor ( shmsp ) in the rate of healing of bone tendon interface and osteo - integration of the tendon at the tunnel . rabbits were procured and were left in the animal house for 2 weeks for acclimatization to the surrounding . under ketamine 50 ml / kg weight and xylazine 35 ml / kg weight animals were anesthetized . a 3 cm long lateral half of tendoachilles tendon of the left side was harvested and a bone tunnel was made in the region of acl and a 1/0 ethilon suture was passed through the end of the harvested tendon [ figure 1 ] . the tendon was place in the amorphous powder of the shmsp as a growth factor at a dose of 5 ml / kg body weight a 2.5 mm drill hole was made at the acl going between the tibia and the femur . the tendon was passed through the bony tunnel made and secured to each end of the tunnel with a 2/0 dexon [ figures 2 and 3 ] . in the control group , the procedure was repeated without the addition of the shmsp . both groups of animals were kept in the similar circumstances and monitored on a regular basis . after 4 weeks , 5 animals from each group were euthanized and 8 weeks the rest of the animals were euthanized . the lower limb was disarticulated at the hip joint and stored in 2% formalin at a temperature of 4c and before histopathological analysis was done . harvesting of the tendon harvesting and drilling of the tendon positioning of the tendon the two groups were compared specifically for the bone in growth in the drill hole made at the tibial end through , which the tendon - graft was passed . the study was approved by the institutional review board of the university of dammam and funded by the deanship of scientific research of university of dammam , saudi arabia . figures 14 shows harvesting of the tendon , drilling , and position of the graft . in all the animals of the study group at 4 weeks showed , newly formed osteoid was observed at places early of the bone formation encroaching the tunnel , where in the control group tunnel was filled with the granulation tissue [ figures 5 and 6 ] . photomicrograph of control group at 4 weeks showing fibro - collagenous tissue ( fibrosis ) ; ( h and e , 40 ) photomicrograph of study group at 4 weeks showing early bony exostosis in the tunnel ; ( h and e , 40 ) by 8 weeks in the study group , the canal was totally obliterated with increased mineralization of the new bone and at places seen extending onto the periosteal surface . in the control , there was minimal change in the formation of the new bone formation but there was more granulation tissue leading to form the connective tissue [ figures 7 and 8 ] . photomicrograph of control group at 8 weeks showing the whole tunnel is filled with granulation tissue with no signs of any new bone formation ; ( h and e , 40 ) photomicrograph of study group at 8 weeks showing abundant new bony formation and little granulation tissue in the tunnel ; ( h and e , 40 ) our study showed that in animals in which shmsp was used to augment healing of the tendon - graft in the osseous tunnel there was exuberant bone formation , which was higher and more organized when compared to the control group of animals . the changes were subtle initially but 8 weeks the difference was more pronounced and appreciable . the histological specimens showed increased trabecular bone close to the grafted tendons as early as 4 weeks after implantation . different methods and growth factors were used to improve healing of the bone tendon interface . found that improved bone formation around a tendon - graft using recombinant human bone morphogenetic protein-2 ( bmp-2 ) in an extra - articular bone tunnel in a dog model . similarly , nicklin et al . showed that exogenous osteogenic protein-1 results in improved bone formation at the tendon - bone interface in a sheep model . injected fibrin sealant ( ifs ) combined with bmp after acl reconstruction showed that the rate of new bone formation of ifs - bmp composite was significantly and achieved a more prolonged osteogenic effect . coated mesenchymal stem cells to the tendon - graft thereby enhancing the of tendon - graft osteo - integration . in our study , we coated the tendon with the shmsp to stimulate the osteo - integration in the tunnel with satisfactory results . in this study , we used shmsp a small polypeptide , which was reported as an angiogenesis factor and showed results comparable to other growth factors . the healing pattern between the tendon and the drill hole in the bone through which the tendon passes is not clearly understood but one this is certain that it takes many months probably to heal and incorporate and till that time certain activities are to be curtailed postsurgery . for a young athlete to be away from sporting activities is quite difficult and early activity to jeopardize the repair causing up to 10,000 revision yearly of acl reconstruction in the usa alone . if a osteo - integration of the bone there is still no clinical evidence regarding the use of growth factors , due to the fact that dosage of these factors still remain undeterminable as most of the half - life of growth factors is too short to stimulate the healing for weeks . with regard to shmsp which was used on a daily basis and we believe that our study has some limitations and the one which stands out that we did not perform bio - mechanical tests to assess the strength of the healing which was so convincing histologically and secondly a small sample of 10 animals on each side of the study arm . our study shows that local application of shmsp on the tendon - graft and instilling the growth factor in the tunnel itself enhanced the oseto - integration of the tendon into the bony tunnel created . we believe there is opportunity to convert this animal - based study into a clinical study when the safety of shmsp is established .","<S> background : reconstruction of the anterior cruciate ligament ( acl ) involves use of semintendinosis and gracilis tendons graft that is transplanted into bone tunnels at the femoral and tibial insertion sites and the sites and the bone tendon interface is a weak link in the early healing period due to slow rate of healing . </S> <S> we hypothesized that an addition of bone growth factor like sadat - habdan mesenchymal stimulating peptide ( shmsp ) could enhance bone tendon healing rate so that re - rupture of the tendon does not take place.methodology:twenty skeletally mature rabbits underwent acl reconstruction of the right knee . in 10 of the rabbits at the site of the tendon - graft 5 mg / kg body weight of shmsp </S> <S> was put in the bone tunnel . in 10 other animals , nothing was added . at eight and 12 weeks </S> <S> 5 animals from each group were sacrificed . </S> <S> the tendon - graft site was harvested and sent for histopathological examination to assess the healing at the tendon - bone graft to the tibial tunnel.results:there were no deaths in both the groups . </S> <S> one rabbit of the control group developed an infection . </S> <S> in all the animals of the study group from 4 weeks onward showed bone formation , wherein the control group only granulation tissue was observed . by 8 weeks in the study group , </S> <S> the canal was totally obliterated with the new bone formation which extended onto the periosteal area . in the control , there was minimal change in the formation of the new bone formation.conclusion:addition of a growth factor like shmsp would enhance the osteo - integration of the tendon - graft in the bony tunnel after acl reconstruction in vivo . </S>"


The metric is an instance of [`datasets.Metric`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Metric):

In [13]:
metric

Metric(name: "rouge", features: {'predictions': Value(dtype='string', id='sequence'), 'references': Value(dtype='string', id='sequence')}, usage: """
Calculates average rouge scores for a list of hypotheses and references
Args:
    predictions: list of predictions to score. Each predictions
        should be a string with tokens separated by spaces.
    references: list of reference for each prediction. Each
        reference should be a string with tokens separated by spaces.
    rouge_types: A list of rouge types to calculate.
        Valid names:
        `"rouge{n}"` (e.g. `"rouge1"`, `"rouge2"`) where: {n} is the n-gram based scoring,
        `"rougeL"`: Longest common subsequence based scoring.
        `"rougeLSum"`: rougeLsum splits text using `"
"`.
        See details in https://github.com/huggingface/datasets/issues/617
    use_stemmer: Bool indicating whether Porter stemmer should be used to strip word suffixes.
    use_agregator: Return aggregates if this is set to True
Retu

You can call its `compute` method with your predictions and labels, which need to be list of decoded strings:

In [14]:
fake_preds = ["hello there", "general kenobi"]
fake_labels = ["hello there", "general kenobi"]
metric.compute(predictions=fake_preds, references=fake_labels)

{'rouge1': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rouge2': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rougeL': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rougeLsum': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0))}

## Preprocessing the data

Before we can feed those texts to our model, we need to preprocess them. This is done by a 🤗 `Transformers` `Tokenizer` which will (as the name indicates) tokenize the inputs (including converting the tokens to their corresponding IDs in the pretrained vocabulary) and put it in a format the model expects, as well as generate the other inputs that the model requires.

To do all of this, we instantiate our tokenizer with the `AutoTokenizer.from_pretrained` method, which will ensure:

- we get a tokenizer that corresponds to the model architecture we want to use,
- we download the vocabulary used when pretraining this specific checkpoint.

That vocabulary will be cached, so it's not downloaded again the next time we run the cell.

In [15]:
from transformers import AutoTokenizer
    
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

Downloading:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.76k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

By default, the call above will use one of the fast tokenizers (backed by Rust) from the 🤗 `Tokenizers` library.

You can directly call this tokenizer on one sentence or a pair of sentences:

In [16]:
tokenizer("Hello, this one sentence!")

{'input_ids': [0, 31414, 6, 42, 65, 3645, 328, 2], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1]}

Depending on the model you selected, you will see different keys in the dictionary returned by the cell above. They don't matter much for what we're doing here (just know they are required by the model we will instantiate later), you can learn more about them in [this tutorial](https://huggingface.co/transformers/preprocessing.html) if you're interested.

Instead of one sentence, we can pass along a list of sentences:

In [17]:
tokenizer(["Hello, this one sentence!", "This is another sentence."])

{'input_ids': [[0, 31414, 6, 42, 65, 3645, 328, 2], [0, 713, 16, 277, 3645, 4, 2]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1]]}

To prepare the targets for our model, we need to tokenize them inside the `as_target_tokenizer` context manager. This will make sure the tokenizer uses the special tokens corresponding to the targets:

In [18]:
with tokenizer.as_target_tokenizer():
    print(tokenizer(["Hello, this one sentence!", "This is another sentence."]))

{'input_ids': [[0, 31414, 6, 42, 65, 3645, 328, 2], [0, 713, 16, 277, 3645, 4, 2]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1]]}


If you are using one of the five T5 checkpoints we have to prefix the inputs with "summarize:" (the model can also translate and it needs the prefix to know which task it has to perform).

In [19]:
if model_checkpoint in ["t5-small", "t5-base", "t5-larg", "t5-3b", "t5-11b"]:
    prefix = "summarize: "
else:
    prefix = ""

We can then write the function that will preprocess our samples. We just feed them to the `tokenizer` with the argument `truncation=True`. This will ensure that an input longer that what the model selected can handle will be truncated to the maximum length accepted by the model. The padding will be dealt with later on (in a data collator) so we pad examples to the longest length in the batch and not the whole dataset.

The max input length of `sshleifer/distilbart-cnn-12-6` is 1024, so `max_input_length = 1024`.

In [20]:
max_input_length = 1024
max_target_length = 256

def preprocess_function(examples):
    inputs = [prefix + doc for doc in examples["article"]]
    model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)

    # Setup the tokenizer for targets
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["abstract"], max_length=max_target_length, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

This function works with one or several examples. In the case of several examples, the tokenizer will return a list of lists for each key:

In [21]:
preprocess_function(raw_datasets['train'][:2])

{'input_ids': [[0, 405, 11493, 11, 55, 87, 654, 207, 9, 1484, 8, 189, 1338, 1814, 207, 11, 1402, 3505, 9, 16640, 2156, 941, 11, 1484, 11793, 17930, 8, 73, 368, 13785, 5804, 4, 134, 41, 23249, 16, 6533, 25, 41, 15650, 17215, 672, 9, 23385, 43202, 36, 1368, 428, 4839, 36, 1368, 428, 28696, 316, 821, 1589, 385, 462, 4839, 8, 189, 16072, 25, 10, 898, 9, 5, 7482, 2199, 2156, 13162, 2156, 2129, 10894, 2156, 17930, 2156, 50, 13785, 5804, 479, 6104, 3218, 3608, 14, 7967, 8, 18327, 139, 111, 2174, 797, 71, 13785, 5804, 2156, 941, 11, 471, 8, 5397, 16640, 2156, 189, 28, 13969, 30, 41, 23249, 4, 1978, 41, 23249, 747, 41089, 1290, 5298, 215, 25, 16069, 2156, 8269, 2156, 8, 25599, 642, 22423, 2156, 8, 4634, 189, 33, 10, 2430, 1683, 15, 1318, 9, 301, 36, 2231, 1168, 4839, 8, 819, 2194, 11, 1484, 19, 1668, 479, 4634, 2156, 7, 1477, 2166, 13838, 2156, 2231, 1168, 2156, 8, 17618, 32444, 11, 1484, 19, 1668, 2156, 24, 74, 28, 5701, 7, 185, 10, 16300, 1548, 11, 9397, 9883, 54, 240, 1416, 13, 1668, 111, 30

To apply this function on all the pairs of sentences in our dataset, we just use the `map` method of our `dataset` object we created earlier. This will apply the function on all the elements of all the splits in `dataset`, so our training, validation and testing data will be preprocessed in one single command.

In [22]:
tokenized_datasets = raw_datasets.map(preprocess_function, batched=True)

  0%|          | 0/8 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

Even better, the results are automatically cached by the 🤗 `Datasets` library to avoid spending time on this step the next time you run your notebook. The 🤗 `Datasets` library is normally smart enough to detect when the function you pass to map has changed (and thus requires to not use the cache data). For instance, it will properly detect if you change the task in the first cell and rerun the notebook. 🤗 `Datasets` warns you when it uses cached files, you can pass `load_from_cache_file=False` in the call to `map` to not use the cached files and force the preprocessing to be applied again.

Note that we passed `batched=True` to encode the texts by batches together. This is to leverage the full benefit of the fast tokenizer we loaded earlier, which will use multi-threading to treat the texts in a batch concurrently.

## Fine-tuning the model

Now that our data is ready, we can download the pretrained model and fine-tune it. Since our task is of the sequence-to-sequence kind, we use the `AutoModelForSeq2SeqLM` class. Like with the tokenizer, the `from_pretrained` method will download and cache the model for us.

In [23]:
from transformers import AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq, Seq2SeqTrainingArguments, Seq2SeqTrainer

model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

Downloading:   0%|          | 0.00/1.14G [00:00<?, ?B/s]

Note that  we don't get a warning like in our classification example. This means we used all the weights of the pretrained model and there is no randomly initialized head in this case.

To instantiate a `Seq2SeqTrainer`, we will need to define three more things. The most important is the [`Seq2SeqTrainingArguments`](https://huggingface.co/transformers/main_classes/trainer.html#transformers.Seq2SeqTrainingArguments), which is a class that contains all the attributes to customize the training. It requires one folder name, which will be used to save the checkpoints of the model, and all other arguments are optional:

In [24]:
batch_size = 2
model_name = model_checkpoint.split("/")[-1]
args = Seq2SeqTrainingArguments(
    f"{model_name}-finetuned-pubmed",
    evaluation_strategy = "epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=5,
    predict_with_generate=True,
    fp16=True,
    push_to_hub=True,
    seed = 42,
)

Here we set the evaluation to be done at the end of each epoch, tweak the learning rate, use the `batch_size` defined at the top of the cell and customize the weight decay. Since the `Seq2SeqTrainer` will save the model regularly and our dataset is quite large, we tell it to make three saves maximum. Lastly, we use the `predict_with_generate` option (to properly generate summaries) and activate mixed precision training (to go a bit faster).

The last argument to setup everything so we can push the model to the [Hub](https://huggingface.co/models) regularly during training. Remove it if you didn't follow the installation steps at the top of the notebook. If you want to save your model locally in a name that is different than the name of the repository it will be pushed, or if you want to push your model under an organization and not your name space, use the `hub_model_id` argument to set the repo name (it needs to be the full name, including your namespace: for instance `"sgugger/t5-finetuned-xsum"` or `"huggingface/t5-finetuned-xsum"`).

Then, we need a special kind of data collator, which will not only pad the inputs to the maximum length in the batch, but also the labels:

In [25]:
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

The last thing to define for our `Seq2SeqTrainer` is how to compute the metrics from the predictions. We need to define a function for this, which will just use the `metric` we loaded earlier, and we have to do a bit of pre-processing to decode the predictions into texts:

In [26]:
import nltk
import numpy as np

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    # Replace -100 in the labels as we can't decode them.
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
    
    # Rouge expects a newline after each sentence
    decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
    decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]
    
    result = metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)
    # Extract a few results
    result = {key: value.mid.fmeasure * 100 for key, value in result.items()}
    
    # Add mean generated length
    prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]
    result["gen_len"] = np.mean(prediction_lens)
    
    return {k: round(v, 4) for k, v in result.items()}

Then we just need to pass all of this along with our datasets to the `Seq2SeqTrainer`:

In [27]:
trainer = Seq2SeqTrainer(
    model,
    args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

Cloning https://huggingface.co/Kevincp560/distilbart-cnn-12-6-finetuned-pubmed into local empty directory.
Using amp half precision backend


We can now finetune our model by just calling the `train` method:

In [28]:
trainer.train()

The following columns in the training set  don't have a corresponding argument in `BartForConditionalGeneration.forward` and have been ignored: abstract, article. If abstract, article are not expected by `BartForConditionalGeneration.forward`,  you can safely ignore this message.
***** Running training *****
  Num examples = 8000
  Num Epochs = 5
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 20000


Epoch,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum,Gen Len
1,2.1709,2.025711,38.1012,15.112,23.4064,33.9373,141.9195
2,1.9495,1.959253,39.529,16.1693,24.487,35.5238,141.9785
3,1.756,1.948793,39.9623,16.5799,24.949,35.9194,141.8855
4,1.6032,1.973226,39.672,16.1994,24.5996,35.7021,141.921
5,1.4817,1.989518,40.0985,16.5016,24.8319,36.0775,141.884


Saving model checkpoint to distilbart-cnn-12-6-finetuned-pubmed/checkpoint-500
Configuration saved in distilbart-cnn-12-6-finetuned-pubmed/checkpoint-500/config.json
Model weights saved in distilbart-cnn-12-6-finetuned-pubmed/checkpoint-500/pytorch_model.bin
tokenizer config file saved in distilbart-cnn-12-6-finetuned-pubmed/checkpoint-500/tokenizer_config.json
Special tokens file saved in distilbart-cnn-12-6-finetuned-pubmed/checkpoint-500/special_tokens_map.json
tokenizer config file saved in distilbart-cnn-12-6-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in distilbart-cnn-12-6-finetuned-pubmed/special_tokens_map.json
Saving model checkpoint to distilbart-cnn-12-6-finetuned-pubmed/checkpoint-1000
Configuration saved in distilbart-cnn-12-6-finetuned-pubmed/checkpoint-1000/config.json
Model weights saved in distilbart-cnn-12-6-finetuned-pubmed/checkpoint-1000/pytorch_model.bin
tokenizer config file saved in distilbart-cnn-12-6-finetuned-pubmed/checkpoint-1000/token

TrainOutput(global_step=20000, training_loss=1.8202964630126952, metrics={'train_runtime': 21129.4689, 'train_samples_per_second': 1.893, 'train_steps_per_second': 0.947, 'total_flos': 6.17961574416384e+16, 'train_loss': 1.8202964630126952, 'epoch': 5.0})

You can now upload the result of the training to the Hub, just execute this instruction:

In [29]:
trainer.push_to_hub()

Saving model checkpoint to distilbart-cnn-12-6-finetuned-pubmed
Configuration saved in distilbart-cnn-12-6-finetuned-pubmed/config.json
Model weights saved in distilbart-cnn-12-6-finetuned-pubmed/pytorch_model.bin
tokenizer config file saved in distilbart-cnn-12-6-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in distilbart-cnn-12-6-finetuned-pubmed/special_tokens_map.json
Several commits (2) will be pushed upstream.
The progress bars may be unreliable.


Upload file pytorch_model.bin:   0%|          | 3.36k/1.14G [00:00<?, ?B/s]

Upload file runs/Mar06_16-25-08_93e0f11b91d5/events.out.tfevents.1646583939.93e0f11b91d5.76.0:  25%|##4       …

To https://huggingface.co/Kevincp560/distilbart-cnn-12-6-finetuned-pubmed
   13aa666..75d69bd  main -> main

To https://huggingface.co/Kevincp560/distilbart-cnn-12-6-finetuned-pubmed
   75d69bd..0dc1cd0  main -> main



'https://huggingface.co/Kevincp560/distilbart-cnn-12-6-finetuned-pubmed/commit/75d69bdac44b79d65558693140639296ddc51a6e'

You can now share this model with all your friends, family, favorite pets: they can all load it with the identifier `"your-username/the-name-you-picked"` so for instance:

```python
from transformers import AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("sgugger/my-awesome-model")
```