If you're opening this Notebook on colab, you will probably need to install 🤗 `Transformers` and 🤗 `Datasets` as well as other dependencies. 

* `datasets`
* `transformers`
* `rogue-score`
* `nltk`
* `pytorch`
* `ipywidgets`

*Note*: Since we are using the GPU to optimize the performance of the deep learning algorithms, `CUDA` needs to be installed on the device.

In [1]:
! pip install datasets transformers rouge-score nltk torch ipywidgets

Collecting datasets
  Downloading datasets-1.18.3-py3-none-any.whl (311 kB)
[K     |████████████████████████████████| 311 kB 5.3 MB/s 
[?25hCollecting transformers
  Downloading transformers-4.16.2-py3-none-any.whl (3.5 MB)
[K     |████████████████████████████████| 3.5 MB 59.2 MB/s 
[?25hCollecting rouge-score
  Downloading rouge_score-0.0.4-py2.py3-none-any.whl (22 kB)
Collecting xxhash
  Downloading xxhash-3.0.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)
[K     |████████████████████████████████| 212 kB 48.2 MB/s 
Collecting fsspec[http]>=2021.05.0
  Downloading fsspec-2022.2.0-py3-none-any.whl (134 kB)
[K     |████████████████████████████████| 134 kB 50.2 MB/s 
[?25hCollecting aiohttp
  Downloading aiohttp-3.8.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.1 MB)
[K     |████████████████████████████████| 1.1 MB 48.2 MB/s 
Collecting huggingface-hub<1.0.0,>=0.1.0
  Downloading huggingface_hub-0.4.0-p

When using `nltk`, `punkt` also needs to be installed. I guess it is not installed automatically. Not having `punkt` will result in an error during the analysis.

In [2]:
import nltk
nltk.download('punkt')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

If you're opening this notebook locally, make sure your environment has an install from the last version of those libraries.

To be able to share your model with the community and generate results like the one shown in the picture below via the inference API, there are a few more steps to follow.

First you have to store your authentication token from the Hugging Face website (sign up [here](https://huggingface.co/join) if you haven't already!) then execute the following cell and input your username and password:

In [3]:
from huggingface_hub import notebook_login

notebook_login()

Login successful
Your token has been saved to /root/.huggingface/token
[1m[31mAuthenticated through git-credential store but this isn't the helper defined on your machine.
You might have to re-authenticate when pushing to the Hugging Face Hub. Run the following command in your terminal in case you want to set this credential helper as the default

git config --global credential.helper store[0m


Then you need to install `Git-LFS`.

If you are not using `Google Colab`, you may need to install `Git-LFS` manually, since the code below may not work and depending on your operating system. You can read about `Git-LFS` and how to install it [here](https://git-lfs.github.com/).

In [4]:
! apt install git-lfs

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following package was automatically installed and is no longer required:
  libnvidia-common-470
Use 'apt autoremove' to remove it.
The following NEW packages will be installed:
  git-lfs
0 upgraded, 1 newly installed, 0 to remove and 39 not upgraded.
Need to get 2,129 kB of archives.
After this operation, 7,662 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 git-lfs amd64 2.3.4-1 [2,129 kB]
Fetched 2,129 kB in 1s (2,429 kB/s)
Selecting previously unselected package git-lfs.
(Reading database ... 155320 files and directories currently installed.)
Preparing to unpack .../git-lfs_2.3.4-1_amd64.deb ...
Unpacking git-lfs (2.3.4-1) ...
Setting up git-lfs (2.3.4-1) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...


Make sure your version of `Transformers` is at least 4.11.0 since the functionality was introduced in that version:

In [5]:
import transformers

print(transformers.__version__)

4.16.2


You can find a script version of this notebook to fine-tune your model in a distributed fashion using multiple GPUs or TPUs [here](https://github.com/huggingface/transformers/tree/master/examples/seq2seq).

# Fine-tuning a model on a summarization task

In this notebook, we will see how to fine-tune one of the [🤗`Transformers`](https://github.com/huggingface/transformers) model for a summarization task. We will use the [PubMed Summarization dataset](https://huggingface.co/datasets/ccdv/pubmed-summarization) which contains PubMed articles accompanied with abstracts.

![Widget inference on a summarization task](https://github.com/huggingface/notebooks/blob/master/examples/images/summarization.png?raw=1)

We will see how to easily load the dataset for this task using 🤗 `Datasets` and how to fine-tune a model on it using the `Trainer` API.

In [6]:
model_checkpoint = "facebook/bart-large-cnn"

This notebook is built to run  with any model checkpoint from the [Model Hub](https://huggingface.co/models) as long as that model has a sequence-to-sequence version in the Transformers library. Here we picked the [`facebook/bart-large-cnn`](https://huggingface.co/facebook/bart-large-cnn) checkpoint. 

## Loading the dataset

We will use the [🤗 `Datasets`](https://github.com/huggingface/datasets) library to download the data and get the metric we need to use for evaluation (to compare our model to the benchmark). This can be easily done with the functions `load_dataset` and `load_metric`.  

In [7]:
from datasets import load_dataset, load_metric

raw_datasets = load_dataset("ccdv/pubmed-summarization")
metric = load_metric("rouge")

Downloading:   0%|          | 0.00/4.88k [00:00<?, ?B/s]

No config specified, defaulting to: pub_med_summarization_dataset/document


Downloading and preparing dataset pub_med_summarization_dataset/document to /root/.cache/huggingface/datasets/ccdv___pub_med_summarization_dataset/document/1.0.0/5792402f4d618f2f4e81ee177769870f365599daa729652338bac579552fec30...


Downloading:   0%|          | 0.00/779M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/43.7M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/43.8M [00:00<?, ?B/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

0 examples [00:00, ? examples/s]

Dataset pub_med_summarization_dataset downloaded and prepared to /root/.cache/huggingface/datasets/ccdv___pub_med_summarization_dataset/document/1.0.0/5792402f4d618f2f4e81ee177769870f365599daa729652338bac579552fec30. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

Downloading:   0%|          | 0.00/2.16k [00:00<?, ?B/s]

The `dataset` object itself is [`DatasetDict`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasetdict), which contains one key for the training, validation and test set:

In [8]:
raw_datasets

DatasetDict({
    train: Dataset({
        features: ['article', 'abstract'],
        num_rows: 119924
    })
    validation: Dataset({
        features: ['article', 'abstract'],
        num_rows: 6633
    })
    test: Dataset({
        features: ['article', 'abstract'],
        num_rows: 6658
    })
})

To access an actual element, you need to select a split first, then give an index:

In [9]:
raw_datasets["train"][0]

{'abstract': "<S> background : the present study was carried out to assess the effects of community nutrition intervention based on advocacy approach on malnutrition status among school - aged children in shiraz , iran.materials and methods : this case - control nutritional intervention has been done between 2008 and 2009 on 2897 primary and secondary school boys and girls ( 7 - 13 years old ) based on advocacy approach in shiraz , iran . </S> <S> the project provided nutritious snacks in public schools over a 2-year period along with advocacy oriented actions in order to implement and promote nutritional intervention . for evaluation of effectiveness of the intervention growth monitoring indices of pre- and post - intervention were statistically compared.results:the frequency of subjects with body mass index lower than 5% decreased significantly after intervention among girls ( p = 0.02 ) . </S> <S> however , there were no significant changes among boys or total population . </S> <S> 

Since the `pubmed` data is extremely large, we are going to remove rows so that we have a training set of 8,000, a validation set of 2,000, and a test set of 2,000. 

In [10]:
raw_datasets["train"] = raw_datasets["train"].select(range(1, 8001))
raw_datasets["validation"] = raw_datasets["validation"].select(range(1, 2001))
raw_datasets["test"] = raw_datasets["test"].select(range(1, 2001))

To get a sense of what the data looks like, the following function will show some examples picked randomly in the dataset.

In [11]:
import datasets
import random
import pandas as pd
from IPython.display import display, HTML

def show_random_elements(dataset, num_examples=5):
    assert num_examples <= len(dataset), "Can't pick more elements than there are in the dataset."
    picks = []
    for _ in range(num_examples):
        pick = random.randint(0, len(dataset)-1)
        while pick in picks:
            pick = random.randint(0, len(dataset)-1)
        picks.append(pick)
    
    df = pd.DataFrame(dataset[picks])
    for column, typ in dataset.features.items():
        if isinstance(typ, datasets.ClassLabel):
            df[column] = df[column].transform(lambda i: typ.names[i])
    display(HTML(df.to_html()))

In [12]:
show_random_elements(raw_datasets["train"])

Unnamed: 0,article,abstract
0,"the alarming increase in overweight and obesity has played a pivotal role in the rise of type 2 diabetes prevalence . the obesity epidemic has also been associated with an increased prevalence of sleep disturbances , particularly obstructive sleep apnea ( osa ) ( 2,3 ) . over the past decade , both laboratory and epidemiologic studies have identified poor sleep quality and osa as putative novel risk factors for type 2 diabetes ( 46 ) . osa is a treatable chronic sleep disorder characterized by recurrent episodes of complete ( apnea ) or partial ( hypopnea ) obstruction of the upper airway . osa leads to intermittent hypoxemia and hypercapnia , increased oxidative stress , cortical microarousals , sleep fragmentation , and chronic sleep loss . hypopnea index ( ahi ) 5 events per hour , in nondiabetic obese adults have ranged from 50 to 68% ( 7,8 ) . in recent years , evidence has accumulated to indicate that osa is both a risk factor for type 2 diabetes and an exceptionally frequent comorbidity with an adverse impact on glycemic control . five independent studies , totaling nearly 1,400 patients with type 2 diabetes , have shown that the prevalence of osa ( assessed by polysomnography [ psg ] ) ranges between 58 and 86% ( 9,10 ) . independent of adiposity and other known confounders between the presence and severity of osa and insulin resistance and glucose intolerance in nondiabetic adults ( 1114 ) . two studies that used the gold standard of in - laboratory psg to accurately quantify the severity of osa reported a robust association between increasing osa severity and increasing levels of hemoglobin a1c ( hba1c ) in patients with type 2 diabetes , after controlling for multiple potential confounders ( 9,15 ) . while the findings of these two studies suggested that the effective treatment of osa may be a nonpharmacologic strategy to improve glucose control , the results of the only randomized , placebo - controlled clinical trial examining the impact of continuous positive airway pressure ( cpap ) treatment on hba1c in patients with type 2 diabetes were surprisingly disappointing ( 16 ) . one potential reason for the failure of osa treatment to improve chronic glycemic control in patients with type 2 diabetes is insufficient cpap use . notably , the mean nightly cpap use in this clinical trial was 3.6 h. as most of rapid eye movement ( rem ) sleep occurs in the early morning hours before habitual awakening , one possibility is that with suboptimal adherence to cpap therapy , obstructive apneas and hypopneas during rem sleep were disproportionately untreated compared with events in non - rem ( nrem ) sleep . this may be relevant to glycemic control because it is now well established that compared with nrem sleep , rem sleep is associated with greater sympathetic activity in healthy subjects as well as in patients with osa ( 1719 ) . further , compared with events in nrem sleep , obstructive apneas and hypopneas during rem sleep last nearly 30 s longer and are associated with significantly larger oxygen desaturation ( 2022 ) . therefore , obstructive events during rem sleep , as compared with nrem sleep , may have a larger adverse effect on insulin release and action . this issue has major clinical implications for the duration of cpap use that is needed to reverse the negative consequences of osa on glycemic control in type 2 diabetes . we have therefore performed a detailed analysis comparing the contributions of nrem versus rem osa to glycemic control as assessed by levels of hba1c in a large cohort of adults with type 2 diabetes . we prospectively recruited subjects with established type 2 diabetes using an advertisement posted in the primary care and endocrinology clinics at the university of chicago , inviting participation in a research study on sleep and diabetes . eligible individuals had to meet the criteria for type 2 diabetes based on physician diagnosis using standard criteria ( 23 ) . in order to include individuals with newly diagnosed type 2 diabetes , we also recruited in the community using an advertisement inviting subjects at risk for type 2 diabetes based on age and adiposity to participate in a research study on sleep and metabolism . all participants without an established diagnosis of type 2 diabetes had to undergo a standard 75-g oral glucose tolerance test and meet the american diabetes association guidelines for the diagnosis of type 2 diabetes ( 23 ) . all participants were in stable condition and , when on pharmacological treatment , on a stable antidiabetic medication regimen for the preceding 3 months . exclusion criteria were unstable cardiopulmonary conditions , neurological disorders , psychiatric disease , shift work , chronic insomnia , or any prior or current treatment for osa ( upper airway surgery , cpap therapy , oral appliances , or supplemental oxygen ) . we previously reported the association of osa severity categories ( no , mild , moderate , and severe osa ) with chronic glycemic control in type 2 diabetes using 60 of the participants included in this analysis ( 9 ) . the study was approved by the university of chicago institutional review board , and all participants gave written informed consent . subjects were admitted to the clinical resource center or the sleep research laboratory of the sleep , health , and metabolism center at the university of chicago to undergo an overnight in - laboratory psg . self - reported ethnicity - based diabetes risk was categorized as low - risk category ( non - hispanic whites ) and high - risk category ( african americans , hispanics , and asians ) . the duration of type 2 diabetes and the number of medications were verified by questionnaires as well as review of the patients medical records . hba1c values ( defined as the proportion of hemoglobin that is glycosylated ) were obtained from the patient s chart if assessed during the previous three months ( n = 17,15% of subjects ) or measured on a single blood sample drawn on the morning after the psg ( n = 98 , 85% of the subjects ) . hba1c was measured by bio - rad variant classic boronate affinity - automated high - performance liquid chromatography ( bio - rad , hercules , ca ) . the intra - assay coefficient of variation was 0.51.0% , and the interassay coefficient of variation was 2.22.4% . bedtimes were from 11:00 p.m.12:00 a.m. until 7:00 a.m.8:00 a.m. each subject was recorded for a minimum of 7 h to determine the presence and severity of obstructive respiratory events across the entire night . psg ( neurofax eeg 1100 system , nihon kohden , foothill ranch , ca ) included recordings of six electroencephalographic channels , bilateral electro - oculograms , chin and tibialis electromyogram , electrocardiogram , airflow by nasal pressure transducer and oronasal thermocouples , chest and abdominal wall motion by piezo electrode belts , and oxygen saturation by finger pulse oximeter . all psgs were staged and scored according to the 2007 american academy of sleep medicine manual for the scoring of sleep and related events ( 24 ) . apneas were defined as total cessation of airflow for at least 10 s ( obstructive if respiratory effort was present and central if respiratory effort was absent ) . hypopneas were scored if the magnitude of the ventilation signal decreased by at least 50% of the baseline amplitude of the nasal pressure transducer for at least 10 s and were associated with either a 3% or greater drop in oxygen saturation as measured by finger pulse oximetry or an electroencephalographic microarousal ( 24 ) . the total ahi was defined as the number of obstructive apneas and obstructive hypopneas per hour of sleep . given the minimal presence of central apneas , we did not include these events in the calculation of ahi . the median central apnea index was 0.001 ( interquartile range of 0.0010.41 ) , and the highest central apnea index was 5 . a subject was considered to have mild osa if the ahi was 514 , moderate osa if the ahi was 1529 , and severe osa if the ahi was 30 . rem ahi was calculated as the number of apneas and hypopneas during rem sleep divided by total time in rem sleep in hours . nrem ahi was calculated by dividing the number of apneas and hypopneas during nrem sleep by total time in nrem sleep in hours . the oxygen desaturation index ( odi ) was defined as the total number of oxygen desaturations of at least 3% per total sleep time ( tst ) in hours . the microarousal index ( mai ) was calculated as the total number of microarousals per hour of sleep . differences between subjects with and without osa were tested using the student t test or mann whitney nonparametric test for continuous variables . categorical variables were reported as proportions and were compared using the square test or fisher s exact test . five multivariate linear models were successively fitted to examine associations between hba1c and measures of osa severity after controlling for multiple covariates . model 1 included demographic variables traditionally associated with glycemic control , namely , age , sex , ethnicity - based diabetes risk , bmi , years of type 2 diabetes , and insulin use . lastly , model 5 included all the covariates in model 1 plus nrem ahi and rem ahi . since there are individuals who have a significant number of apneas and hypopneas during rem sleep while having an overall ahi below 5 ( hence no osa based on current definitions ) , we included all 115 subjects ( with or without osa ) in the multivariate regression models in order to explore the entire spectrum of rem and nrem events ( ahi , odi , and mai ) . we formally ruled out any evidence of collinearity among the variables entered in the models using standard statistics , including tolerance and variance inflation factor ( spss statistics v20 , ibm , armonk , ny ) . values for hba1c , years of type 2 diabetes , and rem and nrem ahi were submitted to natural log ( ln ) transformation . in order to deal with zero values , the total ahi , rem ahi , and nrem ahi were log transformed using the formula ahi = log(ahi + 0.01 ) . only one outlier was identified ( low hba1c ) , and sensitivity analysis excluding this subject was performed confirming the association between hba1c and measures of rem osa severity . goodness of fit of the models was assessed using diagnostic plots . in order to estimate the effect size of increasing severity of osa on hba1c in a clinically useful manner , models were fitted to estimate the change in adjusted hba1c based on quartiles of rem and nrem ahi . the models included all the covariates in model 5 ( age , sex , ethnicity - based diabetes risk , bmi , years of type 2 diabetes , and insulin use ) and replaced lnrem ahi with rem ahi quartiles while keeping lnnrem ahi in the model . this process was repeated , and lnnrem ahi was replaced with nrem ahi quartiles while keeping lnrem ahi in the model . similar models were fitted for rem and nrem odi and mai quartiles . to simulate the impact of various durations of nocturnal cpap therapy on hba1c , we calculated the mean profiles of cumulative minutes of rem and nrem sleep over 8 h of total recording time from the 115 polysomnograms . we then estimated mean percentages of rem and nrem sleep left untreated after 4 , 6 , and 7 h of optimal cpap treatment eliminating all events . for each duration of cpap use , we entered the number of rem and nrem obstructive events left untreated in a regression model predicting hba1c after adjustment for age , sex , ethnicity - based diabetes risk , bmi , years of type 2 diabetes , and insulin use . all statistical analyses were performed using spss statistics v20 and verified using stata ( v10.1 , college station , tx ) . we prospectively recruited subjects with established type 2 diabetes using an advertisement posted in the primary care and endocrinology clinics at the university of chicago , inviting participation in a research study on sleep and diabetes . eligible individuals had to meet the criteria for type 2 diabetes based on physician diagnosis using standard criteria ( 23 ) . in order to include individuals with newly diagnosed type 2 diabetes , we also recruited in the community using an advertisement inviting subjects at risk for type 2 diabetes based on age and adiposity to participate in a research study on sleep and metabolism . all participants without an established diagnosis of type 2 diabetes had to undergo a standard 75-g oral glucose tolerance test and meet the american diabetes association guidelines for the diagnosis of type 2 diabetes ( 23 ) . all participants were in stable condition and , when on pharmacological treatment , on a stable antidiabetic medication regimen for the preceding 3 months . exclusion criteria were unstable cardiopulmonary conditions , neurological disorders , psychiatric disease , shift work , chronic insomnia , or any prior or current treatment for osa ( upper airway surgery , cpap therapy , oral appliances , or supplemental oxygen ) . we previously reported the association of osa severity categories ( no , mild , moderate , and severe osa ) with chronic glycemic control in type 2 diabetes using 60 of the participants included in this analysis ( 9 ) . the study was approved by the university of chicago institutional review board , and all participants gave written informed consent . subjects were admitted to the clinical resource center or the sleep research laboratory of the sleep , health , and metabolism center at the university of chicago to undergo an overnight in - laboratory psg . self - reported ethnicity - based diabetes risk was categorized as low - risk category ( non - hispanic whites ) and high - risk category ( african americans , hispanics , and asians ) . the duration of type 2 diabetes and the number of medications were verified by questionnaires as well as review of the patients medical records . hba1c values ( defined as the proportion of hemoglobin that is glycosylated ) were obtained from the patient s chart if assessed during the previous three months ( n = 17,15% of subjects ) or measured on a single blood sample drawn on the morning after the psg ( n = 98 , 85% of the subjects ) . hba1c was measured by bio - rad variant classic boronate affinity - automated high - performance liquid chromatography ( bio - rad , hercules , ca ) . the intra - assay coefficient of variation was 0.51.0% , and the interassay coefficient of variation was 2.22.4% . bedtimes were from 11:00 p.m.12:00 a.m. until 7:00 a.m.8:00 a.m. each subject was recorded for a minimum of 7 h to determine the presence and severity of obstructive respiratory events across the entire night . psg ( neurofax eeg 1100 system , nihon kohden , foothill ranch , ca ) included recordings of six electroencephalographic channels , bilateral electro - oculograms , chin and tibialis electromyogram , electrocardiogram , airflow by nasal pressure transducer and oronasal thermocouples , chest and abdominal wall motion by piezo electrode belts , and oxygen saturation by finger pulse oximeter . all psgs were staged and scored according to the 2007 american academy of sleep medicine manual for the scoring of sleep and related events ( 24 ) . apneas were defined as total cessation of airflow for at least 10 s ( obstructive if respiratory effort was present and central if respiratory effort was absent ) . hypopneas were scored if the magnitude of the ventilation signal decreased by at least 50% of the baseline amplitude of the nasal pressure transducer for at least 10 s and were associated with either a 3% or greater drop in oxygen saturation as measured by finger pulse oximetry or an electroencephalographic microarousal ( 24 ) . the total ahi was defined as the number of obstructive apneas and obstructive hypopneas per hour of sleep . given the minimal presence of central apneas , we did not include these events in the calculation of ahi . the median central apnea index was 0.001 ( interquartile range of 0.0010.41 ) , and the highest central apnea index was 5 . a subject was considered to have mild osa if the ahi was 514 , moderate osa if the ahi was 1529 , and severe osa if the ahi was 30 . rem ahi was calculated as the number of apneas and hypopneas during rem sleep divided by total time in rem sleep in hours . nrem ahi was calculated by dividing the number of apneas and hypopneas during nrem sleep by total time in nrem sleep in hours . the oxygen desaturation index ( odi ) was defined as the total number of oxygen desaturations of at least 3% per total sleep time ( tst ) in hours . the microarousal index ( mai ) differences between subjects with and without osa were tested using the student t test or mann whitney nonparametric test for continuous variables . categorical variables were reported as proportions and were compared using the square test or fisher s exact test . five multivariate linear models were successively fitted to examine associations between hba1c and measures of osa severity after controlling for multiple covariates . model 1 included demographic variables traditionally associated with glycemic control , namely , age , sex , ethnicity - based diabetes risk , bmi , years of type 2 diabetes , and insulin use . lastly , model 5 included all the covariates in model 1 plus nrem ahi and rem ahi . since there are individuals who have a significant number of apneas and hypopneas during rem sleep while having an overall ahi below 5 ( hence no osa based on current definitions ) , we included all 115 subjects ( with or without osa ) in the multivariate regression models in order to explore the entire spectrum of rem and nrem events ( ahi , odi , and mai ) . we formally ruled out any evidence of collinearity among the variables entered in the models using standard statistics , including tolerance and variance inflation factor . values for hba1c , years of type 2 diabetes , and rem and nrem ahi were submitted to natural log ( ln ) transformation . in order to deal with zero values , the total ahi , rem ahi , and nrem ahi were log transformed using the formula ahi = log(ahi + 0.01 ) . only one outlier was identified ( low hba1c ) , and sensitivity analysis excluding this subject was performed confirming the association between hba1c and measures of rem osa severity . goodness of fit of the models was assessed using diagnostic plots . in order to estimate the effect size of increasing severity of osa on hba1c in a clinically useful manner , models were fitted to estimate the change in adjusted hba1c based on quartiles of rem and nrem ahi . the models included all the covariates in model 5 ( age , sex , ethnicity - based diabetes risk , bmi , years of type 2 diabetes , and insulin use ) and replaced lnrem ahi with rem ahi quartiles while keeping lnnrem ahi in the model . this process was repeated , and lnnrem ahi was replaced with nrem ahi quartiles while keeping lnrem ahi in the model . similar models were fitted for rem and nrem odi and mai quartiles . to simulate the impact of various durations of nocturnal cpap therapy on hba1c , we calculated the mean profiles of cumulative minutes of rem and nrem sleep over 8 h of total recording time from the 115 polysomnograms . we then estimated mean percentages of rem and nrem sleep left untreated after 4 , 6 , and 7 h of optimal cpap treatment eliminating all events . for each duration of cpap use , we entered the number of rem and nrem obstructive events left untreated in a regression model predicting hba1c after adjustment for age , sex , ethnicity - based diabetes risk , bmi , years of type 2 diabetes , and insulin use . all statistical analyses were performed using spss statistics v20 and verified using stata ( v10.1 , college station , tx ) . therefore , the study was completed by 131 participants . those who obtained less than 4 h of tst during the psg were not included in the analysis ( n = 7 ) . participants were also excluded if the psg data could not be interpreted due to multiple artifacts in the airflow signal ( n = 8) . one patient showed severe oxygen desaturation not explained by apneas or hypopneas consistent with significant hypoventilation . of the 115 subjects included in the study , 56.5% were women , 58.3% were african american , 68.7% were obese , and the mean bmi was 34.5 kg / m . the median duration of type 2 diabetes was 4 years , and a quarter of the subjects were not on any antidiabetic medication . i.e. , neuropathy , nephropathy , retinopathy , coronary artery disease , or peripheral arterial disease ) . mild osa was present in 27% , moderate osa in 28.7% , and severe osa in 29.6% of the subjects . there were no significant differences in sex , race , bmi , years of type 2 diabetes , number of antidiabetic medications , insulin use , and hba1c level between subjects with and without osa , but participants without osa were on average 9 years younger than those with osa . the lack of statistically significant differences in bmi and hba1c may have been related to the small number of subjects without osa . there were no significant differences in total recording time and percentage of slow wave sleep between subjects with and without osa . however , subjects with osa had significantly less tst and percentage of rem sleep and significantly higher wake after sleep onset . within the participants with osa ( n = 98 ) , rem ahi , rem odi , and rem mai were all significantly higher than nrem ahi , nrem odi , and nrem mai . the odi was more than fourfold higher during rem than nrem sleep , but differences in mai were more modest . demographic and clinical features of 115 patients with type 2 diabetes table 2 describes the results of the five multivariate linear regression models predicting hba1c . model 3 shows that nrem ahi was not associated with hba1c ( p = 0.070 ) . in contrast , in model 4 , rem ahi was independently associated with hba1c ( p = 0.001 ) . in model 5 , rem ahi ( p = 0.008 ) remained a significant predictor of hba1c even after adjusting for nrem ahi ( p = 0.762 ) . in the final fully adjusted model 5 , the independent predictors of increased hba1c were rem ahi ( p = 0.008 ) , race risk ( p = 0.001 ) , years of type 2 diabetes ( p = 0.001 ) , and insulin use ( p < 0.001 ) . similar results were obtained when nrem ahi and rem ahi were replaced by the total number of events in nrem and rem sleep , respectively ( p = 0.023 for rem events and p = 0.355 for nrem events ) . multivariate linear regression models predicting natural log of hba1c in patients with type 2 diabetes in order to estimate the effect size of increasing levels of rem ahi and nrem ahi on hba1c , we performed multivariate linear regression models using quartiles of rem ahi and nrem ahi . 1 , after adjustment for age , sex , bmi , race risk , years of type 2 diabetes , insulin use , and lnnrem ahi , increasing quartiles of rem ahi were significantly associated with increasing levels of hba1c ( p = 0.044 for linear trend ) . the mean adjusted hba1c increased from 6.3% in subjects with rem ahi < 12.3 events per hour ( lowest quartile ) to 7.3% in subjects with rem ahi > 47 events per hour ( highest quartile ) . similarly , quartiles of rem odi and rem mai were significantly associated with increasing levels of hba1c . the mean adjusted hba1c increased from 6.5% in the lowest quartile of rem odi to 7.5% in the highest quartile ( p = 0.039 for linear trend ) . similarly , the mean adjusted hba1c increased from 7.6% in the lowest quartile of rem mai to 8.9% in the highest quartile ( p = 0.003 for linear trend ) . in contrast , increasing levels of nrem ahi , nrem odi , and nrem mai quartiles were not associated with hba1c ( fig . 1 ) . adjusted mean hba1c values for rem and nrem ahi , odi , and mai quartiles . for all the panels , multivariate linear regression models were fitted to estimate the mean natural ln hba1c adjusted for demographic variables traditionally associated with glycemic control such as age , sex , ethnicity - based diabetes risk , bmi , ln years of type 2 diabetes , and insulin use . in addition , panels are adjusted for ( a ) lnnrem ahi , ( b ) lnrem ahi , ( c ) lnnrem odi , ( d ) lnrem odi , ( e ) lnnrem mai , and ( f ) lnrem mai . age and bmi are centered at their means : 55 years old and 35 kg / m , respectively . the corresponding -coefficients for each quartile were then exponentiated to convert from ln hba1c to the standard values of hba1c . figure 2a illustrates the predominance of rem sleep in the later part of sleep . in our cohort , 3 and 4 h after lights off , on average only 25 and 40% of rem sleep had occurred , respectively . optimally titrated cpap use for 3 or 4 h would treat only 25 or 40% of rem sleep , respectively , and would leave most obstructive events during rem sleep untreated . figure 2b and c illustrate the simulated impact of 4 , 6 , and 7 h of cpap use in men and women with low and high race / ethnicity - based diabetes risk . this simulation clearly shows that the metabolic benefit of 4 h of cpap use , often considered as adequate cpap compliance , is modest , while a much more clinically significant effect can be obtained when treatment is extended to 6 h and beyond . cumulative minutes of rem and nrem sleep over 8 h of bedtime ( a ) and simulation of various hours of cpap use in men and women with type 2 diabetes based on race / ethnicity - based diabetes risk ( b and c ) . a : data are summarized as mean sd of cumulative rem and nrem sleep minutes from lights off to lights on in 115 subjects with type 2 diabetes . the mean duration of rem and nrem sleep in our cohort was 82 and 298 min , respectively . using cpap for 3 or 4 h from the time lights are turned off will cover only 25 or 40% of rem sleep , respectively , and will leave most obstructive events during rem sleep untreated . b and c : simulation of the impact of 4 , 6 , and 7 h of cpap use in four groups of subjects based on sex and race / ethnicity - related diabetes risk . with this simulation , 4 h of cpap use would treat 40% of rem sleep and would lead to a drop in adjusted hba1c of 0.230.28% . in contrast , 7 h of cpap therapy would treat 87% of rem sleep and lead to a decrease in adjusted hba1c between 0.87 and 1.1% . this study reveals that hba1c , a measure of chronic glycemic control in patients with type 2 diabetes , is adversely associated with obstructive apneas and hypopneas that occur in rem sleep ( rem ahi ) but not in nrem sleep ( nrem ahi ) . the independent association between rem ahi , rem odi , and rem mai and hba1c is robust and of clinical significance , with a difference of 1.0% hba1c between the lowest and highest quartiles of rem ahi as well as rem odi and of 1.3% hba1c between the lowest and highest quartiles of rem mai after adjusting for all the covariates . these effect sizes are comparable to what would be expected from widely used antidiabetic medications . the severity of osa in our cohort was greater in rem sleep than in nrem sleep , as evidenced by a higher ahi and a nearly fourfold higher odi . thus , despite the shorter duration of rem sleep , exposure to the adverse consequences of osa , particularly intermittent hypoxemia , was greater during rem than nrem sleep . surprisingly , in our diabetic participants with osa , nrem ahi only predicted approximately 25% of the variance of rem ahi . whether hyperglycemia plays a role in this relative independence of the severity of rem osa relative to nrem osa remains to be determined . multiple mechanistic pathways are likely to be involved in the link between rem osa and poorer glycemic control in subjects with type 2 diabetes . when compared with nrem sleep or quiet wakefulness , rem sleep is associated with increased sympathetic activation and reduced vagal tone in normal subjects and even more so in patients with osa ( 1719 ) . most endocrine organs releasing hormones involved in glucose regulation are sensitive to changes in sympathovagal balance . well - documented examples relevant to metabolic risk are pancreatic insulin secretion , hepatic glucose production , and adipocyte regulation of energy balance ( 2527 ) . in addition , peptidergic factors originating from the intestine ( glucagon - like peptide-1 and glucose - dependent insulinotropic polypeptide ) augment the insulin response induced by nutrients . the secretion of these incretin hormones is intimately linked to autonomous nervous system ( 2830 ) . however , it is important to point out that the impact of osa on sympathetic activation in patients with type 2 diabetes of long duration remains unclear and that long - standing hyperglycemia may lead to reduction in sympathetic activity . lastly , obstructive apneas and hypopneas during rem sleep lead to greater degrees of hypoxemia than in nrem sleep ( 21,22 ) . in the present cohort , rem odi was indeed much greater than nrem odi . intermittent hypoxemia has been shown to be toxic to -cell function in murine models of sleep apnea ( 31,32 ) . the findings from our analyses strongly suggest that rem - related obstructive respiratory events are of clinical significance for the severity of type 2 diabetes . two recent studies that performed continuous interstitial glucose monitoring simultaneously with psg directly support our hypothesis that rem - related osa may have adverse metabolic consequences ( 33,34 ) . one of these studies included 13 obese patients with type 2 diabetes with severe osa and compared them with 13 obese patients with type 2 diabetes without osa with similar demographic characteristics . although there was no difference in the mean diurnal glycemic level between the two groups , the mean glycemic level was 38% higher during rem sleep in those with osa ( 33 ) . they found that in the absence of osa , rem sleep leads to a larger decline in interstitial glucose concentration than nrem sleep . in contrast , osa during nrem sleep had no impact on interstitial glucose concentrations ( 34 ) . taken together , the evidence from studies assessing interstitial glucose levels supports our finding that obstructive events during rem sleep are adversely associated with glucose metabolism . while our participants did not meet any proposed definition of rem - related or rem - predominant osa ( 35,36 ) , our findings suggest that failure to recognize and treat osa in rem sleep may be of critical clinical significance for glycemic control in diabetic patients . in clinical practice , 4 h of nightly cpap use is considered adequate adherence to therapy ( 37 ) . indeed , a randomized controlled trial of cpap therapy in patients with type 2 diabetes reported an average use of 3.6 h per night ( 16 ) . the severity of residual osa was not estimated . the disappointing results of cpap trial in type 2 diabetes may reflect the failure to treat rem osa due to insufficient cpap use , leaving most obstructive events during rem sleep untreated . alternatively , there may be other factors beyond poor cpap adherence that led to a lack of improvement in glycemic control such as a poor reserve in -cell function . our analyses show that based on the distribution of cumulative rem sleep in our cohort , cpap therapy for the first 4 h after lights off would leave 60% of obstructive events during rem sleep untreated and would be associated with a decrease in the adjusted hba1c by only 0.250.28% . in contrast , 7 h of optimal cpap therapy would be associated with a decrease in the adjusted hba1c by 0.871.1% . first , we used hba1c , the most commonly used measure in clinical practice , to assess glycemic control . therefore , we can not ascertain whether the mediating pathways linking rem ahi to hba1c involve increased insulin resistance or impaired -cell function . moreover , we only measured hba1c at a single time point , which was not consistently on the same day as psg . however , treatment was stable for the preceding 3 months in all participants , and hba1c was measured on the morning after the psg in 98 out of 115 participants ( 85% ) . although hba1c reflects glycemic control 10 to 12 weeks before the assay , it mostly reflects glucose fluctuations during the last 6 weeks of the measurement . despite our efforts to ensure treatment stability in the prior 3 months , we can not exclude the possibility that fluctuations in adherence to medications may have influenced hba1c levels . our study did not assess associations between rem osa and glucose control in subjects with prediabetes or normal glucose tolerance . we also had a large proportion of african americans and subjects requiring little or no antidiabetic medications . therefore , it would be important for our findings to be replicated in larger and more diverse cohorts , including participants with more diabetes complications and/or longer disease duration as well as individuals with prediabetes or with normal glucose tolerance but at high risk for type 2 diabetes . also , we did not have a measure of habitual sleep duration , which may be important in evaluating chronic exposure to rem osa , and our only measure of adiposity was the bmi . lastly , the cross - sectional nature of the study does not address the direction of causality . indeed , only rigorously designed intervention studies will provide causal evidence between disordered breathing during rem sleep and glucose metabolism dysregulation . in summary , our findings support the notion that osa in rem sleep has a strong and clinically significant association with glycemic control in subjects with type 2 diabetes . since rem sleep is dominant during the latter part of the sleep period , rem - related osa may often remain untreated with 4 h of cpap use . our analyses suggest that to achieve clinically significant improvement in glycemic control in patients with type 2 diabetes , cpap use may need to be extended beyond 6 h per night . further research is needed to elucidate the mechanistic pathways linking osa during rem sleep and adverse metabolic outcomes .","<S> objectiveseverity of obstructive sleep apnea ( osa ) has been associated with poorer glycemic control in type 2 diabetes . </S> <S> it is not known whether obstructive events during rapid eye movement ( rem ) sleep have a different metabolic impact compared with those during non - rem ( nrem ) sleep . </S> <S> treatment of osa is often limited to the first half of the night , when nrem rather than rem sleep predominates . </S> <S> we aimed to quantify the impact of osa in rem versus nrem sleep on hemoglobin a1c ( hba1c ) in subjects with type 2 diabetes.research design and methodsall participants underwent polysomnography , and glycemic control was assessed by hba1c.resultsour analytic cohort included 115 subjects ( 65 women ; age 55.2 9.8 years ; bmi 34.5 7.5 kg / m2 ) . in a multivariate linear regression model , </S> <S> rem apnea </S> <S> hypopnea index ( ahi ) was independently associated with increasing levels of hba1c ( p = 0.008 ) . </S> <S> in contrast , nrem ahi was not associated with hba1c ( p = 0.762 ) . </S> <S> the mean adjusted hba1c increased from 6.3% in subjects in the lowest quartile of rem ahi to 7.3% in subjects in the highest quartile of rem ahi ( p = 0.044 for linear trend ) . </S> <S> our model predicts that 4 h of continuous positive airway pressure ( cpap ) use would leave 60% of rem sleep untreated and would be associated with a decrease in hba1c by approximately 0.25% . </S> <S> in contrast , 7 h of cpap use would cover more than 85% of rem sleep and would be associated with a decrease in hba1c by as much as 1%.conclusionsin type 2 diabetes , osa during rem sleep may influence long - term glycemic control . </S> <S> the metabolic benefits of cpap therapy may not be achieved with the typical adherence of 4 h per night . </S>"
1,"intoxication due to carbon monoxide ( co ) is one of the most common types of poisoning , and one of the most important toxicological global causes of morbidity and mortality ( 1 ) . a weak association between carboxyhemoglobin ( cohb ) level and patients clinical picture is documented ( 2 ) . co binds rapidly to hemoglobin with greater affinity than oxygen ( o2 ) and forms cohb , which leads to a decrease in the o2 carrying capacity of the blood resulting in tissue hypoxia . therefore , organs with high oxygen demand , such as heart , brain and lungs are most sensitive to hypoxia ( 3 - 6 ) . cardiac effects of cohb range from simple arrhythmias to myocardial infarction ( 7 - 9 ) . cardiac troponins ( ctn ) the release of ctn into the circulation occurs as a consequence of cardiomyocyte injury ( 10 ) . highly sensitive - ctn ( hs - ctnt ) assays are developed recently , enabling measurements of concentrations that are 100-fold lower than those of the ones previously measurable ( 11 ) . the current study aimed to assess whether or not myocardial damage occurs in patients with co poisoning . the study investigated the relationship between blood carboxyhemoglobin and hs - ctnt level with a highly sensitive assay in patients with acute co poisoning . the current retrospective study was conducted at the sevket yilmaz training and research hospital that is a state tertiary hospital with 1050 beds , located in the eastern bursa , turkey . the study used the data available in the hospital clinical data warehouse , a centralized data repository integrating information in several databases including the order entry database and the laboratory results database of the hospital . patients diagnosed with co poisoning were included in the study and their corresponding electronic charts were reviewed for data collection . prescription data were linked to detailed clinical information including patient demographics , diagnosis , clinical characteristics ( including past medical history , smoking status ) , co source ( charcoal or fire ) , vital signs at presentation , physical examination characteristics , cohb levels , treatment therapy , complications in erectile dysfunction ( ed ) , and laboratory data ; the latter included specimen collection date , time , and location ( for example : intensive care unit ) . two - hundred - seventeen cases admitted to the emergency medicine unit of the hospital in 2012 , ( january 2012 - january 2013 ) with the diagnosis of acute co intoxication seventy - six patients whose additional diagnoses indicated chronic ischaemic heart disease , heart failure , myocarditis , muscular dystrophy , polymyositis , implanted cardiac resynchronization device , chronic inflammatory disease , and chronic renal failure were excluded . the overall study population included 141 subjects ( 87 females ( 62% ) and 54 males ( 38% ) ; 70% of the poisonings occurred in the winter , 24% in spring and 6% in autumn . the study was approved by the sevket yilmaz research and education hospital ethics committee ( no . 2013/7 - 4 ) and was in compliance with the helsinki declaration ; informed consent was not assumed necessary because of the retrospective observational nature of the study and all steps were taken to ensure the anonymity of the data . blood samples were first collected ( on admission ) in the emergency department and four hours later in the intensive care unit ( icu ) . routine laboratory data ( cohb , creatine kinase - myocardial band ( ck - mb ) and high - sensitivity cardiac troponin t ( hs - tnt ) were recorded . cohb levels were measured by a blood gas analyzer ( omni s , roche diagnostics penzberg , germany ) supported by a co - oximetry panel . hs - ctnt and ck - mb levels were determined by an elecsys 2010 autoanalyzer ( roche diagnostics , penzberg , germany ) using commercial assays . according to the manufacturer of the hs - ctnt stat assay , ng / l , and the 99th percentile in healthy volunteers was 14 ng / l . for quality assurance purposes , the laboratory participates in an external quality assessment scheme run by labquality , helsinki , finland . at the time of the present study , the qc program reported average values as part of the treatment , all the patients inhaled high flow normobaric oxygen and were monitored and followed up in icu . treatment continued until clinical findings stabilized and serum cohb levels decreased to target levels of 5% . the patients intoxicated with co were divided into three groups depending on cohb levels : group i , mild cohb level < 15% ; group ii , cohb levels < 25% and > 15% ; group iii , severe acute co intoxication cohb levels > 25% . samples with hs - ctnt below the limit of blank ( i e , 3 ng / l ) were assigned a value of 1 ng / l . the comparisons between the medians of three groups were performed by the kruskal - wallis test and the post - hoc dunnett tests were used to examine the significance levels between groups . spearman rank analysis was used to assess associations between cohb , ck - mb and hs - tnt levels ( spss , chicago , il ) and p values of < 0.05 were considered significant . the patients intoxicated with co were divided into three groups depending on cohb levels : group i , mild cohb level < 15% ; group ii , cohb levels < 25% and > 15% ; group iii , severe acute co intoxication cohb levels > 25% . samples with hs - ctnt below the limit of blank ( i e , 3 ng / l ) were assigned a value of 1 ng / l . the comparisons between the medians of three groups were performed by the kruskal - wallis test and the post - hoc dunnett tests were used to examine the significance levels between groups . spearman rank analysis was used to assess associations between cohb , ck - mb and hs - tnt levels . ( spss , chicago , il ) and p values of < 0.05 were considered significant . one - hundred - forty of the patients were poisoned at home and only one of the patients was poisoned at work and they used coal or wood for heating . the predominating symptoms were nausea ( n = 89 ) , headache ( n = 86 ) , dizziness ( n = 47 ) and vomiting ( n = 28 ) . cohb levels ranged from 8 to 35 ( median 23 , mean 22.0 6.6% ) on admission . the patients intoxicated with co were divided into three groups depending on their cohb levels : group i , mild cohb level < 15% ( n = 29 ) ; group ii , cohb between 15% and 25% ( n = 67 ) ; group iii , severe acute co intoxication cohb levels > 25% ( n = 45 ) ( table 1 ) ( 12 ) . abbreviations : ck - mb , creatine kinase - myocardial band ; cohb , carboxyhemoglobin ; hs - tnt , high - sensitivity cardiac troponin t. values are expressed as median ( interquartile range ) or mean sd . median hs - ctnt increased with increasing cohb level ( kruskal - wallis p = 0.05 ; table 1 ) . when the post - hoc dunnett test was performed , hs - ctnt levels were not statistically different between the groups . ck - mb levels did not differ between the three groups ( kruskal - wallis ; p = 0.48 ) . cohb levels with hs - tnt values were weakly correlated ( r = 0.173 , p = 0.041 ) ; on the other hand , ck - mb levels were not correlated with those of the cohb ( r = 0.013 , p = 0.883 ) ( table 2 ) . abbreviations : ck - mb , creatine kinase - myocardial band ; hs - tnt , high - sensitivity cardiac troponin t. on admission , 5 of the 141 patients had elevated serum ck - mb levels and 20 had elevated serum hstnt levels ( > 14 ng / l ) , only three of the patients with cardiac markers were elevated on the follow - up period . the findings of the cohort current study of patients with cohb intoxication hs - ctnt levels were slightly higher in severe toxicated patients than in mild toxicated patients without significant correlation between cohb and hs - ctnt levels . cardiac troponins are components of the contractile apparatus of cardiomyocytes and are released during myocardial necrosis . serum troponin elevation is a specific and well established myocardial necrosis biomarker , and can detect extremely small amounts of myocardial necrosis ( < 1.0 g ) ( 12 , 13 ) . very low , but detectable amounts of hs - ctnt levels may reflect a normal biological process of myocyte turnover and it may also be associated with an increased cell turnover ( 14 , 15 ) . the proposed mechanisms of cardiac troponin release include apoptosis , cellular release of proteolytic degradation products , increased cell wall permeability , and formation and release of membranous blebs ( 11 ) . many acute diseases are associated with elevated ctn in the absence of acute ischemic heart disease that can occur for many reasons ( 16 ) . direct toxic effects of circulating cytokines and chemotherapies can cause severe myocardial toxicity as severe sepsis and septic shock ( 12 , 16 ) . recent data showed that one can detect the effects of some toxic chemotherapy by monitoring ctn ( 16 ) . low levels of co activate soluble guanylate cyclase which in turn exerts beneficial effects such as vasodilatation and inhibition of platelet aggregation ( 7 ) . in the current study , hs - ctnt levels were higher in patients with severe co toxicity compared to the ones with mild co intoxication . although ultramicroscopic changes are reported in cases of co toxicity , its relative effects need to be documented ( 17 ) . a direct toxic effect of cohb is discussed as a consequence of experimental studies on cytochrome oxidase ( 17 ) . co binds with cardiac myoglobin causes a rapid decrease in myocardial oxygen reserves ( 18 ) . when the energy source is blocked , then the function of the myoglobin is diminished ( 19 , 20 ) . myocardial fiber necrosis and other changes observed with electron microscopy are associated with impaired energy metabolism ( 22 , 23 ) . however , cardiac troponins are released without electron microscopic changes ( 24 , 25 ) . in this study ck - mb levels did not differ between the three groups and did not correlate with cohb . patients with detectable troponin , but no ck - mb , in the blood may exhibit microscopic zones of myocardial necrosis ( microinfarction ) ( 13 ) . ( 26 ) showed that in co poisoning , patients without known underlying significant coronary artery disease with cohb levels of up to 60% do not develop myocardial damage but they used only cardiac biomarkers ck - mb and ctnt . the current study did not find any significant elevation in ck - mb levels , but in contrast a significant increase in hs - ctnt levels was found , which might be because of the microscopic myocardial necrosis ( 26 ) . myocardial injury , documented with elevations in cardiac biomarkers , could be present in about one - third of the patients with serious co poisoning and it was associated with mortality ( 7 ) . the severity of myocardial injury depended on the duration and amount of co exposure ( 27 ) . moreover , the level of co in the tissues may have an equal or greater impact on the clinical status of the patient than does the blood level of co ( 28 ) . the primary limitation of the current small study group was the method of data collection . there was no information on the length of co exposure and the timing of the blood cohb levels in relation to the co exposures . further studies are necessary in this regard . another limitation was the retrospective nature of the study . the data were reviewed by one researcher who avoided the selection bias ; however , misclassification , may still exist , which can not be verified or validated . these results apply only to the elecsys hs - ctnt and ck - mb ( roche diagnostics ) assays , and may not be generalized to other high sensitivity assays by other manufacturers . finally , the study design was based on the data available in one hospital , and the obtained results may not necessarily be generalized . in conclusion , in patients without clear signs of myocardial infarction , even mild co poisoning is associated with quantifiable circulating levels of hs - ctnt when tnt is measured using a highly sensitive assay and cardiac complications should be considered in such patients . plasma levels of the hs - tnt and ck - mb assays were not correlated with the cohb levels . in conclusion , in patients without clear signs of myocardial infarction , even mild co poisoning is associated with quantifiable circulating levels of hs - ctnt when tnt is measured using a highly sensitive assay and cardiac complications should be considered in such patients . plasma levels of the hs - tnt and ck - mb assays were not correlated with the cohb levels .","<S> backgroundintoxication due to carbon monoxide ( co ) is one of the most common types of poisoning . </S> <S> cardiac effects of carboxyhemoglobin ( cohb ) range from simple arrhythmias to myocardial infarction.objectivesthe current study aimed to investigate the relationship between blood carboxyhemoglobin and high - sensitivity cardiac troponin t ( hs - ctnt ) level with a highly sensitive assay in patients with acute carbon monoxide poisoning.patients and methodsthis retrospective study was conducted on 141 ( 54 males and 87 females ) patients , with acute co intoxication , admitted to the sevket yilmaz research and education hospital emergency unit during a one - year period ( january 2012 - january 2013 ) . </S> <S> the patients were divided into three groups based on cohb levels : group i , mild cohb level < 15% ; group ii , cohb between 15% and 25% ; group iii , severe acute co intoxication cohb levels > 25% . </S> <S> cohb , hs - ctnt ( stat ) , creatine kinase ( ck ) and creatine kinase - myocardial band ( ck - mb ) levels were measured on admission.resultsthe mean age of the patients was 38 16 years . </S> <S> cohb levels ranged from 8 to 35 . </S> <S> hs - ctnt levels on inclusion in this study were slightly different between the groups ( p = 0.05 ) . </S> <S> cohb levels with hs - ctnt values were weakly correlated ( r = 0.173 , p = 0.041 ) ; on the other hand , ck - mb levels were not correlated with cohb ( r = 0.013 , p = 0.883).conclusionsin patients without clear signs of myocardial infarction , even mild co poisoning was associated with quantifiable circulating levels of hs - ctnt when tnt was measured using a highly sensitive assay in the current study patients . </S> <S> plasma levels of the hs - tnt and ck - mb assays were not correlated with the cohb levels in the current study patients . </S>"
2,"crystal engineering the \n rational design of crystalline molecular \n solids remains an important challenge for chemistry . crystal structure prediction is not yet feasible \n in all cases , and it is therefore useful \n to develop motifs which allow families of structures to be generated \n in a reliable fashion . there is special \n interest in motifs which lead to microporous ( or nanoporous ) crystals \n with voids on the scale of 0.52 nm , sufficient to \n accommodate molecular guests . such materials offer various functionalities , \n such as inclusion and storage of gases , and other guest molecules , the enhancement \n of optical properties of included guests , the use of pores as reaction vessels to promote the formation of \n desired products , and the separation of \n mixtures including enantiomers from racemates . at the same time crystallizing species will usually \n attempt to maximize contact with each other , thus minimizing any void \n space . to counter this trend is not straightforward and will often \n require the construction of specially shaped rigid components or units \n capable of strong and directional interspecies interactions . successful approaches to nanoporous crystal engineering may be \n divided into two categories . on the one hand ( pcps ) or metal organic \n frameworks ( mofs ) , formed by combining \n metal ions with rigid multivalent ligands . on the other are purely \n organic systems , which rely on noncovalent bonding to regulate crystal \n packing . the organic systems may be further \n divided into intrinsically and extrinsically porous molecular crystals . intrinsically porous crystals are based on \n molecules with predefined open spaces ( macrocycles , cages etc . ) , whereas extrinsic porosity \n results simply from crystal packing . of these three approaches ( hybrid , \n intrinsic / extrinsic organic ) , the latter is probably the most challenging \n as open frameworks must be maintained without the help of powerful \n directional coordination bonds or pre - existing cavities . some solutions \n have emerged through serendipity , such as the classic urea inclusion \n compounds . there are few motifs \n which generate families of readily accessible nanoporous crystals , allowing tuning of void dimensions and material \n properties . the work described in this paper is founded on a \n serendipitous \n discovery made a few years ago in the course of our program on anion - binding \n cholapods . these powerful \n receptors combine a rigid steroidal scaffold , derived from cholic \n acid 1 ( chart 1 ) with various \n combinations of h - bond donor groups . most are reluctant to crystallize \n but a small subset , represented initially by 24 , were found to form needles from methyl acetate - water or \n acetone - water mixtures . all three were subjected to x - ray crystallography , \n with interesting results . despite the \n significant differences between 24 , the external similarity of the crystals was reflected in the internal \n structures ; the three were isomorphous , with almost identical unit \n cell dimensions and packing arrangements for the invariant steroidal \n cores . the packing involved the formation of helices with hexagonal \n symmetry ( space group = p61 ) , surrounding \n solvent - filled channels . the arrangement is illustrated in figure 1 , using tris - urea 3 as an example . \n individual steroid molecules bind to a single molecule of co - crystallized \n water through 5 h - bonds ( figure 1a ) and stack \n to form columns ( figure 1b ) . the columns then \n pack in a hexagonal arrangement to generate the solvent - filled pores . \n the orientation of the steroids in the columns is such that the terminal \n groups ( methoxy and nhph ) face into the pores and largely determine \n the nature of the channel surface . effectively , the terminal groups can \n expand into the channel interior without affecting the packing of \n the columns which maintain the structure . the channels , moreover , \n are unusually wide . in the case of trifluoroacetamide 2 , the average diameter was found to be 16.4 . the average diameter \n for 3 was only slightly less at 15.7 , although \n the surfaces are more irregular ( figure 2 ) . \n there is thus substantial room , in principle , both for guest molecules \n and for terminal groups . preliminary experiments on 2 implied that guest exchange was possible , at least for certain solvents \n ( meoac , et2o , toluene ) . evacuation lead to partial degradation \n ( evidenced by crazing ) , but the powder x - ray diffraction ( xrd ) pattern \n remained largely unchanged . the steroid is solvated \n by a molecule of water which forms hydrogen bonds to all three urea \n groups . ( b ) molecules of 3 stack to form \n columns , running along the crystallographic c axis . \n representation as for ( a ) except that core steroidal atoms are colored \n blue and green in adjacent molecules , and water molecules are shown \n with thick bonds . one column of steroids is highlighted using the \n coloring from ( b ) , with the och3 and nhph groups now in \n spacefilling mode . ( d ) a single channel sliced in half along the c axis , viewed in spacefilling mode . terminal och3 and nhph groups retain their coloring , other atoms near or at the \n internal surface are shown as light blue . ( e ) 3d schematic representation \n of a channel , showing helical arrangement of methyl groups and aromatic \n rings ( spheres and hexagons , respectively ) . interior surfaces for trifluoroacetamide 2 and tris - n - phenylurea 3 viewed along the c - axis . the surfaces were calculated using a 1.4 probe . given the space available within \n the channels , it seemed likely \n that a wide range of analogues with a common bis-(n - phenylureido)steroidal core ( figure 3 ) would \n form crystals isostructural with 24 . variation should be feasible not only at the c3 substituent ( r in figure 3 ) but also at the c24 ester \n group ( r in figure 3 ) ( npsus ) would \n be isostructural they should be able to form solid solutions ( organic \n alloys ) , greatly enlarging the range of systems available . since our \n original publication we have confirmed both of these possibilities . \n we have described a series of three npsus with aromatic groups in \n r , and the interesting feature of water wires \n in the channels , and also a range of \n npsu - based organic alloys . herein we \n provide a more complete description of our work surveying the scope \n and properties of npsus , drawing on 25 examples which have been characterized \n by x - ray crystallography . we show how the dimensions and shapes of \n the channels can be tuned , and how their chemical nature can be altered \n by the introduction of functional groups ( including previously unreported \n alkene and aldehyde functionality ) . we also report , for the first \n time , that npsus can be porous in the strictest sense , stable to evacuation \n and capable of gas adsorption . moreover we show that they can adsorb \n a remarkable range of guests , including organic dyes with molecular \n weights up to 300 and even the c30 hydrocarbon \n squalene ( mw = 410 ) . the core bis-(n - phenylureido)steroidal \n unit maintains the p61 nanoporous structure , \n while groups r and r control the size and \n nature of the pore . the first are esters of 3,7,12-tris-(n - phenylureido)-5-cholanoic acid 6 and \n include 3 as well as the 14 variants 720 represented in chart 2 . ester groups \n r were chosen for variation in size ( and thus pore diameter ) \n and surface characteristics ( aliphatic vs aromatic vs fluorocarbon ) \n and also to showcase the potential for placing chromophores ( e.g. , \n azobenzenes ) , fluorophores ( e.g. , pyrenes ) , and reactive units ( e.g. , \n allyl groups ) in the channels . the second group are derivatives of \n methyl 3-amino-7,12-bis-(n - phenylureido)-5-cholanoate 5 , including trifluoroacetamide 2 , carbamate 4 , tris - ureas 2126 , and \n amides 2729 ( chart 3 ) . again the variable group ( r ) was used to change \n steric and surface properties and to introduce chromophores and functional \n groups . in this case aalthough 29 is included \n here for convenience , it does not adopt the npsu crystal packing . \n for further details see text . although 29 is included \n here for convenience , it does not adopt the npsu crystal packing . \n for further details amine 5 is accessible from cholic acid 1 via a multistep but \n well - established route in 40% overall yield . tris - urea 3 may be prepared from 5 by \n treatment with phenyl isocyanate or more directly from 1 via methyl 3,7,12-triaminocholanoate . esters 720 ( chart 2 ) are available from 3 via equilibration with lithium alkoxide or hydrolysis to acid 6 followed by o - alkylation or carbodiimide - induced esterification . \n the derivatives in chart 3 may be prepared \n from 5 by treatment with an aryl isocyanate ( giving 2126 ) or an acylating agent ( giving 2729 ) . the preparations of 9 , 10 , 12 , 1419 , 23 , 26 , and 29 have \n been reported previously . procedures for the remaining compounds in charts 2 and 3 are given in the supporting information . the steroids in charts 2 and 3 were crystallized from methyl \n acetate or acetone , to which small amounts of water had been added , \n through slow evaporation of the organic solvent . in most cases other polar solvents , such as methanol \n or ethanol , or nonpolar mixtures , such as chloroform - hexane , yielded \n oils or amorphous solids . all the compounds could be analyzed by single \n crystal x - ray diffraction ( scxrd ) , and with the single exception of 29 ( see below ) , all formed crystals with the p61 npsu packing . the structures of 24 , 14 , 16 , and 18 have \n been reported in communications , the remainder \n are described for the first time herein . as expected \n these show only minor variations , the differences between the steroids \n being accommodated by changes to the shape , diameter , and surface \n characteristics of the pores . unsurprisingly , given the open nature \n of the pore region , disorder in terminal groups r and \n ( especially ) r was fairly common , being present in 11 \n of 25 structures . however , in most cases the groups concerned were \n divided between just two positions , so that a reasonable model of \n the crystal ( for estimating pore volume etc . ) this applied to 4 , 8 , 10 , 12 , 13 , and 24 ( disorder in r ) , and 22 ( disorder \n in r ) . in two cases , 15 and 19 , deleting one of the two possible positions did not give a viable \n structure . however , these crystals could be modeled successfully by \n assuming equal occupancy of both positions , on an alternating basis . \n after editing where relevant , the smoothed solvent accessible surfaces \n and resulting guest - accessible volumes were calculated using materials \n studio , employing a probe of radius 1.2 . these values are given \n in table 1 , while images of the surfaces are \n available as supporting information . these were estimated by repeating the calculation using probes of \n increasing size until the surface was no longer continuous . the resulting \n value is , effectively , the diameter of the largest sphere which can \n pass through the channel . images of selected structures viewed down \n the pores , with terminal groups shown in spacefilling mode , are shown \n in figure 4 ( compounds from chart 2 , varying ester group r ) and figure 5 ( compounds from chart 3 , \n varying c3 terminal group r ) . , obtained from the materials studio \n program employing a spherical probe of 1.2 radius . calculations of total solvent accessible surfaces give higher values \n but include small voids outside the channel region . estimated by calculating the smoothed \n solvent accessible surface using differing probe radii ( increments / decrements \n of 0.05 ) . the value given is the diameter of the largest probe \n for which the calculation yields a continuous surface . these values are slightly smaller \n than those reported in ref ( 17 ) , due to a change in the method of calculation . could be built \n using the assumption that r in neighboring molecules occupied \n alternating positions , and this model was used for the pore volume \n and diameter calculations . when the probe diameter is reduced \n to this value , voids are generated outside the channel region while \n a continuous pore surface has not yet appeared . mes disordered over two positions , \n one being removed before pore volume and diameter calculations . terminal groups \n r (= \n nhph ) and or are shown in space - filling mode , with r colored gold . terminal groups r and \n or (= ome ) are shown in space - filling mode , with or colored magenta . table 1 and figures 4 and 5 illustrate the wide variety \n of structural \n properties available via the npsu system . for example , starting at \n nearly 20% ( for 2 ) , the volume available in the pores \n can be tuned downward in small increments essentially to zero ( for 15 and 19 ) . indeed , by taking advantage of alloy \n formation , continuous variation should \n be possible with these compounds . unsurprisingly , pore volumes and \n diameters are generally determined by the size of the terminal groups , \n but more subtle effects are also in play . for example , in the series \n with varying or ( chart 2 , figure 4 ) , a 2-carbon spacer between the oxygen and an aromatic \n group tends to allow efficient packing of the aromatic surface against \n the side of the channels . thus , for pyrenyl derivative 16 , space remains down the center for hydrogen - bonded chains of water \n molecules . in contrast , a 1-carbon methylene \n spacer directs the aromatic group toward the center of the channel . \n in the case of pyrenyl derivative 15 , this results effectively \n in full occupation of the channel ; the calculated guest - accessible \n volume and minimum diameter are both close to zero . paradoxically , \n therefore , the larger terminal group ( in 16 ) leaves more \n space than the smaller group in 15 . the shape of \n the channel wall ( smooth vs corrugated ) is another \n feature which can be altered . as mentioned above , compounds for which \n r = ch2ch2ar ( e.g. , 14 , 16 , 18 ) tend to adopt structures in which \n the aromatic groups line the surfaces of the channels . the resulting pores are relatively smooth and \n cylindrical , as illustrated for 14 in figure 6 ( top ) . in other systems from chart 2 , the channel surface is presumably corrugated , but with random \n and/or flexible character due to disorder within the crystal . an example \n is provided by 13 , for which the naphthylmethyl group \n appears in two orientations , one roughly perpendicular and one more \n nearly parallel to the channel axis . well - defined corrugated pores \n may be accessed by placing extended substituents at r ( which \n is less prone to disorder ) . thus for both 24 ( r = azobenzene ) and 25 ( r = biphenyl ) the \n c3-substituent reaches well toward the c axis creating \n strongly asymmetric helical pores ( figure 6 , middle and bottom ) . this work also shows that the chemical \n nature of the pore walls \n can be subject to wide variation . the structures collected in figures 4 and 5 feature an alkenyl \n group c = c ( 8) , a helical strip of fluorocarbon \n surface ( 10 ) , an aldehyde ( 21 ) , a thioether \n ( 22 ) , a boc - protected amine ( 23 ) , and an \n iodobenzene ( 26 ) . as illustrated in figure 7 , all are positioned where they can interact with guest molecules \n and participate in reactions or noncovalent interactions . finally , \n crystallography of acetamide 29 showed that \n not every molecule defined by figure 3 adopts \n the p61 npsu structure . in this case a monoclinic ( p21 ) form obtained from methyl acetate / water \n was denoted 29 , and a tetragonal ( p432121 ) form which crystallized \n from acetone / water was denoted 29. the molecular \n units in the two forms are almost identical , and qualitatively different \n to those in the npsus ; in particular , the c3 substituent is positioned \n so that the nh group points inward , creating a binding site which \n accommodates two water molecules ( see figures \n s31 and s32 ) . in both crystals the packing is efficient , leaving \n no substantial voids ( see figures s57 and s58 ) . crystal structures of 14 , 24 , and 25 viewed perpendicular to the c axis . for \n the images on the left , the groups which dominate the channel surface \n ( or for 14 , r for 24 and 25 ) are highlighted in spacefilling mode . for the \n right - hand images , the smoothed solvent accessible surfaces have been \n added using materials studio , and the structures have then been sliced \n along the c - axis . space - filling representations of the channel regions in 8 , 10 , 21 , and 22 . the structures \n have been sliced along the c - axis to expose the pore \n interiors and are viewed roughly perpendicular to c and a ( see axes attached to 8) . conventional \n coloring is used for the distinctive groups in each structure ( or for 8 and 10 , r for 21 and 22 ) , the remaining atoms being shown as \n silver - blue . \n implies that the crystal is permeable , \n allowing exchange of small guest molecules , and that this process \n does not substantially affect the host framework . ideally , the crystals should also be able to survive the \n removal of all guest molecules without loss of structure and then \n show reversible gas adsorption to confirm porosity . as mentioned earlier , we previously demonstrated that trifluoroacetamide 2 satisfies , at least , the guest - exchange criterion . the results \n from evacuation were less clear - cut ; the powder xrd ( pxrd ) pattern \n remained essentially unchanged , but the crystals crazed and became \n opaque . most npsu crystals , especially those with 3-ureido substituents , \n showed no change in appearance on evacuation . nonetheless it was clearly \n desirable to establish the solvation state of a typical npsu , show \n that the solvent could be removed , and demonstrate that the resulting \n crystals were unchanged and capable of gas adsorption . we chose tris - n - phenylurea 3 for this study , as this compound \n is the most accessible npsu and has proved the most convenient for \n routine use . crystals of 3 were obtained as needles from acetone / water ( initial ratio \n 10:1 ) , after washing with acetone and air - drying . samples were then \n evacuated at room temperature and 100 c for 24 h. the three \n samples ( air - dried , rt evacuated , 100 c evacuated ) were then \n analyzed by h nmr in dmso , using a procedure which allowed \n the amount of background water to be measured and taken into account . the composition of the air - dried crystals was found to be 3:water : acetone = 1:3.8:0.2 . allowing for the single water \n molecule per steroid embedded in the channel wall , this implies that \n the pores are filled with 3 molecules of h2o per \n molecule of 3 , with a small amount of organic solvent \n also present . after evacuation at rt the composition was 3:water = 1:1 , implying that the channels are empty . after evacuation at 100 c \n for 24 h the ratio 3:water reduced slightly to 1:0.9 . \n this suggests some degradation , although microscopy and pxrd again \n showed no major changes . samples of 3 were also heated \n to 150 c and above , and in these cases clear signs of decomposition \n were observed both by microscopy ( loss of transparency ) and pxrd ( loss \n of diffraction peaks ) . having established that the pores could \n be evacuated without loss \n of crystallinity , we proceeded to confirm the permanent porosity of 3 using n2 gas adsorption measurements for a sample \n that had been heated under vacuum at 75 c for 9 h. surprisingly , \n the n2 adsorption predominately takes place at high relative \n pressures ( p / p > \n 0.7 ) , \n and there is significant hysteresis between the adsorption and desorption \n isotherms giving a type iv isotherm ( see e.g. , figure 8) . this hysteresis differs from \n that observed in mesoporous materials ( pore diameter > 20 ) \n that \n generally closes at lower relative pressures ( p / po 0.4 ) and which is related to pore \n evacuation involving capillary action . furthermore , the desorption \n isotherm falls below that of the adsorption isotherm at p / po 0.7 . desorption cycle is repeated \n and presumably reflect slow , nonequilibrium , kinetics of n2 adsorption . such slow kinetics is understandable if access to the \n pores is restricted to the relatively small number of openings located \n at the end of the long needle - shaped crystals , which are on average \n > 2 mm in length . we have previously shown that the pores in 3 are parallel to the long axis of the crystals . similar hysteresis was observed by tosi - pellenq \n et al . from the n2 isotherms of long ( 150 m ) microporous \n crystals of alpo4 - 5 , which \n also contain cylindrical channels ( 0.76 nm in diameter ) along the \n long axis of the crystals . in this case the fact that the desorption isotherm in figure 8 dips below the adsorption isotherm implies that \n evaporating nitrogen is lost more rapidly from the channels than gaseous \n n2 is readsorbed . this may relate to pressure differences \n between the interior and exterior of the crystals ; it is reasonable \n to suppose that when the crystals are compressed , inward gas transfer \n could be relatively slow , while internal pressure could expand the \n crystals and assist n2 efflux . the possibility that the \n effects are due to collapse of the crystal structure during n2 analysis was discounted by confirming that the structure \n remained unchanged , as shown by scxrd of a crystal extracted from \n the sample of 3 used for n2 analysis . the bet surface area calculated from the n2 adsorption \n isotherm is very low ( 29 m / g ) and probably represents \n only the external surface area of the crystals . however the pore volume \n of 0.17 ml / g calculated from the total n2 uptake ( 4.9 mmol / g ) \n is highly consistent with the guest - accessible volume ( 16.1% ) calculated \n from the crystal structure ( i.e. , 0.17 ml / g equates to 16% of the \n total volume given a crystal framework density of 0.941 g / ml ) . the \n pore volume obtained from n2 uptake is also consistent \n with the values calculated from the adsorption of liquid guests , as \n discussed in the following section . n2 adsorption ( ) and \n desorption ( ) isotherms \n for crystal 3 at 77 k. see text for discussion . air - dried crystals of 3 were placed \n in each , left for 1224 h , washed briefly with ether , and subjected \n to h nmr analysis . all of the substrates were adsorbed \n in significant amounts , as summarized in table 2 . aniline 30 formed a well - defined host guest \n 1:1 complex which could be characterized by x - ray crystallography . \n as shown in figure 9 , the aniline molecules \n form a helix within the channel , apparently stabilized by a close \n interguest chn contact ( dc hn = 2.68 ) . the anilines are also held in place by specific \n favorable interactions with the channel wall , including hydrogen bonds \n between amino nh and host ester carbonyl ( dn ho = 2.46 and n ho = 167.8 ) , nh interactions involving the second \n amino nh and a host phenyl group ( dn h = 2.84 ) , and ch interactions to the aniline \n -system . guest ratio , although in this case the guest could \n not be located crystallographically . a calculation of the volume of liquid absorbed \n per unit mass of host gave a value of 0.12 ml / g , consistent \n with the pore volume obtained by gas adsorption ( see above and table 2 ) . this represents a pore - filling efficiency of \n 70% using the pore volume calculated from the crystal structure , \n which is consistent with a strong affinity between the crystal and \n adsorbate . similar calculations based \n on the uptake of 3336 suggested \n that these were absorbed less efficiently . however the value for squalene 36 , at 65% of the maximum , is remarkable for such a large \n ( 30-carbon ) guest . x - ray crystal structure of 3 with adsorbed \n aniline , \n viewed along the c - axis ( top ) and a - axis ( bottom ) . the aniline is shown in space - filling mode . samples of air - dried crystalline 3 were place in aniline and then removed , washed with ether , \n and analyzed by h nmr after periods ranging from 2 to \n 180 min . the results showed that the crystals are filled to about \n half capacity very quickly ( within the first 2 min ) , but that subsequent \n adsorption is much slower . we were also interested to discover whether the aniline could \n be oxidized to polyaniline within the channels . indeed , treatment \n of the complex with peroxyammonium sulfate in 0.1 n hcl caused the \n crystals to turn dark violet ( after 4 h ) then green ( after 12 h ) . \n the diffuse - reflectance uv vis spectrum of the product showed \n adsorption maxima at 420 and 795 nm consistent with polyaniline formation \n ( see figure s75 ) . pxrd analysis showed that the npsu structure was retained , although \n the crystals were no longer suitable for single - crystal x - ray structure \n determination . another set of experiments involved the adsorption \n of larger guest \n molecules from solutions in diethyl ether . in these cases colored \n guests were used for ease of analysis and the potential for interesting \n or useful optical effects . , solutions of dyes 3743 in ether ( 1020 mm ) were \n added to crystals of 3 , and the mixtures left to stand \n for 3 days . after isolation and washing with ether , all crystals were \n visibly colored . in the case of 3741 the colors were strong enough to show clearly under a microscope \n ( see figure 10 ) . as shown in figure 10 , the colors appeared to permeate the crystals \n and were not localized at ends or edges . interestingly , the crystals \n containing nile red 41 were observed to be blue - purple \n ( figure 10e ) . this dye is strongly solvatochromic , \n its optical adsorption moving to longer wavelengths with increasing \n solvent polarity , and a blue or purple \n color suggests a highly polar environment . soaking the crystals in \n ether for 24 h resulted in loss of color , showing that the dye adsorption \n was reversible . a second npsu crystal , trifluoroacetamide 2 , was also investigated as host and was found to absorb azo - dyes 37 and 38 . the combinations of 3 with disperse red 1 ( 38 ) and azulene ( 43 ) were investigated further , to establish how much dye was included \n and how fast . in the case of azulene , only 1 mol % was absorbed , \n while equilibrium was reached within the first hour . in the case of 38 , the first 1 mol % was also absorbed quickly , but a further \n quantity ( nearly 1 mol % ) was taken up in a slower process over 24 \n h. crystals of 3 after exposure to ethereal solutions \n of ( a ) 37 , ( b ) 38 , ( c ) 39 , \n ( d ) 40 , and ( e ) 41 . despite the appearance of the crystals there was room for \n concern \n that the dyes might not be entering the channels but somehow associated \n with cracks or defects in the crystals . to test this possibility , \n we examined the colored crystals under a microscope using plane polarized \n light . if the dyes were occupying the channels , it seemed likely that \n some ( at least ) would show preferential alignments . if the transition \n dipole moments were to lie roughly along the channel axis ( the long \n axis of the crystal ) the crystals should be dichroic , i.e. , their \n colors should be dependent on their orientation with respect to the \n plane of polarization . figure 11 shows pairs \n of photomicrographs in which crystals of identical composition , but \n oriented at roughly 90 to each other , are illuminated with polarized \n light . each pair of images shows the same crystals , with the plane \n of polarization differing by 90. the crystals are clearly dichroic , \n changing from colored to almost colorless as the plane of polarization \n is rotated . the effect was observed for 237 , 238 , 337 , 338 , and 341 but not for 339 or 340 . the guests which lead to \n dichroism ( 37 , 38 , and 41 ) \n possess extended dipoles due to conjugation of an amino group with \n an electron acceptor . this feature should encourage the molecules \n to adopt a head - to - tail arrangement parallel to the channel axis . \n the images in figure 11 provide strong evidence \n that the dye molecules are indeed in the channels revealed by crystallography . it should be noted that this phenomenon of dye uptake by organic \n molecular crystals is rare and may be unprecedented . it is well - known \n that dyes may be adsorbed by inorganic crystals , such as zeolites , or by organic inorganic hybrids ( pcps / mofs ) . however , the inclusion of dyes in organic molecular \n crystals is normally achieved by cocrystallization , not by the interaction of substrates with \n macroscopically sized preformed crystals . this ability of npsus to \n adsorb such large guest molecules highlights their unusual combination \n of robust crystal structures with spacious accessible interiors . crystals \n of npsus with included dyes illuminated with polarized \n light . for each pair of images the plane of polarization is rotated \n through 90 between left and right . ( a ) 237 , ( b ) 238 , ( c ) 337 , ( d ) 341 . in \n principle , the npsu crystal packing represents a powerful tool \n for the design of functional materials . first , the structure needs \n to be generalizable , forming in ( at least ) most of the cases where \n it might be predicted . second , the crystals need to be truly porous \n so that the space within may be exploited . we have now examined the crystal structures of 26 \n molecules with the general structure represented in figure 3 , and of these only one ( acetamide 29 ) fails to adopt the p61 npsu arrangement . \n the range of npsus now includes examples with vanishingly small pores \n sizes , strongly corrugated pore surfaces , and several cases with potentially \n reactive functional groups ( ch2ch = ch2 in 8 , ch = o in 21 , sme in 22 , and nhboc in 23 ) . it is notable that neither \n the aldehyde nor nhboc groups , both of which are quite polar , disturbed \n the npsu packing . we have also shown that the pores can be evacuated \n without loss of integrity and that subsequent gas adsorption is possible \n ( although given the pressures involved and the low pore volume , applications \n in gas storage are unrealistic ) . more importantly , organic molecules \n are also absorbed , including the large rigid nile red 41 ( mw 318 ) , and the even larger but more \n flexible squalene 36 ( mw 411 ) . \n the ability to orient dye molecules suggests applications in display \n technology and nonlinear optics . although not all dyes showed this \n behavior , the tunability of the pores implies that the phenomenon \n should be extendable ( e.g. , by tailoring of channel diameter ) . the \n fact that small molecules can readily access the pores points to further \n applications in catalysis , sensing , and separations , especially given \n the chirality of the crystals and the ability to incorporate effector \n groups through alloy formation . we hope \n to explore these and other possibilities in future work .","<S> previous \n work has shown that certain steroidal bis-(n - phenyl)ureas , \n derived from cholic acid , form crystals in the p61 space group with unusually wide unidimensional \n pores . </S> <S> a key feature of the nanoporous steroidal urea ( npsu ) structure \n is that groups at either end of the steroid are directed into the \n channels and may in principle be altered without disturbing the crystal \n packing . </S> <S> herein we report an expanded study of this system , which \n increases the structural variety of npsus and also examines their \n inclusion properties . </S> <S> nineteen new npsu crystal structures are described , \n to add to the six which were previously reported . </S> <S> the materials show \n wide variations in channel size , shape , and chemical nature . </S> <S> minimum \n pore diameters vary from 0 up to 13.1 , while some of \n the interior surfaces are markedly corrugated . </S> <S> several variants possess \n functional groups positioned in the channels with potential to interact \n with guest molecules . </S> <S> inclusion studies were performed using a relatively \n accessible tris-(n - phenyl)urea . </S> <S> solvent removal was \n possible without crystal degradation , and gas adsorption could be \n demonstrated . </S> <S> organic molecules ranging from simple aromatics ( e.g. , \n aniline and chlorobenzene ) to the much larger squalene ( mw = 411 ) could be adsorbed from the liquid state , while \n several dyes were taken up from solutions in ether . </S> <S> some dyes gave \n dichroic complexes , implying alignment of the chromophores in the \n npsu channels . </S> <S> notably , these complexes were formed by direct adsorption \n rather than cocrystallization , emphasizing the unusually robust nature \n of these organic molecular hosts . </S>"
3,"giant cell tumor ( gct ) accounts for about 7 to 10% of all cases of primary spinal tumors . patients with gct are usually diagnosed in the third or fourth decade of life , and there is a slight female preponderance152325 ) . gct usually occurs in the metaepiphyseal ends of the long bone , and it rarely occurs in the spine2611 ) . savini et al.28 ) reported that only 2.9% of all gct incidences occur in the spine , and goldenberg et al.14 ) reported that spine involvement is 1.3% in a study of 218 cases of gct . a research conducted by mayo clinic reported incidence of up to 6.5% of gct in the spine27 ) . gct in the spine usually involves sacrum , but it also occurs in other parts of the spine . gct is classified as benign tumor histopathologically , but it had a locally aggressive tendency . and , if it is incompletely excised , it shows a high recurrence rate . there is a reason for a relatively worse prognosis in gct patients , compared to other benign primary spinal tumor patients7826 ) . many studies reported from 10 to 40% local recurrence rate after spinal gct treatment , depending on surgical protocols46141821 ) . in case of recurrence , excision is conducted when another operation is possible . if re - operation is considered difficult , radiation therapy ( rt ) is introduced . generally , however , due to the characteristics of spine , there is a possibility of neural or vascular injury , which will result in difficult wide en bloc excision10 ) . some authors reported that rt is not effective in local tumor control and has a risk of malignant transformation of gct15252728 ) . recent studies reported chemotherapy as effective in case of recurrence or when there are complications1591329 ) . the objective of this study is to obtain the results of treatment gct in the spine . in addition , we analyzed whether local recurrence is influenced by surgical protocols , or rt . the subjects of our study were 242 patients treated for gct from 2000 to 2012 . patients with involved gct spine were 19 cases , and incidence rate was 7.9% . the median age at their first diagnosis was 31 years ( range , 14 to 39 years ) . fourteen tumors were located in the sacrum , one in cervical , one in thoracic , and three in lumbar spine . all the lesions were single and localized at the original spine except one , which originated in the sacrum extending into the coccyx . the median follow - up period was 92 months ( range , 18 to 163 months ) . main symptom of the patients with gct was pain , and the average visual analogue scale ( vas ) was 7 ( range , 5 to 9 ) . only one out of 19 showed motor weakness initially caused by spinal cord compression at thoracic spine . for initial treatment , 6 out of 19 patients underwent gross total removal ( gtr ) , and 13 patients underwent subtotal removal ( str ) . adjuvant rt was performed in 12 cases , 2 cases in gtr group and 10 cases in str group . we analyzed the recurrence rate and recurrence free period ( rfp ) for gct in the spine after the treatment . in addition , we analyzed the difference of the recurrence rate according to the surgical protocol and rt . we did not consider the effect of the treatment for the patients with recurrence after the initial treatment . chi - square tests were used for categorical variables , and student 's t - tests were used for continuous and ordinal variables , as appropriately . a p value 0.05 ( two - tailed ) the data were compiled and analyzed with the software package spss , version 18.0 ( spss inc . , chicago , il , usa ) . during the follow - up period , 7 out of 19 patients had local recurrence , at 36.8% rate . an average recurrence free period was 14 months ( range , 4 to 34 months ) for the patients with recurrence . median recurrence free period of all patients was 84 months ( 95% ci , from 4 to 163 months ) ( fig . gtr group did not have any recurring patients , while 7 out of 13 patients who received str had local recurrence ( table 2 ) , which is statistically significant ( p=0.024 ) . this implies that gtr is the most important method of treatment for controlling the tumor . as for the local control effect of radiation therapy , the average recurrence free period of the group that underwent radiation therapy was 112 months ( 95% ci , from 74 to 150 months ) , which was longer than that of the group without radiation therapy , for 65 months ( 95% ci , from 47 to 83 months ) ( fig . because there was a statistically significant difference between rt group and non - rt group ( p=0.041 ) , we considered that radiation may be effective in local control of tumors . four patients underwent re - operation , 2 patients underwent re - operation and rt , and 1 patient underwent radiosurgery . gtr was possible in only one out of 6 patients who underwent re - operation . all patients reported relieved pain after the treatment according to vas ( range , 0 to 5 ) . nine out of 19 patients were pain - free during the follow - up period . the patient with motor weakness showed incomplete recovery after surgery , although the improvement was not functional and bladder dysfunction remained . neurological complication after surgery occurred in 3 out of 19 patients , including bladder or bowel dysfunctions . all the patients who experienced complications had gct in the sacrum . until the final follow - up period , 18 out of 19 patients remained alive without recurrence , and 1 patient with pulmonary metastasis expired due to pulmonary complication . female 31-year - old patient visited our hospital with pain in buttock that lasted for 4 months . magnetic resonance ( mr ) the patient underwent str , and the pathologic examination revealed to be gct . on computed tomography scan ( ct ) after the operation showed remnant tumor on the ventral side of s2 spinal canal ( fig . four months after the initial operation , mr image showed the progression of the tumor extending to the left side ( fig . ninety - four months after re - operation , the tumor did not recur ( fig . 33-year - old male patient complained lower back pain and radiating pain into both legs for 4 months . sunnyvale , ca , usa ) radiosurgery was done to the remnant tumor located in front of the sacrum ( fig . the patient 's pain was relieved and he has been stable for 68 months ( fig . female 31-year - old patient visited our hospital with pain in buttock that lasted for 4 months . magnetic resonance ( mr ) the patient underwent str , and the pathologic examination revealed to be gct . on computed tomography scan ( ct ) after the operation showed remnant tumor on the ventral side of s2 spinal canal ( fig . four months after the initial operation , mr image showed the progression of the tumor extending to the left side ( fig . ninety - four months after re - operation , the tumor did not recur ( fig . 33-year - old male patient complained lower back pain and radiating pain into both legs for 4 months . sunnyvale , ca , usa ) radiosurgery was done to the remnant tumor located in front of the sacrum ( fig . the patient 's pain was relieved and he has been stable for 68 months ( fig . female 31-year - old patient visited our hospital with pain in buttock that lasted for 4 months . magnetic resonance ( mr ) image showed a huge enhancing mass involving the sacrum ( fig . the patient underwent str , and the pathologic examination revealed to be gct . on computed tomography scan ( ct ) after the operation showed remnant tumor on the ventral side of s2 spinal canal ( fig . four months after the initial operation , mr image showed the progression of the tumor extending to the left side ( fig . ninety - four months after re - operation , the tumor did not recur ( fig . 33-year - old male patient complained lower back pain and radiating pain into both legs for 4 months . sunnyvale , ca , usa ) radiosurgery was done to the remnant tumor located in front of the sacrum ( fig . the patient 's pain was relieved and he has been stable for 68 months ( fig . the gct usually involves the end of long bone as femur ( 23% ) , tibia ( 21% ) , and fibula ( 5% ) around knee , and a few other arise from the spine ( range , 3% to 7%)24 ) . it is reported that there is a slight female preponderance in spinal gct152325 ) . in our study , the incidence of spinal gct was 7.9% , and the male to female ratio is 1 to 0.9 . however , gct in the spine is not easy to remove totally because of lack of accessibility and adjacent important neural and vascular structures10 ) . in case of incomplete excision , rt can be considered , but rt for gct is still controversial1516252728 ) . wide or marginal excision of the tumor or en bloc resections may yield in a lower recurrence rate . liljenqvist et al.20 ) stated on malignant tumors of the spine that en bloc spondylectomy enables wide or marginal resection in most cases with acceptable morbidity . compared to a degree of resection , the textbook showed average of 50% ( from 35 to 70% ) in curettage alone and from 10 to 15% recurrence rate in en bloc resection24 ) . campanacci et al.6 ) reported a recurrence rate of 27% after intralesional curettage versus rates of 8% and 0% after marginal or wide resection , respectively . fidler reported successful result of only one recurrence in 9 consecutive patients who underwent en bloc resection12 ) . in a larger series , 10 of 32 patients who had intralesional curettage recurred within 16 months postoperatively and none of 2 patients with en bloc resection recurred15 ) . regardless of its surgical resection , 25 to 28% of recurrence rate showed in other series152728 ) . according to kim et al.16 ) , 32.4% of recurrence rate in curettage and 11.1% of rate in en bloc resection ( overall 27.1% ) was16 ) . in our study , gtr was defined as cases in which all the involved structures were removed completely . average recurrence rate of all patients was 36.8% , with 0% recurrence rate in gtr group , and 53.8% in str group . the surgical protocols showed a statistically significant difference ( p=0.024 ) , implying gross total resection as more desirable . we performed gtr rather than en bloc wide excision but there was no recurrence case during the median follow up period 92 months . in addition , 14 out of 19 cases located in the sacrum . en bloc resection of the sacrum is much more complicated than that of the thoracolumbar spine . most report about en bloc spondylectomy for gct mentioned above localized in the thoracolumbar spine . in case of the recurrence ( case 4 ) , the patient received additional surgery and rt . considering the morbidity of en bloc wide excision in spinal gct , grt is not worse than en bloc wide excision . in the early 2000s , str and rt was used to treat spinal gct in our hospital , which was the reason for high recurrence rate . later , str was considered not effective for local control of gct , which led to a wider use of gtr . however , recent studies insisted that rt is not only a useful adjuvant treatment modality after incomplete removal of tumors but also an effective therapy as a sole treatment of gct of the spine15252728 ) . hart et al.15 ) reported that 8 out of 36 patients received rt as an initial treatment before surgery , no case except one recurred . four patients had rt after recurrence , and only 1 patient experienced re - recurrence . kim et al.16 ) reported that regardless of surgical protocols , 29 out of 96 cases had received rt and only 3 cases ( 10.3% ) of recurrence occurred . in 67 cases without rt , 23 cases ( 34.3% ) of recurrence occurred , which is significant statistically ( p=0.031)16 ) . in our study , although recurrence rate was 42% after rt , average recurrence free period was 112 months for rt group , and 65 months for non - rt group , with statistical significance ( p=0.041 ) . even after recurrence , additional rt or radiosurgery after re - operation yielded good results . rt seems to be a good treatment modality for delaying recurrence and locally controlling the tumor . distant metastasis of gct is reported to be about 2 to 9%17 ) . in our study , only 1 out of 19 patients ( 5% ) had pulmonary metastasis . bertoni et al.3 ) mentioned that rt or chemotherapy may be useful to treat pulmonary metastasis of gct . other adjuvant therapies , such as a cryotherapy that had cure rate 92% in marcove 's first series and a preoperative embolization that was performed in five patients with result of no recurrence , were mentioned in these series . however a relatively large number of patients is not studied yet22 ) . lee et al.19 ) reported that bone cement injection offers an adjuvant strategy that may enhance the efficacy of treatment for gct when complete en bloc spondylectomy is difficult . balke et al.1 ) mentioned that most inoperable sacral gcts that had repeatedly recurred did not increase in size with no further recurrence was seen . although the role of bisphosphonates for treatment of gct is still unknown , the administration of bisphosphonates can be considered in complicated cases and metastasis . thomas et al.29 ) reported that 35 patients of gct were treated with denosumab , 30 cases were effective for tumor control after 25 weeks . in branstetter et al.5 ) study , 17 out of 20 patients were examined at various stages of treatment to distinguish clinical benefits from denosumab such as improved functional status or reduced pain . gct in the spine is difficult to resect completely due to the special structure of the spine and the invasive nature of the tumor . bloc wide excision is a well known a treatment of choice in order to manage gcts . however , en bloc wide excision of the gct in the spine is not easy without damaging neurovascular structures . the authors performed gross total removal rather than en bloc wide excision and obtained results . in case of str , rt was beneficial in delaying tumor recurrence .","<S> objectivethe treatment of giant cell tumor ( gct ) is mainly performed surgically . </S> <S> however , gct in spine seems difficult to treat because of the limited surgical accessibility and proximity . in this report </S> <S> , we analyzed the outcome of gct treatment in spine.methodsbetween 2000 and 2012 , 19 patients received treatment for gct in spine . </S> <S> median age at their first diagnosis was 31 years , 10 patients were male , and 9 female . </S> <S> fourteen tumors were located in the sacrum , 1 in cervical , 1 in thoracic and 3 in lumbar spine . as primary treatment , gross total removal ( gtr ) was done in 6 patients , and subtotal removal ( str ) in 13 patients . </S> <S> radiation therapy ( rt ) as an adjuvant therapy was performed in 2 cases in gtr group and 10 cases in str group.resultsduring the follow - up , 7 patients had local recurrence ( 36.8% ) . </S> <S> the average period until recurrence after primary treatment was 14 months . </S> <S> no recurrence was detected in gtr group . </S> <S> recurrence was noted in 7 out of 13 patients who underwent str . </S> <S> these differences were statistically significant ( p=0.024 ) . </S> <S> a median of recurrence free period ( rfp ) was 84 months . </S> <S> also average rfp of the rt group was 112 months , and non - rt group was 65 months . </S> <S> these differences were statistically significant ( p=0.041).conclusiontreatment of choice for gct in spine is a complete removal of tumor without neurological deficits . in case of incomplete removal , radiation therapy may be a useful adjuvant treatment modality . </S>"
4,"the cerebral abscess is a common central nervous system infection that can result from trauma , hematogenous spread , or spread from an adjacent infection such as otitis media or sinusitis . despite exhaustive searches , 15 to 30% of abscesses are termed cryptogenic when no source of infection is identified20 ) . a distant infection focus that can cause brain abscesses is a cardiac right to left shunt , which is related to patent foramen ovale15 ) , cyanotic cardiac disease15 ) or pulmonary arteriovenous malformation or fistula ( avf ) . pulmonary avf is a rare congenital vascular malformation involving direct communication between the pulmonary artery and vein without an intervening capillary bed . approximately 8095% of pulmonary avfs are associated with hereditary hemorrhagic telangiectasia ( hht ) , known as osler - weber - rendu syndrome5,8,9 ) . the clinical triads of pulmonary avf is cyanosis , exertional dyspnea , and digital clubbing ; however , 56% of one large series were asymptomatic8 ) . the most prominent central nervous system ( cns ) complications associated with pulmonary avfs are neurologic events , including transient ischemic attacks31 ) , recurrent stroke1,6,18,27 ) , brain abscesses , and seizures26 ) . . we will focus on the cryptogenic brain abscess as developed by patients with idiopathic pulmonary avf , which may be detected only if we consider it may cause the brain abscess . a 65-year - old woman was admitted with a 1-month history of headache and cognitive impairment that had become aggravated 10 days prior . she showed impairment in orientation and judgment , but there were no lateralization signs of motor paralysis or cranial nerve deficits . there was no history of diabetes mellitus , hypertension , lung disease or heart disease . her blood pressure was 110/70 mmhg , pulse rate was 74/min , body temperature was 36.5c , and her respiration rate was 20 breaths / min . erythrocyte sedimentation rate was 54 mm / h and c - reactive protein ( crp ) was 0.22 mg / l . arterial blood gas analysis revealed a ph of 7.425 , pco2 of 42.5 mmhg , po2 of 90.4 mmhg , and an hco3 of 22.2 mmol / l on room air . total cholesterol was 249 mg / dl and ldl - cholesterol was 172 mg / dl . brain computed tomography ( ct ) showed a mass with perilesional edema on the right frontal lobe . brain magnetic resonance ( mr ) imaging revealed a 43-cm ring enhanced mass in the right frontal lobe , which was associated with severe edema and midline shifting to the left side ( fig . a chest x ray showed nodular infiltrations on the right mid - lung zone and the left upper and lower lung zones . 2 ) and antibiotics were maintained for 8 weeks . her past history gave no indication of exertional dyspnea or episodes of hemoptysis , melena , hematemesis , epistaxis or hematuria that might suggest underlying hht . a 45-year - old woman presented with a 7-day history of a progressive left hemiparesis . on admission , blood tests showed a hb of 13.4 g / dl , an hct of 39.6% , a white blood cell ( wbc ) count of 6290 cells/l , and a crp elevated to 2.62 mg / l . diffusion mr imaging showed a diffusion - restricted ovoid mass on the right motor and sensory cortex measuring 1.50.9 cm that was surrounded by diffuse vasogenic perilesional edema ( fig . 3 ) . mr imaging and mr spectroscopy revealed an enhancing lesion involving the right motor and sensory cortex and increased lactate / lipid complex , which was compatible with a brain abscess . a chest x ray did not suggest underlying lung disease . however , a chest ct revealed a pulmonary avf in the right upper lung ( fig . the brain abscess progressed despite treatment with vancomycin and ceftriaxone , so it was removed via craniotomy and the pulmonary avf was embolized . a 65-year - old woman was admitted with a 1-month history of headache and cognitive impairment that had become aggravated 10 days prior . she showed impairment in orientation and judgment , but there were no lateralization signs of motor paralysis or cranial nerve deficits . there was no history of diabetes mellitus , hypertension , lung disease or heart disease . her blood pressure was 110/70 mmhg , pulse rate was 74/min , body temperature was 36.5c , and her respiration rate was 20 breaths / min . erythrocyte sedimentation rate was 54 mm / h and c - reactive protein ( crp ) was 0.22 mg / l . arterial blood gas analysis revealed a ph of 7.425 , pco2 of 42.5 mmhg , po2 of 90.4 mmhg , and an hco3 of 22.2 mmol / l on room air . total cholesterol was 249 mg / dl and ldl - cholesterol was 172 mg / dl . brain computed tomography ( ct ) showed a mass with perilesional edema on the right frontal lobe . brain magnetic resonance ( mr ) imaging revealed a 43-cm ring enhanced mass in the right frontal lobe , which was associated with severe edema and midline shifting to the left side ( fig . a chest x ray showed nodular infiltrations on the right mid - lung zone and the left upper and lower lung zones . 2 ) and antibiotics were maintained for 8 weeks . her past history gave no indication of exertional dyspnea or episodes of hemoptysis , melena , hematemesis , epistaxis or hematuria that might suggest underlying hht . a 45-year - old woman presented with a 7-day history of a progressive left hemiparesis . , blood tests showed a hb of 13.4 g / dl , an hct of 39.6% , a white blood cell ( wbc ) count of 6290 cells/l , and a crp elevated to 2.62 mg / l . diffusion mr imaging showed a diffusion - restricted ovoid mass on the right motor and sensory cortex measuring 1.50.9 cm that was surrounded by diffuse vasogenic perilesional edema ( fig . 3 ) . mr imaging and mr spectroscopy revealed an enhancing lesion involving the right motor and sensory cortex and increased lactate / lipid complex , which was compatible with a brain abscess . however , a chest ct revealed a pulmonary avf in the right upper lung ( fig . the brain abscess progressed despite treatment with vancomycin and ceftriaxone , so it was removed via craniotomy and the pulmonary avf was embolized . cryptogenic brain abscesses can occur due to rare diseases that are not addressed in routine clinical practice . congenital cyanotic heart disease3 ) , patent foramen ovale12,15 ) , thoracic infection19 ) or asymptomatic dental infections7 ) are commonly associated with cerebral abscesses . idiopathic asymptomatic pulmonary avfs are also a cause of brain abscess , especially recurrent brain abscess ( table 1 ) . the incidence of idiopathic pulmonary avf - related cns complications is between 19 and 59 % 11,25,29 ) . associated neurological events included migraine , transient ischemic attack , stroke , abscess , and seizure17,26,30 ) . the incidence of brain abscess in patients with a pulmonary avf is around 1 to 5%8 ) . the most likely mechanism for these neurological events is a paradoxical embolism across the pulmonary avf or across a coexisting cerebral arteriovenous malformation in patients with hht17 ) . the pulmonary capillary bed acts as a filter that removes small thrombi and bacteria as they enter the bloodstream , even during daily activities such as oral hygiene . in a pulmonary avf , the pulmonary capillary bed is bypassed , providing a direct right to left shunt that depends on the diameter of the feeding artery . as a consequence , patients may develop paradoxical emboli that present as a transient ischemic attack , stroke , or brain abscess5 ) . the fundamental defect that we found was a right - to - left shunt from the pulmonary artery to the pulmonary vein , and the degree of shunting determines the clinical effects . if shunting is minimal , cyanotic symptoms are usually absent . if the right - to - left shunt is greater than 20% of the systemic cardiac output , or if there is reduction of hemoglobin by more than 50 g / l , the patient will have obvious cyanosis , clubbing , and polycythemia17 ) . the characteristic findings of cyanosis , clubbing and an extra - cardiac murmur do not always accompany pulmonary avf , and diagnosis may be difficult . a review of the mayo clinic experience suggested a morbidity of 2633% and mortality of 816% in untreated patients with pulmonary avf26 ) . the international guidelines for the management of pulmonary avf in hht recommends that treatments should be applied to all adults with avfs and children with symptomatic pulmonary avfs . the decision to treat in asymptomatic children ( no dyspnea , no exercise intolerance , no growth delay , no cyanosis or clubbing , no previous complication ) should be made on a case - by - case basis . the selection of pulmonary avfs for embolization is based on feeding artery diameter , generally 3 mm or greater10 ) . a literature review ( table 1 ) indicates that idiopathic pulmonary avfs can cause brain abscess in patients as young as 18 years old . based on these reported series , 10 of 13 patients did not have pulmonary avf - related symptoms until the brain abscess developed . idiopathic pulmonary avfs are more frequently solitary , around 80% , compared with that of hht ( < 40%)30 ) . it means idiopathic pulmonary avfs are less likely to be associated with large right - to - left shunt , this might in part explain why patients with idiopathic pavms might have a lower frequency of cyanosis and polycythemia . also , previous hht series have shown an association between number of pulmonary avfs and cerebral abscess risk22 ) . this might explain why idiopathic pulmonary avfs are associated with a lower frequency of cerebral abscess than in hht . the organisms in the brain abscess were not consistent but most frequently isolated ones were streptococci genus . 6 cases out of 13 did not reveal an organism at all . if young adults without a premorbid history present with a brain abscess , pulmonary problems must be evaluated . this report highlights the need to consider pulmonary avf as an etiology of cerebral abscess when routine investigations fail to detect a source .","<S> brain abscess commonly occurs secondary to an adjacent infection ( mostly in the middle ear or paranasal sinuses ) or due to hematogenous spread from a distant infection or trauma . </S> <S> pulmonary arteriovenous fistulas ( avfs ) are abnormal direct communications between the pulmonary artery and vein . </S> <S> we present two cases of brain abscess associated with asymptomatic pulmonary avf . </S> <S> a 65-year - old woman was admitted with a headache and cognitive impairment that aggravated 10 days prior . </S> <S> an magnetic resonance ( mr ) imaging revealed a brain abscess with severe edema in the right frontal lobe . </S> <S> we performed a craniotomy and abscess removal . </S> <S> bacteriological culture proved negative . </S> <S> her chest computed tomography ( ct ) showed multiple avfs . </S> <S> therapeutic embolization of multiple pulmonary avfs was performed and antibiotics were administered for 8 weeks . a 45-year - old woman presented with a 7-day history of progressive left hemiparesis . </S> <S> she had no remarkable past medical history or family history . on admission , </S> <S> blood examination showed a white blood cell count of 6290 cells / ul and a high sensitive c - reactive protein of 2.62 mg / l . </S> <S> ct and mr imaging with mr spectroscopy revealed an enhancing lesion involving the right motor and sensory cortex with marked perilesional edema that suggested a brain abscess . </S> <S> a chest ct revealed a pulmonary avf in the right upper lung . </S> <S> the pulmonary avf was obliterated with embolization . </S> <S> there needs to consider pulmonary avf as an etiology of cerebral abscess when routine investigations fail to detect a source . </S>"


The metric is an instance of [`datasets.Metric`](https://huggingface.co/docs/datasets/package_reference/main_classes.html#datasets.Metric):

In [13]:
metric

Metric(name: "rouge", features: {'predictions': Value(dtype='string', id='sequence'), 'references': Value(dtype='string', id='sequence')}, usage: """
Calculates average rouge scores for a list of hypotheses and references
Args:
    predictions: list of predictions to score. Each predictions
        should be a string with tokens separated by spaces.
    references: list of reference for each prediction. Each
        reference should be a string with tokens separated by spaces.
    rouge_types: A list of rouge types to calculate.
        Valid names:
        `"rouge{n}"` (e.g. `"rouge1"`, `"rouge2"`) where: {n} is the n-gram based scoring,
        `"rougeL"`: Longest common subsequence based scoring.
        `"rougeLSum"`: rougeLsum splits text using `"
"`.
        See details in https://github.com/huggingface/datasets/issues/617
    use_stemmer: Bool indicating whether Porter stemmer should be used to strip word suffixes.
    use_agregator: Return aggregates if this is set to True
Retu

You can call its `compute` method with your predictions and labels, which need to be list of decoded strings:

In [14]:
fake_preds = ["hello there", "general kenobi"]
fake_labels = ["hello there", "general kenobi"]
metric.compute(predictions=fake_preds, references=fake_labels)

{'rouge1': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rouge2': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rougeL': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0)),
 'rougeLsum': AggregateScore(low=Score(precision=1.0, recall=1.0, fmeasure=1.0), mid=Score(precision=1.0, recall=1.0, fmeasure=1.0), high=Score(precision=1.0, recall=1.0, fmeasure=1.0))}

## Preprocessing the data

Before we can feed those texts to our model, we need to preprocess them. This is done by a 🤗 `Transformers` `Tokenizer` which will (as the name indicates) tokenize the inputs (including converting the tokens to their corresponding IDs in the pretrained vocabulary) and put it in a format the model expects, as well as generate the other inputs that the model requires.

To do all of this, we instantiate our tokenizer with the `AutoTokenizer.from_pretrained` method, which will ensure:

- we get a tokenizer that corresponds to the model architecture we want to use,
- we download the vocabulary used when pretraining this specific checkpoint.

That vocabulary will be cached, so it's not downloaded again the next time we run the cell.

In [15]:
from transformers import AutoTokenizer
    
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)

Downloading:   0%|          | 0.00/1.55k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]

By default, the call above will use one of the fast tokenizers (backed by Rust) from the 🤗 `Tokenizers` library.

You can directly call this tokenizer on one sentence or a pair of sentences:

In [16]:
tokenizer("Hello, this one sentence!")

{'input_ids': [0, 31414, 6, 42, 65, 3645, 328, 2], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1]}

Depending on the model you selected, you will see different keys in the dictionary returned by the cell above. They don't matter much for what we're doing here (just know they are required by the model we will instantiate later), you can learn more about them in [this tutorial](https://huggingface.co/transformers/preprocessing.html) if you're interested.

Instead of one sentence, we can pass along a list of sentences:

In [17]:
tokenizer(["Hello, this one sentence!", "This is another sentence."])

{'input_ids': [[0, 31414, 6, 42, 65, 3645, 328, 2], [0, 713, 16, 277, 3645, 4, 2]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1]]}

To prepare the targets for our model, we need to tokenize them inside the `as_target_tokenizer` context manager. This will make sure the tokenizer uses the special tokens corresponding to the targets:

In [18]:
with tokenizer.as_target_tokenizer():
    print(tokenizer(["Hello, this one sentence!", "This is another sentence."]))

{'input_ids': [[0, 31414, 6, 42, 65, 3645, 328, 2], [0, 713, 16, 277, 3645, 4, 2]], 'attention_mask': [[1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1]]}


If you are using one of the five T5 checkpoints we have to prefix the inputs with "summarize:" (the model can also translate and it needs the prefix to know which task it has to perform).

In [19]:
if model_checkpoint in ["t5-small", "t5-base", "t5-larg", "t5-3b", "t5-11b"]:
    prefix = "summarize: "
else:
    prefix = ""

We can then write the function that will preprocess our samples. We just feed them to the `tokenizer` with the argument `truncation=True`. This will ensure that an input longer that what the model selected can handle will be truncated to the maximum length accepted by the model. The padding will be dealt with later on (in a data collator) so we pad examples to the longest length in the batch and not the whole dataset.

The max input length of `facebook/bart-large-cnn` is 1024, so `max_input_length = 1024`.

In [20]:
max_input_length = 1024
max_target_length = 256

def preprocess_function(examples):
    inputs = [prefix + doc for doc in examples["article"]]
    model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)

    # Setup the tokenizer for targets
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples["abstract"], max_length=max_target_length, truncation=True)

    model_inputs["labels"] = labels["input_ids"]
    return model_inputs

This function works with one or several examples. In the case of several examples, the tokenizer will return a list of lists for each key:

In [21]:
preprocess_function(raw_datasets['train'][:2])

{'input_ids': [[0, 405, 11493, 11, 55, 87, 654, 207, 9, 1484, 8, 189, 1338, 1814, 207, 11, 1402, 3505, 9, 16640, 2156, 941, 11, 1484, 11793, 17930, 8, 73, 368, 13785, 5804, 4, 134, 41, 23249, 16, 6533, 25, 41, 15650, 17215, 672, 9, 23385, 43202, 36, 1368, 428, 4839, 36, 1368, 428, 28696, 316, 821, 1589, 385, 462, 4839, 8, 189, 16072, 25, 10, 898, 9, 5, 7482, 2199, 2156, 13162, 2156, 2129, 10894, 2156, 17930, 2156, 50, 13785, 5804, 479, 6104, 3218, 3608, 14, 7967, 8, 18327, 139, 111, 2174, 797, 71, 13785, 5804, 2156, 941, 11, 471, 8, 5397, 16640, 2156, 189, 28, 13969, 30, 41, 23249, 4, 1978, 41, 23249, 747, 41089, 1290, 5298, 215, 25, 16069, 2156, 8269, 2156, 8, 25599, 642, 22423, 2156, 8, 4634, 189, 33, 10, 2430, 1683, 15, 1318, 9, 301, 36, 2231, 1168, 4839, 8, 819, 2194, 11, 1484, 19, 1668, 479, 4634, 2156, 7, 1477, 2166, 13838, 2156, 2231, 1168, 2156, 8, 17618, 32444, 11, 1484, 19, 1668, 2156, 24, 74, 28, 5701, 7, 185, 10, 16300, 1548, 11, 9397, 9883, 54, 240, 1416, 13, 1668, 111, 30

To apply this function on all the pairs of sentences in our dataset, we just use the `map` method of our `dataset` object we created earlier. This will apply the function on all the elements of all the splits in `dataset`, so our training, validation and testing data will be preprocessed in one single command.

In [22]:
tokenized_datasets = raw_datasets.map(preprocess_function, batched=True)

  0%|          | 0/8 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

Even better, the results are automatically cached by the 🤗 `Datasets` library to avoid spending time on this step the next time you run your notebook. The 🤗 `Datasets` library is normally smart enough to detect when the function you pass to map has changed (and thus requires to not use the cache data). For instance, it will properly detect if you change the task in the first cell and rerun the notebook. 🤗 `Datasets` warns you when it uses cached files, you can pass `load_from_cache_file=False` in the call to `map` to not use the cached files and force the preprocessing to be applied again.

Note that we passed `batched=True` to encode the texts by batches together. This is to leverage the full benefit of the fast tokenizer we loaded earlier, which will use multi-threading to treat the texts in a batch concurrently.

## Fine-tuning the model

Now that our data is ready, we can download the pretrained model and fine-tune it. Since our task is of the sequence-to-sequence kind, we use the `AutoModelForSeq2SeqLM` class. Like with the tokenizer, the `from_pretrained` method will download and cache the model for us.

In [23]:
from transformers import AutoModelForSeq2SeqLM, DataCollatorForSeq2Seq, Seq2SeqTrainingArguments, Seq2SeqTrainer

model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

Downloading:   0%|          | 0.00/1.51G [00:00<?, ?B/s]

Note that  we don't get a warning like in our classification example. This means we used all the weights of the pretrained model and there is no randomly initialized head in this case.

To instantiate a `Seq2SeqTrainer`, we will need to define three more things. The most important is the [`Seq2SeqTrainingArguments`](https://huggingface.co/transformers/main_classes/trainer.html#transformers.Seq2SeqTrainingArguments), which is a class that contains all the attributes to customize the training. It requires one folder name, which will be used to save the checkpoints of the model, and all other arguments are optional:

In [24]:
batch_size = 2
model_name = model_checkpoint.split("/")[-1]
args = Seq2SeqTrainingArguments(
    f"{model_name}-finetuned-pubmed",
    evaluation_strategy = "epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    weight_decay=0.01,
    save_total_limit=3,
    num_train_epochs=5,
    predict_with_generate=True,
    fp16=True,
    push_to_hub=True,
    seed = 42,
)

Here we set the evaluation to be done at the end of each epoch, tweak the learning rate, use the `batch_size` defined at the top of the cell and customize the weight decay. Since the `Seq2SeqTrainer` will save the model regularly and our dataset is quite large, we tell it to make three saves maximum. Lastly, we use the `predict_with_generate` option (to properly generate summaries) and activate mixed precision training (to go a bit faster).

The last argument to setup everything so we can push the model to the [Hub](https://huggingface.co/models) regularly during training. Remove it if you didn't follow the installation steps at the top of the notebook. If you want to save your model locally in a name that is different than the name of the repository it will be pushed, or if you want to push your model under an organization and not your name space, use the `hub_model_id` argument to set the repo name (it needs to be the full name, including your namespace: for instance `"sgugger/t5-finetuned-xsum"` or `"huggingface/t5-finetuned-xsum"`).

Then, we need a special kind of data collator, which will not only pad the inputs to the maximum length in the batch, but also the labels:

In [25]:
data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

The last thing to define for our `Seq2SeqTrainer` is how to compute the metrics from the predictions. We need to define a function for this, which will just use the `metric` we loaded earlier, and we have to do a bit of pre-processing to decode the predictions into texts:

In [26]:
import nltk
import numpy as np

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    decoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)
    # Replace -100 in the labels as we can't decode them.
    labels = np.where(labels != -100, labels, tokenizer.pad_token_id)
    decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)
    
    # Rouge expects a newline after each sentence
    decoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]
    decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]
    
    result = metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True)
    # Extract a few results
    result = {key: value.mid.fmeasure * 100 for key, value in result.items()}
    
    # Add mean generated length
    prediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]
    result["gen_len"] = np.mean(prediction_lens)
    
    return {k: round(v, 4) for k, v in result.items()}

Then we just need to pass all of this along with our datasets to the `Seq2SeqTrainer`:

In [27]:
trainer = Seq2SeqTrainer(
    model,
    args,
    train_dataset=tokenized_datasets["train"],
    eval_dataset=tokenized_datasets["validation"],
    data_collator=data_collator,
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

Cloning https://huggingface.co/Kevincp560/bart-large-cnn-finetuned-pubmed into local empty directory.
Using amp half precision backend


We can now finetune our model by just calling the `train` method:

In [28]:
trainer.train()

The following columns in the training set  don't have a corresponding argument in `BartForConditionalGeneration.forward` and have been ignored: article, abstract.
***** Running training *****
  Num examples = 8000
  Num Epochs = 5
  Instantaneous batch size per device = 2
  Total train batch size (w. parallel, distributed & accumulation) = 2
  Gradient Accumulation steps = 1
  Total optimization steps = 20000


Epoch,Training Loss,Validation Loss,Rouge1,Rouge2,Rougel,Rougelsum,Gen Len
1,1.932,1.811004,38.1151,15.2255,23.4286,34.2521,141.8905
2,1.7001,1.779011,39.8217,16.3042,24.649,35.831,142.0
3,1.5,1.797123,40.6108,17.0446,25.1977,36.5556,141.9865
4,1.3316,1.81065,40.0466,16.4851,24.7094,36.0998,141.9335
5,1.1996,1.841579,40.4866,16.7472,24.9831,36.4002,142.0


Saving model checkpoint to bart-large-cnn-finetuned-pubmed/checkpoint-500
Configuration saved in bart-large-cnn-finetuned-pubmed/checkpoint-500/config.json
Model weights saved in bart-large-cnn-finetuned-pubmed/checkpoint-500/pytorch_model.bin
tokenizer config file saved in bart-large-cnn-finetuned-pubmed/checkpoint-500/tokenizer_config.json
Special tokens file saved in bart-large-cnn-finetuned-pubmed/checkpoint-500/special_tokens_map.json
tokenizer config file saved in bart-large-cnn-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in bart-large-cnn-finetuned-pubmed/special_tokens_map.json
Saving model checkpoint to bart-large-cnn-finetuned-pubmed/checkpoint-1000
Configuration saved in bart-large-cnn-finetuned-pubmed/checkpoint-1000/config.json
Model weights saved in bart-large-cnn-finetuned-pubmed/checkpoint-1000/pytorch_model.bin
tokenizer config file saved in bart-large-cnn-finetuned-pubmed/checkpoint-1000/tokenizer_config.json
Special tokens file saved in bart-larg

TrainOutput(global_step=20000, training_loss=1.5518645141601564, metrics={'train_runtime': 28915.1413, 'train_samples_per_second': 1.383, 'train_steps_per_second': 0.692, 'total_flos': 8.65157255626752e+16, 'train_loss': 1.5518645141601564, 'epoch': 5.0})

You can now upload the result of the training to the Hub, just execute this instruction:

In [29]:
trainer.push_to_hub()

Saving model checkpoint to bart-large-cnn-finetuned-pubmed
Configuration saved in bart-large-cnn-finetuned-pubmed/config.json
Model weights saved in bart-large-cnn-finetuned-pubmed/pytorch_model.bin
tokenizer config file saved in bart-large-cnn-finetuned-pubmed/tokenizer_config.json
Special tokens file saved in bart-large-cnn-finetuned-pubmed/special_tokens_map.json
Several commits (2) will be pushed upstream.
The progress bars may be unreliable.


Upload file pytorch_model.bin:   0%|          | 3.37k/1.51G [00:00<?, ?B/s]

Upload file runs/Feb28_10-34-18_0a745a4bc4c2/events.out.tfevents.1646044508.0a745a4bc4c2.82.0:  25%|##5       …

To https://huggingface.co/Kevincp560/bart-large-cnn-finetuned-pubmed
   b5eb817..fc211c7  main -> main

To https://huggingface.co/Kevincp560/bart-large-cnn-finetuned-pubmed
   fc211c7..28458c3  main -> main



'https://huggingface.co/Kevincp560/bart-large-cnn-finetuned-pubmed/commit/fc211c76a18a4439926266ad363e91630f947555'

You can now share this model with all your friends, family, favorite pets: they can all load it with the identifier `"your-username/the-name-you-picked"` so for instance:

```python
from transformers import AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("sgugger/my-awesome-model")
```