# Possible models to use

## DistilBART - distilled version of BART, which is much smaller than the full BART model but retains much of its performance. Since it is distilled, it's faster and more efficient while still being well-suited for summarization tasks. DistilBART is designed for text summarization, and the cnn-12-6 variant is trained on news articles, making it a viable medium sized model for summarizing legal documents.

## T5 (Text-to-Text Transfer Transformer) - Small or Base - T5 treats every task as a text-to-text problem, making it very flexible for summarization. The small and base variants offer a middle ground between performance and model size, making them suitable for use cases where computational resources are limited.

In [1]:
from datasets import load_dataset

### Here I load the datasets and edit some of the columns prior to tokenizing the datasets

In [55]:
# Load the datasets
ds1_train = load_dataset("joelniklaus/legal_case_document_summarization", split='train')
ds1_train = ds1_train.remove_columns(['dataset_name'])
ds1_train = ds1_train.rename_column('judgement', 'text')
ds1_train = ds1_train.rename_column('summary', 'labels')
print(ds1_train)

ds1_test = load_dataset("joelniklaus/legal_case_document_summarization", split='test')
ds1_test = ds1_test.remove_columns(['dataset_name'])
ds1_test = ds1_test.rename_column('judgement', 'text')
ds1_test = ds1_test.rename_column('summary', 'labels')

# NOTE: This dataset only has 50 rows. It may not be a dataset we want to use.
# NOTE: THIS DATA IS NOT PLAYING NICELY WITH CONCATENATION
# Although the summaries appear to be good
ds2 = load_dataset("manasvikalyan/legal-documents-summary")
ds2 = ds2['data']
ds2 = ds2.remove_columns(['summary_a2'])
ds2 = ds2.rename_column('summary_a1', 'labels')
ds2 = ds2.rename_column('judgement', 'text')
print(ds2)

# TODO: need to split this dataset manually later

# NOTE: This dataset may not be useful the Task: Text Summarization. But moreso, option selection.
# Context: is a given legal scenario or fact pattern
# Options (Holdings): Multiple candidate holdings, one of which is correct.
# Labels: The correct holding is labeled to allow supervised learning and evaluation
ds3_train = load_dataset("coastalcph/lex_glue", "case_hold", split='train')
ds3_train = ds3_train.rename_column('label', 'labels')
ds3_test = load_dataset("coastalcph/lex_glue", "case_hold", split='test')
ds3_test = ds3_test.rename_column('label', 'labels')
print(ds3_train)

ds4_train = load_dataset("coastalcph/lex_glue", "ecthr_a", split='train')
ds4_test = load_dataset("coastalcph/lex_glue", "ecthr_a", split='test')
print(ds4_train)

ds5_train = load_dataset("coastalcph/lex_glue", "ecthr_b", split='train')
ds5_test = load_dataset("coastalcph/lex_glue", "ecthr_b", split='test')
print(ds5_train)

ds6_train = load_dataset("coastalcph/lex_glue", "eurlex", split='train')
ds6_test = load_dataset("coastalcph/lex_glue", "eurlex", split='test')
print(ds6_train)

ds7_train = load_dataset("coastalcph/lex_glue", "ledgar", split='train')
ds7_train = ds7_train.rename_column('label', 'labels')
ds7_test = load_dataset("coastalcph/lex_glue", "ledgar", split='test')
ds7_test = ds7_test.rename_column('label', 'labels')
print(ds7_train)

ds8_train = load_dataset("coastalcph/lex_glue", "scotus", split='train')
ds8_train = ds8_train.rename_column('label', 'labels')
ds8_test = load_dataset("coastalcph/lex_glue", "scotus", split='test')
ds8_test = ds8_test.rename_column('label', 'labels')
print(ds8_train)



Repo card metadata block was not found. Setting CardData to empty.


Dataset({
    features: ['text', 'labels'],
    num_rows: 7773
})


Repo card metadata block was not found. Setting CardData to empty.


Dataset({
    features: ['text', 'labels'],
    num_rows: 50
})
Dataset({
    features: ['context', 'endings', 'labels'],
    num_rows: 45000
})
Dataset({
    features: ['text', 'labels'],
    num_rows: 9000
})
Dataset({
    features: ['text', 'labels'],
    num_rows: 9000
})
Dataset({
    features: ['text', 'labels'],
    num_rows: 55000
})
Dataset({
    features: ['text', 'labels'],
    num_rows: 60000
})
Dataset({
    features: ['text', 'labels'],
    num_rows: 5000
})


### Here I am pre-processing the data for the DistilBART model

In [33]:
from transformers import BartTokenizer

In [34]:
# Load the BART tokenizer
tokenizer = BartTokenizer.from_pretrained('sshleifer/distilbart-cnn-12-6')

In [39]:
# Tokenization function for text and summaries
def tokenize_function(examples):
    # Tokenize the input text
    inputs = tokenizer(examples['text'], max_length=512, truncation=True, padding='max_length')
    
    # Tokenize the output summary labels
    with tokenizer.as_target_tokenizer():
        labels = tokenizer(examples['labels'], max_length=150, truncation=True, padding='max_length')

    # Set the tokenized labels in the input dictionary
    inputs['labels'] = labels['input_ids']
    
    return inputs

def tokenize_function_for_ds4(examples):
    # Tokenize each item in the list of 'text' entries
    inputs = tokenizer(
        examples['text'],
        max_length=512,
        truncation=True,
        padding='max_length',
        is_split_into_words=True  # Add this if each item is already tokenized/split into words
    )
    return inputs

def tokenize_function_for_ds6(examples):
    # Tokenize the input text
    inputs = tokenizer(examples['text'], max_length=512, truncation=True, padding='max_length')

    # If labels are in batches, process accordingly
    if 'label' in examples:
        with tokenizer.as_target_tokenizer():
            labels = tokenizer(
                [str(label) for label in examples['label']],
                max_length=150,
                truncation=True,
                padding='max_length'
            )
        inputs['labels'] = labels['input_ids']
    
    return inputs

def tokenize_function_for_ds7(examples):
    # Tokenize the input text
    inputs = tokenizer(examples['text'], max_length=512, truncation=True, padding='max_length')
    
    # Convert labels to strings if necessary
    labels = [str(label) for label in examples['labels']]
    
    with tokenizer.as_target_tokenizer():
        tokenized_labels = tokenizer(labels, max_length=150, truncation=True, padding='max_length')
    
    # Set the tokenized labels in the input dictionary
    inputs['labels'] = tokenized_labels['input_ids']
    return inputs

### Here I am just Tokenizing 'ds1' and 'ds2' for DistilBART (ds1_train and ds2_actual)

### TODO: Tokenize the training set data later

In [56]:
# Tokenize the datasets for DistilBART
# Training Data
ds1_train_tokenized = ds1_train.map(tokenize_function, batched=True)

ds2_tokenized = ds2.map(tokenize_function, batched=True)
ds2_tokenized = ds2_tokenized.train_test_split(test_size=0.2)
ds2_train_tokenized = ds2_tokenized['train'] 

# ds3_train_tokenized = ds3_train.map(tokenize_function, batched=True) <-- multiple choice data
ds4_train_tokenized = ds4_train.map(tokenize_function_for_ds4, batched=True)
ds5_train_tokenized = ds5_train.map(tokenize_function_for_ds4, batched=True)
ds6_train_tokenized = ds6_train.map(tokenize_function_for_ds6, batched=True)
ds7_train_tokenized = ds7_train.map(tokenize_function_for_ds7, batched=True)
ds8_train_tokenized = ds8_train.map(tokenize_function_for_ds7, batched=True)

# Testing Data
ds1_test_tokenized = ds1_test.map(tokenize_function, batched=True)

ds2_test_tokenized = ds2_tokenized['test']

# ds3_test_tokenized = ds3_test.map(tokenize_function, batched=True) <-- multiple choice data
ds4_test_tokenized = ds4_test.map(tokenize_function_for_ds4, batched=True)
ds5_test_tokenized = ds5_test.map(tokenize_function_for_ds4, batched=True)
ds6_test_tokenized = ds6_test.map(tokenize_function_for_ds6, batched=True)
ds7_test_tokenized = ds7_test.map(tokenize_function_for_ds7, batched=True)
ds8_test_tokenized = ds8_test.map(tokenize_function_for_ds7, batched=True)


Map:   0%|          | 0/200 [00:00<?, ? examples/s]



Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

Map:   0%|          | 0/5000 [00:00<?, ? examples/s]

Map:   0%|          | 0/10000 [00:00<?, ? examples/s]

Map:   0%|          | 0/1400 [00:00<?, ? examples/s]

In [67]:
# Checking the features of each dataset
print("ds1_train_tokenized features:", ds1_train_tokenized.features)
print("ds2_train_tokenized features:", ds2_train_tokenized.features)
print("ds4_train_tokenized features:", ds4_train_tokenized.features)
print("ds5_train_tokenized features:", ds5_train_tokenized.features)
print("ds6_train_tokenized features:", ds6_train_tokenized.features)
print("ds7_train_tokenized features:", ds7_train_tokenized.features)
print("ds8_train_tokenized features:", ds8_train_tokenized.features)

ds1_train_tokenized features: {'text': Value(dtype='string', id=None), 'labels': Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None), 'input_ids': Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None), 'attention_mask': Sequence(feature=Value(dtype='int8', id=None), length=-1, id=None)}
ds2_train_tokenized features: {'text': Value(dtype='string', id=None), 'labels': Sequence(feature=Value(dtype='int64', id=None), length=-1, id=None), 'input_ids': Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None), 'attention_mask': Sequence(feature=Value(dtype='int8', id=None), length=-1, id=None)}
ds4_train_tokenized features: {'text': Sequence(feature=Value(dtype='string', id=None), length=-1, id=None), 'labels': Sequence(feature=ClassLabel(names=['2', '3', '5', '6', '8', '9', '10', '11', '14', 'P1-1'], id=None), length=-1, id=None), 'input_ids': Sequence(feature=Value(dtype='int32', id=None), length=-1, id=None), 'attention_mask': Sequence(feature=Value

### Concatenating Tokenized Datasets

Based on the schemas of the tokenized datasets, there are differences in the structure of the `text` and `labels` fields:

- **`text` Field**:  
   - In `ds1_train_tokenized`, `ds2_train_tokenized`, `ds6_train_tokenized`, `ds7_train_tokenized`, and `ds8_train_tokenized`, the `text` field is of type `Value(dtype='string')` (a single string).
   - In `ds4_train_tokenized` and `ds5_train_tokenized`, the `text` field is of type `Sequence(feature=Value(dtype='string'))` (a sequence of strings).

- **`labels` Field**:  
   - In `ds1_train_tokenized`, `ds2_train_tokenized`, `ds7_train_tokenized`, and `ds8_train_tokenized`, the `labels` field is a `Sequence` of `int64` values.
   - In `ds4_train_tokenized`, `ds5_train_tokenized`, and `ds6_train_tokenized`, the `labels` field is a `Sequence` of `ClassLabel` objects.

### Which Datasets Can Be Concatenated?

1. **Datasets with Matching `text` and `labels` Fields:**
   - The following datasets have the same `text` and `labels` types and can be concatenated directly:
     - `ds1_train_tokenized`
     - `ds2_train_tokenized`
     - `ds7_train_tokenized`
     - `ds8_train_tokenized`

   These datasets all have `text` as `Value(dtype='string')` and `labels` as `Sequence(feature=Value(dtype='int64'))`.

2. **Datasets with `ClassLabel` in `labels`:**
   - The following datasets have `ClassLabel` in the `labels` field and can be concatenated after aligning the `text` field:
     - `ds4_train_tokenized`
     - `ds5_train_tokenized`
     - `ds6_train_tokenized`

   Note that `ds4_train_tokenized` and `ds5_train_tokenized` have `text` as `Sequence(feature=Value(dtype='string'))`, while `ds6_train_tokenized` has `text` as `Value(dtype='string')`. You will need to cast these to the same type before concatenating.

### How to Concatenate?

1. **Concatenating Compatible Datasets Directly**:
   You can concatenate the following datasets directly:
   
   `combined_training_tokenized_dataset = concatenate_datasets([
       ds1_train_tokenized, 
       ds2_train_tokenized,
       ds7_train_tokenized,
       ds8_train_tokenized
   ])
    
2. **Aligning Features for Other Datasets**:
   For datasets with differing `text` fields, they can be casted to a consistent type before concatenating:
   
   `
    ds4_train_tokenized = ds4_train_tokenized.cast({'text': Value('string')})
    ds5_train_tokenized = ds5_train_tokenized.cast({'text': Value('string')})
    combined_classlabel_tokenized_dataset = concatenate_datasets([
        ds4_train_tokenized, 
        ds5_train_tokenized, 
        ds6_train_tokenized
        ])
    

## TODO: choose what datasets to concatenate and how to concatenate them.

In [70]:
from datasets import concatenate_datasets

In [75]:
combined_training_tokenized_dataset = concatenate_datasets([
ds1_train_tokenized, 
ds2_train_tokenized,
ds7_train_tokenized,
ds8_train_tokenized
])

combined_testing_tokenized_dataset = concatenate_datasets([
ds1_test_tokenized, 
ds2_test_tokenized,
ds7_test_tokenized,
ds8_test_tokenized
])

### TODO: set the other dataset formats later:

### Extra Columns (`input_ids`, `attention_mask`, `labels`)
- **`input_ids`**: Token IDs representing the input text for the model.
- **`attention_mask`**: Identifies which tokens are real and which are padding.
- **`labels`**: Token IDs representing the target summary, used for training.
These columns are essential for the model to properly process inputs, ignore padding, and learn to generate correct summaries during training.


In [77]:
# Set the dataset format to PyTorch tensors
# print(ds1_train_tokenized)
combined_training_tokenized_dataset.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])
combined_testing_tokenized_dataset.set_format(type='torch', columns=['input_ids', 'attention_mask', 'labels'])

### Load the DistilBART model here

In [78]:
from transformers import BartForConditionalGeneration

In [79]:
# Load the DistilBART model for conditional generation
model = BartForConditionalGeneration.from_pretrained('sshleifer/distilbart-cnn-12-6')

### Setting up training arguments for the model here

### TODO: These can be modified later to improve the model

In [80]:
from transformers import TrainingArguments, Trainer

In [81]:
# Set up training arguments
training_args = TrainingArguments(
    output_dir='./results',            # output directory
    eval_strategy="epoch",       # evaluate at each epoch
    learning_rate=5e-5,                # learning rate
    per_device_train_batch_size=4,     # batch size for training
    per_device_eval_batch_size=4,      # batch size for evaluation
    num_train_epochs=3,                # number of training epochs
    weight_decay=0.01,                 # strength of weight decay
    save_total_limit=2,                # only keep last 2 checkpoints
)

In [82]:
# Initialize the Trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=combined_training_tokenized_dataset,
    eval_dataset=combined_testing_tokenized_dataset
)

### Training the model here

In [83]:
# Train the model
trainer.train()

Epoch,Training Loss,Validation Loss


Non-default generation parameters: {'max_length': 142, 'min_length': 56, 'early_stopping': True, 'num_beams': 4, 'length_penalty': 2.0, 'no_repeat_ngram_size': 3, 'forced_bos_token_id': 0, 'forced_eos_token_id': 2}
Non-default generation parameters: {'max_length': 142, 'min_length': 56, 'early_stopping': True, 'num_beams': 4, 'length_penalty': 2.0, 'no_repeat_ngram_size': 3, 'forced_bos_token_id': 0, 'forced_eos_token_id': 2}
Non-default generation parameters: {'max_length': 142, 'min_length': 56, 'early_stopping': True, 'num_beams': 4, 'length_penalty': 2.0, 'no_repeat_ngram_size': 3, 'forced_bos_token_id': 0, 'forced_eos_token_id': 2}
Non-default generation parameters: {'max_length': 142, 'min_length': 56, 'early_stopping': True, 'num_beams': 4, 'length_penalty': 2.0, 'no_repeat_ngram_size': 3, 'forced_bos_token_id': 0, 'forced_eos_token_id': 2}
Non-default generation parameters: {'max_length': 142, 'min_length': 56, 'early_stopping': True, 'num_beams': 4, 'length_penalty': 2.0, 'no_

KeyboardInterrupt: 

In [None]:
# Evaluate the model
eval_results = trainer.evaluate()
print(eval_results)

### Training and Evaluation Results

After training the DistilBART model for **3 epochs** on the legal case summarization dataset, we achieved the following results:

#### Training Metrics:
- **Training Loss**: **1.8569**
  - The training loss represents the average difference between the predicted token probabilities and the actual tokens across the entire dataset. For a complex task like summarization, this loss value indicates that the model is learning effectively.
  - While ideally a loss closer to zero is better, for sequence generation tasks involving long and complex legal texts, a value around **1.8** is reasonable. The model is capturing the patterns within the legal data without significant overfitting.

#### Evaluation Metrics:
- **Evaluation Loss**: **1.9931**
  - The evaluation loss is slightly higher than the training loss, which suggests that the model generalizes moderately well to unseen data. This is a positive sign as it implies that the model has not overfit significantly to the training dataset.
  - Summarization models, particularly with large input/output sequences and complex legal terminology, typically have evaluation loss values greater than **1**. The small difference between the training and evaluation loss indicates good generalization.

- **Evaluation Runtime**: **3,726.99 seconds** (~62 minutes)
  - This is the time taken to evaluate the model over the validation set. The runtime is reasonable considering the complexity of the task and the length of the input sequences.

- **Samples per Second**:
  - **Training**: **0.407** samples per second
  - **Evaluation**: **0.417** samples per second
  - These rates are consistent across training and evaluation, indicating that the model was trained and evaluated with stable performance given the computational resources. The relatively low samples per second can be attributed to the complexity of processing long legal documents and generating summaries.

#### Interpretation of Loss Values:
- **Training Loss and Evaluation Loss**:
  - The **training loss of 1.8569** compared to the **evaluation loss of 1.9931** indicates that the model is not significantly overfitting to the training set, which is a good outcome. The slight increase in evaluation loss shows that the model is encountering some additional complexity when dealing with unseen data, which is expected.
  - In general, for summarization tasks involving complex data, a loss in the range of **1.5 - 3.0** is typical. This is due to the nature of cross-entropy loss accumulating over long sequences of tokens. Thus, the current loss values are quite reasonable.

#### Next Steps for Improvement:
1. **Hyperparameter Tuning**:
   - Consider adjusting the learning rate or using **scheduled learning rate decay** to help further reduce the training and evaluation loss.
2. **Additional Training Epochs**:
   - Training for an additional **1-2 epochs** could further reduce the loss, provided that overfitting is controlled.
3. **Regularization Techniques**:
   - **Weight Decay** or **Dropout** could be introduced to help improve generalization.
4. **Evaluate with ROUGE Metric**:
   - In addition to using loss as a performance measure, evaluating the model with **ROUGE** scores can give a more targeted assessment of how well the summaries capture the important content from the legal texts.

#### Summary:
- The **training and evaluation losses** are reasonable for a text generation task involving legal documents. The model seems to be learning effectively without significant overfitting.
- Further improvement can be achieved through hyperparameter tuning, training for additional epochs, and using metrics such as **ROUGE** to better evaluate the quality of the generated summaries.

The next logical step is to test the quality of the generated summaries by comparing them with the reference summaries and calculating relevant metrics to better understand the model's performance.


# Training with an NVIDIA H100 SXM 

### Model Training and Evaluation Summary

After training the model for **3 epochs** on the legal case summarization dataset, we achieved the following results:

#### Training Metrics:
- **Training Loss**:
  - **Epoch 1**: 0.278700
  - **Epoch 2**: 0.246900
  - **Epoch 3**: 0.125400
  - The training loss consistently decreased across epochs, which indicates that the model is effectively learning from the dataset. This is a positive trend, especially for a sequence generation task, as the model is gradually fitting the data with better accuracy over time.

#### Evaluation Metrics:
- **Validation Loss**:
  - **Epoch 1**: 0.088182
  - **Epoch 2**: 0.141818
  - **Epoch 3**: 0.117075
  - The validation loss shows a slight fluctuation between epochs, with the lowest loss occurring in Epoch 1. The higher values in Epoch 2 and 3 indicate that the model may be encountering some complexity in generalizing, though it still performs well overall.

- **Evaluation Loss**: **0.1170758718937302**
  - The evaluation loss suggests that the model generalizes well to unseen data. The small difference between training and evaluation losses demonstrates good generalization, which is a positive sign, as it implies the model is not overfitting.

- **Evaluation Runtime**: **90.6238 seconds** (~1.5 minutes)
  - The evaluation process took about 90 seconds, which is quite efficient given the complexity of the model and dataset.

- **Samples per Second**:
  - **Training**: **36.789** samples per second
  - **Evaluation**: **128.112** samples per second
  - The evaluation speed is significantly higher than the training speed, likely due to the larger computational requirements for backpropagation during training. Both rates suggest the model was trained and evaluated efficiently.

#### Interpretation of Loss Values:
- **Training Loss and Validation Loss**:
  - The training loss consistently decreases, while the validation loss stabilizes. This is indicative of a well-performing model with minimal overfitting. Given that the validation loss is close to the training loss, the model seems to generalize well to unseen data, which is crucial for real-world applications.

#### Next Steps for Improvement:
1. **Hyperparameter Tuning**:
   - Adjust the learning rate or batch size to see if further reductions in loss are possible.
2. **Additional Training Epochs**:
   - Training for an additional **1-2 epochs** could help reduce the loss further, though care should be taken to avoid overfitting.
3. **Regularization Techniques**:
   - Techniques such as **Dropout** or **Weight Decay** could be introduced to further improve generalization and reduce validation loss.
4. **Advanced Evaluation Metrics**:
   - Consider using additional metrics like **ROUGE** or **BLEU** to evaluate the model's performance in terms of summarization quality, not just loss.

#### Summary:
- The **training and validation losses** are promising for a complex text generation task. The model is learning effectively and generalizing well to the validation set without signs of significant overfitting.
- Further improvements could be made by fine-tuning hyperparameters, increasing the number of training epochs, and using advanced evaluation metrics such as ROUGE or BLEU.

Next, we can analyze the quality of the generated summaries against reference summaries to further assess the model's performance.


------------------------------------------------------------------------------------------
# Example 107:
## Original Summary: 
On 9 September 2004 the appellant, Steven Allison, was convicted after trial in the High Court at Glasgow of four contraventions of section 4(3)(b) of the Misuse of Drugs Act 1971.
In effect, he was found guilty of being concerned in the supplying of cocaine and three other controlled drugs at his home in Cumbernauld, at an address in Falkirk and elsewhere in the United Kingdom, between 12 November and 3 December 2003.
The trial judge, Lord Bracadale, sentenced him to 8 years imprisonment.
The appellant appealed against both his conviction and sentence.
On 7 November 2008 the appeal court (Lord Osborne, Lady Paton and Lord Philip) refused his appeal against conviction, leaving his appeal against sentence to be heard on a date to be fixed.
Among his grounds of appeal against conviction was one which was first advanced in an additional Note of Appeal.
It relates to the record of a police interview of a John Stronach.
Mr Stronach had died before the trial and the Crown introduced the interview into evidence in accordance with the procedure in section 259(5) of the Criminal Procedure (Scotland) Act 1995.
Neither before nor during the trial did the Crown disclose to the defence that Mr Stronach had a number of previous convictions and outstanding charges.
In particular, he had convictions for reset, theft by opening lockfast places, assault and robbery and assault and breach of the peace.
He also had a number of outstanding charges, including two alleged contraventions of the Misuse of Drugs Act 1971, an alleged theft by housebreaking and several alleged contraventions of the Road Traffic Act 1988.
One of the outstanding cases under the Misuse of Drugs Act related to events covered by the trial and was known to the appellants legal advisers.
The Crown disclosed the previous convictions and the other outstanding charges only while the appellants appeal was pending before the appeal court.
This prompted the appellant to lodge his additional ground of appeal: The failure on the part of the Crown to disclose to the defence the existence of all the previous convictions and outstanding charges resulted in the defence being unable to prepare and properly conduct their defence and the result was that the appellant did not receive a fair trial, as guaranteed by article 6(1) of the European Convention on Human Rights.
Following the dismissal of his appeal by the appeal court, the appellant applied for leave to appeal to the Privy Council in relation to the additional ground of appeal.
On 6 March 2009 the appeal
*****************************************************************************************
## Generated Summary: 
On 9 September 2004 Steven Allison was convicted of four contraventions of section 4(3)(b) of the Misuse of Drugs Act 1971.
He was found guilty of being concerned in the supply of cocaine and three other controlled drugs at his home in Cumbernauld, at an address in the United Kingdom, between 12 November and 3 December 2003.
The trial judge, Lord Bracadale, sentenced him to 8 years imprisonment.
In appeal, the appeal court refused his appeal against conviction, leaving his appeal to be heard on a date to be fixed.
One of the outstanding cases under the misuse of drugs Act 1971 related to events covered by the trial and was known to the appellants legal advisers.

------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
# Example 2341:
## Original Summary: 
THIS JOINDER SHALL BE CONSTRUED AND ENFORCED IN ACCORDANCE WITH AND GOVERNED BY THE LAW OF THE STATE OF NEW YORK WITHOUT REFERENCE TO ITS CONFLICT OF LAWS PRINCIPLES TO THE EXTENT THAT THE SAME ARE NOT MANDATORILY APPLICABLE BY STATUTE AND WOULD PERMIT OR REQUIRE THE APPLICATION OF LAWS OF ANOTHER JURISDICTION.
*****************************************************************************************
## Generated Summary: 
23 This appeal arises out of proceedings under the United Commonwealth ( Commonwealth of Delaware and Commonwealth of Pennsylvania) and is directed to be governed by the provisions of the, and is governed by that law by the United State of Delaware (the Delaware Act), which is a well


------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------
# Example 10:
## Original Summary: 
Appeal No. 251 of 1963.
Appeal by special leave from the judgment and order dated March 20, 1957, of the Patna High Court in Civil Revision No. 40 of 1956.
M. C. Setalvad, and R. C. Prasad, for the appellants.
The respondent did not appear.
March 24, 1964.
The short question which arises in this appeal is whether the term "wages" as defined by section 2(vi) of the (No. 4 of 1936) (hereinafter called 'the Act ') includes wages fixed by an award in an industrial dispute between the employer and his employees.
This question has to be answered in the light of the definition prescribed by section 2(vi) before it was amended in 1958.
The subsequent amendment expressly provides by section 2(vi) (a) that any remuneration payable under any award or settlement between the parties or order of a Court, would be included in the main definition under section 2(vi).
The point which we have to decide in the present appeal is whether the remuneration payable under an award was not already included in the definition of wages before the said definition was amended.
It is common ground that between the appellant, Sasamusa Sugar Works Ltd., and its workmen, the respondents, an award had been made by an Industrial Tribunal fixing the pay of the employees at Rs. 2/2/ per day, and in pursuance of the said award, the management of the appellant had entered into an agreement with the respondents that effect would be given to the wage structure, prescribed by the said award.
This agreement was subsequently published in the Bihar Gazette as a part of the award.
In spite of the award and the agreement, the appellant paid its employees only As.
/ 10 / per day and that led to the present claim made by the respondents under section 15 of the Act.
The respondents contended before the payment of wages authority that the refusal of the appellant to pay to them wages at the rate awarded, in substance, amounted to an illegal deduction from their wages and on that basis, they asked for an order from the authority directing the appellant to pay to the respondents the said prescribed wages.
The appellant raised two pleas against the respondents'claim.
It urged that section 15 of the Act was inapplicable, because the rates of wages fixed by the award did not fall within the definition of wages prescribed by section 2(
*****************************************************************************************
## Generated Summary:
An award had been made by an Industrial Tribunal fixing the pay of the employees at Rs. 2/2/ per day and in pursuance thereof the management of the appellant had entered into an agreement with the respondents that effect would be given to the wage structure prescribed by the said award.
This agreement was subsequently published in the Bihar Gazette as a part of the award and the appellant paid its employees only As.
/10/ per Day.
The respondents contended before the payment of wages authority that the appellant's refusal to pay to them wages at the rate awarded, in substance, amounted to an illegal deduction from their wages and on that basis, they asked for an order from the authority directing the appellant to pay the said