In [1]:
!ls data

[34mdst_vocab[m[m      [34mmulwoz_process[m[m [34mnlu_process[m[m
[34mmulwoz[m[m         [34mnlu[m[m            [34montology[m[m


In [2]:
sys_template = {
    'en': """You are a helpful AI assistant tasked with generating key-value pairs from a dialogue context based on schema.
        ###Task: Slot Extraction aims to extract all slots and corresponding values mentioned in the given dialogue context.
        If the value of a slot is mentioned, then the substring is formatted as 'inform [slot] [value]'.
        If the value of a slot is not mentioned, then the substring is formatted as 'request slot [slot]'. 
        The output is a concatenation of all substrings of all slots.
        ### Schema:
        food: the cuisine of the restaurant you are looking for, such as "british".
        area: the location or area of the restaurant, including "centre",  "north", "west", "south", "east".
        price range: price budget for the restaurant, including "cheap", "moderate", and "expensive".
        request: the attribute of a restaurant you are looking for, including "address", "area", "food", "phone", "price range",  "postcode",  "name".
        ### Example:
        """.replace('\n', '').strip(),
    'de': """Sie sind ein hilfreicher KI-Assistent, der die Aufgabe hat, Schlüssel-Wert-Paare aus einem Dialogkontext basierend auf einem Schema zu generieren.
         ###Aufgabe: Slot-Extraktion zielt darauf ab, alle Slots und entsprechenden Werte zu extrahieren, die im gegebenen Dialogkontext erwähnt werden.
         Wenn der Wert eines Slots erwähnt wird, wird die Teilzeichenfolge als „inform [slot] [value]“ formatiert.
         Wenn der Wert eines Slots nicht erwähnt wird, wird die Teilzeichenfolge als „Slot anfordern [Slot]“ formatiert.
         Die Ausgabe ist eine Verkettung aller Teilzeichenfolgen aller Slots.
         ### Schema:
         essen: die Küche des Restaurants, das Sie suchen, z. B. "britische".
         gegend: der Standort oder Bereich des Restaurants, einschließlich "zentrum", "norden", "westen", "süden", "osten".
         preisklasse: Preisbudget für das Restaurant, einschließlich "billig", "mäßig", "teuer".
         request: das Attribut eines Restaurants, nach dem Sie suchen, einschließlich "adresse", "gegend", "essen", "telefon", "preisklasse", "postleitzahl", "name".
         ### Beispiel:
         """.replace('\n', '').strip(), 
    'it': """Sei un utile assistente AI incaricato di generare coppie chiave-valore da un contesto di dialogo basato su uno schema.
         ###Compito: L'estrazione degli slot mira a estrarre tutti gli slot e i valori corrispondenti menzionati nel contesto del dialogo dato.
         Se viene menzionato il valore di uno slot, la sottostringa viene formattata come 'inform [slot] [value]'.
         Se il valore di uno slot non viene menzionato, la sottostringa viene formattata come "request slot [slot]".
         L'output è una concatenazione di tutte le sottostringhe di tutti gli slot.
         ### Schema:
         cibo: la cucina del ristorante che stai cercando, ad esempio quella "bistro".
         area: l'ubicazione o la zona del ristorante, compreso "centro", "nord", "ovest", "sud", "est".
         prezzo: budget di prezzo per il ristorante, incluso "economico", "moderato", "caro".
         request: l'attributo di un ristorante che stai cercando, tra cui "indirizzo", "area", "cibo", "telefono", "prezzo","codice postale", "nome".
         ### Esempio:
    """.replace('\n', '').strip(),
}


examples = {
    'en': {
        "input": "<|user|> i want to find a moderately priced restaurant in the west part of town . what is the address and the postcode ?", 
         "output": "request slot postcode, request slot address, inform price range moderate, inform area west"
    },
    'de': {
        "input": "<|user|> hallo , ich suchen ein restaurant mit fairen preisen . <|system_actions|> request slot essen <|user|> also , ich will in dem norden essen gehen , be geben 's da so ?", 
         "output": "inform preisklasse mäßig, inform gegend norden"
    },
    'it' : {
        "input": "<|user|> sto cercare un ristorante a prezzo modico . <|system_actions|> request slot cibo <|user|> cucina turco .",
        "output": "inform cibo turco,inform prezzo moderato"
    }
}

dst = {
    'en': sys_template['en'] + str(examples['en']),
    'de': sys_template['de'] + str(examples['de']),
    'it': sys_template['it'] + str(examples['it']),
}

In [3]:
dst['en']

'You are a helpful AI assistant tasked with generating key-value pairs from a dialogue context based on schema.        ###Task: Slot Extraction aims to extract all slots and corresponding values mentioned in the given dialogue context.        If the value of a slot is mentioned, then the substring is formatted as \'inform [slot] [value]\'.        If the value of a slot is not mentioned, then the substring is formatted as \'request slot [slot]\'.         The output is a concatenation of all substrings of all slots.        ### Schema:        food: the cuisine of the restaurant you are looking for, such as "british".        area: the location or area of the restaurant, including "centre",  "north", "west", "south", "east".        price range: price budget for the restaurant, including "cheap", "moderate", and "expensive".        request: the attribute of a restaurant you are looking for, including "address", "area", "food", "phone", "price range",  "postcode",  "name".        ### Example:

In [4]:
import json
import re

def convert_txt_to_jsonl(instruction, f):
    data_list = []
    with open(f, 'r') as fr:
        for line in fr.readlines():
            context = line.strip().split('<|endofcontext|>')[0].split('<|context|>')[-1].strip()
            belief_str = line.strip().split('<|belief|>')[-1].split('<|endofbelief|>')[0].strip()
            sample = {
                "instruction": instruction, 
                "input": context, 
                "output": belief_str
            }
            if belief_str:
                data_list.append(sample)
    with open(f.replace('.txt', '.jsonl'), 'w') as fw:
        fw.write(json.dumps(data_list))
    return data_list
    

In [5]:
data_dir = 'data/mulwoz_process'

route_group = {
    0: '<en> <en>',
    1: '<de> <de>',
    2: '<it> <it>',
    3: '<en> <de>', # 3 <en> <en> <en_input>; 3 <de> <en> <en_input>; 3 <de> <de> <de_input> ; 3 <en> <de> <de_input> ;
    4: '<en> <it>', # 4 <en> <en> <en_input>; 4 <it> <en> <en_input>; 4 <it> <it> <it_input> ; 4 <en> <it> <it_input> ;
    5: '<de> <en>', # 5 <de> <de> <de_input>; 5 <en> <de> <de_input>; 5 <en> <en> <en_input> ; 5 <de> <en> <en_input> ;
    6: '<de> <it>', # 6 <de> <de> <de_input>; 6 <it> <de> <de_input>; 6 <it> <it> <it_input> ; 6 <de> <it> <it_input> ;
    7: '<it> <en>', # 7 <it> <it> <it_input>; 7 <en> <it> <it_input>; 7 <en> <en> <en_input> ; 7 <it> <en> <en_input> ;
    8: '<it> <de>'  # 8 <it> <it> <it_input>; 8 <de> <it> <it_input>; 8 <de> <de> <de_input> ; 8 <it> <de> <de_input> ;
}

# Route 0
dst_en_train_set = convert_txt_to_jsonl(f'<r0> <en> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_train.txt')
dst_en_val_set = convert_txt_to_jsonl(f'<r0> <en> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_val.txt')
dst_en_test_set = convert_txt_to_jsonl(f'<r0> <en> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_test.txt')
dst_en = {'train': dst_en_train_set, 'val': dst_en_val_set, 'test': dst_en_test_set}

dst_de_train_set = convert_txt_to_jsonl(f'<r1> <de> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_train.txt')
dst_de_val_set = convert_txt_to_jsonl(f'<r1> <de> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_val.txt')
dst_de_test_set = convert_txt_to_jsonl(f'<r1> <de> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_test.txt')
dst_de = {'train': dst_de_train_set, 'val':dst_de_val_set, 'test': dst_de_test_set}

dst_it_train_set = convert_txt_to_jsonl(f'<r2> <it> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_train.txt')
dst_it_val_set = convert_txt_to_jsonl(f'<r2> <it> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_val.txt')
dst_it_test_set = convert_txt_to_jsonl(f'<r2> <it> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_test.txt')
dst_it = {'train': dst_it_train_set, 'val': dst_it_val_set, 'test': dst_it_test_set}

In [6]:
dst_en

{'train': [{'instruction': '<r0> <en> <en> You are a helpful AI assistant tasked with generating key-value pairs from a dialogue context based on schema.        ###Task: Slot Extraction aims to extract all slots and corresponding values mentioned in the given dialogue context.        If the value of a slot is mentioned, then the substring is formatted as \'inform [slot] [value]\'.        If the value of a slot is not mentioned, then the substring is formatted as \'request slot [slot]\'.         The output is a concatenation of all substrings of all slots.        ### Schema:        food: the cuisine of the restaurant you are looking for, such as "british".        area: the location or area of the restaurant, including "centre",  "north", "west", "south", "east".        price range: price budget for the restaurant, including "cheap", "moderate", and "expensive".        request: the attribute of a restaurant you are looking for, including "address", "area", "food", "phone", "price range",

In [7]:
def generate_split_data(splits=['train', 'val', 'test']):
    dst_dataset = {}
    for split in splits:
        dst_3_data_list = convert_txt_to_jsonl(f'<r0> <en> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt')
        dst_3_data_list += convert_txt_to_jsonl(f'<r1> <de> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')
        dst_3_data_list += convert_txt_to_jsonl(f'<r2> <it> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r3> <en> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r3> <de> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt') 
        dst_3_data_list += convert_txt_to_jsonl(f'<r3> <de> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r3> <en> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r4> <en> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt')   
        dst_3_data_list += convert_txt_to_jsonl(f'<r4> <it> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r4> <it> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt') 
        dst_3_data_list += convert_txt_to_jsonl(f'<r4> <en> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt')   
        dst_3_data_list += convert_txt_to_jsonl(f'<r5> <de> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')   
        dst_3_data_list += convert_txt_to_jsonl(f'<r5> <en> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r5> <en> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r5> <de> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r6> <de> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')   
        dst_3_data_list += convert_txt_to_jsonl(f'<r6> <it> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r6> <it> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r6> <de> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r7> <it> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt')   
        dst_3_data_list += convert_txt_to_jsonl(f'<r7> <en> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt')   
        dst_3_data_list += convert_txt_to_jsonl(f'<r7> <en> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r7> <it> <en> {dst["en"]}', f'{data_dir}/beliefinput2delex_en_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r8> <it> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt')   
        dst_3_data_list += convert_txt_to_jsonl(f'<r8> <de> <it> {dst["it"]}', f'{data_dir}/beliefinput2delex_it_{split}.txt')   
        dst_3_data_list += convert_txt_to_jsonl(f'<r8> <de> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')  
        dst_3_data_list += convert_txt_to_jsonl(f'<r8> <it> <de> {dst["de"]}', f'{data_dir}/beliefinput2delex_de_{split}.txt')  
        
        dst_dataset[split] = dst_3_data_list
        
    return dst_dataset



In [8]:
dst_mix_en_de_it = generate_split_data()
print(dst_mix_en_de_it['train'][0])

{'instruction': '<r0> <en> <en> You are a helpful AI assistant tasked with generating key-value pairs from a dialogue context based on schema.        ###Task: Slot Extraction aims to extract all slots and corresponding values mentioned in the given dialogue context.        If the value of a slot is mentioned, then the substring is formatted as \'inform [slot] [value]\'.        If the value of a slot is not mentioned, then the substring is formatted as \'request slot [slot]\'.         The output is a concatenation of all substrings of all slots.        ### Schema:        food: the cuisine of the restaurant you are looking for, such as "british".        area: the location or area of the restaurant, including "centre",  "north", "west", "south", "east".        price range: price budget for the restaurant, including "cheap", "moderate", and "expensive".        request: the attribute of a restaurant you are looking for, including "address", "area", "food", "phone", "price range",  "postcode

In [9]:
def upload_to_hub(data, dataset_identifier='Jiahuan/dst_en'):
    from datasets import load_dataset, DatasetDict, Dataset
    from huggingface_hub import login
    from sklearn.model_selection import train_test_split
    import os

    os.environ['HF_TOKEN'] = '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'
    # Replace 'your_token' with your actual Hugging Face API token
    api_token = os.environ['HF_TOKEN']

    # Log in to the Hugging Face Hub
    login(token=api_token)

    dataset = DatasetDict({
        'train':Dataset.from_list(data['train']), 
        'val':Dataset.from_list(data['val']),
        'test':Dataset.from_list(data['test'])
    })
    dataset.push_to_hub(dataset_identifier)

    # Print some information about the dataset
    print(dataset)

In [10]:
upload_to_hub(dst_en, dataset_identifier='Jiahuan/dst_en')
upload_to_hub(dst_de, dataset_identifier='Jiahuan/dst_de')
upload_to_hub(dst_it, dataset_identifier='Jiahuan/dst_it')

  from .autonotebook import tqdm as notebook_tqdm


Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /Users/jiahuanpei/.cache/huggingface/token
Login successful


Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 3/3 [00:00<00:00, 84.77ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:00<00:00,  2.95it/s]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 1/1 [00:00<00:00, 114.26ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:00<00:00,  2.91it/s]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 2/2 [00:00<00:00, 132.51ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:00<00:00,  3.42it/s]


DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 2535
    })
    val: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 830
    })
    test: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 1646
    })
})
Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /Users/jiahuanpei/.cache/huggingface/token
Login successful


Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 3/3 [00:00<00:00, 121.68ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:01<00:00,  1.57s/it]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 1/1 [00:00<00:00, 117.69ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:01<00:00,  1.21s/it]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 2/2 [00:00<00:00, 111.65ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:01<00:00,  1.42s/it]


DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 2535
    })
    val: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 830
    })
    test: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 1646
    })
})
Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /Users/jiahuanpei/.cache/huggingface/token
Login successful


Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 3/3 [00:00<00:00, 122.78ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:00<00:00,  2.20it/s]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 1/1 [00:00<00:00, 117.62ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:00<00:00,  3.94it/s]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 2/2 [00:00<00:00, 153.17ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:00<00:00,  2.49it/s]


DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 2535
    })
    val: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 830
    })
    test: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 1646
    })
})


In [11]:
upload_to_hub(dst_mix_en_de_it, dataset_identifier='Jiahuan/dst_mix_en_de_it')

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /Users/jiahuanpei/.cache/huggingface/token
Login successful


Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format:   0%|          | 0/69 [00:00<?, ?ba/s][A
Creating parquet from Arrow format:  35%|███▍      | 24/69 [00:00<00:00, 236.69ba/s][A
Creating parquet from Arrow format: 100%|██████████| 69/69 [00:00<00:00, 292.06ba/s][A
Uploading the dataset shards: 100%|██████████| 1/1 [00:01<00:00,  1.56s/it]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 23/23 [00:00<00:00, 250.63ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:01<00:00,  1.74s/it]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format:   0%|          | 0/45 [00:00<?, ?ba/s][A
Creating parquet from Arrow format: 100%|██████████| 45/45 [00:00<00:00, 234.13ba/s][A
Uploading the dataset shards: 100%|██████████| 1/1 [00:01<00:00,  1.39s/it]
README.md: 100%|██████████| 554/554 [00:00<00:00, 46.4kB/s]


DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 68445
    })
    val: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 22410
    })
    test: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 44442
    })
})


In [12]:
dst_all = {
    'train': dst_en_train_set + dst_de_train_set + dst_it_train_set, 
    'val': dst_en_val_set + dst_de_val_set + dst_it_val_set, 
    'test': dst_en_test_set + dst_de_test_set + dst_it_test_set
}
upload_to_hub(dst_all, dataset_identifier='Jiahuan/dst_all_en_de_it')

Token will not been saved to git credential helper. Pass `add_to_git_credential=True` if you want to set the git credential as well.
Token is valid (permission: write).
Your token has been saved to /Users/jiahuanpei/.cache/huggingface/token
Login successful


Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 8/8 [00:00<00:00, 167.49ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:01<00:00,  1.10s/it]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 3/3 [00:00<00:00, 48.55ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:00<00:00,  1.42it/s]
Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]
Creating parquet from Arrow format: 100%|██████████| 5/5 [00:00<00:00, 227.98ba/s]
Uploading the dataset shards: 100%|██████████| 1/1 [00:01<00:00,  1.40s/it]
README.md: 100%|██████████| 546/546 [00:00<00:00, 37.7kB/s]


DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 7605
    })
    val: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 2490
    })
    test: Dataset({
        features: ['instruction', 'input', 'output'],
        num_rows: 4938
    })
})


In [None]:
from transformers import LlamaForCausalLM, LlamaTokenizer

# tokenizer = LlamaTokenizer.from_pretrained("meta-llama/Llama-2-7b-chat-hf")
model = LlamaForCausalLM.from_pretrained("meta-llama/Llama-2-7b-chat-hf")