<div style="width: 30%; float: right; margin: 10px; margin-right: 5%;">
    <img src="https://upload.wikimedia.org/wikipedia/commons/thumb/d/d3/FHNW_Logo.svg/2560px-FHNW_Logo.svg.png" width="500" style="float: left; filter: invert(50%);"/>
</div>

# GPT-2 Few-Shot learning

In diesem Notebook werden wir einen Chatbot für Schweizer Immobilien Empfehlungen mittels Few-Shot learning erstellen. <br>
Dabei verwenden wir das LLM [GPT-2](https://huggingface.co/gpt2)von OpenAI.



---
Bearbeitet durch Si Ben Tran, Yannic Lais, Rami Tarabishi im HS 2023.<br>
Bachelor of Science FHNW in Data Science.

## Einleitung

### Allgemeines Vorgehen

- Name entity recognition auf den Prompt
- Entities werden für die Datenbankabfrage extrahiert
- Prompt wird mit den Trainingsexamples sowie der Datenbankabfrage an das Phi-2 Modell gesendet

In [2]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import pandas as pd
import re

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
# setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
print()

#Additional Info when using cuda
if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
    print('Cached:   ', round(torch.cuda.memory_reserved(0)/1024**3,1), 'GB')

Using device: cuda

NVIDIA GeForce RTX 3090
Memory Usage:
Allocated: 0.0 GB
Cached:    0.0 GB


## Immobiliendaten

In [38]:
# read parquet file
df = pd.read_parquet('data\immo_data_202208.parquet')

In [39]:
# filter for important columns
df = df[['Municipality', 'detailed_description', 'price_cleaned', 'type', 'rooms', 'url']]
df['price'] = df['price_cleaned'].astype(float)
df['rooms'] = pd.to_numeric(df['rooms'], errors='coerce')
df = df.drop(columns=['price_cleaned'])
# drop rows with missing values
df = df.dropna()

In [6]:
df.head()

Unnamed: 0,Municipality,detailed_description,type,rooms,url,price
0,Biberstein,DescriptionLuxuriöse Attika-Wohnung direkt an ...,penthouse,5.0,https://www.immoscout24.ch//en/d/penthouse-buy...,1150000.0
1,Biberstein,DescriptionStilvolle Liegenschaft an ruhiger L...,terrace-house,5.0,https://www.immoscout24.ch//en/d/terrace-house...,1420000.0
3,Biberstein,DescriptionDieses äusserst grosszügige Minergi...,detached-house,5.0,https://www.immoscout24.ch//en/d/detached-hous...,1430000.0
4,Küttigen,DescriptionAus ehemals zwei Wohnungen wurde ei...,flat,5.0,https://www.immoscout24.ch//en/d/flat-buy-romb...,995000.0
5,Erlinsbach (AG),DescriptionDer Blick in die Weite vermittelt R...,detached-house,5.0,https://www.immoscout24.ch//en/d/detached-hous...,2160000.0


## GPT2 Modell Laden

In [7]:
model = AutoModelForCausalLM.from_pretrained("gpt2", torch_dtype="auto", device_map="cuda", trust_remote_code=True)
model.to('cuda')
tokenizer = AutoTokenizer.from_pretrained("gpt2", trust_remote_code=True)

#### Einzelner Input

In [8]:
inputs = tokenizer("Is a penguin a bird or a mamal?", return_tensors="pt").to('cuda')

# Generate outputs and decode
outputs = model.generate(**inputs, max_length=40)
text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(text)

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Is a penguin a bird or a mamal?

A penguin is a bird that lives in the water. It is a very small bird. It is a very small bird.


GPT2 scheint weiter zu generieren, bis das Token Limit erreicht ist, egal ob es schon die antwort richtig und ganz gegeben hat.

## Process Prompt

In [9]:
def filter_dataframe(df = df, max_price = None, min_price = None, arround_price = None, location_keyword = None, property_type = None, rooms = None):

    type_list =  df.type.unique().tolist()
    type_list.append('house')
    type_list = [x.lower() for x in type_list]

    # Apply filters
    if arround_price:
        filtered_df = df[df['price'] <= arround_price * 1.1]
        filtered_df = df[df['price'] >= arround_price * 0.9]
    if max_price:
        filtered_df = df[df['price'] <= max_price]
    if min_price:
        filtered_df = df[df['price'] >= min_price]
    if location_keyword:
        filtered_df = filtered_df[filtered_df['Municipality'].str.contains(location_keyword, case=False, na=False)]
    if property_type.lower() in type_list:
        filtered_df = filtered_df[filtered_df['type'].str.contains(property_type, case=False, na=False)]
    if rooms:
        filtered_df = filtered_df[filtered_df['rooms'] == rooms]

    # Return 5 random samples
    if len(filtered_df) >= 5:
        return filtered_df.sample(n=5, random_state = 42)
    else:
        return filtered_df

In [10]:
test = filter_dataframe(df, max_price = 1200000, location_keyword = 'Basel', property_type = 'flat')

In [11]:
test.head()

Unnamed: 0,Municipality,detailed_description,type,rooms,url,price
2742,Basel,DescriptionObjektbeschrieb An idealer Lage im...,flat,5.0,https://www.immoscout24.ch//en/d/flat-buy-base...,495000.0
2738,Basel,DescriptionDiese Wohnung im 3. Obergeschoss bi...,flat,5.0,https://www.immoscout24.ch//en/d/flat-buy-base...,650000.0
2671,Basel,DescriptionWollen Sie sich endlich Ihren Traum...,flat,5.0,https://www.immoscout24.ch//en/d/flat-buy-base...,1130000.0
2737,Basel,DescriptionIm Breite-Quartier wird diese heime...,flat,2.0,https://www.immoscout24.ch//en/d/flat-buy-base...,470000.0
2733,Basel,DescriptionBaujahr: 1961Renovation: 2022Wohnfl...,flat,5.0,https://www.immoscout24.ch//en/d/flat-buy-base...,695000.0


In [12]:
def parse_and_filter(input_str, df):
    try:
        # Extract the parameter string
        params_str = re.search(r'\((.*?)\)', input_str).group(1)
    except:
        return "No specific answer found."

    # Initialize parameters with default values
    params = {
        'df': df,
        'max_price': None,
        'min_price': None,
        'arround_price': None,
        'location_keyword': None,
        'property_type': None,
        'rooms': None
    }

    # Split the parameter string and iterate over each parameter
    for param in params_str.split(','):
        key, value = param.split('=')
        key = key.strip()
        value = value.strip()

        # Convert value to the correct type if necessary
        if key in ['max_price', 'min_price', 'arround_price', 'rooms']:
            value = int(value)
        # Update the parameters dictionary
        params[key] = value

    # Call the filter_dataframe function with unpacked arguments
    return filter_dataframe(**params)

In [13]:
# Few-shot examples:
few_shot_examples = [
    {
        "Question": "I am looking for an flat in Zurich under 1'000'000 CHF.", 
        "Answer": "Here are some options for apartments in Zurich under 1'000'000 CHF: (max_price = 1000000, location_keyword = Zürich, property_type = flat)"
    },
    {
        "Question": "Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?",
        "Answer": "Yes, there are terraced houses in Bern in the CHF 500,000 to 700,000 range: (max_price = 700000, min_price = 500000, location_keyword = Bern, property_type = terraced_house)"
    },
    {
        "Question": "I need a detached house in Lucerne with a garden for around CHF 1,200,000.",
        "Answer": "In Lucerne you can find detached houses with a garden for around CHF 1,200,000: (location_keyword = Bern, property_type = detached-house, arround_price = 1200000)"
    },
    {
        "Question": "Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?",
        "Answer": "Modern apartments in Basel under 900'000 CHF are available: (max_price = 900000, location_keyword = Basel, property_type = flat, rooms = 3.5)"
    },
    {
        "Question": "I am looking for a large house in Lausanne, at least 5 rooms, up to 1'500'000 CHF.",
        "Answer": "Large houses in Lausanne with at least 5 rooms up to 1'500'000 CHF can be found here: (max_price = 1500000, location_keyword = Lausanne, property_type = house)"
    }
]

In [22]:
def get_model_response(query, model, tokenizer, few_shots = None, pre_query = None):
    # Load model and tokenizer
    model = model
    tokenizer = tokenizer

    if few_shots:
        prompt_text = "\n".join([f"Question: {ex['Question']}\nAnswer: {ex['Answer']}" for ex in few_shots])
        prompt_text += f"\nQuestion: {query}"
        prompt_text += "\nAnswer:"
    elif pre_query:
        prompt_text = pre_query
        prompt_text += f"\nQuestion: {query}"
        prompt_text += "\nAnswer:"
    else:
        prompt_text = f"Question: {query}\nAnswer:"

    print(prompt_text)

    # Encode and send to model
    inputs = tokenizer(prompt_text, return_tensors="pt").to('cuda')
    outputs = model.generate(**inputs, max_length=1024, num_return_sequences=1)

    # Decode the output
    full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    # Extracting the answer corresponding to the specific query
    response_parts = full_response.split("Answer:")
    for i, part in enumerate(response_parts[:-1]):
        if f"Question: {query}" in part:
            return response_parts[i + 1].split("\n")[0].strip()

    return "No specific answer found."

#### Mit gleiche few-shot prompt als phi-2:

In [24]:
# Example usage
query = "Give me houses in aarau which cost less than 4'000'000 CHF."
response = get_model_response(query, model=model, tokenizer=tokenizer, few_shots=few_shot_examples)
print(response)

filtered_df = parse_and_filter(response, df)

filtered_df

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: I am looking for an flat in Zurich under 1'000'000 CHF.
Answer: Here are some options for apartments in Zurich under 1'000'000 CHF: (max_price = 1000000, location_keyword = Zürich, property_type = flat)
Question: Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?
Answer: Yes, there are terraced houses in Bern in the CHF 500,000 to 700,000 range: (max_price = 700000, min_price = 500000, location_keyword = Bern, property_type = terraced_house)
Question: I need a detached house in Lucerne with a garden for around CHF 1,200,000.
Answer: In Lucerne you can find detached houses with a garden for around CHF 1,200,000: (location_keyword = Bern, property_type = detached-house, arround_price = 1200000)
Question: Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?
Answer: Modern apartments in Basel under 900'000 CHF are available: (max_price = 900000, location_keyword = Basel, property_type = flat, rooms = 3.5)
Question: I am looking 

Unnamed: 0,Municipality,detailed_description,type,rooms,url,price


GPT-2 Gibt eine Antwort zurück, die nicht richtig geschrieben ist, aber die richtige struktur hat mit falschen properties (Speciefisch preis, glaub das ist wegen den ' im preis). 

#### Mit gleiche few-shot prompt als phi-2 mit anderen preis format:

In [25]:
# Example usage
query = "Give me houses in aarau which cost less than 4000000 CHF."
response = get_model_response(query, model=model, tokenizer=tokenizer, few_shots=few_shot_examples)
print(response)

filtered_df = parse_and_filter(response, df)

filtered_df

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: I am looking for an flat in Zurich under 1'000'000 CHF.
Answer: Here are some options for apartments in Zurich under 1'000'000 CHF: (max_price = 1000000, location_keyword = Zürich, property_type = flat)
Question: Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?
Answer: Yes, there are terraced houses in Bern in the CHF 500,000 to 700,000 range: (max_price = 700000, min_price = 500000, location_keyword = Bern, property_type = terraced_house)
Question: I need a detached house in Lucerne with a garden for around CHF 1,200,000.
Answer: In Lucerne you can find detached houses with a garden for around CHF 1,200,000: (location_keyword = Bern, property_type = detached-house, arround_price = 1200000)
Question: Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?
Answer: Modern apartments in Basel under 900'000 CHF are available: (max_price = 900000, location_keyword = Basel, property_type = flat, rooms = 3.5)
Question: I am looking 

Unnamed: 0,Municipality,detailed_description,type,rooms,url,price
13676,Le Mont-sur-Lausanne,"Description\n""Les Villas de Bellevue - Villa A...",Bifamiliar house,6.5rm,https://www.homegate.ch/buy/3002039073,1850000.0
13815,Belmont-sur-Lausanne,"Description\n""Maison individuelle à rénover""\n...",Single house,4.5rm,https://www.homegate.ch/buy/3001735442,1195000.0
10216,Lausanne,DescriptionEN EXCLUSIVITE !A deux pas du Parc ...,detached-house,5.0,https://www.immoscout24.ch//en/d/detached-hous...,1790000.0
13442,Lausanne,"Description\n""RARE A LA VENTE ! MAISON MITOYEN...",Bifamiliar house,5rm,https://www.homegate.ch/buy/3001001892,2640000.0
10230,Lausanne,DescriptionIdéalement située proche de la lign...,detached-house,5.0,https://www.immoscout24.ch//en/d/detached-hous...,1950000.0


Jetzt das ich die ' beim preis raus genommen habe, versteht GPT-2 die preis besser und gibt die richtige antwort zurück.

#### Ohne few-shot, aber mit einer Erklärung wie die Anfrage aufgebaut ist:

In [18]:
# Example usage
query = "Give me houses in aarau which cost less than 4000000 CHF."
pre_query = "You will be given a question about a real estate inquiry, in which there are values which you need to extract and answer back in the following format: <Answer> (max_price = <max_price>, min_price = <min_price>, around_price = <around_price>, location_keyword = <location_keyword>, property_type = <property_type>, rooms = <rooms>)."
response = get_model_response(query, model=model, tokenizer=tokenizer, pre_query=pre_query)
print(response)
filtered_df = parse_and_filter(response, df)

filtered_df

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


You will be given a question about a real estate inquiry, in which there are values which you need to extract and answer back in the following format: <Answer> (max_price = <max_price>, min_price = <min_price>, around_price = <around_price>, location_keyword = <location_keyword>, property_type = <property_type>, rooms = <rooms>).
Question: Give me houses in aarau which cost less than 4000000 CHF.
Answer:
Give me houses in aarau which cost less than 4000000 CHF.


'No specific answer found.'

Das model versteht einfach nicht die aufgabe von der anfrage und gibt eine antwort zurück, die entweder wie den prompt ausseiht aber ohne dem richtigen format, oder die format vo gefragt wurde.

#### Few-shot mit einem deutchen prompt:

In [26]:
# Example usage
query_german = "Zeige mir ein paar Wohnungen in Basel, die weniger als 1'200'000 CHF kosten."
response = get_model_response(query_german, model=model, tokenizer=tokenizer, few_shots=few_shot_examples)
print(response)

filtered_df = parse_and_filter(response, df)

filtered_df

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: I am looking for an flat in Zurich under 1'000'000 CHF.
Answer: Here are some options for apartments in Zurich under 1'000'000 CHF: (max_price = 1000000, location_keyword = Zürich, property_type = flat)
Question: Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?
Answer: Yes, there are terraced houses in Bern in the CHF 500,000 to 700,000 range: (max_price = 700000, min_price = 500000, location_keyword = Bern, property_type = terraced_house)
Question: I need a detached house in Lucerne with a garden for around CHF 1,200,000.
Answer: In Lucerne you can find detached houses with a garden for around CHF 1,200,000: (location_keyword = Bern, property_type = detached-house, arround_price = 1200000)
Question: Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?
Answer: Modern apartments in Basel under 900'000 CHF are available: (max_price = 900000, location_keyword = Basel, property_type = flat, rooms = 3.5)
Question: I am looking 

'No specific answer found.'

Da GPT-2 nicht wirklich ein multi language modell ist, versteht es nicht die anfrage und gibt einfach den prompt zurück als antwort.

## Weitere Queries:

Die folgenden queries volgen wie bei Phi-2 gemacht war

### Random query ohne few shot:

In [27]:
query_random = "Hello do you have bread?"

response = get_model_response(query_random, model=model, tokenizer=tokenizer)
filtered_df = parse_and_filter(response, df)

print(response.split(':')[0])
filtered_df

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: Hello do you have bread?
Answer:
Yes, I do.


'No specific answer found.'

Ohne fewshots versteht GPT-2 die anfrage richit und beantwortet sie richtig.

In [28]:
# Mit fewshot
query_random = "Hello do you have bread?"

response = get_model_response(query_random, model=model, tokenizer=tokenizer, few_shots=few_shot_examples)
filtered_df = parse_and_filter(response, df)

print(response.split(':')[0])
filtered_df

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: I am looking for an flat in Zurich under 1'000'000 CHF.
Answer: Here are some options for apartments in Zurich under 1'000'000 CHF: (max_price = 1000000, location_keyword = Zürich, property_type = flat)
Question: Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?
Answer: Yes, there are terraced houses in Bern in the CHF 500,000 to 700,000 range: (max_price = 700000, min_price = 500000, location_keyword = Bern, property_type = terraced_house)
Question: I need a detached house in Lucerne with a garden for around CHF 1,200,000.
Answer: In Lucerne you can find detached houses with a garden for around CHF 1,200,000: (location_keyword = Bern, property_type = detached-house, arround_price = 1200000)
Question: Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?
Answer: Modern apartments in Basel under 900'000 CHF are available: (max_price = 900000, location_keyword = Basel, property_type = flat, rooms = 3.5)
Question: I am looking 

'No specific answer found.'

GPT-2 Verstehet die anfrage und antwortet "richtig", es probiert sogar die query an den beispiel anzupassen mit einer stadt.

### Extrahieren des gesammten Dataframe Abfrage Codes

In [21]:
few_shot_examples_code = [
    {
        "Question": "I am looking for an flat in Zurich under 1'000'000 CHF.",
        "Answer": "df_test = df[(df['Municipality'].str.contains('Zürich')) & (df['price'] < 1000000) & (df['type'].str.contains('flat'))]"
    },
    {
        "Question": "Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?",
        "Answer": "df_test = df[(df['Municipality'].str.contains('Bern')) & (df['price'] >= 500000) & (df['price'] <= 700000) & (df['type'].str.contains('terraced_house'))]"
    },
    {
        "Question": "I need a house in Lucerne with a garden for around CHF 1,200,000.",
        "Answer": "df_test = df[(df['Municipality'].str.contains('Luzern')) & (df['price'] <= (1200000*1.1)) & (df['price'] >= (1200000*0.9)) & (df['type'].str.contains('house'))]"
    },
    {
        "Question": "Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?",
        "Answer": "df_test = df[(df['Municipality'].str.contains('Basel')) & (df['rooms'] == 3.5) & (df['price'] < 900000) & (df['type'].str.contains('apartment'))]"
    },
    {
        "Question": "I am looking for a large house in Lausanne, at least 5 rooms, up to 1'500'000 CHF.",
        "Answer": "df_test = df[(df['Municipality'].str.contains('Lausanne')) & (df['rooms'] >= 5) & (df['price'] <= 1500000) & (df['type'].str.contains('house'))]"
    }
]

In [41]:
query = "Show me flats in Basel which cost less than 2000000 CHF with 3 or more rooms."

response = get_model_response(query, model=model, tokenizer=tokenizer, few_shots=few_shot_examples_code)

print(response)

exec(response)
df_test.head()

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: I am looking for an flat in Zurich under 1'000'000 CHF.
Answer: df_test = df[(df['Municipality'].str.contains('Zürich')) & (df['price'] < 1000000) & (df['type'].str.contains('flat'))]
Question: Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?
Answer: df_test = df[(df['Municipality'].str.contains('Bern')) & (df['price'] >= 500000) & (df['price'] <= 700000) & (df['type'].str.contains('terraced_house'))]
Question: I need a house in Lucerne with a garden for around CHF 1,200,000.
Answer: df_test = df[(df['Municipality'].str.contains('Luzern')) & (df['price'] <= (1200000*1.1)) & (df['price'] >= (1200000*0.9)) & (df['type'].str.contains('house'))]
Question: Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?
Answer: df_test = df[(df['Municipality'].str.contains('Basel')) & (df['rooms'] == 3.5) & (df['price'] < 900000) & (df['type'].str.contains('apartment'))]
Question: I am looking for a large house in Lausanne, at least 5 room

Unnamed: 0,Municipality,detailed_description,type,rooms,url,price
10216,Lausanne,DescriptionEN EXCLUSIVITE !A deux pas du Parc ...,detached-house,5.0,https://www.immoscout24.ch//en/d/detached-hous...,1790000.0
10218,Lausanne,"DescriptionSur les hauts de Lausanne, cette vi...",terrace-house,5.0,https://www.immoscout24.ch//en/d/terrace-house...,1590000.0
10222,Lausanne,DescriptionSitué dans un quartier calme et rés...,detached-house,5.0,https://www.immoscout24.ch//en/d/detached-hous...,1790000.0
10226,Lausanne,DescriptionIMMO 4G vous présente en exclusivit...,detached-house,7.0,https://www.immoscout24.ch//en/d/detached-hous...,1950000.0
10230,Lausanne,DescriptionIdéalement située proche de la lign...,detached-house,5.0,https://www.immoscout24.ch//en/d/detached-hous...,1950000.0


In ein paar wege hat GPT2 es richtig verstanden, aber es hat immer noch schwierigkeiten den ganzen code richtig zu generieren. Es hat die anzahl zimmer und den 'less than' vor dem preis richtig gekriegt, aber danach hat es einfach die parameter vom letsten beispiel genommen und die query damit generiert.

In [42]:
query = "I need a house in Basel with a garden for around CHF 1200000."

response = get_model_response(query, model=model, tokenizer=tokenizer, few_shots=few_shot_examples_code)

print(response)

exec(response)
df_test.head()

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: I am looking for an flat in Zurich under 1'000'000 CHF.
Answer: df_test = df[(df['Municipality'].str.contains('Zürich')) & (df['price'] < 1000000) & (df['type'].str.contains('flat'))]
Question: Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?
Answer: df_test = df[(df['Municipality'].str.contains('Bern')) & (df['price'] >= 500000) & (df['price'] <= 700000) & (df['type'].str.contains('terraced_house'))]
Question: I need a house in Lucerne with a garden for around CHF 1,200,000.
Answer: df_test = df[(df['Municipality'].str.contains('Luzern')) & (df['price'] <= (1200000*1.1)) & (df['price'] >= (1200000*0.9)) & (df['type'].str.contains('house'))]
Question: Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?
Answer: df_test = df[(df['Municipality'].str.contains('Basel')) & (df['rooms'] == 3.5) & (df['price'] < 900000) & (df['type'].str.contains('apartment'))]
Question: I am looking for a large house in Lausanne, at least 5 room

Unnamed: 0,Municipality,detailed_description,type,rooms,url,price


In [43]:
query = "I am looking for a house in Biberstein which costs about 1000000."

response = get_model_response(query, model=model, tokenizer=tokenizer, few_shots=few_shot_examples_code)

print(response)

exec(response)
df_test.head()

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Question: I am looking for an flat in Zurich under 1'000'000 CHF.
Answer: df_test = df[(df['Municipality'].str.contains('Zürich')) & (df['price'] < 1000000) & (df['type'].str.contains('flat'))]
Question: Are there terraced houses in Bern in the CHF 500,000 to 700,000 range?
Answer: df_test = df[(df['Municipality'].str.contains('Bern')) & (df['price'] >= 500000) & (df['price'] <= 700000) & (df['type'].str.contains('terraced_house'))]
Question: I need a house in Lucerne with a garden for around CHF 1,200,000.
Answer: df_test = df[(df['Municipality'].str.contains('Luzern')) & (df['price'] <= (1200000*1.1)) & (df['price'] >= (1200000*0.9)) & (df['type'].str.contains('house'))]
Question: Are there modern apartments with 3.5 rooms available in Basel for under CHF 900,000?
Answer: df_test = df[(df['Municipality'].str.contains('Basel')) & (df['rooms'] == 3.5) & (df['price'] < 900000) & (df['type'].str.contains('apartment'))]
Question: I am looking for a large house in Lausanne, at least 5 room

Unnamed: 0,Municipality,detailed_description,type,rooms,url,price


Mit anderen queries, ist es gleich, beim ersten hat es diesmal den richtigen preis format "around" gefunden in den beispiele, aber wieder hat es die parameter genommen von dem genauen beispiel und nicht angepasst an die query. Und bei dem zweiten hat es die richtige gemeinde gefunden, aber hat den preis als anzahl zimmer genommen.

## Fazit GPT2

Durch die experimenten sehen wir, dass GPT-2 mit strikten formaten von prompts gehen kann, aber gibt immer noch nicht wirklich eine corversational antowrt zurück. <br>
Eher ist die antwort "`<prompt>` kann man hier finden: (Antwort)" wenn es überhaupt eine richtige antwort formt. <br>

Sogar mit erklärung von der anfrage, versteht GPT-2 die aufgabe gar nicht und gibt einfach den prompt zurück als antwort.