## GPT2

In [26]:
import pandas as pd
from transformers import pipeline, set_seed

# Configuration pour assurer la reproductibilité lors de la génération de texte
set_seed(123)

data = {
    'business_id': ['123', '123', '456', '456', '456'],
    'text': [
        "Great service but the food was bland.",
        "Loved the ambiance and the dessert, but the main course was too salty.",
        "The waiter was rude.",
        "The steak was fantastic, but the room was too noisy.",
        "Amazing cocktails and friendly staff, but the price is too high."
    ]
}

df = pd.DataFrame(data)


In [39]:
# Initialiser le modèle GPT-2 
generator = pipeline('text-generation', model='gpt2')

def generate_insights(reviews):
    combined_reviews = " ".join(reviews)
    prompt = (f"Here are some customer reviews on a restaurant deduct the restaurant's asset's.\n\n{combined_reviews}")
    
    summary = generator(prompt,max_length=250, num_return_sequences=1)
    return summary[0]['generated_text']

# Grouper les avis par 'business_id'
grouped_reviews = df.groupby('business_id')['text'].apply(list)

# Générer des insights pour chaque groupe d'avis
insights_by_business = grouped_reviews.apply(generate_insights)

# Afficher les insights
for business_id, insights in insights_by_business.items():
    print(f"Business ID: {business_id}\nInsights:\n{insights}\n")


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Business ID: 123
Insights:
Here are some customer reviews on a restaurant deduct the restaurant's asset's.

Great service but the food was bland. Loved the ambiance and the dessert, but the main course was too salty. The only thing that really helped the mood was the customer service and they would also take your order in person.

I was trying the pizza for my birthday in the morning. My daughter couldn't tell me in any way that a 5 year old could make one that has no idea what it is. It's like when you put on your jacket and see what it is. A 9 year old kid didn't even know there was pizza in his hand at lunch. The waitress came back and said there was a 4 year old who was not there. It was one of the most frustrating experiences of my life.

The food was also terrible, most dishes included. I've been here for about ten years, but I had been here for six years and I wouldn't say what I enjoyed better than pizza. It was just flat out awful and the food was not good. It was very bad tas

In [40]:
def generate_insights(reviews):
    combined_reviews = " ".join(reviews[:5])  # Limiter le nombre d'avis
    prompt = (f"Based on the following customer reviews, identify the key strengths and weaknesses of the restaurant:\n\n{combined_reviews}\n\nStrengths:\n- Weaknesses:")
    
    summary = generator(prompt, max_length=250, num_return_sequences=1)
    return summary[0]['generated_text']

# Grouper les avis par 'business_id'
grouped_reviews = df.groupby('business_id')['text'].apply(list)

# Générer des insights pour chaque groupe d'avis
insights_by_business = grouped_reviews.apply(generate_insights)

# Afficher les insights
for business_id, insights in insights_by_business.items():
    print(f"Business ID: {business_id}\nInsights:\n{insights}\n")


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Business ID: 123
Insights:
Based on the following customer reviews, identify the key strengths and weaknesses of the restaurant:

Great service but the food was bland. Loved the ambiance and the dessert, but the main course was too salty.

Strengths:
- Weaknesses:

- We don't like to go out to our own place and then get our hands dirty on our food until we can't enjoy ourselves again.

- No side-tasting

- Slow serving method

Frequent customer feedback and suggestions to improve the menu for all of the customers.

All reviews are written in Korean language and are considered to be representative of the Korean-language company's work.

Read More

Business ID: 456
Insights:
Based on the following customer reviews, identify the key strengths and weaknesses of the restaurant:

The waiter was rude. The steak was fantastic, but the room was too noisy. Amazing cocktails and friendly staff, but the price is too high.

Strengths:
- Weaknesses:

- The staff of C-I-N-O are really nice - a little

In [37]:
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

# Charger le tokenizer et le modèle
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-large")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-large")

# Initialiser la pipeline de génération de texte
dialogue_generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

def generate_dialogue_insights(reviews):
    combined_reviews = " ".join(reviews[:5])  # Limiter à 5 avis pour une analyse concise
    prompt = (
        f"Customer Reviews: {combined_reviews}\n"
        f"AI: Based on these reviews, the key weaknesses of the restaurant are :"
    )

    # Générer une réponse basée sur les avis
    generated_response = dialogue_generator(prompt, max_length=200, num_return_sequences=1)
    return generated_response[0]['generated_text']


tokenizer_config.json: 100%|██████████| 26.0/26.0 [00:00<00:00, 13.0kB/s]
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
config.json: 100%|██████████| 642/642 [00:00<?, ?B/s] 
vocab.json: 100%|██████████| 1.04M/1.04M [00:00<00:00, 2.71MB/s]
merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 5.20MB/s]
pytorch_model.bin: 100%|██████████| 1.75G/1.75G [01:37<00:00, 17.9MB/s]
generation_config.json: 100%|██████████| 124/124 [00:00<00:00, 62.0kB/s]


In [38]:

# Grouper les avis par 'business_id' et générer des insights
grouped_reviews = df.groupby('business_id')['text'].apply(list)
insights_by_business = grouped_reviews.apply(generate_dialogue_insights)

# Affichage des insights
for business_id, insights in insights_by_business.items():
    print(f"Business ID: {business_id}\nInsights:\n{insights}\n")


Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Business ID: 123
Insights:
Customer Reviews: Great service but the food was bland. Loved the ambiance and the dessert, but the main course was too salty.
AI: Based on these reviews, the key weaknesses of the restaurant are : 1. Saltiness 2. Saltiness 3. Saltiness 4. Saltiness 5. Saltiness 6. Saltiness 7. Saltiness 8. Saltiness 9. Saltiness

Business ID: 456
Insights:
Customer Reviews: The waiter was rude. The steak was fantastic, but the room was too noisy. Amazing cocktails and friendly staff, but the price is too high.
AI: Based on these reviews, the key weaknesses of the restaurant are : 1. The food is not great 2. The service is not great

