# abc Trip Planner (MVP)

Members
* Daniel Jiang (1009509)
* John Mo (1009513)
* Timothy Zheng (1009502)

## Notes

To run this program, you will need to create and populate a `.env` file with your OpenAI API key

`.env`

```bash
OPENAI_API_KEY = "your_key_here"
```

### TODO
* Set up OpenAI key
* Implement LLM
* RAG system
* (eval) compare to "ground truth"

### Innovative "ideas"
* Take note of the places LLM suggests, then make LLM system search online for the prices of places. Then output estimated cost of trip.
* Approach the topic as a "narrative journey": Add the history of places, culture, etc and make attractions "references" to the narrative, instead of the primary focus. Make the narrative the primary focus.
* Attach a packing list
* Implement weather forecast into trip plan

In [1]:
%pip uninstall pandas -y
%pip uninstall numpy -y
%pip uninstall pytz -y
%pip uninstall python-dateutil -y

Found existing installation: pandas 2.2.3
Uninstalling pandas-2.2.3:
  Successfully uninstalled pandas-2.2.3
Note: you may need to restart the kernel to use updated packages.
Found existing installation: numpy 1.26.4
Uninstalling numpy-1.26.4:
  Successfully uninstalled numpy-1.26.4
Note: you may need to restart the kernel to use updated packages.
Found existing installation: pytz 2024.2
Uninstalling pytz-2024.2:
  Successfully uninstalled pytz-2024.2
Note: you may need to restart the kernel to use updated packages.
Found existing installation: python-dateutil 2.9.0.post0
Uninstalling python-dateutil-2.9.0.post0:
  Successfully uninstalled python-dateutil-2.9.0.post0
Note: you may need to restart the kernel to use updated packages.


In [2]:
%pip install --upgrade pip --quiet
%pip install openai --quiet
%pip install python-dotenv --quiet
%pip install praw --quiet
%pip install pydantic --quiet
%pip install langchain-community --quiet
%pip install bs4 --quiet
%pip install pandas --quiet

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
bert-score 0.3.13 requires pandas>=1.0.1, which is not installed.
matplotlib 3.9.3 requires python-dateutil>=2.7, which is not installed.[0m[31m
[0mNote: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [3]:
from openai import OpenAI
from dotenv import load_dotenv
from IPython.display import display, Markdown, clear_output
import time

load_dotenv() # Requires .env file

True

In [4]:
client = OpenAI()

system_prompt = """You are a knowledgeable travel assistant. Provide a comprehensive and well-structured guide for the specified destination. Your response should be detailed, accurate, and formatted using markdown. Follow these guidelines:

1. Begin with a brief introduction about the destination.
2. Use level 2 headers (##) to separate main sections.
3. Use bolding (**) for subsections or important points within sections.
4. Include exclusively the following sections in exactly this order.
   - Best Time to Visit
   - Top Attractions
   - Local Culture and Etiquette
   - Getting Around
   - Food and Dining
   - Accommodation
   - Safety and Health
   - Budget Tips
   - Unique Experiences
   - Essential Phrases (if applicable)

5. Provide practical, up-to-date information and insider tips, highlighting any date-specific events if possible.
6. Use bullet points sparingly, preferring well-structured paragraphs.
7. Include any recent developments or changes affecting travel to the destination.
8. Conclude with a brief summary or final travel tip.

Aim for a comprehensive guide that's easy to read and informative for travelers."""

In [5]:
destination = "Chiang Mai"
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": f"Provide key travel tips for {destination}"}
    ],
    stream=True
)

full_response = ""
for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        content = chunk.choices[0].delta.content
        full_response += content
        
        # Clear the current output and display the updated markdown
        clear_output(wait=True)
        display(Markdown(full_response))
        
        # Add a small delay to make the streaming visible
        time.sleep(0.01)

# Travel Guide to Chiang Mai

Chiang Mai, located in the mountainous region of northern Thailand, is a city rich in history and culture. Known for its well-preserved Old City, stunning temples, and lush landscapes, it offers a more laid-back and traditional experience compared to the bustling capital of Bangkok. Whether you are exploring its ancient roots or partaking in modern adventures, Chiang Mai is a destination that gracefully blends the past with the present.

## Best Time to Visit

Chiang Mai experiences a tropical climate with three distinct seasons: the cool, the hot, and the rainy season. 

- **Cool Season (November to February):** This is the most popular time to visit Chiang Mai, with pleasantly cool temperatures, clear skies, and a range of festivals, including the famous Loy Krathong and Yi Peng Lantern Festivals usually in November.

- **Hot Season (March to May):** Expect temperature spikes during this period, which can be quite oppressive. However, it coincides with the Songkran Festival (Thai New Year) in mid-April, a vibrant and energetic event complete with city-wide water fights.

- **Rainy Season (June to October):** While the rain can be frequent, it transforms the region into a lush, green paradise. Travelers can take advantage of fewer crowds and lower prices, but should be prepared for occasional downpours.

## Top Attractions

Chiang Mai is a treasure trove of cultural and natural attractions.

- **Wat Phra That Doi Suthep:** This iconic temple, perched on a mountain overlooking the city, is a must-visit for its stunning architecture and panoramic views. Climbing the 309 steps to the temple is a revered pilgrimage for many.

- **Old City Temples:** Within the ancient moat and walls are quintessential temples such as Wat Chedi Luang and Wat Phra Singh, each offering unique histories and architectural splendor.

- **Chiang Mai Night Bazaar:** Perfect for shopping enthusiasts, this bustling market offers everything from traditional crafts to street food.

- **Elephant Nature Park:** A sanctuary and rescue center that focuses on the rehabilitation of elephants. It's a perfect spot for animal lovers wanting ethical interactions with these majestic creatures.

## Local Culture and Etiquette

Chiang Mai's culture is deeply rooted in Buddhism and local traditions.

- **Respect for Buddhism:** Always dress appropriately when visiting temples (shoulders and knees covered), and avoid touching statues or point at images of the Buddha with your feet.
  
- **Greeting Etiquette:** The traditional Thai greeting is a wai, where you place your palms together in a prayer-like position and bow slightly. 

- **Cultural Events:** Participate in or respectfully observe local festivals, offering a glimpse into the vibrant traditions of Chiang Mai, such as the colorful flower festival held in February.

## Getting Around

Chiang Mai is relatively easy to navigate, with several options for getting around:

- **Songthaews (Red Trucks):** An affordable and popular mode conveying shared rides that function like a bus/taxi hybrid.

- **Tuk-tuks:** A quick way around town, albeit pricier than songthaews. Negotiating fares upfront is advisable.

- **Bicycles and Motorbikes:** Renting bikes or scooters provides flexibility and is an ideal way to explore the city and its surroundings.

- **Public Transportation:** The city lacks an extensive public transport system, so visitors rely on the options above.

## Food and Dining

Chiang Mai is famous for its Northern Thai cuisine, which is distinct from the rest of Thailand.

- **Must-Try Dishes:** Enjoy Khao Soi, a spicy coconut curry soup, Sai Oua (Northern Thai sausage), and the ubiquitous Green Curry.

- **Dining Spots:** Visit Riverside on the Ping River for a quintessential dining experience or venture into local markets such as the Sunday Walking Street to taste street food delights.

## Accommodation

Chiang Mai offers a variety of accommodation options ranging from budget hostels to luxury hotels.

- **Old City:** Good for proximity to main attractions and a bustling atmosphere.

- **Nimmanhaemin Road:** Ideal for a more modern experience, with chic cafes, boutiques, and vibrant nightlife.

- **Outskirts of the City:** Consider staying at a homestay or resort for a more tranquil experience surrounded by nature.

## Safety and Health

Chiang Mai is generally a safe city, but precautions are always wise.

- **Traffic Caution:** Be vigilant while crossing streets and cautious of the chaotic nature of local driving, especially when renting motorbikes.

- **Health Tips:** Stay hydrated, use insect repellent, and consider vaccinations recommended by your health provider for Southeast Asia.

## Budget Tips

- **Affordable Travel:** Use songthaews for economical travel, and eat at local markets to enjoy authentic food at lower prices.

- **Bargain Shopping:** Negotiating prices, especially in markets, is commonplace and can help stretch your budget further.

## Unique Experiences

Chiang Mai offers a wealth of unique experiences beyond the usual sightseeing.

- **Cooking Classes:** Participating in a traditional Thai cooking class provides both a souvenir and a new skill to take home.

- **Trekking and Hill Tribe Visits:** Organized tours offer insights into the life of the region's ethnic minority groups and stunning natural scenery.

## Essential Phrases

While many locals speak English, knowing a few Thai phrases can enhance your experience.

- **Sawasdee (Khrap/Kha):** Hello (for men/women).
- **Khop Khun (Khrap/Kha):** Thank you (for men/women).
- **Mai Pen Rai:** No worries/It’s okay.
- **Tao Rai?:** How much?

In summary, Chiang Mai is a captivating blend of cultural depth and natural beauty. Planning your visit around the cooler months, immersing yourself in local traditions, and being open to new experiences can make your trip truly unforgettable. Don’t miss out on enjoying Chiang Mai’s rich culinary offerings and exploring its picturesque landscapes by foot or bike for an authentic view of the city.

***

## RAG Implementation

## TODO: Need to create system that dynamically finds reddit threads (maybe tripadvisor if time permits) given a location (refer to the diagram in slides to implement)
* Searches google API for places based on "things to do in <location> site:reddit.com" etc.

In [6]:
# code here
%pip install googlesearch-python

Defaulting to user installation because normal site-packages is not writeable
Note: you may need to restart the kernel to use updated packages.


In [7]:
from googlesearch import search

def fetchRedditLinks(dest):
    queries = [f"Things to do {dest}",
               f"{dest} itinerary",
               f"Hidden spots in {dest}",
               f"Must do {dest}",
               f"Places to stay {dest}"] # or site:reddit.com/r/travel
    
    links = set() # avoid duplicate links
    for query in queries:
        query += " site:reddit.com" # Restrict to reddit.com
        for link in search(query, num_results = 10):
            links.add(link)

    return links



In [8]:
links = fetchRedditLinks(destination)

if links:
    print(f"Reddit Threads about {destination}")
    for link in links:
        print(f"- [{link}]({link})")
else:
    print(f"No Reddit threads found for {destination}")
    
len(links)

Reddit Threads about Chiang Mai
- [https://www.reddit.com/r/chiangmai/comments/tv763h/where_to_stay_in_chiangmai_nimmanheim_or_east_old/](https://www.reddit.com/r/chiangmai/comments/tv763h/where_to_stay_in_chiangmai_nimmanheim_or_east_old/)
- [https://www.reddit.com/r/chiangmai/comments/101znbe/any_hidden_gems_youd_recommend/](https://www.reddit.com/r/chiangmai/comments/101znbe/any_hidden_gems_youd_recommend/)
- [https://www.reddit.com/r/chiangmai/comments/1bmosut/chiang_mai_trip_report_places_to_eat_activities/](https://www.reddit.com/r/chiangmai/comments/1bmosut/chiang_mai_trip_report_places_to_eat_activities/)
- [https://www.reddit.com/r/ThailandTourism/comments/1dvp91d/best_place_to_stay_in_chang_mai/](https://www.reddit.com/r/ThailandTourism/comments/1dvp91d/best_place_to_stay_in_chang_mai/)
- [https://www.reddit.com/r/chiangmai/comments/185wni7/what_to_do_in_chiang_mai/](https://www.reddit.com/r/chiangmai/comments/185wni7/what_to_do_in_chiang_mai/)
- [https://www.reddit.com/r/chi

37

### Testing
* Below is a test list of sources for Chiang Mai (still need to implement the system for fetching docs from Google)

In [9]:
sources = [
    'https://www.reddit.com/r/chiangmai/comments/1cb3fx4/yi_pengloi_krathong_2024/',
    'https://www.reddit.com/r/ThailandTourism/comments/1cpccho/yi_peng_festival_on_a_budget/',
    'https://www.reddit.com/r/ThailandTourism/comments/13b0gld/help_me_with_chiang_mai_itinerary/',
    'https://www.reddit.com/r/chiangmai/comments/1evxoyu/4_days_trip_in_chiang_mai_any_suggestions_for/',
    'https://www.reddit.com/r/chiangmai/comments/1er2mj4/things_to_do_in_chiang_mai/',
    'https://www.reddit.com/r/chiangmai/comments/1bmosut/chiang_mai_trip_report_places_to_eat_activities/',
    'https://www.reddit.com/r/chiangmai/comments/185wni7/what_to_do_in_chiang_mai/',
    
    # buggy tripadvisor
    'https://www.tripadvisor.com/ShowUserReviews-g293917-d8820428-r621075297-Yi_Peng_and_Loy_Krathong_Lantern_Festival-Chiang_Mai.html'
]

In [10]:
# New - use links from reddit
sources = links

In [11]:
import praw
import requests
from bs4 import BeautifulSoup, SoupStrainer
from urllib.parse import urlparse
import pandas as pd
import time
from typing import List, Dict
import os


### Reddit API Key
Create a Reddit API key by following the instructions [here](https://www.reddit.com/prefs/apps) (create a script)

add the following to your `.env` file
```bash
REDDIT_CLIENT_ID = "<your-reddit-client-id>"
REDDIT_CLIENT_SECRET = "<your-reddit-client-secret>" 
REDDIT_USER_AGENT = "python:<app-name>:v1.0 (by /u/<your-reddit-username>)"
```

In [12]:
# Scrape reddit using praw

# using praw to better seperate reddit data
reddit = praw.Reddit(
    client_id=os.environ.get("REDDIT_CLIENT_ID"),
    client_secret=os.environ.get("REDDIT_CLIENT_SECRET"),
    user_agent=os.environ.get("REDDIT_USER_AGENT"),
)

def scrape_reddit_post(url: str) -> Dict:
    """Scrape content from a Reddit post."""
    try:
        # Extract post ID from URL
        submission_id = url.split('comments/')[1].split('/')[0]
        submission = reddit.submission(id=submission_id)
        
        # Get post content
        post_data = {
            'title': submission.title,
            'text': submission.selftext,
            'comments': [],
            'url': url,
            'source': 'reddit'
        }
        
        # Get comments (limit to top-level comments)
        submission.comments.replace_more(limit=0)
        for comment in submission.comments:
            if not comment.stickied:  # Skip stickied comments
                post_data['comments'].append(comment.body)
        
        return post_data
    
    except Exception as e:
        print(f"Error scraping Reddit post {url}: {str(e)}")
        return None


In [13]:
from langchain_community.document_loaders import WebBaseLoader

# TODO scraping tripadvisor is too buggy keeps getting rate limited
def scrape_tripadvisor(url: str):
    loader = WebBaseLoader(
    web_paths=(url,),
    bs_kwargs=dict(
        parse_only=SoupStrainer(
            class_=("post-content", "post-title", "post-header")
            )
        ),
    )
    return loader.load()[0]

USER_AGENT environment variable not set, consider setting it to identify your requests.


In [14]:
# Combined function
def scrape_urls(urls: List[str]) -> pd.DataFrame:
    """Scrape content from a list of URLs."""
    scraped_data = []
    
    for url in urls:
        print(f"Scraping {url}")
        domain = urlparse(url).netloc
        
        # Add delay between requests
        time.sleep(1)
        
        if 'reddit.com' in domain:
            data = scrape_reddit_post(url)
        else:
            data = scrape_tripadvisor(url)
            
        if data:
            scraped_data.append(data)
    
    # Convert to DataFrame
    df = pd.DataFrame(scraped_data)
    return df

# non-reddit links keep getting rate limited
# display(scrape_urls(['https://www.tripadvisor.com/ShowUserReviews-g293917-d8820428-r621075297-Yi_Peng_and_Loy_Krathong_Lantern_Festival-Chiang_Mai.html']))

In [15]:
# Scrape reddit only
reddit_results = scrape_urls([s for s in sources if 'reddit' in s])

# Display first few rows
display(reddit_results.head())

Scraping https://www.reddit.com/r/chiangmai/comments/tv763h/where_to_stay_in_chiangmai_nimmanheim_or_east_old/
Scraping https://www.reddit.com/r/chiangmai/comments/101znbe/any_hidden_gems_youd_recommend/
Scraping https://www.reddit.com/r/chiangmai/comments/1bmosut/chiang_mai_trip_report_places_to_eat_activities/
Scraping https://www.reddit.com/r/ThailandTourism/comments/1dvp91d/best_place_to_stay_in_chang_mai/
Scraping https://www.reddit.com/r/chiangmai/comments/185wni7/what_to_do_in_chiang_mai/
Scraping https://www.reddit.com/r/chiangmai/comments/16qipnb/hotel_recommendation_in_chiang_mai/
Scraping https://www.reddit.com/r/chiangmai/comments/1dxc8ef/where_should_i_stay_old_city_or_nimmanheim_and/
Scraping https://www.reddit.com/r/chiangmai/comments/1evxoyu/4_days_trip_in_chiang_mai_any_suggestions_for/
Scraping https://www.reddit.com/r/ThailandTourism/comments/b66ie2/47_days_in_chiang_mai_i_need_itinerary_helpideas/
Scraping https://www.reddit.com/r/chiangmai/comments/1g3twnl/looking_

Unnamed: 0,title,text,comments,url,source
0,Where to stay in Chiangmai? Nimmanheim or East...,We love to eat street food and walk around. Wh...,[You are asking me where you should stay in Ch...,https://www.reddit.com/r/chiangmai/comments/tv...,reddit
1,Any hidden gems you’d recommend?,I’m in Chiang Mai over the weekend and looking...,[If you are looking for real Thai home cooking...,https://www.reddit.com/r/chiangmai/comments/10...,reddit
2,Chiang Mai Trip Report / Places to eat & Activ...,\n\nHello World!\n\nThis is my first thread...,[Thanks for taking the time to write this. It’...,https://www.reddit.com/r/chiangmai/comments/1b...,reddit
3,Best place to stay in Chang Mai,"I recently stayed at Astra sky river, it's wa...",[You stayed down Chang Klan Road and while it'...,https://www.reddit.com/r/ThailandTourism/comme...,reddit
4,What to do in Chiang Mai?,As the title says. My wife and I are a bit bor...,[Sticky Waterfalls. Mae Rim area has a lot of ...,https://www.reddit.com/r/chiangmai/comments/18...,reddit


### Encode Reddit data in single string for embedding

Target structure (Markdown):

* Markdown performance in GPT-4 is cheaper and more token-efficient
- https://arxiv.org/html/2411.10541v1
   - GPT-3.5-turbo prefers JSON, whereas GPT-4 favors Markdown
   - Encode in markdown since we are using GPT-4
- https://community.openai.com/t/markdown-is-15-more-token-efficient-than-json/841742

In [16]:
first_post = reddit_results.iloc[0]
first_post

title       Where to stay in Chiangmai? Nimmanheim or East...
text        We love to eat street food and walk around. Wh...
comments    [You are asking me where you should stay in Ch...
url         https://www.reddit.com/r/chiangmai/comments/tv...
source                                                 reddit
Name: 0, dtype: object

In [17]:
from dataclasses import dataclass
from typing import List
import textwrap

@dataclass
class RedditThread:
    title: str
    text: str
    comments: List[str]
    url: str
    source: str
    
    def __str__(self) -> str:
        # Format the main post
        output = [
            f"# {self.title}",
            "",
            f"## Original Post",
            "",
            textwrap.fill(self.text, width=80),
            "",
            "### Comments",
            ""
        ]
        
        # Format comments with indentation to show thread structure
        for comment in self.comments:
            # Calculate comment depth based on leading spaces
            depth = len(comment) - len(comment.lstrip())
            # Remove leading spaces and wrap text
            clean_comment = comment.lstrip()
            # Add markdown list marker with proper indentation
            indent = "  " * (depth // 2)  # Adjust indentation based on depth
            wrapped_comment = textwrap.fill(
                clean_comment,
                width=80,
                initial_indent=f"{indent}- ",
                subsequent_indent=f"{indent}  "
            )
            output.append(wrapped_comment)
        
        output += [
            "",
            "#### URL",
            self.url
        ]
        
        return "\n".join(output)
    
    def get_comments_as_str(self):
        output = ["### Comments"]
        # Format comments with indentation to show thread structure
        for comment in self.comments:
            # Calculate comment depth based on leading spaces
            depth = len(comment) - len(comment.lstrip())
            # Remove leading spaces and wrap text
            clean_comment = comment.lstrip()
            # Add markdown list marker with proper indentation
            indent = "  " * (depth // 2)  # Adjust indentation based on depth
            wrapped_comment = textwrap.fill(
                clean_comment,
                width=80,
                initial_indent=f"{indent}- ",
                subsequent_indent=f"{indent}  "
            )
            output.append(wrapped_comment)
        return "\n".join(output)

In [18]:
reddit_threads = [RedditThread(
    title=row['title'],
    text=row['text'],
    comments=row['comments'],
    url=row['url'],
    source=row['source']
) for index, row in reddit_results.iterrows()]

In [19]:
str(reddit_threads[0])

"# Where to stay in Chiangmai? Nimmanheim or East old town?\n\n## Original Post\n\nWe love to eat street food and walk around. Where should we stay in Chiang Mai\nand what to do? We plan on staying 1-2 weeks. Right now we’re looking at\nnimmaheim and east of old town; but really have no idea. Please suggest\n\n### Comments\n\n- You are asking me where you should stay in Chiang  Mai ?  You should stay in\n  the old city .\n- Honestly I'd say east Old City. I'm staying in Nimman now and have to say I\n  don't really get the appeal. There are some nice little side streets but\n  theres also very busy main arteries that cut through the area and aside from\n  that it just kinda seems a bit boring. Basically it doesn't really seem to be\n  neighbourhoody and quiet enough to be quaint but its also too quiet to be\n  lively if that makes sense.  Granted I've only been here a week so need to get\n  out and explore more, and obviously it's likely far quieter than it would be\n  pre-pandemic. But

In [20]:
display(Markdown(str(reddit_threads[0])))

# Where to stay in Chiangmai? Nimmanheim or East old town?

## Original Post

We love to eat street food and walk around. Where should we stay in Chiang Mai
and what to do? We plan on staying 1-2 weeks. Right now we’re looking at
nimmaheim and east of old town; but really have no idea. Please suggest

### Comments

- You are asking me where you should stay in Chiang  Mai ?  You should stay in
  the old city .
- Honestly I'd say east Old City. I'm staying in Nimman now and have to say I
  don't really get the appeal. There are some nice little side streets but
  theres also very busy main arteries that cut through the area and aside from
  that it just kinda seems a bit boring. Basically it doesn't really seem to be
  neighbourhoody and quiet enough to be quaint but its also too quiet to be
  lively if that makes sense.  Granted I've only been here a week so need to get
  out and explore more, and obviously it's likely far quieter than it would be
  pre-pandemic. But I would say for right now Old City, seems to have better
  streetfood options, lots of little local restaurants, more nightlife, more
  walkable, more to see and do in general.
- We love to eat street food and walk around. Where should we stay and what to
  do?   Right now we’re looking at nimmanheim, tha pae gate and santitham.
  Please suggest we really have no idea
- Santitham would be my vote since it’s so inexpensive and has lots of street
  food.  “Walkable” would suggest Old City, but the truth is, Chiang Mai is not
  a small city. Old City has lovely walks, but you’ll most likely have to rent a
  bike or take Grab taxis out of the area to see the other sights CM has to
  offer.
- Highly suggest Enantamer Poshtel. Pappy has great recs for anything you could
  possibly want. If you feel comfortable, rent a scooter to see the whole city
  in a couple days.

#### URL
https://www.reddit.com/r/chiangmai/comments/tv763h/where_to_stay_in_chiangmai_nimmanheim_or_east_old/

### Add reddit data into vector store

In [21]:
%pip install --quiet sentence-transformers
%pip install --quiet langchain
%pip install --quiet langchain_openai

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


In [22]:
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4o"
)

In [23]:
reddit_thread_strings = [str(t) for t in reddit_threads] # not being used anymore

In [24]:
import torch

def get_device():
    if torch.backends.mps.is_available():
        if torch.backends.mps.is_built():
            return "mps"
    return "cpu"

get_device()

'mps'

In [None]:
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import Chroma
from langchain.schema import Document
import numpy as np
from typing import List, Optional
import re


def create_reddit_vectorstore(
    reddit_threads: List[RedditThread], 
    persist_directory: str = "./chroma_db",
) -> Chroma:
# Initialize embeddings model
    embeddings = HuggingFaceEmbeddings(
        model_name=" ", # allows embedding the entire reddit thread, which is probably more useful
        model_kwargs={'device': get_device()}
    )
    
    # Convert threads to Documents with metadata
    documents = []
    for i, thread in enumerate(reddit_threads):
        metadata = {
            "thread_id": i,
            "title": thread.title,
            "content": thread.text,
            "comments": thread.get_comments_as_str(),
            "url": thread.url,
            "type": "reddit_thread",
        }
        
        doc = Document(
            page_content=str(thread),
            metadata=metadata
        )
        documents.append(doc)
    
    vectorstore = Chroma.from_documents(
        documents=documents,
        embedding=embeddings,
        persist_directory=persist_directory
    )
    
    vectorstore.persist()
    return vectorstore

# Create vectorstore
vectorstore = create_reddit_vectorstore(reddit_threads)

retriever = vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={
        "k": 3,
        "filter": None,
    }
)

  embeddings = HuggingFaceEmbeddings(
  from .autonotebook import tqdm as notebook_tqdm
  vectorstore.persist()


In [26]:
from langchain_core.prompts import ChatPromptTemplate

rag_system_prompt = (
    "You're a helpful AI assistant. Given a user question "
    "and some Reddit reddit threads formatted in markdown format composing of user questions and comments, answer the user "
    "question using the replies in the comments based on the topic of the thread. If none of the threads answer the question, "
    "just say you don't know."
    "\n\nHere are the Reddit Threads in full: "
    "{context}"
)

rag_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", rag_system_prompt),
        ("human", "{input}"),
    ]
)

In [27]:
def format_reddit_threads(threads):
    enumerated_reddit_threads = [
        f"Source ID: {i}\nReddit Thread: {thread}"
        for i, thread in enumerate(threads)
    ]

    return "\n\n" + "\n\n".join(enumerated_reddit_threads)


In [28]:
from pydantic import BaseModel, Field
from langchain_core.runnables import RunnablePassthrough


class Citation(BaseModel):
    source_id: int = Field(
        ...,
        description="The integer ID of a SPECIFIC source which justifies the answer.",
    )
    quote: str = Field(
        ...,
        description="The VERBATIM quote from the specified source that justifies the answer.",
    )


class QuotedAnswer(BaseModel):
    """Answer the user question based only on the given sources, and cite the sources used."""

    answer: str = Field(
        ...,
        description="The answer to the user question, which is based only on the given sources.",
    )
    citations: List[Citation] = Field(
        ..., description="Citations from the given sources that justify the answer."
    )

rag_chain_from_docs = (
    RunnablePassthrough.assign(context=(lambda x: format_reddit_threads(x["context"])))
    | rag_prompt
    | llm.with_structured_output(QuotedAnswer)
)


retrieve_docs = (lambda x: x["input"]) | retriever

chain = RunnablePassthrough.assign(context=retrieve_docs).assign(
    answer=rag_chain_from_docs
)

In [29]:
result = chain.invoke({"input": f"What is a must-eat food in {destination}?"})

In [30]:
result["answer"]

QuotedAnswer(answer='A must-eat food in Chiang Mai is Khao Soi. It is highly recommended by multiple sources for its delicious taste.', citations=[Citation(source_id=1, quote='2. Khao soi is the best! Street food is a really great way to try a little bit of everything.'), Citation(source_id=1, quote='Definitely eat Khao Soi, as far as cooking classes go, I’ve used “Cooking @home Chiang Mai” on two occasions and it was both fun and delicious.'), Citation(source_id=1, quote=">2) What are some of the must have food? (I have heard that Khao soi is amazing)  If you've never had it you'll like Khao Soi but if you get a chance visit a Lanna (northern Thai) restaurant as there's so much more to Lanna food. Two recommended dishes - Gaeng Hung Lay and Nam Prik Ong but there are plenty of others and it's completely different to central Thai food, the Thai food you get in restaurants overseas.")])

In [31]:
result["answer"].answer

'A must-eat food in Chiang Mai is Khao Soi. It is highly recommended by multiple sources for its delicious taste.'

In [32]:
import json
def format_citations_to_json(citations: List[Citation], retrieved_docs) -> str:
    # Convert citations to list of dictionaries
    citations_dict = [citation.model_dump() for citation in citations]
    
    # Add URLs using source_id
    for citation in citations_dict:
        citation["url"] = retrieved_docs[citation["source_id"]].metadata["url"]
    
    
    # Convert to JSON with proper formatting
    json_output = json.dumps(citations_dict, indent=2)
    return json_output

In [33]:
print(format_citations_to_json(result["answer"].citations, result['context']))

[
  {
    "source_id": 1,
    "quote": "2. Khao soi is the best! Street food is a really great way to try a little bit of everything.",
    "url": "https://www.reddit.com/r/ThailandTourism/comments/d8b7pl/what_are_some_must_things_to_do_at_chiang_mai/"
  },
  {
    "source_id": 1,
    "quote": "Definitely eat Khao Soi, as far as cooking classes go, I\u2019ve used \u201cCooking @home Chiang Mai\u201d on two occasions and it was both fun and delicious.",
    "url": "https://www.reddit.com/r/ThailandTourism/comments/d8b7pl/what_are_some_must_things_to_do_at_chiang_mai/"
  },
  {
    "source_id": 1,
    "quote": ">2) What are some of the must have food? (I have heard that Khao soi is amazing)  If you've never had it you'll like Khao Soi but if you get a chance visit a Lanna (northern Thai) restaurant as there's so much more to Lanna food. Two recommended dishes - Gaeng Hung Lay and Nam Prik Ong but there are plenty of others and it's completely different to central Thai food, the Thai food

In [34]:
first_result = result['context'][0]
print(first_result.page_content)

print("Metadata")

import json

print(json.dumps(
    first_result.metadata,
    sort_keys=False,
    indent=4,
    separators=(',', ': ')
))

# Things to do in Chiang Mai

## Original Post

Hey guys! I'm an Irish Tourist and am in Chiang Mai for a few days Does anyone
have any recommendations for food or drinks?  I love a good bar, and I love good
food. Apart from that please feel free to mention any other cool stuff!
(Picture: Mae Salong, Chiang Rai)

### Comments

- Here’s a helpful [Chiang Mai Itinerary](https://travelhiatus.com/the-
  perfect-4-days-in-chiang-mai-itinerary/) guide
- https://www.tripadvisor.com/Tourism-g293917-Chiang_Mai-Vacations.html
- Go to santitham and never leave
- Skugga estate, the queen's botanical garden and Elephant Nature Park and
  sunday night market were our highlights.
- Well the UN Irish pub for a pint of Guinness and there's plenty on that road
  to keep you entertained.

#### URL
https://www.reddit.com/r/chiangmai/comments/1er2mj4/things_to_do_in_chiang_mai/
Metadata
{
    "comments": "### Comments\n- Here\u2019s a helpful [Chiang Mai Itinerary](https://travelhiatus.com/the-\n  perfect-

In [35]:
# Here's all the context
result['context']

[Document(metadata={'comments': "### Comments\n- Here’s a helpful [Chiang Mai Itinerary](https://travelhiatus.com/the-\n  perfect-4-days-in-chiang-mai-itinerary/) guide\n- https://www.tripadvisor.com/Tourism-g293917-Chiang_Mai-Vacations.html\n- Go to santitham and never leave\n- Skugga estate, the queen's botanical garden and Elephant Nature Park and\n  sunday night market were our highlights.\n- Well the UN Irish pub for a pint of Guinness and there's plenty on that road\n  to keep you entertained.", 'content': "Hey guys! I'm an Irish Tourist and am in Chiang Mai for a few days\nDoes anyone have any recommendations for food or drinks? \nI love a good bar, and I love good food.\nApart from that please feel free to mention any other cool stuff! \n(Picture: Mae Salong, Chiang Rai)\n", 'thread_id': 29, 'title': 'Things to do in Chiang Mai', 'type': 'reddit_thread', 'url': 'https://www.reddit.com/r/chiangmai/comments/1er2mj4/things_to_do_in_chiang_mai/'}, page_content="# Things to do in Ch

### Generate list of cited claims to add to the enriched RAG answer
* Small change in architecture: our program will output a list of cited claims, which is then passed into the LLM to generate the final answer.

In [36]:
rag_queries = [
    # Core travel information
    f"What are the most underrated or secret spots in {destination} that tourists usually miss?",
    f"What's the current situation with tourists in {destination} as of 2024?",
    f"What are common tourist scams to avoid in {destination} right now?",
    
    # Local insights
    f"Locals of {destination}, what tips would you give to tourists in 2024?",
    f"What are the best neighborhoods to stay in {destination} for different budgets?",
    f"What's the best way to get around {destination} nowadays?",
    
    # Food and culture
    f"What are must-try local foods in {destination} and where to find them?",
    f"What cultural faux pas should tourists avoid in {destination}?",
    f"Which local festivals or events are worth planning a trip around in {destination}?",
    
    # Practical advice
    f"How much does everything cost in {destination} in 2024?",
    f"What's the best time to visit {destination} and why?",
    f"What should I absolutely pack for {destination} that most tourists forget?",
    
    # Safety and logistics
    f"How safe is {destination} for solo travelers in 2024?",
    f"What areas should tourists avoid in {destination}?",
    f"What's the current transportation situation in {destination}?"
]

In [37]:
from typing import List, Dict, Tuple
import re

def generate_qa_pairs(destination: str, chain, queries: List[str]) -> List[Tuple[str, str]]:
    qa_pairs = []
    
    for query in queries:
        
        # Get relevant documents from the retriever
        result = chain.invoke({"input": query})
        
        if not result:
            print(f"Unable to get result for query {query}, skipping")
            continue
        
        retrieved_docs = result['context']
        
        # Generate answer using a template
        answer = result['answer'].answer
        citations = format_citations_to_json(result['answer'].citations, retrieved_docs)
        
        
        
        answer = f"Based on recent Reddit discussions:\n{answer}\nCitations:\n{citations}"
            
        qa_pairs.append((query, answer, citations, retrieved_docs)) # also store retireved docs for benchmarking
    
    return qa_pairs

### Generate Answer

In [38]:
qa_pairs = generate_qa_pairs(destination=destination, chain=chain, queries=rag_queries)
reddit_insights = "\n\n".join([
        f"{i+1}.\nQ: {q}\nA: {a}" for i, (q, a, citations, retrieve_docs) in enumerate(qa_pairs)
    ])

In [39]:
print(reddit_insights)

1.
Q: What are the most underrated or secret spots in Chiang Mai that tourists usually miss?
A: Based on recent Reddit discussions:
1. **MAIIAM Contemporary Art Museum**: This gallery is not commonly recommended but is highly praised for its contemporary art installations tackling social issues. It also has a cool souvenir shop and a café with excellent desserts. 
   
2. **Wat Mae Kaet Noi (Hell Temple)**: A unique and less-visited temple in San Sai, known for its unusual and thought-provoking statues.

3. **Sticky Waterfalls (Namtok Bua Thong)**: Located in the Namtok Bua Thong-Nam Phu Chet Si National Park, these waterfalls allow visitors to walk straight up due to the sticky limestone rocks. 

4. **Twenty Mar Coffee Shop**: A hidden gem with an exhibition space out back, offering a unique and quiet café experience.

5. **Erotic Garden**: An unusual garden on the way to Lamphun, offering a different kind of art garden experience.

6. **Doorbell Ice Cream**: A unique ice cream spot th

In [40]:
rag_system_prompt = f"""You are a knowledgeable travel assistant. Provide a comprehensive and well-structured guide for the specified destination. Your response should be detailed, accurate, and formatted using markdown.

The following information in the form of question-answer pairs with citations has been verified from recent Reddit discussions and should be incorporated into your response: 

{reddit_insights}

Now that is the end of the reddit insights. 

Make sure you follow these guidelines:

1. Begin with a brief introduction about the destination.
2. Use level 2 headers (##) to separate main sections.
3. Use bolding (**) for subsections or important points within sections.
4. Include exclusively the following sections in exactly this order:
   - Best Time to Visit
   - Top Attractions
   - Local Culture and Etiquette
   - Getting Around
   - Food and Dining
   - Accommodation
   - Safety and Health
   - Budget Tips
   - Unique Experiences
   - Essential Phrases (if applicable)

5. Provide practical, up-to-date information and insider tips.
6. Use bullet points sparingly, preferring well-structured paragraphs.
7. Cite every factual claim using numbers in brackets (e.g., "The temple is beautiful[1]").
8. At the end of your response, include a "Citations" section with all referenced quotes in this exact format:
   [1] "exact quote from source" - source_url
   [2] "exact quote from source" - source_url

9. Every citation must include both the exact quote and its source URL.
10. Do not include any information from answers that express uncertainty or lack of knowledge.
11. Make sure every citation in your text corresponds to a quote in the Citations section.
12. You must cite starting with [1]. Citations MUST be sequential, and should not skip numbers.

Aim for a comprehensive guide that's easy to read and informative for travelers.
"""

In [41]:
# print(rag_system_prompt)

### Add Citations to Answer

In [42]:
destination = "Chiang Mai"
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": rag_system_prompt},
        {"role": "user", "content": f"Provide key travel tips for {destination}"}
    ],
    stream=True
)

full_response = ""
for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        content = chunk.choices[0].delta.content
        full_response += content
        
        # Clear the current output and display the updated markdown
        clear_output(wait=True)
        display(Markdown(full_response))
        
        # Add a small delay to make the streaming visible
        time.sleep(0.01)

# Chiang Mai Travel Guide

## Introduction
Chiang Mai, often dubbed the "Rose of the North," is a culturally rich and historic city in Thailand, renowned for its ancient temples, vibrant festivals, and beautiful landscapes. Whether you're seeking adventure in the surrounding mountains or exploring the vibrant local culture, Chiang Mai offers a diverse range of experiences for every traveler.

## Best Time to Visit
Chiang Mai is best visited during the cooler months from November to February, when the weather is pleasant for outdoor activities and exploring the city's attractions[1]. This period is ideal as it avoids the hot summer months and the monsoon rains, offering a comfortable environment for tourists.

## Top Attractions
**MAIIAM Contemporary Art Museum** offers a unique insight into contemporary Thai art, with many installations addressing social issues[1]. 

**Wat Mae Kaet Noi (Hell Temple)** in San Sai is another exceptional site, known for its thought-provoking statues[1].

**Sticky Waterfalls (Namtok Bua Thong)** provide a unique natural experience where visitors can walk up the waterfall due to the sticky nature of the limestone rocks[1].

For a quieter, scenic outing, **Mae Kuang Dam** offers beautiful vistas and a peaceful atmosphere[1].

## Local Culture and Etiquette
When it comes to local customs, visiting **Elephant Nature Park** instead of elephant camps is advised, as it is one of the few ethical sanctuaries focusing on the welfare of the animals[1]. It is important to show respect for the local culture by trying authentic northern Thai dishes and dressing modestly when visiting temples.

## Getting Around
Transportation apps like **Grab**, **Bolt**, **Indriver**, or **Maxim** are recommended for traveling around Chiang Mai, offering affordable rates compared to Tuktuks, which are often more of a tourist attraction with higher fares[1]. The city is also walkable, especially around the Old City and Nimman area.

## Food and Dining
**Khao Soi**, a northern Thai curry noodle dish, is a must-try and highly recommended at restaurants like KHAO-SO-I[1]. Try local delicacies like **Nam Prik Ong**, **Guang Hung Lay**, and **Sai Ua** to experience the rich flavors of Lanna cuisine[1]. More adventurous palates can sample an all-you-can-eat BBQ pork, popular among locals and students[1].

## Accommodation
For modern vibes, the **Nimman Area** boasts chic cafes and boutique hotels, offering a slightly upscale experience. The **Old City** is ideal for those on a moderate budget, rich in history and culture with proximity to traditional markets and iconic temples. **Near Thapae Gate** offers budget-friendly options close to the bustling night markets, suitable for those wanting to stay in the heart of the action[1].

## Safety and Health
Chiang Mai is considered very safe for solo travelers, regularly ranking as the safest city in Southeast Asia[1]. However, tourists should avoid karaoke bars and be mindful of traffic, especially when crossing roads, as drivers do not typically slow down for pedestrians[1]. Pack a rain poncho for unexpected afternoon storms to avoid getting wet while exploring[1].

## Budget Tips
To make your visit affordable, stay near local markets like **Warorot Market**, where you can find items cheaper than in the tourist-oriented night markets[1]. Avoid Tuktuks for longer journeys and make use of ridesharing apps for economical transportation[1].

## Unique Experiences
Visit the quirky **Erotic Garden** or the hidden **Twenty Mar Coffee Shop** with its exhibition space for unique and off-the-beaten-path experiences[1]. Take the opportunity to explore lesser-known cafes like **So Cool Café**, which features creative decor and fish tanks made from old TVs[1].

## Citations
[1] "MAIIAM gallery is something I never see recommended. I stumbled upon it by accident and absolutely loved it. It's contemporary art, many installations tackle social issues and carry some kind of moral statements. There's also a pretty cool souvenir shop and a nice café with bomb desserts." - https://www.reddit.com/r/chiangmai/comments/101znbe/any_hidden_gems_youd_recommend/
[2] "If you want to see something different - Wat Me Kaet Noi in San Sai, the so-called Temple of Hell." - https://www.reddit.com/r/chiangmai/comments/101znbe/any_hidden_gems_youd_recommend/
[3] "Sticky waterfall that you can walk straight up is not far from Chiang Mai" - https://www.reddit.com/r/chiangmai/comments/1fmqsk1/best_things_to_do_in_chiang_mai_not_in_your_guide/
[4] "Mae Kuang Dam. This place is pretty awesome" - https://www.reddit.com/r/chiangmai/comments/101znbe/any_hidden_gems_youd_recommend/
[5] "Elephant Nature Park. That's the one. That's the only one you should visit. Virtually all the rest of them are privately owned, for-profit places with a long history of abuse. Please don't support the other places." - https://www.reddit.com/r/chiangmai/comments/1evxoyu/4_days_trip_in_chiang_mai_any_suggestions_for/
[6] "Don't use Tuktuks for transportation, use Grab, Bold, Indriver or Maxim. If you want to just try a Tuktuk for fun go for it, but it's mostly a tourist thing and prices are much higher that other transport." - https://www.reddit.com/r/chiangmai/comments/1evxoyu/4_days_trip_in_chiang_mai_any_suggestions_for/
[7] "I recommend Khao soi At KHAO-SO-I. The food is really good." - https://www.reddit.com/r/chiangmai/comments/1evxoyu/4_days_trip_in_chiang_mai_any_suggestions_for/
[8] "Try lots of different things. Nam prik ong, guang hung lay, sai ua and more." - https://www.reddit.com/r/chiangmai/comments/1evxoyu/4_days_trip_in_chiang_mai_any_suggestions_for/
[9] "Tamarind Village is worth looking at. Ideally located in the Old City. Very good service and lovely rooms." - https://www.reddit.com/r/chiangmai/comments/16qipnb/hotel_recommendation_in_chiang_mai/
[10] "Chiang Mai regularly ranks as the safest city in Southeast Asia ahead of Singapore - as long as you avoid karaoke bars, you will be fine. For great music, Boy Blues Bar in the Night Bazaar." - https://www.reddit.com/r/chiangmai/comments/1evxoyu/4_days_trip_in_chiang_mai_any_suggestions_for/
[11] "Avoid - Afternoon storms. This is the time to sit at a bar/cafe and watch the world splash by." - https://www.reddit.com/r/chiangmai/comments/1evxoyu/4_days_trip_in_chiang_mai_any_suggestions_for/

### Present With Rag / Without RAG relevance (the evaluation step)

In [43]:
%pip install bert-score --quiet

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Note: you may need to restart the kernel to use updated packages.


In [44]:
from bert_score import score, BERTScorer
def evaluate_response_citations_similarity(
    llm_response: str,
    citations: List[str],
    model_type: str = "roberta-large",
    language: str = "en",
    batch_size: int = 32
):
    # Repeat the LLM response for each citation
    scorer = BERTScorer(model_type='bert-base-uncased')
    precision = np.array([])
    recall = np.array([])
    f1 = np.array([])
    for citation in citations:
        P, R, F1 = scorer.score([llm_response], [citation])
        print(P,R,F1)
        precision = np.append(precision, P)
        recall = np.append(recall, R)
        f1 = np.append(f1, F1)
    
    # Calculate average scores
    avg_precision = np.mean(precision)
    avg_recall = np.mean(recall)
    avg_f1 = np.mean(f1)
    
    # Prepare results
    results = {
        'avg_precision': avg_precision,
        'avg_recall': avg_recall,
        'avg_f1': avg_f1,
        'individual_scores': list(zip(precision, recall, f1))
    }
    
    return results

import re

def parse_markdown_sections(markdown_text):
    """
    Parses a Markdown document into sections by titles (denoted by ##).
    Extracts relevant text and citations for each section.
    Omits the '## Citations' section, but extracts its content separately.
    """
    # Regex patterns
    section_pattern = re.compile(r'(##\s*(.+))')
    citation_pattern = re.compile(r'\[(\d+)\]')
    citation_content_pattern = re.compile(r'\[(\d+)\]\s*"([^"]+)"\s*-\s*\[source_url\]\(([^)]+)\)')

    sections = {}
    citations = {}

    current_section = None

    for line in markdown_text.splitlines():
        # Check for a new section
        section_match = section_pattern.match(line)
        if section_match:
            section_title = section_match.group(2).strip()
            
            # Skip the Citations section but track its content
            if section_title.lower() == "citations":
                current_section = "citations"
                continue
            
            current_section = section_title
            sections[current_section] = {"content": "", "citations": set()}
            continue

        # Extract citations and their contents
        if current_section == "citations":
            citation_match = citation_content_pattern.match(line)
            if citation_match:
                citation_number = citation_match.group(1)
                citation_text = citation_match.group(2)
                citation_url = citation_match.group(3)
                citations[citation_number] = {
                    "text": citation_text,
                    "url": citation_url
                }
            continue

        # Append content if inside a section
        if current_section:
            sections[current_section]["content"] += line.strip() + " "

            # Extract citation numbers in the line
            found_citations = citation_pattern.findall(line)
            sections[current_section]["citations"].update(found_citations)

    # Clean up spaces
    for section in sections:
        sections[section]["content"] = sections[section]["content"].strip()
        sections[section]["citations"] = sorted(list(sections[section]["citations"]))

    # Sort citations by number
    sorted_citations = dict(sorted(citations.items(), key=lambda x: int(x[0])))

    return sections, sorted_citations

# Parse the answer
parsed_parts, sorted_citations = parse_markdown_sections(full_response)


In [45]:
for part in parsed_parts:
    cur_obj = parsed_parts[part]
    citation_list = []
    for citation in cur_obj["citations"]:
        if citation not in sorted_citations:
            continue
        citation_list.append(sorted_citations[citation]["text"])
    
    if len(citation_list) != 0: 
        print(cur_obj["content"])
        print(citation_list)
        results = evaluate_response_citations_similarity(cur_obj["content"], citation_list)

        print(f"Average Precision: {results['avg_precision']:.4f}")
        print(f"Average Recall: {results['avg_recall']:.4f}")
        print(f"Average F1: {results['avg_f1']:.4f}")

        print("\nIndividual Scores:")
        for i, (p, r, f) in enumerate(results['individual_scores']):
            print(f"Citation {i+1}: Precision = {p:.4f}, Recall = {r:.4f}, F1 = {f:.4f}")

scorer = BERTScorer(model_type='bert-base-uncased')
concatenated_citation = ""
for query, answer, citations, retrieved_docs in qa_pairs:
    json_obj = json.loads(citations)
    for citation in json_obj:
        concatenated_citation = concatenated_citation + citation["quote"]
    
P, R, F1 = scorer.score([full_response], [concatenated_citation])
print(P,R,F1)

tensor([0.5432]) tensor([0.5557]) tensor([0.5494])
