# 1) Librarian

Librarian is a data-centric conversational model (RAG).

It is able to embody archetypes (specialists) and is able to converse with the user to retrieve information from these databases.

Such a data model centric design allows for the user to query information from a variety of databases, and the librarian is able to retrieve and cross-reference information from these databases.

In [1]:
from RAG.librarian import Librarian 

librarian = Librarian(librarian_LLM_model = "GEMINI")

# SELECT SPECIALIST DATABASE
librarian.select_specialist(specialist = "traveller", specialist_LLM_model = "GEMINI")

# Ask librarian to get acquinted with the specialist database
librarian.Traveller.load_data_model(reembed = False,
                                    embed_id = 0,
                                    data_model_keys = {"TEST - CLIENT":"CLIENT ID",
                                                        "TEST - CLIENT REQUEST":"CLIENT ID",
                                                        "TEST - FLIGHTS":"FLIGHT ID",
                                                        "TEST - ACCOMODATIONS":"ACCOMODATION ID",
                                                        "TEST - ACTIVITIES":"ACTIVITY ID",
                                                        "TEST - SERVICES":"SERVICE ID",
                                                        },
                                    reembed_table = {"TEST - CLIENT":True,
                                                    "TEST - CLIENT REQUEST":True,
                                                    "TEST - FLIGHTS":True,
                                                    "TEST - ACCOMODATIONS":True,
                                                    "TEST - ACTIVITIES":True,
                                                    "TEST - SERVICES":True,
                                                    }

                                    )

# Data Model data quality to be checked (dropped all irrelevant data - invoice number, other ids, empty cells)
# TODO: disintegrate services into 3 buckets so that fit into a day or half  day, etc
# eg; tour: 3h cultural tour at arab str, 4h foodie tour, 6h hiking tour, 3 hr shopping tour 
# - HOW TO CLASSIFY THESE BUCKETS? best way? gotta see existing travel packages (and how they are classified)
# - have to check with SME on data model - edit, modify buckets based on SmartWorld specialised needs
# prompt engineering: include a free-and easy day if theres a gap in the schedule or Reserve 1 day for RR, etc

# RAG ACCURACY: 
# - how to check if the RAG is accurate? 
# - how to check if the RAG is hallucinating? 
# - how to check if the RAG is giving irrelevant data?

#TODO: services -> activity (attractions, shopping, dining,) 
# 
# services new table
# Services: drivers is daily price, if 7 day, 7 x daily price, tourguides, translator, chef, bodyguard, VIP escort


#TODO: costing has to be accurate, and also has to be based on the client's budget

#TODO: itinerary has to be based on the client's preferences, and also has to be based on the client's historical preferences

#TODO: flawed or hallucinated responses have to be stored and set as bad example for future response 
# eg; this package is hallucinating dicsount vouchers at orchard road - this is a bad example, and should be flagged as such

#TODO: streamline library (clear irrevelant backend codes; ADAM_llama_index_RAG, crypto)

loading specialist: TRAVELLER ...
TRAVELLER embedding: TEST - CLIENT - LOADED
TRAVELLER embedding: TEST - CLIENT REQUEST - LOADED
TRAVELLER embedding: TEST - FLIGHTS - LOADED
TRAVELLER embedding: TEST - ACCOMODATIONS - LOADED
TRAVELLER embedding: TEST - ACTIVITIES - LOADED
TRAVELLER embedding: TEST - SERVICES - LOADED
TRAVELLER loaded


## _*Generate*_ : travel package from customer/agent prompt  + inventory

In [3]:
# Ask Traveller to generate a travel package
default_query = "Shaik from Kuala Lumpur wants to go Singapore for 5 days for 2 pax. Budget: $2000. Make one day specifically for Little India. Make leftover days for OTHER casual activities."

initial_query = input(f"Enter your initial query (or press Enter to use the default):\n{default_query}\n")

# Check if the user entered anything
if not initial_query:
  initial_query = default_query

print(f"Using initial query: {initial_query}")


convo_package = librarian.Traveller.III_generate_travel_package(initial_query = initial_query,
                                                                 topN = 6, 
                                                                 model_name = "gemini-pro",
                                                                 )

Using initial query: Shaik from Kuala Lumpur wants to go Singapore for 5 days for 2 pax. Budget: $2000. Make one day specifically for Little India. Make leftover days for OTHER casual activities.
Token size of the prompt for cl100k_base ~ 3351
**Summary**

Embark on an unforgettable 5-day adventure to the vibrant city of Singapore, where you'll immerse yourself in the captivating Little India district and indulge in tantalizing local flavors. Stroll through bustling markets, savor authentic street food, and uncover the rich cultural heritage of this vibrant enclave. Beyond Little India, explore iconic landmarks, experience breathtaking panoramic views, and dive into the diverse culinary scene that Singapore is renowned for.

**Journey Highlights**

* Explore the vibrant Little India district, a kaleidoscope of colors, aromas, and bustling markets
* Savor authentic Indian cuisine and street food on a guided foodie tour
* Visit the iconic Merlion and explore the lush Botanic Gardens
* As

In [5]:
followup_query = input("Enter a followup query to refine the package (or press Enter to skip):\n")

convo_package = librarian.Traveller.III_generate_travel_package(initial_query = "",
                                                                followup_query = followup_query,
                                                                 topN = 6, 
                                                                 model_name = "gemini-pro",
                                                                 )

Token size of the prompt for cl100k_base ~ 3576
**Refined Travel Package**

**Summary**

Embark on an unforgettable 5-day adventure to the vibrant city of Singapore, where you'll immerse yourself in the captivating Little India district and indulge in tantalizing local flavors. Stroll through bustling markets, savor authentic street food, and uncover the rich cultural heritage of this vibrant enclave. Beyond Little India, explore iconic landmarks, experience breathtaking panoramic views, and dive into the diverse culinary scene that Singapore is renowned for.

**Journey Highlights**

* Explore the vibrant Little India district, a kaleidoscope of colors, aromas, and bustling markets
* Savor authentic Indian cuisine and street food on a guided foodie tour
* Visit the iconic Merlion and explore the lush Botanic Gardens
* Ascend to the top of Marina Bay Sands for breathtaking city views
* Indulge in world-class shopping and dining experiences
* Your journey takes you to: Kuala Lumpur - Sin

### i) Librarian's Traveller CAPABILITIES

#### a) _Retrieval_

In [4]:
prompt = "shaik"
client_recommendations = librarian.Traveller.I_recommend_client(content = prompt,
                                                        topN=3,
                                                        # task_type = "retrieval_document",
                                                        task_type = "retrieval_query",
                                                        )
client_recommendations

[{'Passage': 'CLIENT ID: C139, Client Name: DATO SRI SHAIK AQMAL BIN SHAIK ALAUDDIN, Description: TRAVEL INSURANCE & ACOMMODATION 11 - 18 FEB 2024, Date: 08 Feb 24, Price: 22000.0, Remarks: nan, PREPARED BY: ANIES',
  'Similarity Score': 0.5798584591719098},
 {'Passage': 'CLIENT ID: C018, Client Name: ONE&ONLY DESARU (OODC), Description: Commission to Smart World, Date: 2024-01-03 00:00:00, Price: 2790.6, Remarks: nan, PREPARED BY: EKIN',
  'Similarity Score': 0.5793927541501963},
 {'Passage': "CLIENT ID: C001, Client Name: DATO' SRI ISMAIL SABRI BIN YAAKOB , Description: SAUDI Trip from 24-30 Dec 2023 For Umrah. Day 1-2: Arrive in Madinah, visit the Prophet's Mosque and historical sites. Day 3-4: Makkah for Tawaf. Day 5: Cave of Hira, Great Mosque. Day 6: Depart, Date: 2023-10-11 00:00:00, Price: 149579.0, Remarks: UMRAH TRIP 24 - 30 DEC 2023, PREPARED BY: HANA",
  'Similarity Score': 0.5754893795861964}]

In [5]:
prompt = "Who is the most hardcore client?"
client_recommendations = librarian.Traveller.I_recommend_client_request(content = prompt,
                                                        topN=1,
                                                        # task_type = "retrieval_document",
                                                        task_type = "retrieval_query",
                                                        ) 
client_recommendations

[{'Passage': 'CLIENT ID: C139, Date: 2014-03-24 00:00:00, Location: kuala lumpur, Prompt: dunno, just plan something exciting and cultural with calm vibe around an important AI event of mine on december 24. So i need to be mentally sharp before, and not shagged, Duration: 1 month, Period: 2014-12-12 to 2015-01-12, Budget: 40000, Pax: 2, Client Quirks: doesnt like humid weather and will always prefer somewhere cold, Special Requests: Needs 2 bodyguards at all times',
  'Similarity Score': 0.6404644571132381}]

In [None]:
# Libraries (assuming you have necessary libraries installed)
import numpy as np

# Define weights (you can adjust these based on your needs)
PROMPT_WEIGHT = 0.8  # Higher weight for user's current interest
PROFILE_WEIGHT = 0.2  # Lower weight for user profile

def weighted_search(prompt_embedding, user_profile_embedding, recommendations):
  """
  Combines prompt and user profile embedding similarity scores for weighted search.

  Args:
      prompt_embedding: Embedding representing the user's prompt.
      user_profile_embedding: Embedding representing the user's travel profile.
      recommendations: List of recommendations (flights, accommodations, services).

  Returns:
      A list of recommendations sorted by their weighted similarity score.
  """
  # Calculate similarity scores between prompt embedding and each recommendation
  prompt_similarities = [np.dot(prompt_embedding, item['embedding']) for item in recommendations]

  # Calculate similarity scores between user profile embedding and each recommendation
  profile_similarities = [np.dot(user_profile_embedding, item['embedding']) for item in recommendations]

  # Combine similarity scores with weights
  weighted_scores = [PROMPT_WEIGHT * prompt_sim + PROFILE_WEIGHT * profile_sim 
                     for prompt_sim, profile_sim in zip(prompt_similarities, profile_similarities)]

  # Sort recommendations based on weighted scores (descending order)
  sorted_recommendations = sorted(zip(recommendations, weighted_scores), key=lambda x: x[1], reverse=True)

  return [item[0] for item in sorted_recommendations]

# Example usage (assuming you have functions to generate prompt and user profile embeddings, 
# and a list of recommendations with embedding information)
prompt_embedding = generate_prompt_embedding("I want to go to Singapore")
user_profile_embedding = generate_user_profile_embedding(user_id)
recommendations = get_recommendations("Singapore")  # Flights, accommodations, services for Singapore

personalized_recommendations = weighted_search(prompt_embedding, user_profile_embedding, recommendations)

# Use the personalized_recommendations list for further processing or display


#### b) _Recommend_ : travel logistics (II)

In [4]:
content = f"What flights from Singapore to Kuala Lumpur?"

flight_recommendations = librarian.Traveller.I_recommend_flights(content,
                                                                  topN = 5, 
                                                                #  task_type="retrieval_document")
                                                                 task_type = "retrieval_query")
flight_recommendations

[{'Passage': 'FLIGHT ID: 12, Origin: Singapore, Destination: Kuala Lumpur, Price: from $82 ',
  'Similarity Score': 0.7052937218642239},
 {'Passage': 'FLIGHT ID: 14, Origin: Singapore, Destination: Kuala Lumpur, Price: from $70 ',
  'Similarity Score': 0.7012338995802883},
 {'Passage': 'FLIGHT ID: 13, Origin: Singapore, Destination: Kuala Lumpur, Price: from $153 ',
  'Similarity Score': 0.7012112802307542},
 {'Passage': 'FLIGHT ID: 7, Origin: Kuala Lumpur, Destination: Singapore, Price: from $83',
  'Similarity Score': 0.6982578586274387},
 {'Passage': 'FLIGHT ID: 8, Origin: Kuala Lumpur, Destination: Singapore, Price: from $101',
  'Similarity Score': 0.6953344564516624}]

In [5]:
content = f"origin Kuala Lumpur, destination Singapore"

flight_recommendations = librarian.Traveller.I_recommend_flights(content,
                                                                 topN = 10, 
                                                                 task_type="retrieval_query")
                                                                #  task_type = "semantic_similarity")
flight_recommendations

[{'Passage': 'FLIGHT ID: 12, Origin: Singapore, Destination: Kuala Lumpur, Price: from $82 ',
  'Similarity Score': 0.7622100129085936},
 {'Passage': 'FLIGHT ID: 8, Origin: Kuala Lumpur, Destination: Singapore, Price: from $101',
  'Similarity Score': 0.7621598208444731},
 {'Passage': 'FLIGHT ID: 9, Origin: Kuala Lumpur, Destination: Singapore, Price: from $89',
  'Similarity Score': 0.7549945336128664},
 {'Passage': 'FLIGHT ID: 10, Origin: Kuala Lumpur, Destination: Michigan, Singapore, Price: from $1200',
  'Similarity Score': 0.7543854603831508},
 {'Passage': 'FLIGHT ID: 7, Origin: Kuala Lumpur, Destination: Singapore, Price: from $83',
  'Similarity Score': 0.7534204782143469},
 {'Passage': 'FLIGHT ID: 14, Origin: Singapore, Destination: Kuala Lumpur, Price: from $70 ',
  'Similarity Score': 0.7520427625441319},
 {'Passage': 'FLIGHT ID: 13, Origin: Singapore, Destination: Kuala Lumpur, Price: from $153 ',
  'Similarity Score': 0.7500689235748557},
 {'Passage': 'FLIGHT ID: 23, Origi

In [7]:
initial_query = "The client Shaik needs to go Singapore with VIP driver for 6 days, with a focus on chinatown for 1 day. then everything else is free and easy"
prompt = f"""You are a prompt engineer for a travel company. Given the following unstructured request from a client: {initial_query}
                        Segment the query into:
                        1) Client
                        2) Client Request
                        3) Flights
                        4) Accomodations
                        5) Activities
                        6) Services, 
                        So for example, if a client request is "The client Shaik needs to go Singapore with VIP driver for 5 days, with a focus on arab/malay street",
                        you would segment it into and return:
                        'client: Shaik
                        client_request: The client Shaik needs to go Singapore with VIP driver for 5 days, with a focus on arab/malay street
                        flights: Singapore airport
                        accomodations: near arab/malay street, Singapore
                        activities: segmented_query:arab/malay street tour
                        services: segmented_query:VIP driver for 5 days'
                        ALL segments must be returned.
                        If any of the segments are not present, then please indicate that the segment is not present and provide fillers. 
                        """
response = librarian.Traveller.model_specialist.prompt(prompt)
response.text

'Client: Shaik\nClient Request: The client Shaik needs to go Singapore with VIP driver for 6 days, with a focus on chinatown for 1 day. then everything else is free and easy\nFlights: Singapore airport\nAccommodations: Singapore\nActivities: Chinatown tour\nServices: VIP driver for 6 days'

In [8]:
# extract the segments
segments = response.text.split("\n")

segments

['Client: Shaik',
 'Client Request: The client Shaik needs to go Singapore with VIP driver for 6 days, with a focus on chinatown for 1 day. then everything else is free and easy',
 'Flights: Singapore airport',
 'Accommodations: Singapore',
 'Activities: Chinatown tour',
 'Services: VIP driver for 6 days']

In [10]:
# Ask librarian a question regarding one of the specialist databases WITH a chatbot wrapper
# input by agent 
# FUTURE: email thread, entire convo packages - given relevant buckets, eg; dest, duration , budget

# TODO LATER: need relevance score to cut out irrelevant data
travel_proposal = "i need to go Singapore with VIP driver for 5 days, with a focus on arab/malay street"
convo_package = librarian.Traveller.II_recommend_travel_logistics(travel_proposal = travel_proposal,
                                                                   topN=3,
                                                                   chatbot = False, 
                                                                   chatbot_model_name = "gemini-pro")

# Highlight: these outputs are from YOUR preferred buckets (your own inventory) but using gemini/openai


('flights:\n'
 "[{'Passage': 'FLIGHT ID: 10, Origin: Kuala Lumpur, Destination: Michigan, "
 "Singapore, Price: from $1200', 'Similarity Score': 0.5314661234074356}, "
 "{'Passage': 'FLIGHT ID: 7, Origin: Kuala Lumpur, Destination: Singapore, "
 "Price: from $83', 'Similarity Score': 0.5302942126398107}, {'Passage': "
 "'FLIGHT ID: 15, Origin: Singapore, Destination: Michigan, Singapore, Price: "
 "from $1300', 'Similarity Score': 0.5281868848282231}]")
('accomodations:\n'
 "[{'Passage': 'ACCOMODATION ID: 7, Name: Raffles hotel, Type: Hotel, "
 'Location: Singapore, Price per night: 200, Description: An iconic **5-star '
 'luxury beachfront resort**, exudes timeless elegance. After an extensive '
 'restoration, it retains its legendary charm, service, and heritage. Explore '
 'newly opened bars, restaurants, and courtyards. The **Long Bar**, home of '
 'the famous Singapore Sling, returns fully restored. Located in the heart of '
 'the city, close to financial districts, cultural sight

#### c) _*Generate*_ : travel proposal (itinerary) from customer request

In [6]:
# Ask librarian a question regarding one of the specialist databases
travel_proposal_request = "i need to go Singapore for a week, like in this image. Please plan 3 day trip for this country (with 1 day specifically for the location in the image), then a free and easy day where you can fill it up with as much activities as possible from the "
convo_package = librarian.Traveller.II_generate_travel_proposal(input_prompt = travel_proposal_request,
                                                                model_name = "gemini-pro",
                                                                image_path = "./database/travel/sometiktokss.jpg")


(' The image is of the Sultan Mosque in Singapore. It is a beautiful mosque '
 'located in the Kampong Glam district. The mosque is open to visitors from '
 'all faiths and is a popular tourist destination.\n'
 '\n'
 'Day 1:\n'
 '\n'
 '* Arrive in Singapore and check into your hotel.\n'
 '* Take a walk around the Kampong Glam district and visit the Sultan Mosque.\n'
 '* Have dinner at a traditional Malay restaurant.\n'
 '\n'
 'Day 2:\n'
 '\n'
 '* Visit the Gardens by the Bay.\n'
 '* Take a boat ride on the Singapore River.\n'
 '* Have dinner at a seafood restaurant.\n'
 '\n'
 'Day 3:\n'
 '\n'
 '* Visit the Universal Studios Singapore theme park.\n'
 '* Go shopping at the Orchard Road shopping district.\n'
 '* Have dinner at a rooftop restaurant with views of the city.\n'
 '\n'
 'Free and easy day:\n'
 '\n'
 '* Visit the Singapore Zoo.\n'
 '* Go to the Singapore Botanic Gardens.\n'
 '* Take a walk through the Chinatown district.\n'
 '* Have dinner at a Peranakan restaurant.')


In [None]:
#TODO: flawed or hallucinated responses have to be stored and set as bad example for future response 
# eg; this package is hallucinating dicsount vouchers at orchard road - this is a bad example, and should be flagged as such

In [None]:
# # SELECT SPECIALIST DATABASE
# librarian.select_specialist(specialist = "muslim", specialist_LLM_model = "GEMINI")
# librarian.Muslim.load_data_model(reembed = False,
#                                     data_model_keys = {"Quran":"SURAH ID",
#                                                         "Hadith":"HADITH ID",
#                                                         "Tafsir":"TAFSIR ID",
#                                                         }
#                                     )

Key Notes:
at 35:34
text Splitting has to be done by relevance ->embeddings_search --> topN -> chain back to LLM --> talk to AI (+relevance)

at 41:30
Okay nice, he is explanining RAG nicely :
my thoughts at that timestamp:
TECH LEVELS:
1) image + text  -> text 
    - No context: LLM only
    - eg; LLM("Hey AI, give me information on X")
2) image + text -> text  
    - Basic Contextualist: Context sample + LLM
    - eg; LLM("Hey AI, give me information on X, given (Context sample)")
3) image + text -> text 
    - Big Contextualist: Context Population + Recommendation engine + LLM
    - eg; RAG("Hey AI, give me information on X", Librarian(topN, X, Context Population))

at44:31, yes nice explanation of dimensionality reduction


at 46:55: have to classify all avialable LLMs by
matching capabilities, (based on Baijun's picture for example, CLAUDE has best reasoning --> does this imply better model embeddings?)
COST (token, contextualising cost)


at 51:38:
macthing parameters (similiarity)
cosine most popular, but could there be other techiques could have better edge at certain data?? (to investigate)



at 53:50:
VECTOR STORE
db for embeddings/vectors
to store all documents, auto gen emebddings, store auto
!!! Vector stores are optimised to do similiarity really WELL
sure anot, what optimisation, should we built everything inhouse to not be dependant. all this just funnels to get to pay for pinecone only. but yea gotta check cost-benefit here
i've already make everything up till 56mins from scratch already, we already have.
just maybe vector stores. must compare pricing and efficacy of available vector stores


at 59:04:
VECTOR STORE features
vector_store.as_retriever
connects directly to vector_store to get topN
langchain is a wrapper for LLMs. but gemini already has this feature built in.
scope: we aggregate all this LLM services. but remain portable and Microservices oriented as any one of them cld be bought over by corps (no new updates), etc

61:09 onwards is dry pinecone stuff
but yea could have features must replicate/generalise

at 1:01:33
LangChain codebase/structure highlights
prompts themselves are CLASSES, which are modularly attached to RETRIEVER_CLASS
whereas mine is functions only. Need persistence!!

at1:04:20:
langchain codebase highlights
vector_stores are wrapped in highly functionable classes
whereas mine is all objects


at 1:06:46:
Langchain UX experience and ease of "chaining":
(context + prompt + model + parser ) + prompt
so its an infinite way to arrange this.
this is basically modelling conversations with a librarian, and then to self, then writing to paper, then retalk again to librarian, then write again.

at1:09:30
!!! We need to have an AI DATA MODEL map
where we map out the AI conversations with respect to mapping to different specialist databases, human prompts, - its a conversation web modelling


at 1:12:12
wrapup:
nice video on RAG, will have to use whisper for youtube
FUTURE: can even consider other data inputs
 images (we have)
sound + video frames
UI: how? screenshot? video download? links?
AI DATA MODEL:
Process flow of conversation with Librarian and User-
prompt engineering to meet expected outputs (QA)
How to model conversations effectively (with constraints on COST/SPEED OF QUERY)
optimisation? graph theory? or even use a Graph convolutional network on top of all the web to optimise conversation path and prompt engineering?
this is athena i think and all those other company internal AI for boosting efficiency


Post video: 
IDEA:
lol if a RAG can do exactly my Youtube Keynotes taking here with analyses,
analysts will be hyperturbocharged liao
but yea depends on specialist database. so ultimately needs an SME of SME, or a specialist of the specialist database to curate this database. but yea can go even higher dimensions, reach all fields, to reach true GAI
Which is what LLM+search engines (bing copilot) aspire to do? try to be the new google across all domains, with references,
but not for just for information querying but for user experience, like Samsung AI
which is making user interface more seamless with AI augmentations.

so UI UX side, we should focus on what samsung is doing:
these are the AI funnels:

"You
galaxy ai features
Copilot
Certainly! Here are some of the features of Galaxy AI:

Circle to Search: This feature allows you to draw a circle around any object in an image, and Galaxy AI will search for relevant information related to that object.
Browsing Assist: Galaxy AI can assist with web browsing by summarizing articles, providing relevant links, and answering questions based on web content.
Transcript Assist: When you upload an audio or video file, Galaxy AI can transcribe the spoken content and provide a text-based summary.
Photo Assist: Upload an image, and Galaxy AI can describe its contents, identify objects, and provide context.
Generative Edit: Galaxy AI can help generate creative content, such as poems, stories, code, and more.
Call Assist/Live Translate: During phone calls, Galaxy AI can provide real-time translation or assist with conversation summaries.
Chat Assist: Engage in text-based conversations with Galaxy AI, which can respond contextually and provide helpful information.
Note Assist: Galaxy AI can summarize notes, generate outlines, and assist with organizing information."

then formulate all of the thinking (and whatever lacking) behind my chats above into a RAG - then we have an AI business analyst LOL




In [None]:
keynotes_log = """
Key Notes:
at 35:34
text Splitting has to be done by relevance ->embeddings_search --> topN -> chain back to LLM --> talk to AI (+relevance)

at 41:30
Okay nice, he is explanining RAG nicely :
my thoughts at that timestamp:
TECH LEVELS:
1) image + text  -> text 
    - No context: LLM only
    - eg; LLM("Hey AI, give me information on X")
2) image + text -> text  
    - Basic Contextualist: Context sample + LLM
    - eg; LLM("Hey AI, give me information on X, given (Context sample)")
3) image + text -> text 
    - Big Contextualist: Context Population + Recommendation engine + LLM
    - eg; RAG("Hey AI, give me information on X", Librarian(topN, X, Context Population))

at44:31, yes nice explanation of dimensionality reduction


at 46:55: have to classify all avialable LLMs by
matching capabilities, (based on Baijun's picture for example, CLAUDE has best reasoning --> does this imply better model embeddings?)
COST (token, contextualising cost)


at 51:38:
macthing parameters (similiarity)
cosine most popular, but could there be other techiques could have better edge at certain data?? (to investigate)



at 53:50:
VECTOR STORE
db for embeddings/vectors
to store all documents, auto gen emebddings, store auto
!!! Vector stores are optimised to do similiarity really WELL
sure anot, what optimisation, should we built everything inhouse to not be dependant. all this just funnels to get to pay for pinecone only. but yea gotta check cost-benefit here
i've already make everything up till 56mins from scratch already, we already have.
just maybe vector stores. must compare pricing and efficacy of available vector stores


at 59:04:
VECTOR STORE features
vector_store.as_retriever
connects directly to vector_store to get topN
langchain is a wrapper for LLMs. but gemini already has this feature built in.
scope: we aggregate all this LLM services. but remain portable and Microservices oriented as any one of them cld be bought over by corps (no new updates), etc

61:09 onwards is dry pinecone stuff
but yea could have features must replicate/generalise

at 1:01:33
LangChain codebase/structure highlights
prompts themselves are CLASSES, which are modularly attached to RETRIEVER_CLASS
whereas mine is functions only. Need persistence!!

at1:04:20:
langchain codebase highlights
vector_stores are wrapped in highly functionable classes
whereas mine is all objects


at 1:06:46:
Langchain UX experience and ease of "chaining":
(context + prompt + model + parser ) + prompt
so its an infinite way to arrange this.
this is basically modelling conversations with a librarian, and then to self, then writing to paper, then retalk again to librarian, then write again.

at1:09:30
!!! We need to have an AI DATA MODEL map
where we map out the AI conversations with respect to mapping to different specialist databases, human prompts, - its a conversation web modelling


at 1:12:12
wrapup:
nice video on RAG, will have to use whisper for youtube
FUTURE: can even consider other data inputs
 images (we have)
sound + video frames
UI: how? screenshot? video download? links?
AI DATA MODEL:
Process flow of conversation with Librarian and User-
prompt engineering to meet expected outputs (QA)
How to model conversations effectively (with constraints on COST/SPEED OF QUERY)
optimisation? graph theory? or even use a Graph convolutional network on top of all the web to optimise conversation path and prompt engineering?
this is athena i think and all those other company internal AI for boosting efficiency


Post video: 
IDEA:
lol if a RAG can do exactly my Youtube Keynotes taking here with analyses,
analysts will be hyperturbocharged liao
but yea depends on specialist database. so ultimately needs an SME of SME, or a specialist of the specialist database to curate this database. but yea can go even higher dimensions, reach all fields, to reach true GAI
Which is what LLM+search engines (bing copilot) aspire to do? try to be the new google across all domains, with references,
but not for just for information querying but for user experience, like Samsung AI
which is making user interface more seamless with AI augmentations.

so UI UX side, we should focus on what samsung is doing:
these are the AI funnels:

"You
galaxy ai features
Copilot
Certainly! Here are some of the features of Galaxy AI:

Circle to Search: This feature allows you to draw a circle around any object in an image, and Galaxy AI will search for relevant information related to that object.
Browsing Assist: Galaxy AI can assist with web browsing by summarizing articles, providing relevant links, and answering questions based on web content.
Transcript Assist: When you upload an audio or video file, Galaxy AI can transcribe the spoken content and provide a text-based summary.
Photo Assist: Upload an image, and Galaxy AI can describe its contents, identify objects, and provide context.
Generative Edit: Galaxy AI can help generate creative content, such as poems, stories, code, and more.
Call Assist/Live Translate: During phone calls, Galaxy AI can provide real-time translation or assist with conversation summaries.
Chat Assist: Engage in text-based conversations with Galaxy AI, which can respond contextually and provide helpful information.
Note Assist: Galaxy AI can summarize notes, generate outlines, and assist with organizing information."

then formulate all of the thinking (and whatever lacking) behind my chats above into a RAG - then we have an AI business analyst LOL



"""

# KnowledgeBase - Financial Data Time Series
LLM image analysis --> analyse_travel_destination --> Concierge


## a) PROMPT input

In [None]:
date = "2024-03-04"
prompt_0 = f"""
What is the current price of Bitcoin for today ({date})? What is the latest financial news, outlook on the crypto market, 
and also contrast with the traditional stock market and money market.
"""
from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY)

response = g_handler.prompt(prompt_0)
print(response.text)

## b) Image input: Multimodal LLM image analysis

In [None]:
# image_path = "./database/quant/btcusdt_1d_IST.png"
image_path = "./database/quant/buddy_chinese.png"

prompt_1 = f"""
            you are an expert AI generated image analyst- in a AI ethics and misuse department.
            You are given images and asked to analyse them if it is AI generated or not.
            And give your reasonings.
            """
prompt_2 = f"""If the image is AI generated, provide a confidence score from 0 to 100,
            and recreate the prompt that generated the image. 
            If the image is not AI generated, then craft a single paragraph, descriptive prompt that could result in a similar image.
"""
# prompt_2v = f"""If the image is AI generated, provide a confidence score from 0 to 100,
# "The expert AI generated image analyst response to the prior prompt ({prompt_1}) is: 
# <trader's RESPONSE>".
# "My own thoughts on his response is: 
# <vibe_checker's RESPONSE>".
# "Confidence: <YOUR_SCORE>".
# """
from IPython.display import Image, display
display(Image(filename=image_path))
from pprint import pprint
from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY)

g_response = g_handler.prompt_image(model_name = "gemini-pro-vision",
                                  image_path = image_path,
                                  prompt_1 = prompt_1,
                                  prompt_2 = prompt_2,
                                  generation_config = {
                                                        "temperature": 0.9,
                                                        "top_p": 0.95,
                                                        "top_k": 40,
                                                        "max_output_tokens": 1024,
                                                        },
                                block_threshold = "BLOCK_NONE"
                                        )
# print this in a nicer format and not stretched beyond the screen
pprint(g_response.text)

In [None]:

Generate an image with the following description: 
 "The photograph is of a gray tabby cat being held by a person. The cat's eyes "
 'are partially closed and it looks relaxed. The photograph is taken from a '
 'close-up angle and the background is out of focus.\n'
 '\n'
 'To take a similar photograph, you would need a camera and a cat. You would '
 'need to get close to the cat and take the photograph when it is relaxed. You '
 'could also use a telephoto lens to get a closer shot.')

COMMENTS

The code snippet you provided attempts to utilize the Gemini-Pro-Vision model for image analysis in the context of a financial time series chart. However, there are several limitations and concerns to consider:

**Limitations:**

1. **Model capabilities:** While Gemini-Pro-Vision is trained on a massive dataset of text and code, it isn't specifically designed for analyzing financial charts. Its ability to interpret the image and provide the specific descriptions, diagnoses, predictions, and prescriptions you require might be limited.

2. **Data complexity:** Financial charts contain complex information encoded in lines, colors, and patterns. Capturing this information and translating it into accurate financial insights requires specialized training data and models.

**Concerns:**

1. **Over-reliance on LLM:** The code fully relies on the LLM's interpretation of the image, leaving no room for human expertise or domain knowledge. This can lead to inaccurate or misleading financial insights.

2. **Ethical considerations:** Using an LLM for financial predictions without proper validation and transparency can be risky. It's essential to be clear about the limitations of the system and emphasize that its output should not be solely relied upon for investment decisions.

3. **Potential misuse:** This approach might encourage users to interpret the LLM's output as financial advice, even though it might lack the necessary accuracy and context.

**Alternative Approach:**

Instead of relying solely on an LLM, consider a more **hybrid approach**:

1. **Human Analysis:** Involve a human financial analyst to analyze the chart and provide their expertise. This can help ensure the accuracy and reliability of the initial assessment.

2. **LLM Support:** Use the LLM to **complement the human analysis** by:
    * Summarizing relevant financial news or historical data related to the instrument.
    * Providing alternative perspectives or highlighting potential risks.
    * Generating different creative writing formats (e.g., a news article about the market situation) based on the analyst's insights.

This approach leverages the strengths of both humans and LLMs, potentially leading to more robust and reliable insights while mitigating ethical concerns.

It's crucial to remember that LLMs are still under development, and their capabilities in specialized domains like finance are evolving. While they can be valuable tools, exercising caution and combining them with human expertise remains essential, especially when dealing with financial decision-making.

## c) Inventory Recommendation Engine


i) Embed database

ii) embed user_travel_LLM_response above

iii) AI recommendation engine


### i) Embed -> Recommend Inventory

In [None]:
db_path = "./database/travel/Data Model for Travel.xlsx"
sheet_name = "Day Trip"

from settings import GEMINI_API_KEY
from llm_handler import GHandler
import importlib
importlib.reload(GHandler)
from llm_handler.GHandler import GHandler

import pandas as pd

day_trips_df = pd.read_excel(db_path, sheet_name=sheet_name)
# need to embed the day_trips_df
day_trips_df["Text"] = day_trips_df.apply(lambda row: f"Activity: {row['Activity']}, Location: {row['Location']}, Category: {row['Category']} Price: {row['Price']}", axis=1)


g_handler = GHandler(GEMINI_API_KEY)
df_embedded = g_handler.embed_df(day_trips_df,
                                 title = "Activity", 
                                 text = "Text",
                                 model="models/embedding-001")
daytrip_recommendation = g_handler.find_best_passage(g_response.text, df_embedded)
print(daytrip_recommendation)

## d) Cascade Recommendation Engine

Given the user's primary match, the recommendation engine will look up secondary and tertiary recommendations


In [None]:
# Get embeddings for the hotels and flights
flights_hotels_df = pd.read_excel(r'./database/travel/Data Model for Travel.xlsx', sheet_name="Hotels")
# drop where type == Nan
flights_hotels_df = flights_hotels_df.dropna(subset=["Type"])
flights_hotels_df["Text"] = flights_hotels_df.apply(lambda row: f"Name: {row['Name']}, Type: {row['Type']}, Location: {row['Location']}, Price: {row['Price']}, Description: {row['Description']}", axis=1)


g_handler = GHandler(GEMINI_API_KEY)
flights_hotels_df = g_handler.embed_df(flights_hotels_df,
                                 title = "Name", 
                                 text = "Text",
                                 model="models/embedding-001")


In [None]:
Convo_2 = f"""I have an inventory for a {daytrip_recommendation}. I can recommend you a more luxurious based on the location of the activity."""
hotel_recommendation = g_handler.find_best_passage(Convo_2, flights_hotels_df)
print(hotel_recommendation)

In [None]:
Convo_3 = f"""I have an inventory for a {daytrip_recommendation}. I can recommend you the longest flight based on the location of the activity."""
flight_recommendation = g_handler.find_best_passage(Convo_3, flights_hotels_df)
print(flight_recommendation)

In [None]:
flights_hotels_df

In [None]:
# Wrapper that inserts
# daytrip + hotel + flight into a 3 day 2 night itenerary
# so first day fill with casual eating and sightseeing, then 2nd day is the daytrip focus, and 3rd day is the return flight
# LLM has to be smart enough to infer duration of daytrip (hiking mt rinjani is 2 days, so it should be 2 days 1 night)

# Eg: himalayas
# has to account for weather, seasons, and other factors that might affect the trip duration. Or postpone trip under more optimal factors
# TESTCASE: NO inventory for customer request
# TESTCASE: transport types (ferries, etc)

# look at all factor buckets that involves inventory 
# Buckets to quantify the inventory
# main factor is affiliation, services (driver, villa-cook, guide, translator), then location, then price, then type, then rating, then availability, then duration, then weather, then season, then other factors)
# if not affiliated then shouldnt be recommended 




In [None]:
# top n recommendations (pro consumer vs pro business)

# need to see if they have existing monetisation strategy for their travel packages
# recommmendations should be tailored to their monetisation strategy
# (eg: if they have a partnership with a hotel, then recommend that hotel)


# main objective: recommendation engine 
# warning: no buckets for existing monetisation strategy

In [None]:
daytrip_recommendation

In [None]:
flight_recommendation

In [None]:
flight_recommendation

In [None]:
UX_prompt = f"""
Generate a travel package for the given trip_recommendation:
{daytrip_recommendation}

This trip_recommendation comes with the following hotel recommendation:
{hotel_recommendation}

This trip_recommendation comes with the following flight recommendation:
{flight_recommendation}

The package should include the following sections:

"Summary"
introductory and summary of the trip in one paragraph.
The summary should describe in vivid detail, the main attractions, activities, and experiences that the travelers can enjoy in the trip_recommendation. 

"Journey Highlights"
A list of the main features and most exciting aspects of the package.
the highlights must end off with a bold line: "Your journey takes you to: x - y - z"

"Itinerary & Map" 
This section is an itenerary list that shows the day-by-day plan of the trip, that is also accompanied by a map.

The itinerary should include the name, location, and description of each place or activity that the travelers will visit or do each day. 
The itinerary should also indicate the approximate duration and transportation mode for each item.

A highlights and inclusions section that lists the main features and benefits of the package. 
The section should mention what is included in the price, such as flights, accommodation, meals, guides, entrance fees, etc. 
The section should also mention any special offers or discounts that are available for the package.
A dates and pricing section that shows the available dates and prices for the package. 
The section should indicate the departure and return dates, the number of travelers, the total cost, and the payment options for the package. 
The section should also provide a link or contact information for booking or inquiring about the package.

All information derived here should be based on the recommendations from the previous steps and MUST not be fabricated.

"""

In [None]:
from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY)

response_UX = g_handler.prompt(UX_prompt)
print(response_UX.text)

In [None]:
# Format mimic from image of travel website package page

In [None]:
image_path = "./database/travel/lux_example.jpg"
prompt_1 = f"""Reproduce the format of this travel package given a different destination: 
Generate a travel package for the given trip_recommendation:
{daytrip_recommendation}

This trip_recommendation comes with the following hotel recommendation:
{hotel_recommendation}

This trip_recommendation comes with the following flight recommendation:
{flight_recommendation}

"""

from IPython.display import Image, display
# display(Image(filename=image_path))

from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY)

g_response = g_handler.prompt_image(model_name = "gemini-pro-vision",
                                  image_path = image_path,
                                  prompt_1 = prompt_1,
                                  prompt_2 = None)
print(g_response.text)

## e) Full Stack

In [None]:
image_path = "./database/travel/me-at-kelingking-beach-nusa-penida-bali-inonesia-laugh-traveleat.jpg"
prompt_1 = "Tell me the location where this photo is taken from?"
prompt_2 = "Based on the response, recommend a full day trip travel itinerary"
from IPython.display import Image, display
print("(a) Image Analyses")
display(Image(filename=image_path))


import pandas as pd
from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY)

g_response = g_handler.prompt_image(model_name = "gemini-pro-vision",
                                  image_path = image_path,
                                  prompt_1 = prompt_1,
                                  prompt_2 = prompt_2)
# print(g_response.text)
print("(b) Image Analyses Complete")

db_path = "./database/travel/Data Model for Travel.xlsx"
sheet_name = "Day Trip"

def get_day_trip_recommendation(g_response, db_path, sheet_name):
    day_trips_df = pd.read_excel(db_path, sheet_name=sheet_name)
    # need to embed the day_trips_df
    day_trips_df["Text"] = day_trips_df.apply(lambda row: f"Activity: {row['Activity']}, Location: {row['Location']}, Category: {row['Category']} Price: {row['Price']}", axis=1)


    g_handler = GHandler(GEMINI_API_KEY)
    df_embedded = g_handler.embed_df(day_trips_df,
                                    title = "Activity", 
                                    text = "Text",
                                    model="models/embedding-001")
    daytrip_recommendation = g_handler.find_best_passage(g_response.text, df_embedded)
    return daytrip_recommendation

In [None]:
db_path = "./database/travel/Data Model for Travel.xlsx"
sheet_name = "Day Trip"

from settings import GEMINI_API_KEY
from llm_handler import GHandler
import importlib
importlib.reload(GHandler)
from llm_handler.GHandler import GHandler

import pandas as pd

day_trips_df = pd.read_excel(db_path, sheet_name=sheet_name)
# need to embed the day_trips_df
day_trips_df["Text"] = day_trips_df.apply(lambda row: f"Activity: {row['Activity']}, Location: {row['Location']}, Category: {row['Category']} Price: {row['Price']}", axis=1)


g_handler = GHandler(GEMINI_API_KEY)
df_embedded = g_handler.embed_df(day_trips_df,
                                 title = "Activity", 
                                 text = "Text",
                                 model="models/embedding-001")
daytrip_recommendation = g_handler.find_best_passage(g_response.text, df_embedded)
print(daytrip_recommendation)

# Talk to your Fitra AI 

So imagine you are walking back home, from work, on your phone, 
Then you see a website with some inspiring duas, and you want to analyse and 
understand the meaning of the duas, and you want to know the meaning of the duas,
and find contextually similiar duas or information. 
eg;
- "Rabbana atina fid-dunya hasanatan wa fil 'akhirati hasanatan waqina 'adhaban-nar"
- "Our Lord, give us in this world [that which is] good and in the Hereafter [that which is] good and protect us from the punishment of the Fire."
- then you want to find similar duas, or background information about this dua like its origins, hadith chains, and all that 





TECH LEVELS:
1) image + text  -> text 
    - No context: LLM only
    - eg; LLM("Hey AI, give me information on X")
2) image + text -> text  
    - Basic Contextualist: Context sample + LLM
    - eg; LLM("Hey AI, give me information on X, given (Context sample)")
3) image + text -> text 
    - Big Contextualist: Context Population + Recommendation engine + LLM
    - eg; RAG("Hey AI, give me information on X", Librarian(topN, X, Context Population))

In [None]:
from specialists.muslim import Muslim 
muslim = Muslim()
muslim.load_data_model(reembed = True)

In [None]:
text = muslim.tables["Sunan al Tirmidhi"].iloc[1,1]
text

In [None]:
test= muslim.model_specialist.embed_text(title= "hadith", text= text, model="models/embedding-001")

In [None]:
import pandas as pd

from llm_handler.GHandler import GHandler
from settings import GEMINI_API_KEY

g_handler = GHandler(GEMINI_API_KEY)

db_path = "./database/muslim/Sunan al Tirmidhi.csv"
df = pd.read_csv(db_path)
df.iloc[0]

#  translate df.iloc[0] from arabic to english
response = g_handler.prompt(f"Please translate the following from arabic to english: It is from a hadith: {df.iloc[0,0]}")

In [None]:
from pprint import pprint
pprint(response.text)

In [None]:
df

## Tech tier I: Basic Contextualist

In [None]:
# image_path = "./database/quant/btcusdt_1d_IST.png"
image_path = "./database/Fitra AI/dua_1.png"

prompt_1 = f"""
            You are a curator and specialist of all knowledge regarding Islam- 
            in a AI ethics and misuse department.
            You are given images and asked to analyse them for their Islamic content and also level of AI influence.
            So look firstly, for Islamic data authenticity then after that, if there are any AI generated content,
            especially in meanings or translations, give your reasonings.
            """
prompt_2 = f"""
            If the image is not AI generated and is authentic Islamic content, 
            then suggest a list of relevant topics to the Islamic content in this image,
            such as its origins, hadith sources, Quran sources, and other relevant Islamic topics.
"""
# prompt_2v = f"""If the image is AI generated, provide a confidence score from 0 to 100,
# "The expert AI generated image analyst response to the prior prompt ({prompt_1}) is: 
# <trader's RESPONSE>".
# "My own thoughts on his response is: 
# <vibe_checker's RESPONSE>".
# "Confidence: <YOUR_SCORE>".
# """
from IPython.display import Image, display
display(Image(filename=image_path))
from pprint import pprint
from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY,
                     generation_config = {"temperature": 0.9,
                                      "top_p": 0.95,
                                      "top_k": 40,
                                      "max_output_tokens": 1024,
                                      },
                     block_threshold="BLOCK_NONE",
                    )

g_response = g_handler.prompt_image(model_name = "gemini-pro-vision",
                                  image_path = image_path,
                                  prompt_1 = prompt_1,
                                  prompt_2 = prompt_2,)
# print this in a nicer format and not stretched beyond the screen
pprint(g_response.text)

## Tech tier II: Small Contextualist


i) Embed database

ii) embed user_travel_LLM_response above

iii) AI recommendation engine


### i) Embed -> Recommend Inventory

In [None]:
db_path = "./database/travel/SWTT_ Master Database.xlsx"
sheet_name = "CUSTOMERS"

from settings import GEMINI_API_KEY
from llm_handler import GHandler
import importlib
importlib.reload(GHandler)
from llm_handler.GHandler import GHandler

import pandas as pd
tabs = pd.ExcelFile(db_path).sheet_names 
tabs = [sheet for sheet in tabs if "TEST" in sheet]
print(tabs)
# df = pd.read_excel(db_path, sheet_name=sheet_name)
# # need to embed the day_trips_df
# day_trips_df["Text"] = day_trips_df.apply(lambda row: f"Activity: {row['Activity']}, Location: {row['Location']}, Category: {row['Category']} Price: {row['Price']}", axis=1)


# g_handler = GHandler(GEMINI_API_KEY)
# df_embedded = g_handler.embed_df(day_trips_df,
#                                  title = "Activity", 
#                                  text = "Text",
#                                  model="models/embedding-001")
# daytrip_recommendation = g_handler.find_best_passage(g_response.text, df_embedded)
# print(daytrip_recommendation)

## d) Cascade Recommendation Engine

Given the user's primary match, the recommendation engine will look up secondary and tertiary recommendations


In [None]:
# Get embeddings for the hotels and flights
flights_hotels_df = pd.read_excel(r'./database/travel/Data Model for Travel.xlsx', sheet_name="Hotels")
# drop where type == Nan
flights_hotels_df = flights_hotels_df.dropna(subset=["Type"])
flights_hotels_df["Text"] = flights_hotels_df.apply(lambda row: f"Name: {row['Name']}, Type: {row['Type']}, Location: {row['Location']}, Price: {row['Price']}, Description: {row['Description']}", axis=1)


g_handler = GHandler(GEMINI_API_KEY)
flights_hotels_df = g_handler.embed_df(flights_hotels_df,
                                 title = "Name", 
                                 text = "Text",
                                 model="models/embedding-001")


In [None]:
Convo_2 = f"""I have an inventory for a {daytrip_recommendation}. I can recommend you a more luxurious based on the location of the activity."""
hotel_recommendation = g_handler.find_best_passage(Convo_2, flights_hotels_df)
print(hotel_recommendation)

In [None]:
Convo_3 = f"""I have an inventory for a {daytrip_recommendation}. I can recommend you the longest flight based on the location of the activity."""
flight_recommendation = g_handler.find_best_passage(Convo_3, flights_hotels_df)
print(flight_recommendation)

In [None]:
flights_hotels_df

In [None]:
# Wrapper that inserts
# daytrip + hotel + flight into a 3 day 2 night itenerary
# so first day fill with casual eating and sightseeing, then 2nd day is the daytrip focus, and 3rd day is the return flight
# LLM has to be smart enough to infer duration of daytrip (hiking mt rinjani is 2 days, so it should be 2 days 1 night)

# Eg: himalayas
# has to account for weather, seasons, and other factors that might affect the trip duration. Or postpone trip under more optimal factors
# TESTCASE: NO inventory for customer request
# TESTCASE: transport types (ferries, etc)

# look at all factor buckets that involves inventory 
# Buckets to quantify the inventory
# main factor is affiliation, services (driver, villa-cook, guide, translator), then location, then price, then type, then rating, then availability, then duration, then weather, then season, then other factors)
# if not affiliated then shouldnt be recommended 




In [None]:
# top n recommendations (pro consumer vs pro business)

# need to see if they have existing monetisation strategy for their travel packages
# recommmendations should be tailored to their monetisation strategy
# (eg: if they have a partnership with a hotel, then recommend that hotel)


# main objective: recommendation engine 
# warning: no buckets for existing monetisation strategy

In [None]:
daytrip_recommendation

In [None]:
flight_recommendation

In [None]:
flight_recommendation

In [None]:
UX_prompt = f"""
Generate a travel package for the given trip_recommendation:
{daytrip_recommendation}

This trip_recommendation comes with the following hotel recommendation:
{hotel_recommendation}

This trip_recommendation comes with the following flight recommendation:
{flight_recommendation}

The package should include the following sections:

"Summary"
introductory and summary of the trip in one paragraph.
The summary should describe in vivid detail, the main attractions, activities, and experiences that the travelers can enjoy in the trip_recommendation. 

"Journey Highlights"
A list of the main features and most exciting aspects of the package.
the highlights must end off with a bold line: "Your journey takes you to: x - y - z"

"Itinerary & Map" 
This section is an itenerary list that shows the day-by-day plan of the trip, that is also accompanied by a map.

The itinerary should include the name, location, and description of each place or activity that the travelers will visit or do each day. 
The itinerary should also indicate the approximate duration and transportation mode for each item.

A highlights and inclusions section that lists the main features and benefits of the package. 
The section should mention what is included in the price, such as flights, accommodation, meals, guides, entrance fees, etc. 
The section should also mention any special offers or discounts that are available for the package.
A dates and pricing section that shows the available dates and prices for the package. 
The section should indicate the departure and return dates, the number of travelers, the total cost, and the payment options for the package. 
The section should also provide a link or contact information for booking or inquiring about the package.

All information derived here should be based on the recommendations from the previous steps and MUST not be fabricated.

"""

In [None]:
from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY)

response_UX = g_handler.prompt(UX_prompt)
print(response_UX.text)

In [None]:
# Format mimic from image of travel website package page

In [None]:
image_path = "./database/travel/lux_example.jpg"
prompt_1 = f"""Reproduce the format of this travel package given a different destination: 
Generate a travel package for the given trip_recommendation:
{daytrip_recommendation}

This trip_recommendation comes with the following hotel recommendation:
{hotel_recommendation}

This trip_recommendation comes with the following flight recommendation:
{flight_recommendation}

"""

from IPython.display import Image, display
# display(Image(filename=image_path))

from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY)

g_response = g_handler.prompt_image(model_name = "gemini-pro-vision",
                                  image_path = image_path,
                                  prompt_1 = prompt_1,
                                  prompt_2 = None)
print(g_response.text)

## e) Full Stack

# Field populator

In [None]:
columns = """

																																																																										ACCOMMODATION ARRANGMENTS																	NUMBER OF PAX								TRAVEL INSURANCE									VISA ARRANGEMENTS									BESPOKE VIP SERVICES							ENTIRE TRIP GRAND TOTAL (MYR)
ITEM	CLIENTELE													GENERAL BIODATA									CLIENT DIRECT BOOKING							PIC/BOOKER															TRIP							PURCHASE TYPE							FLIGHT ARRANGEMENTS															B2B PLATFORM	DESTINATION	GROUP	PROPERTY TYPE	HOTEL BRAND	STAR RATING	TRAVEL DATE	PERIOD OF STAY (NUMBER OF NIGHTS)	ROOM CATEGORY	ROOM CONFIGURATION (BED-TYPE)	ROOM UNITS	ROOM OCCUPANCY (NUMBER OF GUESTS)	ROOM RATE WITH BREAKFAST/NIGHT (MYR)		ROOM-ONLY RATE /NIGHT (MYR)		GRAND TOTAL	"ADULT 
(12 YEARS OLD AND ABOVE)"	"CHILD
(INFANT - 11 YEARS OLD)"		DOMESTIC HELPER		STAFF	BODYGUARD	POLICE ESCORT	INSURANCE COMPANY	INSURANCE TYPE	INSURANCE PLAN	TRIP PLAN	POLICY NUMBER	PERIOD OF COVERAGE	COST	SELLING	COUNTRIES WE COVER	VISA TYPE	VISA CATEGORY	APPLICANT NAME	REFERENCE NUMBER	PERIOD OF STAY	INSURED BY	COST PER PERSON	MARK UP RATE (MYR)	SELLING RATE PER PERSON	MEET & GREET	AIRPORT DUTY	BUGGY ASSISTANCE	LUGGAGE WRAP SERVICE	SPECIAL CARE ASSISTANCE	VIP LOUNGE	SMART WORLD ALLOWANCE TO STAFF	
	CLIENT ID	ORGANIZATION	SEGMENT	REMARK	PREFIX	TITLE	FULL NAME	DESIGNATION | OCCUPATION	RELATIONSHIP	CLIENT ID	COMPANY	BUSINESS ADDRESS	REGION	BIRTH DATE	PASSPORT NUMBER	IDENTIFICATION NUMBER	MARITAL STATUS	GENDER	AGE	RACE	NATIONALITY 	RELIGION	CLIENT TYPE	"CLIENT SINCE
(YEAR)"	EMAIL ADDRESS	COUNTRY CODE	PHONE	COUNTRY CODE	MOBILE	PREFIX	TITLE	BOOKER NAME	DESIGNATION/OCCUPATION	RELATIONSHIP	CLIENT ID	DEPARTMENT	COMPANY	BUSINESS ADDRESS	REGION	EMAIL ADDRESS	COUNTRY CODE	PHONE	COUNTRY CODE	MOBILE	CLIENT SPENDING POWER	PURPOSE OF TRAVEL	TRAVEL TYPE	OCCASION	FAMILY COMPOSITION	"DATE OF TRAVEL
(DD.MM.YY - DD.MM.YY)"	LENGTH OF TRAVEL (DAY)	FLIGHT	ACCOMMODATION	TRANSPORTATION	VIP SERVICE	VISA	TRAVEL INSURANCE	TOUR PACKAGE	TRIP NAME	AIRLINE NAME	CATEGORY	DESTINATION	DOMESTIC FLIGHT	INTERNATIONAL FLIGHT	ONE WAY TRIP	RETURN TICKET	DEPARTURE DATE	DEPARTURE TIME	ARRIVAL DATE	ARRIVAL TIME	COST	MARK UP RATE	SELLING													COST ROOM RATE PER NIGHT WITH BREAKFAST (MYR)	SELLING ROOM RATE PER NIGHT WITH BREAKFAST (MYR)	COST ROOM-ONLY RATE PER NIGHT (MYR)	SELLING ROOM -ONLY RATE PER NIGHT (MYR)					NATIONALITY	PAX												CLUSTER | DESTINATION																	
"""

In [None]:
image_path = "./database/travel/travel_itinerary.jpg"
prompt_1 = "From this picture of a travel itinerary"
prompt_2 = "Based on the response, recommend a full day trip travel itinerary"
from IPython.display import Image, display
display(Image(filename=image_path))

from settings import GEMINI_API_KEY
from llm_handler.GHandler import GHandler

g_handler = GHandler(GEMINI_API_KEY)

g_response = g_handler.prompt_image(model_name = "gemini-pro-vision",
                                  image_path = image_path,
                                  prompt_1 = prompt_1,
                                  prompt_2 = prompt_2)
print(g_response.text)


Process Flow
0) customer request 
-  

1) Flight booking
- SABRE 
- This must be determined first, 


2) Hotel: B2B platforms: 
- WithinEarth
- TBO
- RAtehawk 
- booking.com 


tour package
- DMC partner lag (24 - 48hours) 

DMC partners need to be filtered, 

proposal:  0.5 days, finalis, another half of the day - hanis will compile
- check flight first, ticketing then hotel. 
STANDARD: 24-48 hours within proposal to get invoice. 

proposals usually take 24 hours, 

invoice, confirmed, wit client then draft of the invoice, --> to finance, and client. 
- depends on payment terms, if they have add on services, then invoice will be sent after trip. 
- mostly high profile boss, but no penalty. penalty 1.2% ---> 15% pay upfreont, another 10% credit term, 5% deposit, 75% the rest after 



mostly the same for umrah. 