In [1]:
!nvidia-smi

Fri Aug 25 20:38:05 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   74C    P8    12W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
!pip install -Uqqq pip --progress-bar off
!pip install -qqq langchain==0.0.228 --progress-bar off
!pip install -qqq chromadb==0.3.26 --progress-bar off
!pip install -qqq sentence-transformers==2.2.2 --progress-bar off
!pip install -qqq auto-gptq==0.2.2 --progress-bar off
!pip install -qqg einops==0.6.1 --progress-bar off
!pip install -qqq unstructured==0.8.0 --progress-bar off
!pip install -qqq transformers==4.30.2 --progress-bar off
!pip install -qqq torch==2.0.1 --progress-bar off


[0m
Usage:   
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...

no such option: -g
[0m

In [3]:
from pathlib import Path

import torch
from auto_gptq import AutoGPTQForCausalLM
from langchain.chains import ConversationalRetrievalChain
from langchain.chains.question_answering import load_qa_chain
from langchain.document_loaders import DirectoryLoader
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import HuggingFacePipeline
from langchain.memory import ConversationBufferMemory
from langchain.prompts import PromptTemplate
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import Chroma
from transformers import AutoTokenizer, GenerationConfig, TextStreamer, pipeline

In [4]:
questions_dir = Path("skyscanner")
questions_dir.mkdir(exist_ok=True, parents=True)

def write_file(question, answer, file_path):
  text = f"""
Q: {question}
A: {answer}
""".strip()
  with Path(questions_dir / file_path).open("w") as text_file:
    text_file.write(text)


In [5]:
write_file(
    question="How do I search for flights on Skyscanner?",
    answer="""
We know you're looking for the best prices and maximum flexibility to choose the right flight.

To start your search, click here to head back to Skyscanner's search pages. Enter the departure country, city or airport in the From text field, the destination country, city or airport in the To text field and your travel dates in the Depart and Return text fields, and click Search. A list of flights will appear.

When you've seen a suitable flight for your needs, you can click on Select, and we'll show you a list of airlines and travel agencies so you can book directly with them.

Smart search filters, such as number of stops and departure time, help you find the perfect flight. Plus, we've got many tips and tricks to help you save more. Please visit our Travel Blog page to find out more.

If you're looking for inspiration for your next trip, why not try our everywhere feature, which helps you see endless destinations, at every price, from thousands of travel sites in one place?
""".strip(),
    file_path="question_1.txt"
)

In [6]:
write_file(
    question="What are mash-ups?",
    answer="""
These are routes where you fly with different airlines, because it’s cheaper than booking with just one. For example:

If you wanted to fly London to New York, we might find it’s cheaper to fly out with British Airways and back with Virgin Atlantic, rather than buy a return ticket with one airline. This is called a ‘sum-of-one-way’ mash-up. Just in case you’re interested.

Another kind of mash-up is what we call a ‘self-transfer’ or a ‘non-protected transfer’. For example:

If you wanted to fly London to Sydney, we might find it’s cheaper to fly London to Dubai with Emirates, and then Dubai to Sydney with Qantas, rather than booking the whole route with one airline.

Pretty simple, right?

However, what’s really important to bear in mind is that mash-ups are NOT codeshares. A codeshare is when the airlines have an alliance. If anything goes wrong with the route — a delay, say, or a strike — those airlines will help you out at no extra cost. But mash-ups DO NOT involve an airline alliance. So if something goes wrong with a mash-up, it could cost you more money.
""".strip(),
    file_path="question_2.txt"
)

In [7]:
write_file(
    question="Why have I been blocked from accessing the Skyscanner website?",
    answer="""
Skyscanner’s websites are scraped by bots many millions of times a day which has a detrimental effect on the service we’re able to provide. To prevent this, we use a bot blocking solution which checks to ensure you’re using the website in a normal manner.

Occasionally, this may mean that a genuine user may be wrongly flagged as a bot. This can be for a number of potential reasons, including, but not limited to:

You’re using a VPN which we have had to block due to excessive bot traffic in the past
You’re using our website at super speed which manages to beat our rate limits
You have a plug-in on your browser which could be interfering with how our website interacts with you as a user
You’re using an automated browser
If you've been blocked during normal use, please send us your IP address (this website may help: http://www.whatismyip.com/), the website you’re accessing (e.g. www.skyscanner.net) and the date/time this happened, via the Contact Us button below and we’ll look into it as quickly as possible.
""".strip(),
    file_path="question_3.txt"
)

In [8]:
write_file(
    question="Where is my booking confirmation?",
    answer="""
You should get a booking confirmation by email from the company you bought your travel from. This can sometimes go into your spam/junk mail folder so it's always worth checking there.

If you still can't find it, it's worth getting in touch with the company you bought from to find out what's going on.

To find out who you need to contact, check the company name next to the charge on your bank account.
""".strip(),
    file_path="question_4.txt"
)

In [9]:
write_file(
    question="How do I change or cancel my booking?",
    answer="""
For all questions about changes, cancellations, and refunds – as well as all other questions about bookings – you'll need to contact the company you bought travel from. They'll have all the info about your booking and can advise you.

You'll find 1000s of travel agencies, airlines, hotels and car hire companies that you can buy from through our site and app. When you buy from one of these travel partners, they will take your payment (you'll see their name on your bank or credit card statement), contact you to confirm your booking, and provide any help or support you might need.

If you bought from one of these partners, you'll need to contact them as they have all the info about your booking. We unfortunately don't have any access to bookings you made with them.
""".strip(),
    file_path="question_5.txt"
)

In [10]:
write_file(
    question="I booked the wrong dates / times",
    answer="""
If you have found that you have booked the wrong dates or times, please contact the airline or travel agent that you booked your flight with as they will be able to help you change your flights to the intended dates or times.

The search box below can help you find the contact details for the travel provider you booked with.

You can search flexible or specific dates on Skyscanner to find your preferred flight, and when you select a flight on Skyscanner you are transferred to the website where you will make and pay for your booking. Once you are redirected to the airline or travel agent website, you might be required to select dates again, depending on the website. In all cases, you will be shown the flight details of your selection and you are required before confirming payment to state that you have checked all details and agreed to the terms and conditions. We strongly recommend that you always check this information carefully, as travel information can be subject to change.
""".strip(),
    file_path="question_6.txt",
)

In [11]:
write_file(
    question="I entered the wrong email address",
    answer="""
If you made a mistake with your email address when booking, you'll need to get in touch with the company you bought travel from as they'll be able to help.

There are 1000s of travel agencies, airlines, hotels and car hire companies that you can buy from through our site and app. We call them "partners". When you buy from them, they will take your payment (you'll see their name on your bank or credit card statement), contact you to confirm your booking, and provide any help or support you might need.
""".strip(),
    file_path="question_7.txt",
)

In [12]:
write_file(
    question="Luggage",
    answer="""
Depending on the flight provider, the rules, conditions and prices for luggage (including sports equipment) do vary.
It’s always a good idea to check with the airline or travel agent directly (and you should be shown the options when you make your booking).
""".strip(),
    file_path="question_8.txt",
)

In [13]:
write_file(
    question="Changes, cancellations, and refunds",
    answer="""
For any questions about changes, cancellations, and refunds – as well as all other questions about bookings – you'll need to contact the company you bought travel from. They'll have all the info about your booking and can advise you.

You'll find 1000s of travel agencies, airlines, hotels and car hire companies that you can buy from through our site and app. When you buy from one of these travel partners, they will take your payment (you'll see their name on your bank or credit card statement), contact you to confirm your booking, and provide any help or support you might need.
""".strip(),
    file_path="question_9.txt",
)

In [14]:
write_file(
    question="Why does the price sometimes change when I am redirected to a flight provider?",
    answer="""
Flight prices and availability change constantly, so we make sure the data is updated regularly to reflect this. When you redirect to a travel provider’s site, the price is updated again so you can be sure that you will always see the best price available from the airline or travel agent at time of booking.

We make every effort to ensure the information you see on Skyscanner is accurate and up to date, but very occasionally there can be reasons why a price change has not updated accurately on the site. If you see a price difference between Skyscanner and a travel provider, please contact us with all the flight details (from, to, dates, departure times, airline and travel agent if applicable) and we will investigate further.
""".strip(),
    file_path="question_10.txt",
)

In [15]:
write_file(
    question="Does Skyscanner charge commission?",
    answer="""
Skyscanner does the hard work for you, and it's free to search for your travel options on our site. We search over 1,200 travel providers to find you the best deal for your flight, hotel or car hire.

See a price you like? When we refer you to book, they pay us, a small fee. And that's all there is to it!
""".strip(),
    file_path="question_11.txt",
)

In [16]:
write_file(
    question="Are my details safe?",
    answer="""
Safeguarding privacy is embedded in our culture and we use a combination of industry-standard methods to protect data.  We’ve strong organisation-wide security programmes, using a range of technical, organisational and administrative security measures and best-practice techniques, depending on the type of data being processed.
""".strip(),
    file_path="question_12.txt",
)

# **Model**

In [17]:
DEVICE = "cuda:0" if torch.cuda.is_available() else "cpu"

In [18]:
DEVICE

'cuda:0'

In [19]:
# !git --version

In [20]:
# !git lfs install

In [21]:
# !git clone https://huggingface.co/TheBloke/Nous-Hermes-13B-GPTQ

In [22]:
model_name_or_path = "TheBloke/Nous-Hermes-13B-GPTQ"
model_basename = "model"

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)

model = AutoGPTQForCausalLM.from_quantized(
    model_name_or_path,
    model_basename=model_basename,
    use_safetensors=True,
    trust_remote_code=True,
    device=DEVICE)

generation_config = GenerationConfig.from_pretrained(model_name_or_path)



In [23]:
question = (
    "What is esg?"
)
prompt = f"""
### Instruction: {question}
### Response:
""".strip()

In [24]:
print(prompt)

### Instruction: What is esg?
### Response:


In [25]:
%%time
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(DEVICE)
with torch.inference_mode():
  output = model.generate(inputs=input_ids, temperature=0.7, max_new_tokens=512)

CPU times: user 7.85 s, sys: 529 ms, total: 8.38 s
Wall time: 11.1 s


In [26]:
print(tokenizer.decode(output[0]))

<s> ### Instruction: What is esg?
### Response:ESG stands for Environmental, Social, and Governance. It is a framework used by investors to assess the sustainability and ethical impact of investment opportunities. The three pillars of ESG focus on the environmental impact of a company's operations, its impact on society through its practices and products, and its governance practices, including transparency and accountability. ESG investing is becoming increasingly popular among investors who want to align their investments with their values and promote sustainable and responsible business practices.</s>


In [27]:
generation_config

GenerationConfig {
  "_from_model_config": true,
  "bos_token_id": 1,
  "eos_token_id": 2,
  "pad_token_id": 0,
  "transformers_version": "4.30.2"
}

In [28]:
 streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True, use_multiprocessing=False)

In [29]:
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_length=2048,
    temperature=0,
    top_p=0.95,
    repetition_penalty=1.15,
    generation_config=generation_config,
    streamer=streamer,
    batch_size=1
)

Xformers is not installed correctly. If you want to use memory_efficient_attention to accelerate training use the following command to install Xformers
pip install xformers.
The model 'LlamaGPTQForCausalLM' is not supported for text-generation. Supported models are ['BartForCausalLM', 'BertLMHeadModel', 'BertGenerationDecoder', 'BigBirdForCausalLM', 'BigBirdPegasusForCausalLM', 'BioGptForCausalLM', 'BlenderbotForCausalLM', 'BlenderbotSmallForCausalLM', 'BloomForCausalLM', 'CamembertForCausalLM', 'CodeGenForCausalLM', 'CpmAntForCausalLM', 'CTRLLMHeadModel', 'Data2VecTextForCausalLM', 'ElectraForCausalLM', 'ErnieForCausalLM', 'GitForCausalLM', 'GPT2LMHeadModel', 'GPT2LMHeadModel', 'GPTBigCodeForCausalLM', 'GPTNeoForCausalLM', 'GPTNeoXForCausalLM', 'GPTNeoXJapaneseForCausalLM', 'GPTJForCausalLM', 'LlamaForCausalLM', 'MarianForCausalLM', 'MBartForCausalLM', 'MegaForCausalLM', 'MegatronBertForCausalLM', 'MvpForCausalLM', 'OpenLlamaForCausalLM', 'OpenAIGPTLMHeadModel', 'OPTForCausalLM', 'Peg

In [30]:
llm = HuggingFacePipeline(pipeline=pipe)

In [31]:
response = llm(prompt)

ESG stands for Environmental, Social and Governance. It refers to the three main factors that investors consider when evaluating a company's sustainability and potential future success. ESG factors include environmental impact, social responsibility, and good corporate governance practices. Investors who prioritize ESG criteria in their investment decisions aim to support companies with strong sustainable practices while avoiding those with negative impacts on society or the environment.



# **Embed Documents**

In [32]:
embeddings = HuggingFaceEmbeddings(
    model_name="embaas/sentence-transformers-multilingual-e5-base",
    model_kwargs={"device": DEVICE},
)

In [33]:
loader = DirectoryLoader("./skyscanner/", glob="**/*txt")
documents = loader.load()
len(documents)

12

In [34]:
text_splitter = CharacterTextSplitter(chunk_size=512, chunk_overlap=0)
texts = text_splitter.split_documents(documents)

In [35]:
texts[4]

Document(page_content="Q: I entered the wrong email address A: If you made a mistake with your email address when booking, you'll need to get in touch with the company you bought travel from as they'll be able to help.", metadata={'source': 'skyscanner/question_7.txt'})

In [36]:
db = Chroma.from_documents(texts, embeddings)

In [37]:
db.similarity_search("flight search")

[Document(page_content="Q: How do I search for flights on Skyscanner? A: We know you're looking for the best prices and maximum flexibility to choose the right flight.\n\nTo start your search, click here to head back to Skyscanner's search pages. Enter the departure country, city or airport in the From text field, the destination country, city or airport in the To text field and your travel dates in the Depart and Return text fields, and click Search. A list of flights will appear.", metadata={'source': 'skyscanner/question_1.txt'}),
 Document(page_content="When you've seen a suitable flight for your needs, you can click on Select, and we'll show you a list of airlines and travel agencies so you can book directly with them.\n\nSmart search filters, such as number of stops and departure time, help you find the perfect flight. Plus, we've got many tips and tricks to help you save more. Please visit our Travel Blog page to find out more.", metadata={'source': 'skyscanner/question_1.txt'})

#**Conversational Chain**

In [38]:
template = """
### Instruction: You're a travelling support agent that is talking to a customer. Use only the chat history and the following information
{context}
to answer in a helpful manner to the question. If you don't know the answer - say that you don't know.
Kepp your replies, short, compassionate and informative.
{chat_history}
### Input: {question}
### Response:
""".strip()

In [39]:
prompt = PromptTemplate(
    input_variables=["context", "question", "chat_history"], template=template
)

In [40]:
memory = ConversationBufferMemory(
    memory_key="chat_history",
    human_prefix="### Input",
    ai_prefix="### Response",
    output_key="answer",
    return_messages=True
)

In [41]:
chain = ConversationalRetrievalChain.from_llm(
    llm,
    chain_type="stuff",
    retriever=db.as_retriever(),
    memory=memory,
    combine_docs_chain_kwargs={"prompt": prompt},
    return_source_documents=True,
    verbose=True,
)

In [42]:
question = "How to search flights?"
answer = chain(question)



[1m> Entering new  chain...[0m


[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3m### Instruction: You're a travelling support agent that is talking to a customer. Use only the chat history and the following information 
Q: How do I search for flights on Skyscanner? A: We know you're looking for the best prices and maximum flexibility to choose the right flight.

To start your search, click here to head back to Skyscanner's search pages. Enter the departure country, city or airport in the From text field, the destination country, city or airport in the To text field and your travel dates in the Depart and Return text fields, and click Search. A list of flights will appear.

When you've seen a suitable flight for your needs, you can click on Select, and we'll show you a list of airlines and travel agencies so you can book directly with them.

Smart search filters, such as number of stops and departure time, help you find the perfect flight. Plus, we've got many

In [43]:
answer.keys()

dict_keys(['question', 'chat_history', 'answer', 'source_documents'])

In [44]:
answer["source_documents"]

[Document(page_content="Q: How do I search for flights on Skyscanner? A: We know you're looking for the best prices and maximum flexibility to choose the right flight.\n\nTo start your search, click here to head back to Skyscanner's search pages. Enter the departure country, city or airport in the From text field, the destination country, city or airport in the To text field and your travel dates in the Depart and Return text fields, and click Search. A list of flights will appear.", metadata={'source': 'skyscanner/question_1.txt'}),
 Document(page_content="When you've seen a suitable flight for your needs, you can click on Select, and we'll show you a list of airlines and travel agencies so you can book directly with them.\n\nSmart search filters, such as number of stops and departure time, help you find the perfect flight. Plus, we've got many tips and tricks to help you save more. Please visit our Travel Blog page to find out more.", metadata={'source': 'skyscanner/question_1.txt'})

In [45]:
question = "I bought flight tickets, but I can't find any confirmation. Where is it?"
answer = chain(question)



[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3mGiven the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:

Human: How to search flights?
Assistant: To search for flights on Skyscanner, follow these steps:
1. Click "Search" at the top of the page.
2. Enter your departure location, arrival location, and travel dates in the respective fields.
3. Click "Search".
4. Browse through the available options and select one that suits your needs.
5. Review the details and price before proceeding to book.
Follow Up Input: I bought flight tickets, but I can't find any confirmation. Where is it?
Standalone question:[0m
Can you help me locate my flight ticket confirmation?

[1m> Finished chain.[0m


[1m> Entering new  chain...[0m


[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3m### Instruction: You're a travelling support agent that is talking to 

#**QA Chain With Memory**

In [46]:
memory = ConversationBufferMemory(
    memory_key="chat_history",
    human_prefix="### Input",
    ai_prefix="### Response",
    input_key="question",
    output_key="output_text",
    return_messages=False,
)

chain = load_qa_chain(
    llm, chain_type="stuff", prompt=prompt, memory=memory, verbose=True
)

In [47]:
question = "How flight search works?"
docs = db.similarity_search(question)
answer = chain.run({"input_documents": docs, "question": question})




[1m> Entering new  chain...[0m


[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3m### Instruction: You're a travelling support agent that is talking to a customer. Use only the chat history and the following information 
Q: How do I search for flights on Skyscanner? A: We know you're looking for the best prices and maximum flexibility to choose the right flight.

To start your search, click here to head back to Skyscanner's search pages. Enter the departure country, city or airport in the From text field, the destination country, city or airport in the To text field and your travel dates in the Depart and Return text fields, and click Search. A list of flights will appear.

When you've seen a suitable flight for your needs, you can click on Select, and we'll show you a list of airlines and travel agencies so you can book directly with them.

Smart search filters, such as number of stops and departure time, help you find the perfect flight. Plus, we've got many

In [48]:
question = "I entered wrong email address during my flight booking. What should I do?"
docs = db.similarity_search(question)
answer = chain.run({"input_documents": docs, "question": question})



[1m> Entering new  chain...[0m


[1m> Entering new  chain...[0m
Prompt after formatting:
[32;1m[1;3m### Instruction: You're a travelling support agent that is talking to a customer. Use only the chat history and the following information 
Q: I entered the wrong email address A: If you made a mistake with your email address when booking, you'll need to get in touch with the company you bought travel from as they'll be able to help.

Q: I booked the wrong dates / times A: If you have found that you have booked the wrong dates or times, please contact the airline or travel agent that you booked your flight with as they will be able to help you change your flights to the intended dates or times.

The search box below can help you find the contact details for the travel provider you booked with.

Q: Where is my booking confirmation? A: You should get a booking confirmation by email from the company you bought your travel from. This can sometimes go into your spam/junk mail folder so

In [49]:
print(answer.strip())

Contact the company you booked your flight with as they will be able to assist you in changing your email address.


#***Customer Support Chatbot***

In [50]:
DEFAULT_TEMPLATE = """
### Instruction: You're a travelling support agent that is talking to a customer. Use only the chat history and the following information
{context}
to answer in a helpful manner to the question. If you don't know the answer - say that you don't know.
Keep your replies short, compassionate and informative.
{chat_history}
### Input: {question}
### Response:
""".strip()


class Chatbot:
    def __init__(
        self,
        text_pipeline: HuggingFacePipeline,
        embeddings: HuggingFaceEmbeddings,
        documents_dir: Path,
        prompt_template: str = DEFAULT_TEMPLATE,
        verbose: bool = False,
    ):
        prompt = PromptTemplate(
            input_variables=["context", "question", "chat_history"],
            template=prompt_template,
        )
        self.chain = self._create_chain(text_pipeline, prompt, verbose)
        self.db = self._embed_data(documents_dir, embeddings)

    def _create_chain(
        self,
        text_pipeline: HuggingFacePipeline,
        prompt: PromptTemplate,
        verbose: bool = False,
    ):
        memory = ConversationBufferMemory(
            memory_key="chat_history",
            human_prefix="### Input",
            ai_prefix="### Response",
            input_key="question",
            output_key="output_text",
            return_messages=False,
        )

        return load_qa_chain(
            text_pipeline,
            chain_type="stuff",
            prompt=prompt,
            memory=memory,
            verbose=verbose,
        )

    def _embed_data(
        self, documents_dir: Path, embeddings: HuggingFaceEmbeddings
    ) -> Chroma:
        loader = DirectoryLoader(documents_dir, glob="**/*txt")
        documents = loader.load()
        text_splitter = CharacterTextSplitter(chunk_size=512, chunk_overlap=0)
        texts = text_splitter.split_documents(documents)
        return Chroma.from_documents(texts, embeddings)

    def __call__(self, user_input: str) -> str:
        docs = self.db.similarity_search(user_input)
        return self.chain.run({"input_documents": docs, "question": user_input})



In [51]:
chatbot = Chatbot(llm, embeddings, "./skyscanner/")

In [58]:
import warnings

warnings.filterwarnings("ignore", category=UserWarning)

print("""Hi, I am a Sky!
Skyscanner customer support chatbot. How can I help you?""")
while True:
    user_input = input("You: ")
    if user_input.lower() in ["bye", "goodbye"]:
        break
    answer = chatbot(user_input)
    print()

Hi, I am a Sky! 
Skyscanner customer support chatbot. How can I help you?
You: I am new to this site. How can I search for a fligh?
Welcome to our website! To search for a flight, simply enter your desired destination and dates in the search bar above. From there, you can choose your preferred departure times, stopovers or non-stop options, and airline carriers. Once you've found a suitable flight, just click 'Select' and you'll be directed to the booking page where you can complete your purchase with the chosen partner. Don't hesitate to let me know if you need further assistance along the way.

You: I previously booked a flight but I forgot from which travel site I booked. How can i know that?
It seems like you might have booked through one of our partners. Could you provide me with more information such as the date of your trip or the airport you are departing from? With that information, I could look up your booking and tell you which partner you used to make the reservation.

You: