In [1]:
%%bash
nvidia-smi

Mon Feb 20 22:26:00 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.47.03    Driver Version: 510.47.03    CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   54C    P0    27W /  70W |      0MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
%%capture
%%bash
pip install --upgrade pip
pip install git+https://github.com/deepset-ai/haystack.git#egg=farm-haystack[colab,faiss]

pip install -qU openai



In [3]:
import logging
logging.basicConfig(format="%(levelname)s - %(name)s -  %(message)s", level=logging.WARNING)
logging.getLogger("haystack").setLevel(logging.INFO)

In [4]:
##################
#Document Store ##
#FAISS is a library for efficient similarity search on a cluster of dense vectors. The FAISSDocumentStore uses a SQL(SQLite in-memory be default) database under-the-hood to store the document text and other meta data. The vector embeddings of the text are indexed on a FAISS Index that later is queried for searching answers. The default flavour of FAISSDocumentStore is "Flat" but can also be set to "HNSW" for faster search at the expense of some accuracy. Just set the faiss_index_factor_str argument in the constructor. For more info on which suits your use case: https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index
##################
from haystack.document_stores import FAISSDocumentStore
#document_store = FAISSDocumentStore(embedding_dim=128, faiss_index_factory_str="Flat")
document_store = FAISSDocumentStore(embedding_dim=768, faiss_index_factory_str="Flat", return_embedding=True)


INFO:haystack.telemetry:Haystack sends anonymous usage data to understand the actual usage and steer dev efforts towards features that are most meaningful to users. You can opt-out at anytime by calling disable_telemetry() or by manually setting the environment variable  HAYSTACK_TELEMETRY_ENABLED as described for different operating systems on the documentation page. More information at https://docs.haystack.deepset.ai/docs/telemetry


In [5]:
#######################################################
## Create a document store base on GOT text extracts ##
######################################################
from google.colab.output import eval_js
eval_js('google.colab.output.setIframeHeight("100")')

from haystack.nodes import PreProcessor
pre_processor = PreProcessor(
    clean_empty_lines=True,
    clean_whitespace=True,
    clean_header_footer=True,
    split_by="word",
    split_length=400,
    split_respect_sentence_boundary=True,
    split_overlap=0)

from haystack.utils import convert_files_to_docs, fetch_archive_from_http, clean_wiki_text
# Let's first get some files that we want to use
doc_dir = "data/GOT"
s3_url = "https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt12.zip"
fetch_archive_from_http(url=s3_url, output_dir=doc_dir)



[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
INFO:haystack.utils.import_utils:Fetching from https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt12.zip to 'data/GOT'


True

In [None]:
eval_js('google.colab.output.setIframeHeight("100")')
# Convert files to dicts
docs = convert_files_to_docs(dir_path=doc_dir, clean_func=clean_wiki_text, split_paragraphs=True)
docs = pre_processor.process(docs)

# Now, let's write the dicts containing documents to our DB.
document_store.write_documents(docs)

In [17]:
#eval_js('google.colab.output.setIframeHeight("100")')
# Convert files to dicts
docs = convert_files_to_docs(dir_path = "ling_txt", clean_func=clean_wiki_text, split_paragraphs=True)
docs = pre_processor.process(docs)

# Now, let's write the dicts containing documents to our DB.
document_store.write_documents(docs)

INFO:haystack.utils.preprocessing:Converting ling_txt/test2 Application Logging Standard Ling.txt


Preprocessing:   0%|          | 0/1 [00:00<?, ?docs/s]

Writing Documents:   0%|          | 0/5 [00:00<?, ?it/s]

In [18]:
# here is where the model will encode the docment into vector
# all doc in doc store will be update with embeding vecotr
# model is download and stored locally
eval_js('google.colab.output.setIframeHeight("100")')

from haystack.nodes import DensePassageRetriever
retriever = DensePassageRetriever(
    document_store=document_store,
    #query_embedding_model="vblagoje/dpr-question_encoder-single-lfqa-wiki",
    query_embedding_model="facebook/dpr-question_encoder-single-nq-base",
    #passage_embedding_model="vblagoje/dpr-ctx_encoder-single-lfqa-wiki",
    passage_embedding_model="facebook/dpr-ctx_encoder-single-nq-base",
    use_gpu=True,
    embed_title=True,
)

document_store.update_embeddings(retriever)

INFO:haystack.modeling.utils:Using devices: CUDA:0 - Number of GPUs: 1
INFO:haystack.modeling.model.language_model:Auto-detected model language: english
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'DPRQuestionEncoderTokenizer'. 
The class this function is called from is 'DPRContextEncoderTokenizerFast'.
INFO:haystack.modeling.model.language_model:Auto-detected model language: english
INFO:haystack.document_stores.faiss:Updating embeddings for 2657 docs...


Updating Embedding:   0%|          | 0/2657 [00:00<?, ? docs/s]

Create embeddings:   0%|          | 0/2672 [00:00<?, ? Docs/s]

In [8]:
## Reader/Generator
# the reader will use the context it's given and generate a answer
# the model will be downloaded locally

from haystack.nodes import Seq2SeqGenerator, RAGenerator
#generator = Seq2SeqGenerator(model_name_or_path="yjernite/bart_eli5")
generator = Seq2SeqGenerator(model_name_or_path="vblagoje/bart_lfqa", max_length=600)


# Initialize RAG Generator
ra_generator = RAGenerator(
    model_name_or_path="facebook/rag-token-nq",
    use_gpu=True,
    top_k=1,
    max_length=600,
    min_length=2,
    embed_title=True,
    num_beams=2,
)




INFO:haystack.modeling.utils:Using devices: CUDA:0 - Number of GPUs: 1


Downloading (…)okenizer_config.json:   0%|          | 0.00/27.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/1.32k [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

Downloading (…)"pytorch_model.bin";:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

INFO:haystack.modeling.utils:Using devices: CUDA:0 - Number of GPUs: 1


Downloading (…)lve/main/config.json:   0%|          | 0.00/4.60k [00:00<?, ?B/s]



Downloading (…)okenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

Downloading (…)_tokenizer/vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'RagTokenizer'. 
The class this function is called from is 'DPRQuestionEncoderTokenizer'.
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'RagTokenizer'. 
The class this function is called from is 'DPRQuestionEncoderTokenizerFast'.


Downloading (…)okenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

Downloading (…)tokenizer/vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

Downloading (…)tokenizer/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'RagTokenizer'. 
The class this function is called from is 'BartTokenizer'.
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'RagTokenizer'. 
The class this function is called from is 'BartTokenizerFast'.


Downloading (…)"pytorch_model.bin";:   0%|          | 0.00/2.06G [00:00<?, ?B/s]

Some weights of the model checkpoint at facebook/rag-token-nq were not used when initializing RagTokenForGeneration: ['rag.question_encoder.question_encoder.bert_model.pooler.dense.bias', 'rag.question_encoder.question_encoder.bert_model.pooler.dense.weight']
- This IS expected if you are initializing RagTokenForGeneration from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RagTokenForGeneration from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RagTokenForGeneration were not initialized from the model checkpoint at facebook/rag-token-nq and are newly initialized: ['rag.generator.lm_head.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictio

In [9]:
# the pipline will link the retriever (our database) 
# and reader (our chatbot) together
from haystack.pipelines import GenerativeQAPipeline
pipe = GenerativeQAPipeline(generator, retriever)

ra_pipe = GenerativeQAPipeline(ra_generator, retriever)



In [10]:
## QnA time ##

def quesetion_ans(question_text, pipe_line):
  res = pipe_line.run(query=question_text, params={"Retriever": {"top_k": 6}})
  answers = res["answers"]
  answer = answers[0]

  print(answer.answer)
  # 'Eddard'
  print(answer.score)
  # 0.9946763813495636
  print(answer.context)
  # 'She travels with her father, Eddard, to King\'s Landing when he is...'
  print(answer.offsets_in_context)
  # [Span(start=72, end=78)]
  print(answer.offsets_in_document)
  # [Span(start=147, end=153)]
  #print(answer.meta)
  print(answer.meta['doc_scores'])
  print(answer.meta['doc_metas'])
  # 'ba2a8e87ddd95e380bec55983ee7d55f'

  return answers


In [13]:
question_text = "what are the anz application logging standard, list in dot point"
ns = quesetion_ans(question_text, pipe)

The ANZ application logging standard is a set of standards that govern how data is stored in electronic storage devices. The standard is based on the idea that data should be stored in a format that is easy to read and easy to manipulate. For example, if you want to store data on a hard drive, you need to be able to read it, write it, and manipulate it in a way that can be easily read and manipulated. The ANZ standard is designed to make it easy for data to be stored and manipulated in a manner that makes it easy to understand what data is being stored.
None
None
None
None
[0.6599020481483923, 0.6590974522316406, 0.6590627895815472, 0.6582378594671305, 0.6581166601685717, 0.6573036236613612]
[{'name': '12_Fire.txt', '_split_id': 0, 'vector_id': '286'}, {'name': '12_Fire.txt', '_split_id': 0, 'vector_id': '922'}, {'name': '191_Gendry.txt', '_split_id': 0, 'vector_id': '1859'}, {'name': '12_Fire.txt', '_split_id': 1, 'vector_id': '335'}, {'name': '332_Sansa_Stark.txt', '_split_id': 0, 'v

In [12]:
question_text = "what are the steps i need to do to follow anz application logging standard, list in dot point"

ns = quesetion_ans(question_text, ra_pipe)

 firefox
None
None
None
None
[0.6640947623199716, 0.6626659421239466, 0.66222517486918, 0.658597782383978, 0.6576700650322412, 0.6572178278267227]
[{'name': '12_Fire.txt', '_split_id': 0, 'vector_id': '922'}, {'name': '12_Fire.txt', '_split_id': 1, 'vector_id': '335'}, {'name': '12_Fire.txt', '_split_id': 0, 'vector_id': '286'}, {'name': '12_Fire.txt', '_split_id': 0, 'vector_id': '1123'}, {'name': '378_A_Game_of_Thrones__board_game_.txt', '_split_id': 0, 'vector_id': '2421'}, {'name': '12_Fire.txt', '_split_id': 0, 'vector_id': '1210'}]




In [None]:
##################################################
## inserting new knowlege to the knowlege store ##
##################################################

def insert_doc(doc_content):
  dicts = [
      {
          'content': f"{doc_content}",
          'meta': {'name': f"{doc_content[0:100]}"}
      }
  ]

  document_store.write_documents(dicts)
  document_store.update_embeddings(retriever, update_existing_embeddings=False)

In [None]:
#new_doc = "Louis Liu is a capability area lead at ANZ bank"
new_doc = "Louis Liu is a capability area lead at ANZ bank"

In [None]:
insert_doc(new_doc)

Writing Documents:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:haystack.document_stores.faiss:Updating embeddings for 2358 docs...


Updating Embedding:   0%|          | 0/2358 [00:00<?, ? docs/s]

In [None]:
# use OpenAI (GPT 3)

import openai
# get API key from top-right dropdown on OpenAI website
openai.api_key = "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"

In [None]:
def openai_quesetion_ans(prompt):
    # query text-davinci-003
    res = openai.Completion.create(
        engine='text-davinci-003',
        prompt=prompt,
        temperature=0,
        max_tokens=2000,
        top_p=1,
        frequency_penalty=0.5,
        presence_penalty=0,
        stop=None
    )
    return res['choices'][0]['text'].strip()

In [None]:
def create_context_prompt(query, context):
  limit = 3750
  # build our prompt with the retrieved contexts included
  prompt_start = (
      "Answer the question based on the context below.\n\n"+
      "Context:\n ''' "
  )
  prompt_end = (
      f"''' \n\nQuestion: {query}\nAnswer:"
  )
  contexts = context
  prompt_context = prompt_start + "\n\n---\n\n" + contexts + prompt_end
  #print(prompt_context)
  return prompt_context

In [None]:
# doc search function
from haystack.utils import print_documents
from haystack.pipelines import DocumentSearchPipeline
p_retrieval = DocumentSearchPipeline(retriever)
res = p_retrieval.run(query="who is louis liu", params={"Retriever": {"top_k": 2}})
print_documents(res, max_text_len=512)


Query: who is louis liu

{   'content': '\n'
               '==Early life==\n'
               'Benioff was born David Friedman in New York City, to a Jewish '
               'family who emigrated from Austria, Romania, Germany, Poland '
               'and Russia. He is the son of Barbara (Benioff) and Stephen '
               'Friedman, who is a former head of Goldman Sachs. He is a '
               'distant cousin of Salesforce founder Marc Benioff. As an '
               "adult, he uses the last name Benioff, his mother's maiden "
               'name, to avoid confusion with other writers named David '
               'Friedman. He is the youngest of three children (Suzy, '
               'Caroline, and David) and grew up ...',
    'name': '33_David_Benioff.txt'}

{   'content': '\n'
               '==Television adaptation==\n'
               'Jason Momoa plays the role of Drogo in the television series.\n'
               'Khal Drogo is played by Jason Momoa in the television '
   

In [None]:
def context_search(question_text):
  p_retrieval = DocumentSearchPipeline(retriever)
  res = p_retrieval.run(query=f"{question_text}", params={"Retriever": {"top_k": 3}})
  #print_documents(res, max_text_len=512, print_meta=True)
  context = [x.content for x in res['documents']]
  
  full_context = ""
  for x in context:
    full_context+= "\n\n" + x

  return full_context

In [None]:
question_text = "who is Louis Liu"



In [None]:
full_context = context_search(question_text)
context_question = create_context_prompt(question_text, full_context)
ns = openai_quesetion_ans(context_question)
print(ns)


• Ensure that every log line produced contains a timestamp with a time zone offset from UTC.
• Ensure that the application logs a 128 bit traceID associated with a particular request flow.
• Ensure that the application can receive and propagate a traceID from request headers. 
• Ensure log files are produced according to a defined schema. 
• Ensure the application can provide logs to a central log aggregator in a consistent, plain-text format. 
• Ensure that each log entry records the log level that generated that particular line (see Appendix A for suggested outputs at each level). 
• Ensure that all application logs are able to provide their application name and/or their executable. 
• Ensure that application logs do not contain sensitive data. The information produced in the log must be treated the same as the data within the system. If data is masked or treated as confidential internally, it should be treated similarly inside of logs as well. Additionally, any data which could be u

In [None]:
ns = openai_quesetion_ans(question_text)
print(ns)

Louis Liu is a Chinese-American entrepreneur and investor. He is the founder and CEO of the venture capital firm, Lightspeed Venture Partners, which has invested in companies such as Snapchat, Affirm, and AppDynamics.


In [None]:
doc_dir = '/content/data/'
# Convert files to dicts
docs = convert_files_to_docs(dir_path=doc_dir, clean_func=clean_wiki_text, split_paragraphs=True)

from haystack.nodes import PreProcessor
pre_processor = PreProcessor(
    clean_empty_lines=True,
    clean_whitespace=True,
    clean_header_footer=True,
    split_by="word",
    split_length=400,
    split_respect_sentence_boundary=True,
    split_overlap=0)



INFO:haystack.utils.preprocessing:Converting /content/data/test2 Application Logging Standard Ling.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/424_Night_King.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/86_Game_of_Thrones__season_4_.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/10_Beyond_the_Wall__Game_of_Thrones_.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/51_Iron_Throne__A_Song_of_Ice_and_Fire_.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/408_The_Last_of_the_Starks.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/369_Samwell_Tarly.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/334_Rickon_Stark.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/118_Dark_Wings__Dark_Words.txt
INFO:haystack.utils.preprocessing:Converting /content/data/GOT/444_Cripples__Bastards__and_Broken_Things.txt
INFO:haystack.utils.preprocessing:Convert

In [None]:
docs = pre_processor.process(docs)

Preprocessing:   0%|          | 0/2498 [00:00<?, ?docs/s]



In [None]:
docs = pre_processor.process(docs)
# Now, let's write the dicts containing documents to our DB.
document_store.write_documents(docs)
document_store.update_embeddings(retriever)

Preprocessing:   0%|          | 0/2797 [00:00<?, ?docs/s]



Writing Documents:   0%|          | 0/2797 [00:00<?, ?it/s]

INFO:haystack.document_stores.faiss:Updating embeddings for 3224 docs...


Updating Embedding:   0%|          | 0/3224 [00:00<?, ? docs/s]

Create embeddings:   0%|          | 0/3232 [00:00<?, ? Docs/s]

In [None]:
question_text = "what are the  Minimum, mandatory requirements in application logging standard, list in dot point"
full_context = context_search(question_text)
context_question = create_context_prompt(question_text, full_context)
ns = openai_quesetion_ans(context_question)
print(ns)

* Ensure that every log line produced contains a timestamp with a time zone offset from UTC.
* Ensure that the application logs a 128 bit traceID associated with a particular request flow.
* Ensure that the application can receive and propagate a traceID from request headers.
* Distributed Tracing Pattern Headers
* Ensure that log files are produced according to a defined schema. 
* Ensure the application can provide logs to a central log aggregator in a consistent, plain-text format.
* Ensure that each log entry records the log level that generated that particular line (see Appendix A for suggested outputs at each level).
* Ensure that all application logs are able to provide their application name and/or their executable. 
* Ensure that application logs do not contain sensitive data.


In [None]:
full_context



In [None]:
full_txt = """

Guides home 
Guide 1: Get started with Confluence 
A brief overview of Confluence 
Set up your site and spaces 
Create and collaborate on content 
Navigate Confluence 
Confluence best practices 
Create compelling contentStay organizedIncrease collaboration and engagement
Guide 2: Extend the functionality of Confluence 
Confluence integrations 
Using Confluence and Jira software together 
Using Confluence and Jira Service Management Together 
Confluence Use Cases 
Guide 3: Best practices for getting the most out of Confluence 
Guide 4: Knowledge management using Confluence 
Confluence best practices 
Teams love Confluence because it's flexible - it's easy to customize for any organization. To 
help you navigate the many powerful ways of using Confluence, we've curated a collection 
of best practices.
Before you dive in, sign up for your free Confluence Cloud site and read the following 
sections: set up your site and spaces, create content in Confluence, and navigate Confluence.
 
Create compelling content 
You work hard, and you want the content you create to reflect your efforts. These resources 
will help you create beautiful, compelling content that does just that.
Cloud
Create beautiful pages in Confluence 
In just under 4 mins, this video demonstrates how to create a page in Confluence from scratch 
or using best practice templates. The techniques you learn here will help you create beautiful 
Confluence pages that stand out from the crowd.
Watch video 
Cloud
Format your page 
Learn how to use the Confluence Cloud editor to apply formatting to text, change the page 
layout, and add tables, media, and links to your content.
Read tutorial 
Cloud
Start with a page template 
Confluence can be used for all kinds of work - meeting notes, project plans, onboarding 
guides, and beyond. Page templates kick-start your next project with industry expertise 
automatically infused.
Browse templates 
cloud / data center / server
Build your own custom template 
Page templates help you keep content consistent across your team or org. In this blog post, 
you'll learn how you can use the features of Confluence to make better templates.
Read blog 
 
Stay organized 
As your team grows and more people start using Confluence to share information, keeping 
things organized can become more of a challenge. Use these resources to create and maintain 
an effective organizational strategy for your team's content.
cloud / data center / server
Build an information architecture that scales 
Learn how to categorize your content with an information architecture based on your current 
and future organizational needs.
Read blog 
cloud / data center / server
Keep your site organized like Marie Kondo 
Follow these four steps to become the Marie Kondo of Confluence. This blog post will show 
you how to find a place for every page on your site, keep it organized, and get rid of content 
that no longer sparks joy.
Read blog 
cloud
Eliminate clutter by archiving old pages 
Learn how archiving old pages can help you keep your page tree neat, make it easier for 
people to find what they need, and give them confidence that the information they find is 
accurate and up to date.
Read article 
Cloud / data center / server
Excel in Confluence content management 
Learn how to guide end users to build their own easy-to-use, well-formatted content. This 
training will cover templates, page and space structure, and content management best 
practices.
Start training 
 
Increase collaboration and engagement 
Celebrate your wins, share in one another's challenges, and unite as a team behind shared 
goals. When your team feels they have a voice and a sense of ownership in your company's 
success, you'll be blown away by their passion, commitment, and productivity. These 
resources will help you increase employee engagement and foster better collaboration among 
team members.
cloud / data center / server
Create transparency at work 
Learn why transparency impacts organizational effectiveness and employee happiness. 
Leverage these strategies to make your culture more open, transparent, and collaborative.
Read blog 
cloud / data center / server
Increase employee engagement 
Learn what employee engagement is, why it's important, and how you can keep employees 
engaged and productive.
Read article 
cloud / data center / server
Use internal blogging to foster an open culture 
Learn how internal blogging can promote knowledge sharing, increase productivity, build 
employee bonds, and open up your workplace culture.
Read blog 
cloud
How to measure engagement with analytics 
Follow step-by-step instructions to analyze your Confluence instance, including number of 
views, most active readers and contributors, most active users, most popular spaces, and 
common searches.
Read tutorial 
Navigate Confluence 
Learn how to navigate Confluence Cloud so you can find the information you need quickly.
Read Part 4 
CONFLUENCE APPS AND INTEGRATIONS 
Learn how to integrate Confluence with your favorite business tools.
"""

In [None]:
insert_doc(full_txt)

Writing Documents:   0%|          | 0/1 [00:00<?, ?it/s]

INFO:haystack.document_stores.faiss:Updating embeddings for 2360 docs...


Updating Embedding:   0%|          | 0/2360 [00:00<?, ? docs/s]

In [None]:
question_text = "what can confluence do"

In [None]:
full_context = context_search(question_text)
context_question = create_context_prompt(question_text, full_context)
ns = openai_quesetion_ans(context_question)
print(ns)

Confluence is a collaboration software that can be used to create, share, and organize documents and other content. It can be used to create wikis, blogs, and other types of content.


In [None]:
full_context

'\n===Writing===\n"Two Swords" was written by executive producers David Benioff and D. B. Weiss, based on George R. R. Martin\'s original work from his novel \'\'A Clash of Kings\'\', chapters Sansa II and Sansa VIII, and \'\'A Storm of Swords\'\', chapters Jaime VII, Tyrion V, Tyrion IV, Jon VI, Jon IX, Daenerys V, and Arya XIII. Arya\'s revenge against Lommy\'s murderer using the same method of killing was depicted in \'\'The Winds of Winter\'\' chapter Mercy.The \'\'\'Dothraki language\'\'\' is a constructed fictional language in George R. R. Martin\'s fantasy novel series \'\'A Song of Ice and Fire\'\' and its television adaptation \'\'Game of Thrones\'\'. It is spoken by the Dothraki, a nomadic people in the series\'s fictional world. The language was developed for the TV series by the linguist David J. Peterson, working off the Dothraki words and phrases in Martin\'s novels.\n, the language comprised 3163 words, not all of which have been made public. In 2012, 146 newborn girls i