# Demo of the unofficial Python SDK for [Vectara](https://vectara.com)'s RAG platform

For questions, ask forrest@vectara.com 

In [1]:
import vectara

In [13]:
# Get some test data 
!mkdir testdoc 
!wget https://www.cs.jhu.edu/~jason/papers/mei+al.icml20.pdf -O testdoc/neural_datalog_through_time.pdf 
!wget https://docs.vectara.com/assets/files/vectara_employee_handbook-4524365135dc70a59977373c37601ad1.pdf -O testdoc/vectara.pdf
!wget https://raw.githubusercontent.com/TexteaInc/funix-doc/main/Reference.md -O testdoc/funix.md
# !wget https://raw.githubusercontent.com/tangxyw/RecSysPapers/main/Calibration/Posterior%20Probability%20Matters%20-%20Doubly-Adaptive%20Calibration%20for%20Neural%20Predictions%20in%20Online%20Advertising.pdf -O testdoc/Calibration.pdf


mkdir: cannot create directory ‘testdoc’: File exists
--2023-11-21 18:37:52--  https://www.cs.jhu.edu/~jason/papers/mei+al.icml20.pdf
Resolving www.cs.jhu.edu (www.cs.jhu.edu)... 128.220.13.64
Connecting to www.cs.jhu.edu (www.cs.jhu.edu)|128.220.13.64|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2087657 (2.0M) [application/pdf]
Saving to: ‘testdoc/neural_datalog_through_time.pdf’


2023-11-21 18:37:53 (3.89 MB/s) - ‘testdoc/neural_datalog_through_time.pdf’ saved [2087657/2087657]

--2023-11-21 18:37:53--  https://docs.vectara.com/assets/files/vectara_employee_handbook-4524365135dc70a59977373c37601ad1.pdf
Resolving docs.vectara.com (docs.vectara.com)... 2600:1f18:16e:df01::64, 2600:1f18:2489:8200::c8, 54.161.234.33, ...
Connecting to docs.vectara.com (docs.vectara.com)|2600:1f18:16e:df01::64|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 53575 (52K) [application/pdf]
Saving to: ‘testdoc/vectara.pdf’


2023-11-21 18:37:53 (386 KB/s

# Create a client object 

By default, the constructor will look for the following environment variables:
* VECTARA_CUSTOMER_ID
* VECTARA_CLIENT_ID
* VECTARA_CLIENT_SECRET

Or you can manually pass them to the constructor, like the example below.

In [2]:
from keys import customer_id, client_id, client_secret
# keys.py is as follows
# customer_id = 'your customer id'
# client_id = 'your client id'
# client_secret = 'your client secret'

client = vectara.vectara(customer_id, client_id, client_secret)
# Default to environment variables

Bearer/JWT token generated. It will expire in 30 minutes. To-regenerate, please call acquire_jwt_token(). 


# Create a corpus

In [4]:
corpus_id = client.create_corpus("test_corpus") 

# Reset Corpus (when needed)

In [14]:
# corpus_id = 9 # manual set here 
client.reset_corpus(corpus_id)

Resetting corpus 9 successful. 


# Add files to a corpus

You can use the `upload()` method to upload a file, a list of files, or a folder to a corpus. The `upload()` method automatically detects the type of file source to switch between the three methods below.
* `upload_file()`: upload a single file
* `upload_files()`: upload a list of files
* `upload_folder()`: upload all files in a folder

Of course, if you are very sure about what you are doing, you can also use the three methods above directly.

In [15]:
corpus_id = 9 # manually set corpus_id if needed. 
client.upload(corpus_id, './testdoc', verbose=True)

UPOload switch: ./testdoc
Uploading files from folder: ./testdoc


Uploading...:   0%|          | 0/3 [00:00<?, ?it/s, ./testdoc/neural_datalog_through_time.pdf]

Uploading invidiv8id
Uploading..../testdoc/neural_datalog_through_time.pdf 

Uploading...:  33%|███▎      | 1/3 [00:09<00:19,  9.54s/it, ./testdoc/funix.md]                       

Success. 
Uploading invidiv8id
Uploading..../testdoc/funix.md 

Uploading...:  67%|██████▋   | 2/3 [00:11<00:05,  5.18s/it, ./testdoc/vectara.pdf]

Success. 
Uploading invidiv8id
Uploading..../testdoc/vectara.pdf 

Uploading...: 100%|██████████| 3/3 [00:14<00:00,  4.69s/it, ./testdoc/vectara.pdf]

Success. 





# Query to a corpus and beautifully display the results

## Example query 1

In [16]:
answer = client.query(corpus_id, "What should I do to rearrange objects?")
_ = vectara.post_process_query_result(answer, jupyter_display=True)

Query successful. 


### Here is the answer
To rearrange objects, you can follow these steps: 
1. Initialize the embeddings of the objects to 0 and recompute them in parallel for a certain number of iterations [1].
2. Change the order and orientation of the objects using the "direction" attribute in a Funix decorator [3].
3. Consider the topological order of the objects and wait until the upstream nodes have "converged" before working on a specific component [5].
4. Embeddings of entities and relations that reflect selected past events can also aid in rearranging objects [4]. 

Please note that these steps are based on the search results provided and may not cover all possible approaches.

### References:
    
1. From document **neural_datalog_through_time.pdf** (matchness=0.65684634):
  _...This method recomputes all embeddings in parallel,
and repeats this for some number of iterations...._

2. From document **neural_datalog_through_time.pdf** (matchness=0.6553048):
  _...Within each strongly connected component C, ini-
tialize the embeddings to 0 and then recompute them in
parallel for |C| iterations...._

3. From document **funix.md** (matchness=0.65107906):
  _...You can change their order and orientation using the "direction" attribute in a Funix decorator...._

4. From document **neural_datalog_through_time.pdf** (matchness=0.6380951):
  _...® Embeddings of entities and relations
that reﬂect selected past events (§2.4 and §2.6)...._

5. From document **neural_datalog_through_time.pdf** (matchness=0.6360733):
  _...In the general case, visiting the com-
ponents in topologically sorted order means that we wait to
work on component C until its strictly upstream nodes have
“converged,” so that the limited iterations on C make use of
the best available embeddings of the upstream nodes...._


## Example query 2

In [17]:
answer = client.query(corpus_id, "Can I bring friends to the office?")
_ = vectara.post_process_query_result(answer, jupyter_display=True)

Query successful. 


### Here is the answer
Yes, you can bring friends to the office. Some companies even have special events like "Furry Friend Fridays" [1] where you can introduce your pets via video conference. However, it is important to understand any specific policies or peculiarities regarding bringing friends to the office [2]. Additionally, there may be team-building activities or interactions involving exotic pets to enhance communication and inject fun into the workplace [3]. These interactions can add a colorful flair to team meetings [4]. Overall, while the sentiment of bringing friends to the office is appreciated, it is advisable to be aware of any rules or guidelines set by your company [2].

### References:
    
1. From document **vectara.pdf** (matchness=0.6796313):
  _...monthly "Furry Friend Fridays" where you can introduce your pets via video conference...._

2. From document **vectara.pdf** (matchness=0.63029546):
  _...However, before you bring in your pet parrot or rescue raven, there are some
peculiarities to this policy that you must understand...._

3. From document **vectara.pdf** (matchness=0.62800384):
  _...The Team-building Adventures: From velociraptor training simulations to bear dance-offs, our
exotic pet interactions are designed to build teamwork, enhance communication, and inject fun
into the workplace...._

4. From document **vectara.pdf** (matchness=0.6266819):
  _...Plus, they add a colorful ﬂair to team meetings...._

5. From document **vectara.pdf** (matchness=0.6225399):
  _...We appreciate the sentiment, and we're sure your furry
friends are delightful...._


## Example query 3

In [18]:
answer = client.query(corpus_id, "How to set the frequency?")
_ = vectara.post_process_query_result(answer, jupyter_display=True)

Query successful. 


### Here is the answer
The returned results did not contain sufficient information to be summarized into a useful answer for your query. Please try a different search or restate your query differently.

### References:
    
1. From document **neural_datalog_through_time.pdf** (matchness=0.684738):
  _...We take λh(t) to be the
(Poisson) intensity of h at time t:  that is, it models the
limit as dt → 0+ of the expected rate of h on the interval
[t, t + dt) (i.e., the expected number of occurrences of h
divided by dt)...._

2. From document **neural_datalog_through_time.pdf** (matchness=0.6802496):
  _...In the continuous-time case, we evaluate (8) at
                rm
time s to obtain [h]<-   ∈ R7Dh (so Wr needs to have more
                                                  rm
rows), and accordingly obtain 7 vectors in (0, 1)Dh,..._

3. From document **neural_datalog_through_time.pdf** (matchness=0.67772794):
  _...We then set (f ; i; z) =def
σ([h]<-  )...._

4. From document **funix.md** (matchness=0.6768694):
  _...There will be a radio box on the front end for the user to switch between the two display options at any time, and the JSON Viewer will be used by default...._

5. From document **funix.md** (matchness=0.6706363):
  _...If sessions are not properly maintained, you can use two Funix functions to manually set and get a session-level global variable...._
