# Demo

## Instantiate database

You can connect to your database like this.

In [1]:
import distyll
db = distyll.DBConnection()

embedded weaviate is already listening on port 6666


## YouTube video example

### Add data

Let's add data from a YouTube video

In [2]:
youtube_url = "https://youtu.be/sNw40lEhaIQ"

In [3]:
db.add_from_youtube(youtube_url)

52

### Query data

Now we can query it

In [4]:
response = db.query_summary(
    prompt="In bullet points, tell me what this material describes",
    object_path=youtube_url
)

In [5]:
print(response.generated_text)

- The material is part of a series discussing contextual representations, specifically focusing on the GPT transformer-based architecture.
- It covers topics such as autoregressive loss function, token representation, hidden representation, language modeling, transformer architecture, and masking.
- The concept of self-attention and its role in allowing the model to look back at previous positions in a sequence when making predictions is explained.
- The training process for a GPT-style model using "teacher forcing" is discussed, highlighting its significance.
- The material briefly touches on the process of sequence generation and different strategies for sampling tokens.
- It mentions that there are different versions of GPT and alternative models available.
- The information provided is based on the text and may vary or become outdated over time.


In [6]:
response = db.query_chunks(
    prompt="In bullet points, tell me what this material describes",
    search_query="open source models",
    object_path=youtube_url,
)

In [7]:
print(response.generated_text)

- The material describes different models in the open source side of language processing.
- It mentions the Bloom model with 176 billion parameters.
- It discusses the GPT-3 paper and intermediate model sizes.
- It explains the process of language modeling and token prediction.
- It mentions smaller models in the GPT mode.
- It describes the structure and parameters of the GPT models.
- It discusses the use of the model's predictions in training.


## Arxiv example

In [9]:
# pdf_url = 'https://arxiv.org/pdf/1706.03762'
pdf_url = 'https://arxiv.org/pdf/2305.15334'
db.add_pdf(pdf_url)

243

In [10]:
response = db.query_summary(
    prompt="In bullet points, tell me what this material describes",
    object_path=pdf_url
)
print(response.generated_text)

- The development of a model called Gorilla
- Comparison of Gorilla's performance to GPT-4
- Gorilla's adaptability to document changes and ability to mitigate hallucination issues
- Introduction of the APIBench dataset for evaluating LLMs' accuracy in using APIs
- Integration of a retrieval system with Gorilla to improve LLMs' accuracy in using tools and updated documentation
- Focus on enhancing the effectiveness and adaptability of LLMs in using APIs
- Training with constraints
- Different retrieval techniques
- Impact of using optimal retrievers
- Suggestion of using a better retriever for finetuning, but zero-shot finetuning as an alternative when a good retriever is not available
- Mention of program synthesis and neural networks in program synthesis
- Application of language models in various tasks.


In [11]:
response = db.query_chunks(
    prompt="how does gorilla work?",
    search_query="gorilla algorithm",
    object_path=pdf_url,
)
print(response.generated_text)

Gorilla is a model that generates reliable API calls to machine learning (ML) models without hallucination. It has been designed to adapt to test-time API usage changes and can satisfy constraints while picking APIs. Gorilla's performance surpasses the state-of-the-art Language Model (LLM) GPT-4 in three massive datasets that were collected. 

Gorilla is a retrieve-aware finetuned LLaMA-7B model specifically for API calls. It can be combined with a document retriever to adapt to test-time document changes, allowing for flexible user updates or version changes. The model has been trained to understand and reason about constraints.

The integration of a retrieval system with Gorilla demonstrates the potential for LLMs to use tools more accurately, keep up with frequently updated documentation, and increase the reliability and applicability of their outputs.

More information about Gorilla can be found in the source: https://arxiv.org/pdf/2305.15334


In [12]:
response = db.query_chunks(
    prompt="how does gorilla reduce hallucination in LLMs?",
    search_query="gorilla algorithm",
    object_path=pdf_url,
)
print(response.generated_text)

Gorilla reduces hallucination in LLMs (Large Language Models) by generating reliable API calls without hallucination. It surpasses the state-of-the-art LLM (GPT-4) in three massive datasets. Gorilla also demonstrates an impressive capability to adapt to test-time API usage changes and can satisfy constraints while picking APIs. Additionally, when combined with a document retriever, Gorilla shows a strong capability to adapt to test-time document changes, enabling flexible user updates or version changes. The model is retrieve-aware and finetuned specifically for API calls. Gorilla's performance is evaluated and compared with other models, and it is shown to outperform them in various configurations. The training of Gorilla enables the model to adapt to changes in API documentation, and it also demonstrates the ability to understand and reason about constraints. The successful integration of the retrieval system with Gorilla increases the reliability and applicability of its outputs.


## Notes

Optionally, you can also specify a particular Weaviate instance

In [13]:
# import weaviate
# import os
# client = weaviate.Client(
#     url=os.environ['JP_WCS_URL'],
#     auth_client_secret=weaviate.AuthApiKey(os.environ['JP_WCS_ADMIN_KEY']),
#     additional_headers={
#         'X-OpenAI-Api-Key': os.environ['OPENAI_APIKEY']
#     }
# )
#
# import distyll
# db = distyll.DBConnection(client=client)