Context Retrieval - scratch PR by aidando73 · Pull Request #3 · aidando73/llama-stack

aidando73 · 2024-11-29T21:22:40Z

No description provided.

aidando73 · 2024-11-30T00:49:43Z

llama-stack/llama_stack/providers/inline/memory/faiss/faiss.py

Line 44 in 501e7c9

self.index = faiss.IndexFlatL2(dimension)

llama-stack/llama_stack/providers/inline/memory/faiss/faiss.py

Lines 145 to 147 in 501e7c9

index=await FaissIndex.create(

ALL_MINILM_L6_V2_DIMENSION, self.kvstore, bank.identifier

),

llama-stack/llama_stack/providers/utils/memory/vector_store.py

Line 28 in 6395dad

ALL_MINILM_L6_V2_DIMENSION = 384

aidando73 · 2024-11-30T03:26:27Z

+    if idx >= 0:  # Valid index
+        chunk = json.loads(chunk_by_index[str(idx)])
+        print(f"\nIndex {idx}:")
+        print(f"Content: {chunk['content']}")


"Llama 3.2 3B Instruct"

Corresponding document IDs: Index 8 -> Document: llama_3.1.md Index 97 -> Document: llama_3.2.md Index 216 -> Document: llama_3.2_vision.md Index 38 -> Document: llama_3.1.md Index 164 -> Document: llama_3.2.md Document chunks: Index 8: Content: _1/LICENSE](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go [here](https:// Index 97: Content: .com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go [here](https Index 216: Content: is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.2 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that power innovation and progress. Testing: But Llama 3.2 is a new Index 38: Content: </td> <td>Llama 3.1 8B Instruct </td> <td>Llama 3 70B Instruct </td> <td>Llama 3.1 70B Instruct </td> <td>Llama 3.1 405B Instruct </td> </tr Index 164: Content: meant to serve everyone, and to work for a wide range of use cases. It is thus designed to be accessible to people across many different backgrounds, experiences and perspectives. Llama 3.2 addresses users and their needs as they are, without insertion unnecessary judgment or normativity, while reflecting the understanding that even content that may appear problematic in some cases can serve valuable purposes in others. It respects the dignity and autonomy of all users, especially in terms of the values of free thought and expression that

"Instructions on how to provide feedback or comments"

Search results: Indices of 5 nearest neighbors: [ 89 167 197 163 97] Distances to 5 nearest neighbors: [1.306552 1.3460835 1.4211811 1.4505348 1.4800235] Corresponding document IDs: Index 89 -> Document: llama_3.1.md Index 167 -> Document: llama_3.2.md Index 197 -> Document: llama_3.2_vision.md Index 163 -> Document: llama_3.2.md Index 97 -> Document: llama_3.2.md Document chunks: Index 89: Content: [Responsible Use Guide](https://llama.meta.com/responsible-use-guide), [Trust and Safety](https://llama.meta.com/trust-and-safety/) solutions, and other [resources](https://llama.meta.com/docs/get-started/) to learn more about responsible development. Index 167: Content: [resources](https://llama.meta.com/docs/get-started/) to learn more about responsible development. Index 197: Content: following the best practices outlined in our Responsible Use Guide, you can refer to the [Responsible Use Guide](https://llama.meta.com/responsible-use-guide/) to learn more. #### Llama 3.2 Instruct Objective: Our main objectives for conducting safety fine-tuning are to provide the research community with a valuable resource for studying the robustness of safety fine-tuning, as well as to offer developers a readily available, safe, and powerful model for various applications to reduce the Index 163: Content: : Finally, we put in place a set of resources including an [output reporting mechanism](https://developers.facebook.com/llama_output_feedback) and [bug bounty program](https://www.facebook.com/whitehat) to continuously improve the Llama technology with the help of the community. ## Ethical Considerations and Limitations Values: The core values of Llama 3.2 are openness, inclusivity and helpfulness. It is meant to serve everyone, and to work for a Index 97: Content: .com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go [here](https

"What are some small Llama models I can run on small devices like my phone?"

Search results: Indices of 5 nearest neighbors: [175 8 97 152 17] Distances to 5 nearest neighbors: [1.0272269 1.0473894 1.050993 1.0618453 1.0983447] Corresponding document IDs: Index 175 -> Document: llama_3.2_vision.md Index 8 -> Document: llama_3.1.md Index 97 -> Document: llama_3.2.md Index 152 -> Document: llama_3.2.md Index 17 -> Document: llama_3.1.md Document chunks: Index 175: Content: the [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use L Index 8: Content: _1/LICENSE](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go [here](https:// Index 97: Content: .com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go [here](https Index 152: Content: B and 3B models are expected to be deployed in highly constrained environments, such as mobile devices. LLM Systems using smaller models will have a different alignment profile and safety/helpfulness tradeoff than more complex, larger systems. Developers should ensure the safety of their system meets the requirements of their use case. We recommend using lighter system safeguards for such use cases, like Llama Guard 3-1B or its mobile-optimized version. ### Evaluations Scaled Evaluations: We built dedicated Index 17: Content: </tr> <tr> <td>Llama 3.1 405B </td> <td>30.84M </td> <td>700 </td> <td>8,930 </td> <td>0 </td> </tr> <tr> <td>Total </td> <td>39.3M <td>

"What about Llama 3.1 model, what is the release date for it?"

Search results: Indices of 5 nearest neighbors: [ 7 8 97 217 175] Distances to 5 nearest neighbors: [0.5512124 0.6114483 0.6586282 0.6921291 0.72823375] Corresponding document IDs: Index 7 -> Document: llama_3.1.md Index 8 -> Document: llama_3.1.md Index 97 -> Document: llama_3.2.md Index 217 -> Document: llama_3.2_vision.md Index 175 -> Document: llama_3.2_vision.md Document chunks: Index 7: Content: for improved inference scalability. Model Release Date: July 23, 2024. Status: This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. License: A custom commercial license, the Llama 3.1 Community License, is available at: [https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE](https://github.com/meta- Index 8: Content: _1/LICENSE](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go [here](https:// Index 97: Content: .com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go [here](https Index 217: Content: But Llama 3.2 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3. Index 175: Content: the [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use L

"What is the name of the llama model released on October 24, 2024?"

Search results: Indices of 5 nearest neighbors: [ 7 97 8 175 217] Distances to 5 nearest neighbors: [0.62626755 0.7692168 0.77628446 0.7984237 0.860503 ] Corresponding document IDs: Index 7 -> Document: llama_3.1.md Index 97 -> Document: llama_3.2.md Index 8 -> Document: llama_3.1.md Index 175 -> Document: llama_3.2_vision.md Index 217 -> Document: llama_3.2_vision.md Document chunks: Index 7: Content: for improved inference scalability. Model Release Date: July 23, 2024. Status: This is a static model trained on an offline dataset. Future versions of the tuned models will be released as we improve model safety with community feedback. License: A custom commercial license, the Llama 3.1 Community License, is available at: [https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE](https://github.com/meta- Index 97: Content: .com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.2 in applications, please go [here](https Index 8: Content: _1/LICENSE](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/LICENSE) Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use Llama 3.1 in applications, please go [here](https:// Index 175: Content: the [Llama 3.2 Community License](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (a custom, commercial license agreement). Feedback: Instructions on how to provide feedback or comments on the model can be found in the Llama Models [README](https://github.com/meta-llama/llama-models/blob/main/README.md). For more technical information about generation parameters and recipes for how to use L Index 217: Content: But Llama 3.2 is a new technology, and like any new technology, there are risks associated with its use. Testing conducted to date has not covered, nor could it cover, all scenarios. For these reasons, as with all LLMs, Llama 3.2’s potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate, biased or other objectionable responses to user prompts. Therefore, before deploying any applications of Llama 3.

aidando73 · 2024-11-30T04:20:30Z

What do we actually search on?

llama-stack/llama_stack/providers/utils/memory/vector_store.py

Lines 196 to 202 in 6395dad

    
           if isinstance(query, list): 
        
               query_str = " ".join([_process(c) for c in query]) 
        
           else: 
        
               query_str = _process(query) 
        
           model = get_embedding_model(self.bank.embedding_model) 
        
           query_vector = model.encode([query_str])[0].astype(np.float32)

llama-stack/llama_stack/providers/inline/agents/meta_reference/agent_instance.py

Lines 706 to 708 in 501e7c9

    
           query = await generate_rag_query( 
        
               memory.query_generator_config, messages, inference_api=self.inference_api 
        
           )

query:  You are a helpful assistant that can answer questions based on provided documents. Return your answer short and concise, less than 50 words. When was the Llama 3.2 family of models released? What about Llama 3.1 family of models, what is the release date for it?

It joins all the characters

aidando73 · 2024-11-30T04:44:43Z

    """
    if config.type == MemoryQueryGenerator.default.value:
        query = await default_rag_query_generator(config, messages, **kwargs)
+        print("query: ", query)


For messages:

user_prompts = [ "What is the name of the llama model released on October 24, 2024?", "What about Llama 3.1 model, what is the release date for it?", ]

This returns:

query: You are a helpful assistant that can answer questions based on provided documents. Return your answer short and concise, less than 50 words. What is the name of the llama model released on October 24, 2024? What about Llama 3.1 model, what is the release date for it?

aidando73 · 2024-11-30T04:57:20Z

https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2

aidando73 added 8 commits November 29, 2024 21:22

.

ddf3746

.

cc6da9b

.

9c4db10

.

21f700d

.

723724f

.

2309971

.

7642f9c

.

4f27bd6

aidando73 commented Nov 30, 2024

View reviewed changes

Comment thread session_turn.py

aidando73 commented Nov 30, 2024

View reviewed changes

aidando73 added 3 commits November 30, 2024 01:46

.

7c2e64c

.

bd756b8

.

bb27d9f

aidando73 commented Nov 30, 2024

View reviewed changes

.

2453fa3

.

d1c14b1

aidando73 commented Nov 30, 2024

View reviewed changes

aidando73 mentioned this pull request Nov 30, 2024

444 Context Retrieval - Scratch PR aidando73/llama-stack-apps#2

Open

aidando73 mentioned this pull request Nov 30, 2024

Context retrieval only works for first user message ogx-ai/ogx#444

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Context Retrieval - scratch PR#3

Context Retrieval - scratch PR#3
aidando73 wants to merge 13 commits intomainfrom
aidand-444-context-retrieval

aidando73 commented Nov 29, 2024

Uh oh!

Uh oh!

aidando73 Nov 30, 2024

Uh oh!

aidando73 Nov 30, 2024 •

edited

Loading

Uh oh!

aidando73 Nov 30, 2024

Uh oh!

aidando73 Nov 30, 2024

Uh oh!

aidando73 Nov 30, 2024 •

edited

Loading

Uh oh!

aidando73 Nov 30, 2024

Uh oh!

aidando73 Nov 30, 2024

Uh oh!

aidando73 commented Nov 30, 2024 •

edited

Loading

Uh oh!

aidando73 Nov 30, 2024 •

edited

Loading

Uh oh!

aidando73 commented Nov 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	index=await FaissIndex.create(
	ALL_MINILM_L6_V2_DIMENSION, self.kvstore, bank.identifier
	),

Conversation

aidando73 commented Nov 29, 2024

Uh oh!

Uh oh!

aidando73 Nov 30, 2024

Choose a reason for hiding this comment

Uh oh!

aidando73 Nov 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aidando73 Nov 30, 2024

Choose a reason for hiding this comment

Uh oh!

aidando73 Nov 30, 2024

Choose a reason for hiding this comment

Uh oh!

aidando73 Nov 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aidando73 Nov 30, 2024

Choose a reason for hiding this comment

Uh oh!

aidando73 Nov 30, 2024

Choose a reason for hiding this comment

Uh oh!

aidando73 commented Nov 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aidando73 Nov 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aidando73 commented Nov 30, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

aidando73 Nov 30, 2024 •

edited

Loading

aidando73 Nov 30, 2024 •

edited

Loading

aidando73 commented Nov 30, 2024 •

edited

Loading

aidando73 Nov 30, 2024 •

edited

Loading