RAG Copilot Extension LP #1611

JoeStech · 2025-02-13T19:33:30Z

Before submitting a pull request for a new Learning Path, please review Create a Learning Path

[ X] I have reviewed Create a Learning Path

Please do not include any confidential information in your contribution. This includes confidential microarchitecture details and unannounced product information.

[ X] I have checked my contribution for confidential information

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the Creative Commons Attribution 4.0 International License.

Co-authored-by: Joe Stech <JoeStech@users.noreply.github.com>

Initial Vector and GitHub writeup

…ot-extension-lp

…ment and configuration sections

annietllnd

Some comments from me!

annietllnd · 2025-02-17T12:52:33Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/_index.md

+    - A GitHub account.
+    - A Linux-based computer with Python installed.
+
+author_primary: Avin Zarlez, Joe Stech


Should be author and a JSON like list. This is currently breaking the site's building process. I try to run hugo before submitting a PR, just to check things aren't breaking

Ah this was different on Joe's fork, looks like it had been updated. Fixed.

annietllnd · 2025-02-17T12:56:16Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/1-rag.md

+
+## What is a RAG system?
+
+RAG stands for "Retrieval Augmented Generation". It describes an AI framework that combines information retrieval with text generation to improve the quality and accuracy of AI-generated content.


IMO the abbreviation should be presented in the introduction at its first mention

Added to introduction page

annietllnd · 2025-02-17T12:58:13Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/1-rag.md

+
+1. Retrieval: The system searches a knowledge base, usually using some combination of vector and/or text search.
+2. Augmentation: The retrieved information is then provided as context to a generative AI model to provide additional context for the user's query.
+3. The AI model uses both thye retrieved knowledge and its internal understanding to generate a more useful response to the user.


😱

Good catch, fixed

annietllnd · 2025-02-17T13:16:30Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/2-vector.md

+
+Then for any given vector (like the embedding of a question asked by a user) we can query our vector database to find embedded data that is most similar. 
+
+For example, for our use case let's say we want to know which Arm learning path is most relevant to a question a user asks.


"For example, for our use case" seems double, I'd use one of them

Changed wording

annietllnd · 2025-02-17T13:19:17Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/2-vector.md

+
+By copying the FAISS database into every deployment, we achieve a scalable, high-performance solution that can handle a large number of requests efficiently.
+
+## Collecting Data into Chunks


I'd put the git clone here, trying to keep commands closer to where they are used

annietllnd · 2025-02-17T14:11:25Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/2-vector.md

+
+Copy the generated `bin` and `json` files to the root directory of your Flask application.
+
+THey should be in the `vectorstore/chunks` folder. Since you are likely still in the `vectorstore` folder, run this command to copy:


annietllnd · 2025-02-17T14:12:39Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/7-testing.md

+layout: learningpathall
+---
+
+## Test it out


Change this to something less generic, it looks a lot like the title of this section

annietllnd · 2025-02-17T14:13:27Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/_index.md

+    - Configure a GitHub Copilot Extension for your RAG application.
+
+prerequisites:
+    - The "[Build a GitHub Copilot Extension in Python](../gh-copilot-simple/)" Learning Path.


IMO the LP titles renders more nicely without the quotes

Removed quotes

annietllnd · 2025-02-17T14:14:31Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/7-testing.md

+
+Another possibility is adding another copilot invocation to rephrase the previous conversation prior to your main copilot invocation. This yields more robust results, if users reference previous elements of the conversation in their question.
+
+You can precisely tailor your RAG extension to your use case, to make your extension as useful as possible.


I'd add something here to wrap things up

Added simple Conclusion

annietllnd · 2025-02-17T15:56:58Z

content/learning-paths/servers-and-cloud-computing/copilot-extension/_index.md

+    - conda
+    - AWS CDK
+operatingsystems:
+    - Linux, MacOS


Should be a JSON like list. It doesn't break anything this way but the mapping doesn't seem to work this way

annietllnd · 2025-02-19T08:40:58Z

Thanks Avin! 💪

pareenaverma · 2025-02-19T15:34:12Z

@madeline-underwood merging into main for your editorial review

JoeStech and others added 30 commits January 28, 2025 10:12

boilerplate for LP

f8c9312

todo file

2752db9

Init flow list

cd4a7d8

Co-authored-by: Joe Stech <JoeStech@users.noreply.github.com>

Tasks by person

80283c3

Init structure

c395e57

Init vector writeup

29dbbff

For commit signing

6339954

Iteration

96c1094

Trailing comma

61475cf

Github init

a04ec03

GitHub Steps and documentatiopn

786226e

More GitHub documentation

8f0d5b4

Wording tweak

0a3a1b3

Vector

6ec7c72

Tweaked wording

23984b7

Spelling

bde5cfe

add changes to todo

a9b7ceb

merge in main

b7a3959

fix merge conflict

38a6f53

Merge branch 'copilot-extension-lp' into avinz/copilot-extension/vector

f96a529

Changed wording

5e7a174

semantic

3d6314e

FAISS

0f9656d

Merge remote-tracking branch 'upstream/main' into copilot-extension-lp

7660479

Chunking

4d16c65

Add todo

75d4399

AI instructions

d8a07aa

Remove dev container piece from this PR

c1b6c83

No comma

e34704c

Whitespace

5477833

AvinZarlez and others added 13 commits February 10, 2025 13:43

Additional Images

b81b936

rag flask doc

ef8e40b

Change step order

ac5bc15

More description

6fb353e

Changed numbers

e6f1d31

Merge pull request #1 from JoeStech/avinz/copilot-extension/vector

eaaf9aa

Initial Vector and GitHub writeup

vector search functions

aa57290

Merge remote-tracking branch 'origin/copilot-extension-lp' into copil…

1730866

…ot-extension-lp

Merge branch 'copilot-extension-lp' into rag-flask-and-reqs

9c93f89

Added Avin Zarlez author name

634cf8c

added a couple new sections at beginning and end, modified the deploy…

a68db5f

…ment and configuration sections

Fixing merge conflict

0e79129

Merge remote-tracking branch 'armdeveco/main' into copilot-extension-lp

a88587c

annietllnd reviewed Feb 17, 2025

View reviewed changes

AvinZarlez added 8 commits February 18, 2025 08:52

Two authors

19d850f

Add Rag explainer to first page

31b2d7f

Removed thye

b5eebf3

Vector changes

fa21ac0

testing title

76154a1

Removed quotes

67b616f

Added simple conclusion

d6d583a

List

e05f297

pareenaverma merged commit fa3fba4 into ArmDeveloperEcosystem:main Feb 19, 2025
1 check passed

chrismoroney added ACM Arm Cloud Migration publish labels Oct 1, 2025

chrismoroney added this to Arm Learning Paths Roadmap Oct 1, 2025

chrismoroney moved this to Done in Arm Learning Paths Roadmap Oct 1, 2025

pareenaverma moved this from Done to Maintenance in Arm Learning Paths Roadmap Oct 22, 2025


		## What is a RAG system?

		RAG stands for "Retrieval Augmented Generation". It describes an AI framework that combines information retrieval with text generation to improve the quality and accuracy of AI-generated content.


		Then for any given vector (like the embedding of a question asked by a user) we can query our vector database to find embedded data that is most similar.

		For example, for our use case let's say we want to know which Arm learning path is most relevant to a question a user asks.


		By copying the FAISS database into every deployment, we achieve a scalable, high-performance solution that can handle a large number of requests efficiently.

		## Collecting Data into Chunks


		Copy the generated `bin` and `json` files to the root directory of your Flask application.

		THey should be in the `vectorstore/chunks` folder. Since you are likely still in the `vectorstore` folder, run this command to copy:


		Another possibility is adding another copilot invocation to rephrase the previous conversation prior to your main copilot invocation. This yields more robust results, if users reference previous elements of the conversation in their question.

		You can precisely tailor your RAG extension to your use case, to make your extension as useful as possible. No newline at end of file

RAG Copilot Extension LP #1611

RAG Copilot Extension LP #1611

Uh oh!

Conversation

JoeStech commented Feb 13, 2025

Uh oh!

annietllnd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

annietllnd commented Feb 19, 2025

Uh oh!

pareenaverma commented Feb 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants