Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,5 +20,4 @@
"scheme": "file"
}
],
"asciidoc.antora.enableAntoraSupport": true,
}
40 changes: 40 additions & 0 deletions asciidoc/courses/genai-workshop/course.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
= Gen-AI - Hands-on Workshop
:status: active
:duration: 2 hours
:caption: GenAI Beyond Chat with RAG, Knowledge Graphs and Python
:usecase: blank-sandbox
:key-points: A comma, separated, list of learnings
:repository: neo4j-graphacademy/genai-workshop

== Course Description

In this GenAI and Neo4j workshop, you will learn how Neo4j can support your GenAI projects.

You will:

* Use Vector indexes and embeddings in Neo4j to perform similarity and keyword search
* Use Python and Langchain to integrate with Neo4j and OpenAI
* Learn about Large Language Models (LLMs), hallucination and integrating knowledge graphs
* Explore Retrieval Augmented Generation (RAG) and its role in grounding LLM-generated content

After completing this workshop, you will be able to explain the terms LLM, RAG, grounding, and knowledge graphs. You will also have the knowledge and skills to create simple LLM-based applications using Neo4j and Python.

=== Prerequisites

Before taking this course, you should have:

* A basic understanding of Graph Databases and Neo4j
* Knowledge of Python and capable of reading simple programs

While not essential, we completing the GraphAcademy link:/courses/neo4j-fundamentals/[Neo4j Fundamentals^] course.

=== Duration

{duration}

== What you need

To complete the practical tasks within this workshop, you will need:

* Access to gitpod.io (you will need a github, gitpod, or bitbucket account) or a local Python environment
* An OpenAI billing account and API key
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
= Getting Started
:order: 1
:type: lesson
:lab: {repository-link}
:disable-cache: true

We have created a link:https://github.com/neo4j-graphacademy/genai-workshop[repository^] for this workshop.
It contains the starter code and resources you need.

A blank Neo4j Sandbox instance has also been created for you to use during this course.

You can open a Neo4j Browser window throughout this course by clicking the link:#[Toggle Sandbox,role=classroom-sandbox-toggle] button in the bottom right-hand corner of the screen.

== Get the code

You can use Gitpod as an online IDE and workspace for this workshop.
It will automatically clone the workshop repository and set up your environment.

lab::Open `Gitpod workspace`[]

[NOTE]
You will need to login with a Github, Gitlab, or Bitbucket account.

Alternatively, you can clone the repository and set up the environment yourself.

[%collapsible]
.Develop on your local machine
====
You will need link:https://python.org[Python] installed and the ability to install packages using `pip`.

You may want to set up a virtual environment using link:https://docs.python.org/3/library/venv.html[`venv`^] or link:https://virtualenv.pypa.io/en/latest/[`virtualenv`^] to keep your dependencies separate from other projects.

Clone the link:https://github.com/neo4j-graphacademy/genai-workshop[github.com/neo4j-graphacademy/genai-workshop] repository:

[source,bash]
----
git clone https://github.com/neo4j-graphacademy/genai-workshop
----

Install the required packages using `pip`:

[source,bash]
----
cd genai-workshop
pip install -r requirements.txt
----
====

== Setup the environment

Create a copy of the `.env.example` file and name it `.env`.
Fill in the required values.

[source]
.Create a .env file
----
include::{repository-raw}/main/.env.example[]
----

Add your Open AI API key (`OPENAI_API_KEY`), which you can get from link:https://platform.openai.com[platformn.openai.com].

Update the Neo4j sandbox connection details:

NEO4J_URI:: [copy]#bolt://{sandbox_ip}:{sandbox_boltPort}#
NEO4J_USERNAME:: [copy]#{sandbox_username}#
NEO4J_PASSWORD:: [copy]#{sandbox_password}#

== Test your setup

You can test your setup by running `test_environment.py` - this will attempt to connect to the Neo4j sandbox and the OpenAI API.

You will see an `OK` message if you have set up your environment correctly. If any tests fail, check the contents of the `.env` file.

== Continue

When you are ready, you can move on to the next task.

read::Success - let's get started![]

[.summary]
== Summary

You have setup your environment and are ready to start the workshop.
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
= Expand the Graph
:order: 10
:type: challenge
:optional: true
:sandbox: true

In this *optional* challenge, you can extend the graph with additional data.

== All Courses

Currently, the graph contains data from a single course, `llm-fundamentals`, you can download the link:https://data.neo4j.com/llm-vectors-unstructured/courses.zip[lesson files for all the courses^].

. Download the content for all the courses - link:https://data.neo4j.com/llm-vectors-unstructured/courses.zip[data.neo4j.com/llm-vectors-unstructured/courses.zip^]
. Update the graph with the new data
. Explore the graph and find the connections between the courses

== Additional metadata

While the course content is unstructured, it contains metadata you can extract and include in the graph.

Examples include:

* The course title is the first level 1 heading in the file - `= Course Title`
* Level 2 headings denote section titles - `== Section Title`
* The lessons include parameters in the format `:parameter: value` at the top of the file, such as:
** `:type:` - the type of lesson (e.g. `lesson`, `challenge`, `quiz`)
** `:order:` - the order of the lesson in the module
** `:optional:` - whether the lesson is optional

Explore the course content and see what other data you can extract and include in the graph.

When you are ready to move on, click Continue.

== Continue

When you are ready, you can move on to the next task.

read::Move on[]

[.summary]
== Summary

In this optional challenge, you extended the graph with additional data.

Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
= Next steps
:order: 11
:type: lesson

Congratulations on completing this workshop!

You have:

* Used vector indexes to search for similar data
* Created embeddings and vector indexes
* Built a graph of unstructured data using Python and Langchain

You can learn more about Neo4j at link:graphacademy.neo4j.com[Neo4j GraphAcademy].

read::Finished[]

[.summary]
== Summary

Congratulations on completing this workshop!
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
= Semantic Search, Vectors, and Embeddings
:order: 2
:type: lesson

Machine learning and natural language processing (NLP) often use vectors and embeddings to represent and understand data.

== Semantic Search

Semantic search aims to understand search phrases' intent and contextual meaning, rather than focusing on individual keywords.

Traditional keyword search often depends on exact-match keywords or proximity-based algorithms that find similar words.

For example, if you input "apple" in a traditional search, you might predominantly get results about the fruit.

However, in a semantic search, the engine tries to gauge the context: Are you searching about the fruit, the tech company, or something else?

image:images/Apple-tech-or-fruit.png[An apple in the middle with a tech icons on the left and a food on the right,width=700,align=center]


== What are Vectors

Vectors are simply a list of numbers.
For example, the vector `[1, 2, 3]`` is a list of three numbers and could represent a point in three-dimensional space.

image:images/3d-vector.svg[A diagram showing a 3d representation of the x,y,z coordinates 1,1,1 and 1,2,3]

You can use vectors to represent many different types of data, including text, images, and audio.

Using vectors with a dimensionality of hundreds and thousands in machine learning and natural language processing (NLP) is common.

== What are Embeddings?

When referring to vectors in the context of machine learning and NLP, the term "embedding" is typically used.
An embedding is a vector that represents the data in a useful way for a specific task.

Each dimension in a vector can represent a particular semantic aspect of the word or phrase.
When multiple dimensions are combined, they can convey the overall meaning of the word or phrase.

For example, the word "apple" might be represented by an embedding with the following dimensions:

* fruit
* technology
* color
* taste
* shape

You can create embeddings in various ways, but one of the most common methods is to use a **large language model**.

For example, the embedding for the word "apple" is `0.0077788467, -0.02306925, -0.007360777, -0.027743412, -0.0045747845, 0.01289164, -0.021863015, -0.008587573, 0.01892967, -0.029854324, -0.0027962727, 0.020108491, -0.004530236, 0.009129008,` ... and so on.

== How are vectors used in semantic search?

You can use the _distance_ or _angle_ between vectors to gauge the semantic similarity between words or phrases.

image::images/vector-distance.svg[A 3 dimensional chart illustrating the distance between vectors. The vectors are for the words "apple" and "fruit",width=700,align=center]

Words with similar meanings or contexts will have vectors that are close together, while unrelated words will be farther apart.

This principle is employed in semantic search to find contextually relevant results for a user's query.

== Continue

When you are ready, you can move on to the next task.

read::Move on[]

[.summary]
== Summary

You learned about semantic search, vectors, and embeddings.

Next, you will use a Neo4j vector index to find similar data.
Loading