Skip to content

datastaxdevs/demo-generativeai-with-java

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Demo of Generative AI with Java

Gitpod ready-to-code License Apache2 Discord

πŸ“‹ Table of content

Week 1

Week 2

WEEK1

βœ… 1 - Create your DataStax Astra account

ℹ️ Account creation tutorial is available in awesome astra

click the image below or go to https://astra.datastax./com


βœ… 2 - Create an Astra Token

ℹ️ Token creation tutorial is available in awesome astra

  • Locate Settings(#1) in the menu on the left, thenToken Management` (#2)

  • Select the role Organization Administrator before clicking [Generate Token]

The Token is in fact three separate strings: a Client ID, a Client Secret and the token proper. You will need some of these strings to access the database, depending on the type of access you plan. Although the Client ID, strictly speaking, is not a secret, you should regard this whole object as a secret and make sure not to share it inadvertently (e.g. committing it to a Git repository) as it grants access to your databases.

{
  "ClientId": "ROkiiDZdvPOvHRSgoZtyAapp",
  "ClientSecret": "fakedfaked",
  "Token":"AstraCS:fake"
}

βœ… 3 - Copy the token value in your clipboard

You can also leave the windo open to copy the value in a second.

βœ… 4 - Open Gitpod

↗️ Right Click and select open as a new Tab...

Open in Gitpod

βœ… 5 - Set up the CLI with your token

In gitpod, in a terminal window:

  • Login
astra login --token AstraCS:fake
  • Validate your are setup
astra org

Output

gitpod /workspace/workshop-beam (main) $ astra org
+----------------+-----------------------------------------+
| Attribute      | Value                                   |
+----------------+-----------------------------------------+
| Name           | cedrick.lunven@datastax.com             |
| id             | f9460f14-9879-4ebe-83f2-48d3f3dce13c    |
+----------------+-----------------------------------------+

βœ… 6 - Create destination Database and a keyspace

ℹ️ You can notice we enabled the Vector Search capability

  • Create db workshop_beam and wait for the DB to become active
astra db create demo-genai -k genai --vector --if-not-exists

πŸ’» Output

[INFO]  Database 'demo-genai' does not exist. Creating database 'demo-genai' with keyspace 'genai'
[INFO]  Enabling vector search for database demo-genai
[INFO]  Database 'demo-genai' and keyspace 'genai' are being created.
[INFO]  Database 'demo-genai' has status 'PENDING' waiting to be 'ACTIVE' ...
[INFO]  Database 'demo-genai' has status 'ACTIVE' (took 112341 millis)
[OK]    Database 'demo-genai' is ready.
  • List databases
astra db list

πŸ’» Output

+--------------------------+--------------------------------------+-----------+-------+---+-----------+
| Name                     | id                                   | Regions   | Cloud | V | Status    |
+--------------------------+--------------------------------------+-----------+-------+---+-----------+
| demo-genai               | 9e54ff00-57e2-47ed-8699-f94d5dd11b6f | us-east1  | gcp   | β–  | ACTIVE    |
+--------------------------+--------------------------------------+-----------+-------+---+-----------+
  • Describe your db
astra db describe demo-genai

πŸ’» Output

+------------------+-----------------------------------------+
| Attribute        | Value                                   |
+------------------+-----------------------------------------+
| Name             | demo-genai                              |
| id               | 9e54ff00-57e2-47ed-8699-f94d5dd11b6f    |
| Status           | ACTIVE                                  |
| Cloud            | GCP                                     |
| Regions          | us-east1                                |
| Default Keyspace | genai                                   |
| Creation Time    | 2023-09-12T08:55:36Z                    |
|                  |                                         |
| Keyspaces        | [0] genai                               |
|                  |                                         |
|                  |                                         |
| Regions          | [0] us-east1                            |
|                  |                                         |
+------------------+-----------------------------------------+

βœ… 7 - Setup env variables

  • Create .env file with variables
astra db create-dotenv demo-genai 
  • Display the file
cat .env
  • Load env variables
set -a
source .env
set +a
env | grep ASTRA

βœ… 8 - Register to OpenAI

  • In your profile, go to View API KEYS, create a new key and copy the value in your clipboard. You have a free trial for a month of so.

EXPORT OPENAI_API_KEY=<key>

βœ… 9 - Setup project

This command will allows to validate that Java , maven and lombok are working as expected and you can connect.

Note: To create the project i simply when with the astra sdk arachetype as follow

mvn archetype:generate \
-DarchetypeGroupId=com.datastax.astra \
-DarchetypeArtifactId=spring-boot-3x-archetype \
-DarchetypeVersion=0.6.9 \
-DinteractiveMode=false \
-DgroupId=com.datastax.demo \
-DartifactId=genai-demo \
-Dversion=1.0-SNAPSHOT

and added the vector dependency:

<dependency>
  <groupId>com.datastax.astra</groupId>
  <artifactId>astra-sdk-vector</artifactId>
  <version>${astra-sdk-starter.version}</version>
</dependency>
  • Run connection test:
mvn test -Dtest=ConnectionTest#shouldBeConnectedTest
  • Run OpenAI Test:

mvn test -Dtest=OpenAiTest#shouldTestOpenAICreateEmbeddings

βœ… 10 - Vector Search

  • Ingest data

mvn test -Dtest=GenerativeAITest#shouldIngestDocuments
  • Open a cqlsh (in a new terminal)
astra db cqlsh genai-demo -k genai
select row_id, metadata_s, blob_text, vector from philosophers
  • Similarity Search
mvn test -Dtest=GenerativeAITest#shouldSimilaritySearchQuotes
  • Similarity Search + MetaData (by Author)
mvn test -Dtest=GenerativeAITest#shouldSimilaritySearchQuotesFilteredByAuthor
  • Similarity Search + MetaData (by Tags)
mvn test -Dtest=GenerativeAITest#shouldSimilaritySearchQuotesFilteredByTags
  • Similarity Search with a threshold
mvn test -Dtest=GenerativeAITest#shouldSimilaritySearchQuotesWithThreshold

βœ… 11 - RAG for Retrieve Augmented Generation

The Full Monty.....

mvn test -Dtest=GenerativeAITest#shouldGenerateQuotesWithRag

WEEK 2

βœ… 12 - Setup Project

  • Check list of running db
astra db list
  • Resume Db if needed (or create a new once)
astra db resume langchain4j
astra db create langchain4j --if-not-exists
  • Make sure you setup the env variables ($ASTRA_APPLICATION_TOKEN)
astra db create-dotenv langchain4j
set -a
source .env
set +a
env | grep ASTRA

Go the application.yaml and check values are correct for your

astra:
  database:
    name: langchain4j
    keyspace: langchain4j
    table: langchain4j

βœ… 13 - Ingest Document

@Test
@DisplayName("02. Should Ingest a document")
@EnabledIfEnvironmentVariable(named = "ASTRA_DB_APPLICATION_TOKEN", matches = "Astra.*")
@EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = "sk.*")
void should_Ingest_Document() {

  Document document = FileSystemDocumentLoader.loadDocument(path, DocumentType.TXT);
  DocumentSplitter splitter = DocumentSplitters
        .recursive(100, 10,
        new OpenAiTokenizer(GPT_3_5_TURBO));
  
  EmbeddingStoreIngestor.builder()
     .documentSplitter(splitter)
     .embeddingModel(embeddingModel)
     .embeddingStore(embeddingStore)
     .build().ingest(document);
}

βœ… 14 - Chat Completion

  @Test
@DisplayName("03. Should Chat Completion")
@EnabledIfEnvironmentVariable(named = "ASTRA_DB_APPLICATION_TOKEN", matches = "Astra.*")
@EnabledIfEnvironmentVariable(named = "OPENAI_API_KEY", matches = "sk.*")
void should_chat_completion(){
        .. //check code in the class
}

About

Show how to build a project to do generative Ai with Java

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages