## Install dependencies

In [None]:
!pip install "nucliadb-sdk<=2.42.1"
!pip install -U sentence-transformers
!pip install datasets

## Setup NucliaDB

- Run **NucliaDB** image:
```bash
docker run -it \
       -e LOG=INFO \
       -p 8080:8080 \
       -p 8060:8060 \
       -p 8040:8040 \
       -v nucliadb-standalone:/data \
       nuclia/nucliadb:latest
```
- Or install with pip and run:

```bash
pip install nucliadb
nucliadb
```

## Check everything's up and running

In [199]:
import requests
response = requests.get(f"http://0.0.0.0:8080")
assert response.status_code == 200, "Ups, it seems something is not properly installed"

## Load our data

Load and explore the prompt dataset

In [185]:
from datasets import load_dataset

dataset = load_dataset("fka/awesome-chatgpt-prompts")

Using custom data configuration fka--awesome-chatgpt-prompts-1d1bd2430c633570
Found cached dataset csv (/Users/ciniesta/.cache/huggingface/datasets/fka___csv/fka--awesome-chatgpt-prompts-1d1bd2430c633570/0.0.0/6b34fb8fcf56f7c8ba51dc895bfa2bfbe43546f190a60fcf74bb5e8afdcc2317)


  0%|          | 0/1 [00:00<?, ?it/s]

In [186]:
dataset

DatasetDict({
    train: Dataset({
        features: ['act', 'prompt'],
        num_rows: 145
    })
})

In [187]:
dataset["train"].shuffle(seed=42)[0:4]

Loading cached shuffled indices for dataset at /Users/ciniesta/.cache/huggingface/datasets/fka___csv/fka--awesome-chatgpt-prompts-1d1bd2430c633570/0.0.0/6b34fb8fcf56f7c8ba51dc895bfa2bfbe43546f190a60fcf74bb5e8afdcc2317/cache-08423e2f5d75493a.arrow


{'act': ['Relationship Coach',
  'Aphorism Book',
  'JavaScript Console',
  'New Language Creator'],
 'prompt': ['I want you to act as a relationship coach. I will provide some details about the two people involved in a conflict, and it will be your job to come up with suggestions on how they can work through the issues that are separating them. This could include advice on communication techniques or different strategies for improving their understanding of one another\'s perspectives. My first request is "I need help solving conflicts between my spouse and myself."',
  'I want you to act as an aphorism book. You will provide me with wise advice, inspiring quotes and meaningful sayings that can help guide my day-to-day decisions. Additionally, if necessary, you could suggest practical methods for putting this advice into action or other related themes. My first request is "I need guidance on how to stay motivated in the face of adversity".',
  'I want you to act as a javascript consol

## Load the model to generate embeddings

In [188]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('sentence-transformers/msmarco-MiniLM-L6-cos-v5')


## Upload our data to NucliaDB


In [191]:
from nucliadb_sdk import KnowledgeBox, get_or_create

In [192]:
my_kb=get_or_create("my_prompts")

In [193]:
for row in dataset["train"]:
    my_kb.upload(
        text=row["prompt"],
        vectors={"ms-marco-vectors": model.encode([row["prompt"]])[0]},
    )

Vectorset is not created, we will create it for you


## Enjoy our semantic search!! 


In [194]:
query = model.encode(["something coding related"])[0]
results = my_kb.search(vector = query, vectorset="ms-marco-vectors", min_score=0.25)

for result in results:
    print(f"Prompt: {result.text}")
    print(f"Score: {result.score}")
    print("------")

Prompt: I want you to act as a Developer Relations consultant. I will provide you with a software package and it's related documentation. Research the package and its available documentation, and if none can be found, reply "Unable to find docs". Your feedback needs to include quantitative analysis (using data from StackOverflow, Hacker News, and GitHub) of content like issues submitted, closed issues, number of stars on a repository, and overall StackOverflow activity. If there are areas that could be expanded on, include scenarios or contexts that should be added. Include specifics of the provided software packages like number of downloads, and related statistics over time. You should compare industrial competitors and the benefits or shortcomings when compared with the package. Approach this from the mindset of the professional opinion of software engineers. Review technical blogs and websites (such as TechCrunch.com or Crunchbase.com) and if data isn't available, reply "No data ava

In [195]:
query = model.encode(["prompts that have something to do with art and emotions"])[0]
results = my_kb.search(vector = query, vectorset="ms-marco-vectors", min_score=0.35)

for result in results:
    print(f"Prompt: {result.text}")
    print(f"Score: {result.score}")
    print("------")

Prompt: I want you to act as a poet. You will create poems that evoke emotions and have the power to stir people’s soul. Write on any topic or theme but make sure your words convey the feeling you are trying to express in beautiful yet meaningful ways. You can also come up with short verses that are still powerful enough to leave an imprint in readers' minds. My first request is "I need a poem about love."
Score: 0.4699018597602844
------
Prompt: I want you to act as a composer. I will provide the lyrics to a song and you will create music for it. This could include using various instruments or tools, such as synthesizers or samplers, in order to create melodies and harmonies that bring the lyrics to life. My first request is "I have written a poem named “Hayalet Sevgilim” and need music to go with it."
Score: 0.4238387942314148
------
Prompt: I want you to act as an aphorism book. You will provide me with wise advice, inspiring quotes and meaningful sayings that can help guide my day-

In [196]:
query = model.encode(["something useful for people learning a new language"])[0]
results = my_kb.search(vector = query, vectorset="ms-marco-vectors", min_score=0.35)

for result in results:
    print(f"Prompt: {result.text}")
    print(f"Score: {result.score}")
    print("------")

Prompt: I want you to act as an AI writing tutor. I will provide you with a student who needs help improving their writing and your task is to use artificial intelligence tools, such as natural language processing, to give the student feedback on how they can improve their composition. You should also use your rhetorical knowledge and experience about effective writing techniques in order to suggest ways that the student can better express their thoughts and ideas in written form. My first request is "I need somebody to help me edit my master's thesis."
Score: 0.4891594350337982
------
Prompt: I want you to act as a fill in the blank worksheets generator for students learning English as a second language. Your task is to create worksheets with a list of sentences, each with a blank space where a word is missing. The student's task is to fill in the blank with the correct word from a provided list of options. The sentences should be grammatically correct and appropriate for students at 