## Weaviate workshop

<a target="_blank" href="https://colab.research.google.com/github/weaviate-tutorials/intro-workshop/blob/main/workshop.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

### Goals:

#### What you will see:


- Create a vector database with Weaviate,
- Add data to the database, and
- Interact with the data, including searching, and using LLMs with your data in Weaviate

### You will learn today:

- What Weaviate is,
- How it stores the data (based on its "meaning"), and
- What you can do with Weaviate, like semantic searches, and using LLMs to transform data.

Install the Weaviate python client, for environments that don't yet have it.

In [1]:
# !pip install -U weaviate-client

## Preparation: Get the data

We'll use a subset of the Jeopardy! quiz library:
> https://www.kaggle.com/datasets/tunguz/200000-jeopardy-questions

Pre-processed version:
> https://raw.githubusercontent.com/databyjp/wv_demo_uploader/main/weaviate_datasets/data/jeopardy_1k.json


Load (or download) the data, and preview it

In [2]:
import requests
import json

# Download the data
response = requests.get('https://raw.githubusercontent.com/databyjp/wv_demo_uploader/main/weaviate_datasets/data/jeopardy_1k.json')
raw_data = response.text

# Parse the JSON and preview it
data = json.loads(raw_data)
print(type(data), len(data))
print(json.dumps(data[0], indent=2))

<class 'list'> 1000
{
  "Air Date": "2006-11-08",
  "Round": "Double Jeopardy!",
  "Value": 800,
  "Category": "AMERICAN HISTORY",
  "Question": "Abraham Lincoln died across the street from this theatre on April 15, 1865",
  "Answer": "Ford's Theatre (the Ford Theatre accepted)"
}


## Step 1: Create a Weaviate instance (database)

This is a quick way to create a Weaviate database. 

You can also use:
- A free sandbox with Weaviate Cloud Services
- Open-source Weaviate directly, available cross-platform with Docker

In [3]:
import weaviate
from weaviate import EmbeddedOptions
import os

client = weaviate.Client(
    embedded_options=EmbeddedOptions(),
    additional_headers={
        "X-OpenAI-Api-Key": os.environ["OPENAI_APIKEY"]  # Replace this with your actual key
    }
)

Started /Users/jphwang/.cache/weaviate-embedded: process ID 27874


{"action":"startup","default_vectorizer_module":"none","level":"info","msg":"the default vectorizer modules is set to \"none\", as a result all new schema classes without an explicit vectorizer setting, will use this vectorizer","time":"2023-08-18T13:52:19+01:00"}
{"action":"startup","auto_schema_enabled":true,"level":"info","msg":"auto schema enabled setting is set to \"true\"","time":"2023-08-18T13:52:19+01:00"}
{"action":"lsm_recover_from_active_wal_success","class":"KnowledgeBlock","index":"knowledgeblock","level":"info","msg":"successfully recovered from write-ahead-log","path":"/Users/jphwang/.local/share/weaviate/knowledgeblock_1q1b9BLntORd_lsm/objects/segment-1692319942344574000.wal","shard":"1q1b9BLntORd","time":"2023-08-18T13:52:19+01:00"}
{"action":"lsm_recover_from_active_wal_success","class":"KnowledgeBlock","index":"knowledgeblock","level":"info","msg":"successfully recovered from write-ahead-log","path":"/Users/jphwang/.local/share/weaviate/knowledgeblock_1q1b9BLntORd_ls

Create a helper function as we'll be dealing with JSON responses a lot

In [4]:
def jprint(data_in):
    print(json.dumps(data_in, indent=2))

Retrieve Weaviate instance information to check our configuration.

In [5]:
jprint(client.get_meta())

{
  "hostname": "http://127.0.0.1:6666",
  "modules": {
    "generative-openai": {
      "documentationHref": "https://beta.openai.com/docs/api-reference/completions",
      "name": "Generative Search - OpenAI"
    },
    "qna-openai": {
      "documentationHref": "https://beta.openai.com/docs/api-reference/completions",
      "name": "OpenAI Question & Answering Module"
    },
    "ref2vec-centroid": {},
    "text2vec-cohere": {
      "documentationHref": "https://docs.cohere.ai/embedding-wiki/",
      "name": "Cohere Module"
    },
    "text2vec-huggingface": {
      "documentationHref": "https://huggingface.co/docs/api-inference/detailed_parameters#feature-extraction-task",
      "name": "Hugging Face Module"
    },
    "text2vec-openai": {
      "documentationHref": "https://beta.openai.com/docs/guides/embeddings/what-are-embeddings",
      "name": "OpenAI Module"
    }
  },
  "version": "1.19.12"
}


## Step 2: Add data to Weaviate

### Add class definition

The equivalent of a SQL "table", or noSQL "collection" is called a "class" in Weaviate.

In case I created a demo class - let's delete it.

In [6]:
if client.schema.exists("Question"):
    client.schema.delete_class("Question")

And create a new class definition here.
We'll set up a class called "Question" with:
- A "vectorizer" -> which will convert data to vectors, which represent meaning,
- A "generative" module -> which will allow us to use LLMs with our data, and
- Properties to save our quiz data (which are like SQL columns).
    - Just the question and answer for now

In [7]:
class_definition = {
    "class": "Question",
    "vectorizer": "text2vec-openai",
    "vectorIndexConfig": {
        "distance": "cosine",
    },
    "moduleConfig": {
        "generative-openai": {}
    },
    "properties": [
        {
            "name": "question",
            "dataType": ["text"]
        },
        {
            "name": "answer",
            "dataType": ["text"]
        },
    ],
}

client.schema.create_class(class_definition)

{"action":"hnsw_vector_cache_prefill","count":1000,"index_id":"question_09d6siOqOdZE","level":"info","limit":1000000000000,"msg":"prefilled vector cache","time":"2023-08-18T13:52:19+01:00","took":33958}


> Tip: You can get example class definitions in our documentation:
> - https://weaviate.io/developers/weaviate/manage-data/classes#example-class-configurations

Was our class created successfully? Let's take a look

In [8]:
jprint(client.schema.get("Question"))

{
  "class": "Question",
  "invertedIndexConfig": {
    "bm25": {
      "b": 0.75,
      "k1": 1.2
    },
    "cleanupIntervalSeconds": 60,
    "stopwords": {
      "additions": null,
      "preset": "en",
      "removals": null
    }
  },
  "moduleConfig": {
    "generative-openai": {},
    "text2vec-openai": {
      "model": "ada",
      "modelVersion": "002",
      "type": "text",
      "vectorizeClassName": true
    }
  },
  "properties": [
    {
      "dataType": [
        "text"
      ],
      "indexFilterable": true,
      "indexSearchable": true,
      "moduleConfig": {
        "text2vec-openai": {
          "skip": false,
          "vectorizePropertyName": false
        }
      },
      "name": "question",
      "tokenization": "word"
    },
    {
      "dataType": [
        "text"
      ],
      "indexFilterable": true,
      "indexSearchable": true,
      "moduleConfig": {
        "text2vec-openai": {
          "skip": false,
          "vectorizePropertyName": false
        

### Add data

We'll add actual objects (SQL rows) to our data. 

First, let's build objects to add - and take a look at a couple.

In [9]:
for o in data[:2]:
    obj_body = {
        "question": o["Question"],
        "answer": o["Answer"],
    }
    print(obj_body)

{'question': 'Abraham Lincoln died across the street from this theatre on April 15, 1865', 'answer': "Ford's Theatre (the Ford Theatre accepted)"}
{'question': 'Any pigment on the wall so faded you can barely see it', 'answer': 'faint paint'}


> If it all looks fine - let's add objects:
> - https://weaviate.io/developers/weaviate/manage-data/import

In [10]:
with client.batch() as batch:
    for o in data:
        obj_body = {
            "question": o["Question"],
            "answer": o["Answer"],
        }
        batch.add_data_object(
            data_object=obj_body,
            class_name="Question"
        )

#### Confirm data load

Do we have data? 

Let's get an object count

In [11]:
jprint(client.query.aggregate("Question").with_meta_count().do())

{
  "data": {
    "Aggregate": {
      "Question": [
        {
          "meta": {
            "count": 1000
          }
        }
      ]
    }
  }
}


Does the data look right?

Let's grab a few objects from Weaviate!

In [12]:
response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_limit(2)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "answer": "equal and opposite",
          "question": "Newton's Third Law of Motion is usually quoted as \"For every action there is\" this 3-word type of \"reaction\""
        },
        {
          "answer": "Jackson Pollock",
          "question": "He poured & splattered paint onto the canvas to make his \"Autumn Rhythm:  Number 30, 1950\""
        }
      ]
    }
  }
}


Let's pause for a second - because we've done a lot!

#### What did we just do?

Here is a conceptual diagram

![img](https://github.com/weaviate-tutorials/intro-workshop/blob/main/images/object_import_process_full.png?raw=1)

## Step 3: Work with the data

Let's try a few more involved queries

### Filtering (similar to WHERE filter in SQL)

Let's find objects that meet a particular condition.

In [13]:
where_filter = {
    "path": ["question"],
    "operator": "Like",
    "valueText": "*history*"
}

response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_where(where_filter)
    .with_limit(3)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "answer": "Greyhound",
          "question": "A Hibbing, Minn. museum traces the history of this bus company founded there in 1914 using Hupmobiles"
        },
        {
          "answer": "the Field Museum",
          "question": "What was once the Chicago Natural History Museum is now called this, after its founder"
        },
        {
          "answer": "the draft",
          "question": "You're in the Army now--in 1940 FDR instituted the first peacetime one of these in U.S. history"
        }
      ]
    }
  }
}


We can also use multiple filters

In [14]:
where_filter = {
    "operator": "Or",
    "operands": [
        {
            "path": ["question"],
            "operator": "Like",
            "valueText": "*history*"            
        },
        {
            "path": ["answer"],
            "operator": "Like",
            "valueText": "*history*"            
        },        
    ]
}

response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_where(where_filter)
    .with_limit(3)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "answer": "Greyhound",
          "question": "A Hibbing, Minn. museum traces the history of this bus company founded there in 1914 using Hupmobiles"
        },
        {
          "answer": "the Field Museum",
          "question": "What was once the Chicago Natural History Museum is now called this, after its founder"
        },
        {
          "answer": "\"A Brief History Of Time In A Bottle\"",
          "question": "Stephen Hawking's 1988 bio of the universe that was a No. 1 hit for Jim Croce"
        }
      ]
    }
  }
}


But this does not rank the result in any meaningful way. 

For that, we need a keyword search (as opposed to a keyword *filter*).

### Keyword search

Unlike a keyword filter, a keyword search will search for, and rank results based on the frequency of the keyword.

In [15]:
response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_bm25("history")
    .with_limit(3)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "answer": "\"A Brief History Of Time In A Bottle\"",
          "question": "Stephen Hawking's 1988 bio of the universe that was a No. 1 hit for Jim Croce"
        },
        {
          "answer": "Oil",
          "question": "The Drake Well Museum in Titusville, Penn. is dedicated to the history of this industry"
        },
        {
          "answer": "the Field Museum",
          "question": "What was once the Chicago Natural History Museum is now called this, after its founder"
        }
      ]
    }
  }
}


### Semantic search

A semantic search, on the other hand, searches objects based on similarity

In [16]:
response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_near_text({"concepts": ["history"]})
    .with_limit(3)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "answer": "Greyhound",
          "question": "A Hibbing, Minn. museum traces the history of this bus company founded there in 1914 using Hupmobiles"
        },
        {
          "answer": "The Rijksmuseum",
          "question": "This Dutch national art museum had its origins in one founded by Louis Bonaparte in 1808"
        },
        {
          "answer": "Shinto",
          "question": "Compiled in 712, the Kojiki, \"Records of Ancient Matters\", is one of this religion's oldest texts"
        }
      ]
    }
  }
}


#### How does this work?

- Under the hood, this uses a vector search. It looks for objects which are the most similar to a text input.
- We can inspect the similarity along with the results.

In [17]:
response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_near_text({"concepts": ["history"]})
    .with_additional("distance")
    .with_limit(3)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "_additional": {
            "distance": 0.19906706
          },
          "answer": "Greyhound",
          "question": "A Hibbing, Minn. museum traces the history of this bus company founded there in 1914 using Hupmobiles"
        },
        {
          "_additional": {
            "distance": 0.20576918
          },
          "answer": "The Rijksmuseum",
          "question": "This Dutch national art museum had its origins in one founded by Louis Bonaparte in 1808"
        },
        {
          "_additional": {
            "distance": 0.20847362
          },
          "answer": "Shinto",
          "question": "Compiled in 712, the Kojiki, \"Records of Ancient Matters\", is one of this religion's oldest texts"
        }
      ]
    }
  }
}


This is where "vectors" come in. 

Each object in Weaviate includes a vector - like so:

In [18]:
response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_additional("vector")
    .with_limit(1)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "_additional": {
            "vector": [
              -0.005387872,
              0.008904468,
              -0.017452504,
              0.0011313576,
              -0.01096669,
              -0.002116323,
              -0.0040671593,
              -0.0146264965,
              -0.0038603004,
              -0.015173877,
              0.011895963,
              0.016485043,
              -0.0016485042,
              -0.009216348,
              0.012494261,
              0.0017328389,
              0.011564989,
              0.004601809,
              0.00590661,
              -0.002918298,
              0.0069313557,
              0.018572723,
              -0.007892452,
              -0.029278453,
              -0.010603892,
              0.015186606,
              0.01498293,
              -0.0354142,
              0.0038380234,
              -0.010864852,
              0.015072038,
              0.00046940998,
       

These vector representations come from deep learning models to those that power LLMs. They capture meaning, and are called vector "embeddings".

### Generative search

A generative search transforms your data at retrieval time. 

In [19]:
response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_near_text({"concepts": ["history"]})
    .with_generate(single_prompt="Write a tweet about {question} as an interesting factoid.")
    .with_limit(3)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "_additional": {
            "generate": {
              "error": null,
              "singleResult": "\"Did you know? \ud83d\ude8c A fascinating piece of history lies in Hibbing, Minn.! \ud83c\udfde\ufe0f The local museum takes you back to 1914, where a bus company was founded using Hupmobiles! \ud83d\ude8d Explore the rich heritage and evolution of transportation at this hidden gem. \ud83c\udf1f #Hibbing #Museum #TransportationHistory\""
            }
          },
          "answer": "Greyhound",
          "question": "A Hibbing, Minn. museum traces the history of this bus company founded there in 1914 using Hupmobiles"
        },
        {
          "_additional": {
            "generate": {
              "error": null,
              "singleResult": "\"\ud83c\udfa8 Did you know? The Dutch national art museum, which we now know as the Rijksmuseum, traces its roots back to 1808 when Louis Bonaparte established it. Talk

You can see here ⬆️ that each object has been transformed into a tweet by the LLM based on our prompt.

You can ask LLMs to perform all sorts of tasks

In [20]:
response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_near_text({"concepts": ["history"]})
    .with_generate(single_prompt="Translate {question} into French.")
    .with_limit(3)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "_additional": {
            "generate": {
              "error": null,
              "singleResult": "Un mus\u00e9e \u00e0 Hibbing, dans le Minnesota, retrace l'histoire de cette compagnie de bus fond\u00e9e en 1914 en utilisant des Hupmobiles."
            }
          },
          "answer": "Greyhound",
          "question": "A Hibbing, Minn. museum traces the history of this bus company founded there in 1914 using Hupmobiles"
        },
        {
          "_additional": {
            "generate": {
              "error": null,
              "singleResult": "Ce mus\u00e9e national d'art n\u00e9erlandais trouve ses origines dans celui fond\u00e9 par Louis Bonaparte en 1808."
            }
          },
          "answer": "The Rijksmuseum",
          "question": "This Dutch national art museum had its origins in one founded by Louis Bonaparte in 1808"
        },
        {
          "_additional": {
            "generate

The LLM is multi-lingual!

You can also send groups of results to the LLM with Weaviate.

In [21]:
response = (
    client.query
    .get("Question", ["question", "answer"])
    .with_near_text({"concepts": ["history"]})
    .with_generate(grouped_task="Write a poem about these facts")
    .with_limit(3)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "Question": [
        {
          "_additional": {
            "generate": {
              "error": null,
              "groupedResult": "In the land of Hibbing, where history resides,\nA museum stands tall, where knowledge abides.\nTracing the footsteps of a company grand,\nFounded in 1914, by a visionary hand.\n\nGreyhound, the name that echoes through time,\nA bus company born, with a purpose sublime.\nUsing Hupmobiles, they embarked on a quest,\nConnecting people, from east to the west.\n\nIn Hibbing, the stories of old come alive,\nAs the wheels of progress continue to drive.\nThrough exhibits and artifacts, we can explore,\nThe legacy of Greyhound, forevermore.\n\nAcross the ocean, in a land far away,\nThe Rijksmuseum stands, where art holds sway.\nA Dutch national treasure, with a rich history,\nBorn from the vision of Louis Bonaparte's decree.\n\nIn 1808, the seeds were sown,\nA museum of art, where beauty was shown.\nThrough the ages, it grew a

The output for a grouped task is contained in the first response object. 

So let's take a closer look at that one :) 

In [22]:
print(response["data"]["Get"]["Question"][0]["_additional"]["generate"]["groupedResult"])

In the land of Hibbing, where history resides,
A museum stands tall, where knowledge abides.
Tracing the footsteps of a company grand,
Founded in 1914, by a visionary hand.

Greyhound, the name that echoes through time,
A bus company born, with a purpose sublime.
Using Hupmobiles, they embarked on a quest,
Connecting people, from east to the west.

In Hibbing, the stories of old come alive,
As the wheels of progress continue to drive.
Through exhibits and artifacts, we can explore,
The legacy of Greyhound, forevermore.

Across the ocean, in a land far away,
The Rijksmuseum stands, where art holds sway.
A Dutch national treasure, with a rich history,
Born from the vision of Louis Bonaparte's decree.

In 1808, the seeds were sown,
A museum of art, where beauty was shown.
Through the ages, it grew and evolved,
Preserving masterpieces, for all to behold.

From Rembrandt's strokes to Van Gogh's flair,
The Rijksmuseum's halls, a haven so rare.
A sanctuary of culture, where passions ignite,
G

Look how far we've got in a short time - we can do much more than that! 

Here's something I prepared earlier.

## What more can we do with Weaviate?

Here is a demo instance that you can connect to and try out. 

Like many of our production clusters, we have a read-only API key set up that you can use.

In [23]:
api_headers = {
    "X-OpenAI-Api-Key": os.environ["OPENAI_APIKEY"],
}

# Instantiate the client with the auth config
client = weaviate.Client(
    url="https://edu-demo.weaviate.network",
    auth_client_secret=weaviate.AuthApiKey(
        api_key="learn-weaviate"
    ),
    additional_headers=api_headers
)

This instance is populated with the first two chapters of the "Pro Git" book.

In [24]:
response = (
    client.query
    .get("GitBookChunk", ["chunk", "chunk_index", "chapter_title"])
    .with_limit(2)
    .do()
)

jprint(response)

{
  "data": {
    "Get": {
      "GitBookChunk": [
        {
          "chapter_title": "01-introduction",
          "chunk": "== Distributed Version Control Systems\n\n(((version control,distributed)))\nThis is where Distributed Version Control Systems (DVCSs) step in.\nIn a DVCS (such as Git, Mercurial, Bazaar or Darcs), clients don't just check out the latest snapshot of the files; rather, they fully mirror the repository, including its full history.\nThus, if any server dies, and these systems were collaborating via that server, any of the client repositories can be copied back up to the server to restore it.\nEvery clone is really a full backup of all the data.\n\n.Distributed version control diagram\nimage::images/distributed.png[Distributed version control diagram]\n\nFurthermore, many of these systems deal pretty well with having several remote repositories they can work with, so you can collaborate with different groups of people in different ways simultaneously within the sam

Using Weaviate, we can talk to this book!

Let's see what the book says about ways of undoing commits.

In [25]:
response = (
    client.query
    .get("GitBookChunk", ["chunk", "chunk_index", "chapter_title"])
    .with_near_text({"concepts": ["undo a git commit"]})
    .with_generate(grouped_task="key concepts contained here in bullet points")
    .with_limit(3)
    .do()
)

Take a look at the results as we've done before

In [26]:
print(response["data"]["Get"]["GitBookChunk"][0]["_additional"]["generate"]["groupedResult"])

- The concept of undoing changes in Git
- The use of the `git commit --amend` command to redo a commit
- The use of the `git reset` command to undo changes in the working directory


And the information that this is based on:

In [27]:
for o in response["data"]["Get"]["GitBookChunk"]:
    print(f"========== Chunk: {o['chunk_index']} ==========")
    print(o["chunk"])

===

[[_undoing]]= Undoing Things

At any stage, you may want to undo something.
Here, we'll review a few basic tools for undoing changes that you've made.
Be careful, because you can't always undo some of these undos.
This is one of the few areas in Git where you may lose some work if you do it wrong.

One of the common undos takes place when you commit too early and possibly forget to add some files, or you mess up your commit message.
If you want to redo that commit, make the additional changes you forgot, stage them, and commit again using the `--amend` option:

[source,console]
----
$ git commit --amend
----

This command takes your staging area and uses it for the commit.
If you've made no changes since your last commit (for instance, you run this command immediately after your previous commit), then your snapshot will look exactly the same, and all you'll change is your commit message.

The same commit-message editor fires up, but it already contains the message of your previous

You can do strange and wonderful things - like this:

In [28]:
response = (
    client.query
    .get("GitBookChunk", ["chunk", "chunk_index", "chapter_title"])
    .with_near_text({"concepts": ["history of git"]})
    .with_generate(grouped_task="explain these results in a short children's story, with emojis.")
    .with_limit(3)
    .do()
)

In [29]:
print(response["data"]["Get"]["GitBookChunk"][0]["_additional"]["generate"]["groupedResult"])

Once upon a time, there was a little penguin named Linux 🐧. Linux had a big project called the Linux kernel, which was a special kind of software. But Linux needed a way to keep track of all the changes and updates to the software.

At first, Linux used patches and archived files to share the changes with others. But then, in 2002, a new tool called BitKeeper came along. BitKeeper helped Linux and its friends work together on the project. 🤝

But, as time went on, there was trouble in paradise. The relationship between Linux and BitKeeper broke down in 2005, and BitKeeper took away its free status. 😔

But Linux and its creator, Linus Torvalds, didn't give up! They decided to create their own tool called Git. 🚀

Git had some special goals: it wanted to be fast ⚡, simple, and able to handle lots of different changes happening at the same time. It also wanted to be able to work on big projects like the Linux kernel. 🌟

Since its birth in 2005, Git has grown and become even better. It's sup

In [30]:
for o in response["data"]["Get"]["GitBookChunk"]:
    print(f"========== Chunk: {o['chunk_index']} ==========")
    print(o["chunk"])

=== A Short History of Git

As with many great things in life, Git began with a bit of creative destruction and fiery controversy.

The Linux kernel is an open source software project of fairly large scope.(((Linux)))
During the early years of the Linux kernel maintenance (1991–2002), changes to the software were passed around as patches and archived files.
In 2002, the Linux kernel project began using a proprietary DVCS called BitKeeper.(((BitKeeper)))

In 2005, the relationship between the community that developed the Linux kernel and the commercial company that developed BitKeeper broke down, and the tool's free-of-charge status was revoked.
This prompted the Linux development community (and in particular Linus Torvalds, the creator of Linux) to develop their own tool based on some of the lessons they learned while using BitKeeper.(((Linus Torvalds)))
Some of the goals of the new system were as follows:

* Speed
* Simple design
* Strong support for non-linear development (thousands 

And a lot more. 

Weaviate makes it easy for you to work with your data and these AI models, at scale. As a vector database, we deal with data stores with 10s or 100s of M objects!