# Astra DB

>[Astra DB](https://astra.datastax.com) is a NoSQL, row-oriented, highly scalable and highly available database.

To run this notebook you need a DataStax Astra DB instance running in the cloud (you can get one for free at [datastax.com](https://astra.datastax.com)).

You should ensure you have `astrapy` installed:

In [None]:
!pip install astrapy>=0.5.3

### Import needed packages

In [None]:
import getpass
import openai

from llama_index import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
)
from llama_index.vector_stores import AstraDBVectorStore

### Please provide database connection parameters and secrets:

In [None]:
api_endpoint = input(
    "\nPlease enter your Database Endpoint URL (e.g. '0123abcd...'):"
)

token = getpass.getpass(
    "\nPlease enter your 'Database Administrator' Token (e.g. 'AstraCS:...'):"
)

OPENAI_API_KEY = getpass.getpass("OpenAI API Key:")
openai.api_key = OPENAI_API_KEY

### Load some example data:

In [None]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2023-11-03 06:48:55--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2023-11-03 06:48:55 (2.90 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



### Read the data:

In [None]:
# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
print(f"Total documents: {len(documents)}")
print(f"First document, id: {documents[0].doc_id}")
print(f"First document, hash: {documents[0].hash}")
print(
    "First document, text"
    f" ({len(documents[0].text)} characters):\n{'='*20}\n{documents[0].text[:360]} ..."
)

Total documents: 1
First document, id: b88ff080-c0b4-423b-87e9-218196974a7c
First document, hash: 319a86a522673c7b5040379a0795c86c9b39dd758ad262dafdf83f86095298ae
First document, text (75014 characters):


What I Worked On

February 2021

Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined ma ...


### Create the Astra DB Vector Store object:

In [None]:
astra_db_store = AstraDBVectorStore(
    token=token,
    api_endpoint=api_endpoint,
    collection_name="astra_v_table",
    embedding_dimension=1536,
)

### Build the Index from the Documents:

In [None]:
storage_context = StorageContext.from_defaults(vector_store=astra_db_store)

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

### Query using the index:

In [None]:
query_engine = index.as_query_engine()
response = query_engine.query("Why did the author choose to work on AI?")
print(response.response)

The author chose to work on AI because they believed that AI had the potential to achieve higher levels of intelligence, as demonstrated by their fascination with the SHRDLU program. They were initially drawn to AI because it seemed like a promising field that could bridge the gap between natural language understanding and computer programs. However, as they delved deeper into AI during their graduate studies, they realized that the existing approaches to AI, which involved translating natural language into formal representations, were fundamentally flawed and could not lead to true understanding. Despite this realization, the author still found value in Lisp, the programming language associated with AI, and decided to focus on it, eventually writing a book about Lisp hacking.


In [None]:
query_engine = index.as_query_engine()
response = query_engine.query(
    "Why did the author choose to work on AI? Answer in a single short sentence."
)
print(response.response)

The author chose to work on AI because they believed it was the path to achieving intelligence.
