In [140]:
import os
import re
from openai import OpenAI
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

The task here was to make a basic retrieval-augmented generation tool in as little time as possible. To do this, I called the OpenAI API for the LLM to avoid running anything on my side. It only took a couple of hours and should demonstrate the basic principles of RAG. 

Note: only data from Hegel's work "The Phenomenology of Spirit" was used here! So don't complain if it's not the best Hegelian philosophy! 

Three approaches are attempted below. One uses whole chapters/aphorisms from the source text as chunks for vectorization. The other uses individuals sentences. The final approach, which appears to be the most effective, uses sentence chunks, but once it finds the k best matches, pads the chunks of information passed to the LLM with sentences from before and after the sentence selected by the model to provide additional context.

Cosine-similarity was used as the distance metric to search for the similarity between the vectorized query and the database of chunk vectors. 

The LLM used for the test is GPT-3.5-turbo. The embedding model used to vectorize the chunks is ada-002.

# Importing the raw text data from a markdown file taken from a pdf parsed by llama

In [144]:
# Import raw text data 

with open("hegelpos.md", "r", encoding="utf8") as ff:
    data = ff.read()

In [146]:
# Sample of raw text data

data[:1000]

'\n        Preface\n\n¶1. It is customary to begin a work by explaining in a preface the aim that the author\nset himself in the work, his reasons for writing it, and the relationship in which he\nbelieves it to stand to other earlier or contemporary treatments of the same subject.\nIn the case of a philosophical work, however, such an explanation seems not only\nsuperfluous but, in view of the nature of the Thing,1                          even inappropriate and\nmisleading. For the sort of statement that might properly be made about philosophy\nin a preface—say, a historical            report   of the main direction and standpoint, of the\ngeneral content and results, a string of desultory assertions and assurances about the\ntrue—cannot be accepted as the way and manner in which to expound philosophical\ntruth. Also, philosophy moves essentially in the element2                            of universality that\nembraces the particular within itself, and this creates the impression, mo

In [148]:
type(data)

str

# Cleaning the text data a bit to make it easier for the LLM to parse

In [151]:
cleandata = data.strip()

In [153]:
cleandata = cleandata.replace("\n", " ")

In [155]:
pattern = '[\uF000-\uF999]'

cleandata = re.sub(pattern, " ", cleandata)

In [157]:
pattern = '\s+'

cleandata = re.sub(pattern, " ", cleandata)

# APPROACH 1

### Creating chunks using the existing aphorisms

In [161]:
splitdata = cleandata.split("¶")

In [163]:
# Example of chapter/aphorism 

splitdata[1]

'1. It is customary to begin a work by explaining in a preface the aim that the author set himself in the work, his reasons for writing it, and the relationship in which he believes it to stand to other earlier or contemporary treatments of the same subject. In the case of a philosophical work, however, such an explanation seems not only superfluous but, in view of the nature of the Thing,1 even inappropriate and misleading. For the sort of statement that might properly be made about philosophy in a preface—say, a historical report of the main direction and standpoint, of the general content and results, a string of desultory assertions and assurances about the true—cannot be accepted as the way and manner in which to expound philosophical truth. Also, philosophy moves essentially in the element2 of universality that embraces the particular within itself, and this creates the impression, more here than in the case of other sciences, that the Thing itself, in all its essentials, is expr

# Creating API client from OpenAI

In [166]:
client = OpenAI(api_key="<ADD YOUR OWN API KEY TO RUN CODE>")

In [167]:
resp = client.chat.completions.create(
 model="gpt-3.5-turbo",
 messages=[{"role": "user", "content": "Hello world"}]
)

In [169]:
resp

ChatCompletion(id='chatcmpl-A9YfJF5MAeb9NGwufn7SriBXUMcHo', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Hello! How can I assist you today?', role='assistant', function_call=None, tool_calls=None, refusal=None))], created=1726841569, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=9, prompt_tokens=9, total_tokens=18, completion_tokens_details={'reasoning_tokens': 0}))

In [170]:
resp.choices[0].message.content

'Hello! How can I assist you today?'

# Function to return text response from client query

In [173]:
def answerme(client, question):

    resp = client.chat.completions.create(model="gpt-3.5-turbo", messages=[{"role": "user", "content": question}])

    return resp.choices[0].message.content

In [177]:
print(answerme(client, "What is the port of Aysén famous for?"))

The port of Aysén, located in the Aysén Region of Chile, is famous for being the main maritime gateway to the town of Puerto Aysén and the surrounding area. It is known for its picturesque landscapes, including the nearby Marble Caves, a popular tourist attraction. The port also serves as an important hub for fishing, agriculture, and tourism industries in the region.


# Function to embed a chunk of text

In [179]:
def get_embedding(client, text, model="text-embedding-ada-002"):
   text = text.replace("\n", " ")
   return client.embeddings.create(input = [text], model=model).data[0].embedding

# Vectorizing the text

In [181]:
#This only needs to be run once to obtain the vectors. 

#vectors = []

#for _ in range(0, len(splitdata)):

    #try:
        #vectors.append(get_embedding(client, splitdata[_]))
    #except:
        #vectors.append(np.zeros(1536))

In [182]:
#df = pd.DataFrame(vectors)

In [183]:
#df.to_csv("vectors.csv")

# Importing existing vector frame

In [185]:
df = pd.read_csv("vectors.csv", index_col=0)

In [186]:
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,1526,1527,1528,1529,1530,1531,1532,1533,1534,1535
0,0.014789,-0.010883,-0.037749,-0.007581,0.006367,0.023629,0.005017,-0.008140,-0.020314,0.003286,...,0.002692,-0.011487,0.005162,-0.015997,-0.016832,-0.007137,0.018040,-0.021830,0.018541,-0.013138
1,0.016903,0.012458,0.012169,-0.010006,-0.014281,0.013520,-0.031656,-0.008517,-0.006711,-0.005488,...,0.004806,-0.005062,0.011960,-0.008845,-0.038869,-0.032863,0.015618,-0.004383,0.009566,-0.050068
2,0.012877,0.003183,0.008545,-0.001652,-0.015958,0.017289,-0.017595,-0.013489,-0.009450,-0.012411,...,0.009403,-0.014853,0.015093,-0.013975,-0.035110,-0.025461,0.014521,0.003266,-0.000340,-0.039130
3,-0.008276,0.001609,0.025049,-0.009256,-0.014511,0.017089,-0.016095,-0.022203,-0.030660,-0.030365,...,0.001323,0.003054,0.014807,-0.015746,-0.032190,-0.010471,-0.002618,-0.006940,-0.003950,-0.043386
4,0.013413,-0.000642,0.026282,-0.017366,-0.002777,0.016197,-0.018176,0.006843,-0.009852,-0.012204,...,0.017047,0.004710,0.008610,-0.025777,-0.030294,-0.017858,0.032447,-0.007208,-0.016037,-0.038665
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2718,-0.011102,0.007972,0.017217,-0.009396,-0.017637,0.006784,-0.001374,-0.012690,-0.012834,-0.026036,...,0.017965,-0.010059,0.012867,-0.033069,-0.030943,0.008149,0.008950,-0.005922,-0.006791,-0.029841
2719,-0.012151,0.002775,0.023420,0.001810,-0.008421,0.025766,-0.013826,-0.032601,-0.023718,-0.010978,...,0.035368,-0.016097,0.024627,-0.042718,-0.002151,0.033632,0.003940,-0.005831,-0.006591,-0.035368
2720,0.010984,-0.017450,0.026957,-0.004615,-0.008260,0.027577,-0.003366,-0.029478,-0.020322,0.009750,...,0.021199,-0.013660,0.002067,-0.025055,-0.025891,0.001400,-0.001155,-0.020403,-0.016883,-0.017733
2721,0.007435,-0.009077,-0.009428,0.002120,-0.025064,-0.004552,-0.009012,-0.023025,-0.007493,-0.018233,...,0.006617,-0.010324,0.009376,-0.031920,-0.016947,0.023155,-0.001584,0.005912,0.006214,-0.010363


In [187]:
# Check the indexing still matches

len(splitdata)

2723

# Function to return top k matches using cosine simiarlity

Here it is of note that the query must be vectorized using the same model as the database. Each time a query is made, the raw text is vectorized. This is handled automatically by the OpenAI API. 

In [198]:
def find_top_k_matches(query, vecdf, textdata, k=5):
    
    query_embedding = np.array(get_embedding(client, query, model="text-embedding-ada-002"))
    
    embeddings = vecdf.to_numpy()
    
    similarities = cosine_similarity(query_embedding.reshape(1,-1), embeddings).flatten()

    top_k_indices = similarities.argsort()[-k:][::-1]

    return top_k_indices


In [200]:
topk = find_top_k_matches("What is the spirit in Hegel's hierarchy?", df, textdata=splitdata, k=5)

# Example of top k matches found

Here we see that our search is working. When we ask about the role of "spirit" in Hegel's philosophy, we return only chunks that discuss the topic of spirit. 

In [203]:
for _ in topk:

    print("\n", splitdata[_], "\n")


 767. 1. Spirit, as conceived by Christianity, involves three stages that correspond not only to the three persons of the Trinity, but to the three parts of Hegel’s system. The pure substance, or God the Father, is thinking or logic: cf. Hegel, SL, p.29: ‘It can therefore be said that this content is the exposition of God as he is in his eternal essence before the creation of nature and of a finite spirit.’ Logic descends into the singularity of Christ or into nature, the second part of the system. This involves representation especially, not only in religion, but also in Hegel’s system, since the description of nature requires determinate empirical concepts, such as that of a plant, not only pure thoughts, such as those of Being or of substance. Christ is other than God and nature is other than pure thought. Finally, in the holy spirit and in Hegel’s account of mind or spirit, spirit returns from otherness and representation into self- consciousness. Each of these stages is a sort of

# Function to run the RAG model

In [207]:
def rag(client, textdata, vecdf, query, k):

    ind = find_top_k_matches(query, vecdf, textdata, k)

    info = ". ".join([f"\nInfo chunk {_+1}:\n\n" + textdata[ind[_]] + "\n" for _ in range(len(ind))])

    combquery = f"The question is: {query}, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':\n {info}"

    print("What the LLM sees:\n")
    print(combquery + "\n\n")

    print("Resulting answer:\n") 
    return answerme(client, combquery)

# Comparison of RAG output vs default output

In [210]:
print(rag(client, textdata=splitdata, vecdf=df, query="What is spirit in Hegel's hierarchy?", k=6))

What the LLM sees:

The question is: What is spirit in Hegel's hierarchy?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

767. 1. Spirit, as conceived by Christianity, involves three stages that correspond not only to the three persons of the Trinity, but to the three parts of Hegel’s system. The pure substance, or God the Father, is thinking or logic: cf. Hegel, SL, p.29: ‘It can therefore be said that this content is the exposition of God as he is in his eternal essence before the creation of nature and of a finite spirit.’ Logic descends into the singularity of Christ or into nature, the second part of the system. This involves representation especially, not only in religion, but also in Hegel’s system, since the description of nature requires determinate empirical concepts, such as that of a plant, not only pure thoughts, such as those of Being

In [211]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is spirit in Hegel's hierarchy?"))

Answer from the basic LLM:

In Hegel's hierarchy, spirit is the highest level of development of consciousness, following nature and consciousness. Spirit represents the culmination of self-awareness and self-determination, where individuals and societies are able to recognize and fulfill their own intrinsic potential. Spirit is characterized by the ability to reflect upon and shape one's own reality, embodying rationality, freedom, and morality. It is through the development of spirit that individuals and societies can achieve true self-actualization and realize their place within the larger framework of history and reality.


In [212]:
print(rag(client, textdata=splitdata, vecdf=df, query="What is the the abstract universality of Being according to Hegel?", k=6))

What the LLM sees:

The question is: What is the the abstract universality of Being according to Hegel?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

51). In Hegel’s system it corresponds to the transition from Logic to the philosophy of nature, showing how the logical ‘Idea’ necessarily burgeons into a world of space, time, matter, etc. and eventually into the human mind itself. The ‘becoming of Being-there as Being-there’ is associated with (among other things) this process, the transition from logic to nature, while the ‘becoming of the essence’ is presented in the Logic itself, the extraction of the logical essence of the world and the passage to higher and higher logical categories. The becoming of the ‘substance’, i.e. of the logical ‘essence’, passes over into ‘externality’, i.e. nature, where it is ‘for another’, i.e. known to us. The bec

In [213]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is the the abstract universality of Being according to Hegel?"))

Answer from the basic LLM:

The abstract universality of Being according to Hegel refers to the idea that all things in existence share a common essence or nature. In Hegel's philosophy, Being is the most basic category of existence, representing pure existence without any specific properties or characteristics. This abstract universality of Being implies that everything in the universe ultimately derives from this common source and is interconnected in a larger system of relationships. Hegel believed that by understanding this universal essence of Being, one could gain insights into the nature of reality and the interconnectedness of all things.


## Asking a question that is not discussed the source text (i.e. no information in database)

Here we demostrate a huge advantage of using the RAG approach over the basic LLM. Hegel does not explicitly discuss feminism in the The Phenomenology of Spirit. Given the text was first published in 1807, this is unsurprising. While the baseline LLM will do its best to formulate an answer, the RAG model can be instructed to not provide answers unless the info chunks provided by the RAG explicitly mention the topic. 

In [216]:
print(rag(client, textdata=splitdata, vecdf=df, query="What did Hegel say about feminimism?", k=6))

What the LLM sees:

The question is: What did Hegel say about feminimism?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

, Hegel uses the neuter gender (das), but the ‘shaped unchangeable’ is masculine, both here and in 
. 
Info chunk 2:

76. Hegel’s primary response to this problem is to take on board both assertions, albeit in a modified form. This is how he treats seemingly competing philoso- phies: cf. 
. 
Info chunk 3:

245, 360; Hegel (1895) II, pp. 148–65; and PS 
. 
Info chunk 4:

: ‘Note Hegel’s incorporation of Evil into the Absolute’. Whatever else Hegel may have believed about the absolute, he did not believe that it was purely good to the exclusion of evil. Evil, and falsity, must be taken on board. 
. 
Info chunk 5:

459. 1. Hegel refers back to 
. 
Info chunk 6:

213–44 for Hegel’s later use of the word. 



Resulting answer:

I don

In [217]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What did Hegel say about feminimism?"))

Answer from the basic LLM:

Hegel did not explicitly discuss feminism in his works, as the term was not widely used during his time. However, his views on women can be inferred from his broader philosophical views on gender and the family.

Hegel believed in the concept of the family as a fundamental unit of society, with distinct roles for men and women within that structure. He viewed women as playing a more emotional and nurturing role in the family, while men played a more intellectual and active role in the public sphere.

Some modern feminists criticize Hegel for his views on gender roles, arguing that they are outdated and reinforce traditional stereotypes of women as inferior to men. However, others argue that Hegel's philosophy can be interpreted in a more progressive light, emphasizing the importance of mutual respect and recognition between men and women in all spheres of society.


# APPROACH 2

### This time we chunk the source text into sentences rather than aphorisms. This means we need many more vector representations of chunk as the number of chunks is far higher.

In [219]:
pattern = "¶"

cleandata2 = re.sub(pattern, " ", cleandata)

### Sentence splitting function taken from: https://stackoverflow.com/questions/4576077/how-can-i-split-a-text-into-sentences

In [221]:

alphabets= "([A-Za-z])"
prefixes = "(Mr|St|Mrs|Ms|Dr)[.]"
suffixes = "(Inc|Ltd|Jr|Sr|Co)"
starters = "(Mr|Mrs|Ms|Dr|Prof|Capt|Cpt|Lt|He\s|She\s|It\s|They\s|Their\s|Our\s|We\s|But\s|However\s|That\s|This\s|Wherever)"
acronyms = "([A-Z][.][A-Z][.](?:[A-Z][.])?)"
websites = "[.](com|net|org|io|gov|edu|me)"
digits = "([0-9])"
multiple_dots = r'\.{2,}'

def split_into_sentences(text: str) -> list[str]:
    """
    Split the text into sentences.

    If the text contains substrings "<prd>" or "<stop>", they would lead 
    to incorrect splitting because they are used as markers for splitting.

    :param text: text to be split into sentences
    :type text: str

    :return: list of sentences
    :rtype: list[str]
    """
    text = " " + text + "  "
    text = text.replace("\n"," ")
    text = re.sub(prefixes,"\\1<prd>",text)
    text = re.sub(websites,"<prd>\\1",text)
    text = re.sub(digits + "[.]" + digits,"\\1<prd>\\2",text)
    text = re.sub(multiple_dots, lambda match: "<prd>" * len(match.group(0)) + "<stop>", text)
    if "Ph.D" in text: text = text.replace("Ph.D.","Ph<prd>D<prd>")
    text = re.sub("\s" + alphabets + "[.] "," \\1<prd> ",text)
    text = re.sub(acronyms+" "+starters,"\\1<stop> \\2",text)
    text = re.sub(alphabets + "[.]" + alphabets + "[.]" + alphabets + "[.]","\\1<prd>\\2<prd>\\3<prd>",text)
    text = re.sub(alphabets + "[.]" + alphabets + "[.]","\\1<prd>\\2<prd>",text)
    text = re.sub(" "+suffixes+"[.] "+starters," \\1<stop> \\2",text)
    text = re.sub(" "+suffixes+"[.]"," \\1<prd>",text)
    text = re.sub(" " + alphabets + "[.]"," \\1<prd>",text)
    if "”" in text: text = text.replace(".”","”.")
    if "\"" in text: text = text.replace(".\"","\".")
    if "!" in text: text = text.replace("!\"","\"!")
    if "?" in text: text = text.replace("?\"","\"?")
    text = text.replace(".",".<stop>")
    text = text.replace("?","?<stop>")
    text = text.replace("!","!<stop>")
    text = text.replace("<prd>",".")
    sentences = text.split("<stop>")
    sentences = [s.strip() for s in sentences]
    if sentences and not sentences[-1]: sentences = sentences[:-1]
    return sentences

# Creating chunks using sentence boundaries

In [224]:
cleandata2 = split_into_sentences(cleandata2)

In [225]:
splitdata2 = [_ for _ in cleandata2 if len(_) > 20]

In [226]:
# Example of sentence chunk

splitdata2[1]

'In the case of a philosophical work, however, such an explanation seems not only superfluous but, in view of the nature of the Thing,1 even inappropriate and misleading.'

In [251]:
len(splitdata2)

11833

# Vectorizing the text

In [255]:
#This only needs to be run once to obtain the vectors. Run time for this one is considerable (>1 hr)

#vectors2 = []

#for _ in range(0, len(splitdata2)):

    #try:
        #vectors2.append(get_embedding(client, splitdata2[_]))
    #except:
        #vectors2.append(np.zeros(1536))

In [256]:
#df2 = pd.DataFrame(vectors2)

In [257]:
#df2.to_csv("vectors2.csv")

# Importing existing vector frame

In [263]:
df2 = pd.read_csv("vectors2.csv", index_col=0)

In [264]:
df2

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,1526,1527,1528,1529,1530,1531,1532,1533,1534,1535
0,0.018040,-0.009153,-0.009210,-0.003637,-0.013296,0.012897,-0.020545,0.010336,-0.002603,0.015484,...,0.015522,-0.016711,0.010260,-0.004624,-0.029071,-0.027401,0.018824,-0.006341,0.017888,-0.025605
1,0.011153,-0.003395,-0.000075,-0.017732,-0.024523,-0.001448,-0.048917,-0.016579,-0.014529,-0.018591,...,0.002793,-0.010404,0.028136,-0.018373,-0.028725,-0.024074,0.014196,0.003732,0.004949,-0.034132
2,0.000723,0.018356,-0.002905,-0.015889,-0.012609,0.006137,-0.027160,-0.001853,0.004087,0.013973,...,-0.001609,-0.008837,0.028655,-0.014105,-0.036082,-0.046709,0.010857,-0.005032,0.016388,-0.043062
3,0.000954,0.006750,0.012694,-0.032007,-0.024352,0.012543,-0.035441,-0.019332,-0.015138,-0.016514,...,0.010780,-0.016960,0.019110,-0.014640,-0.038167,-0.011095,0.014090,-0.010263,-0.005715,-0.050592
4,-0.001529,0.027288,0.037298,-0.008323,-0.009614,0.007757,-0.032846,-0.005535,-0.002529,-0.012847,...,0.008541,0.006414,0.031924,-0.020703,-0.020032,-0.035612,0.013539,-0.003222,-0.008462,-0.030502
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11828,0.008419,-0.009955,0.024746,-0.028968,-0.012700,0.014118,-0.011399,-0.013314,-0.011510,-0.004843,...,0.029491,-0.025033,0.000802,-0.015621,-0.016458,0.012281,0.021648,-0.016327,-0.010347,-0.010412
11829,-0.006728,-0.001566,0.025972,-0.023427,-0.023012,-0.009872,0.003211,-0.011955,-0.008365,-0.026964,...,0.002866,0.012156,-0.001374,-0.019422,-0.013770,-0.023454,0.011727,0.013201,-0.003764,-0.021713
11830,0.022691,-0.026305,-0.015821,0.008549,-0.019767,0.006286,-0.025679,-0.018733,0.002104,-0.027990,...,0.020469,-0.018464,0.000081,-0.021950,-0.031847,0.040990,0.000468,0.001482,-0.020763,-0.016817
11831,-0.004274,-0.016105,0.015706,-0.021507,-0.018613,0.006576,-0.009332,-0.006377,0.003701,-0.016619,...,0.009397,0.000847,0.001629,-0.022048,-0.023694,0.017288,0.001234,0.004509,-0.016439,0.008715


In [267]:
# Check the indexing still matches up

len(splitdata2)

11833

# Comparison of RAG output vs default output

In [270]:
print(rag(client, textdata=splitdata2, vecdf=df2, query="What is spirit in Hegel's hierarchy?", k=15))

What the LLM sees:

The question is: What is spirit in Hegel's hierarchy?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

Hegel summarizes the whole development of spirit.
. 
Info chunk 2:

III, Hegel explicitly distinguishes three stages of spirit: ‘subjective spirit’ (roughly, the individual mind), ‘objective spirit’ (the collective social life of a people), and ‘absolute spirit’ (art, religion, and philosophy).
. 
Info chunk 3:

Spirit, as conceived by Christianity, involves three stages that correspond not only to the three persons of the Trinity, but to the three parts of Hegel’s system.
. 
Info chunk 4:

Here Hegel speaks of ‘the spiritual’ (das Geistige) as well as of ‘spirit’ (Geist), perhaps because he is discussing the emergence and development of spirit, which begins not from fully fledged spirit, but the spiritual aspect of ordinary lif

In [271]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is spirit in Hegel's hierarchy?"))

Answer from the basic LLM:

In Hegel's philosophical system, spirit is one of the three main categories of existence, alongside nature and mind. Spirit represents the highest form of human consciousness and self-awareness, characterized by the capacity for self-reflection, reason, and moral autonomy. Hegel believed that spirit develops over time through history and culture, progressing through various stages of development towards self-realization and self-determination. Spirit is also seen as the ultimate realization of the Absolute Idea or ultimate truth.


In [272]:
print(rag(client, textdata=splitdata2, vecdf=df2, query="What is the the abstract universality of Being according to Hegel?", k=15))

What the LLM sees:

The question is: What is the the abstract universality of Being according to Hegel?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

The first sentence seems to refer to the first type of universality, but Hegel’s primary concern is the second.
. 
Info chunk 2:

Hegel uses ‘universal’ liberally: the object of perception is universal in that it has many properties.
. 
Info chunk 3:

Hegel tends to equate the structure of reality (the order of Being) with the structure of our thoughts about reality (the order of knowing), as if abstract concepts were objective forces, not only driving us on from one concept to the next, but also at work causally within the world.
. 
Info chunk 4:

Hegel is now trying to show that its generation of Being-there and the subsequent dissolution of that Being-there and its replacement by another is a dee

In [273]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is the the abstract universality of Being according to Hegel?"))

Answer from the basic LLM:

According to Hegel, the abstract universality of Being refers to the concept that all things in existence share a common underlying essence or reality. This means that there is a fundamental unity that connects all beings and phenomena in the world, and that this unity is the basis for understanding the nature of existence itself. Hegel argues that by recognizing this abstract universality of Being, we can gain insight into the interconnectedness of all things and the underlying principles that govern the universe. This concept is central to Hegel's philosophy of dialectical idealism, which emphasizes the importance of recognizing the unity and interconnectedness of all phenomena in order to gain a deeper understanding of reality.


In [274]:
print(rag(client, textdata=splitdata2, vecdf=df2, query="What did Hegel say about blossoms and buds?", k=15))

What the LLM sees:

The question is: What did Hegel say about blossoms and buds?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

Hegel may have this episode in mind.
. 
Info chunk 2:

But in  2 Hegel compares the succession of philosophies to the growth of a plant, which is quite different from a glass or --- an inkwell.
. 
Info chunk 3:

The bud disappears when the blossom bursts forth, and one could say that the bud is refuted by the blossom; similarly, when the fruit appears, the blossom is declared to be a false Being-there2 of the plant, and the fruit replaces the blossom as the truth of the plant.
. 
Info chunk 4:

Nevertheless, it is useful to consider a concept in terms of Hegel’s botanical analogy.
. 
Info chunk 5:

Hegel attempts to draw parallels between phases of spirit and earlier shapes of consciousness.
. 
Info chunk 6:

(The beautif

In [275]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What did Hegel say about blossoms and buds?"))

Answer from the basic LLM:

Hegel famously said, "The bud disappears when the blossom breaks through, and we might say that the former is refuted by the latter; in the same way when the fruit comes, the blossom may be explained to be a false form of the plant's existence, for the fruit appears as its true nature in place of the blossom. The ceaseless activity of their own inherent nature makes these stages moments of an organic unity, where they not merely do not contradict one another, but where one is as necessary as the other; and constitutes thereby the life of the whole." Hegel believed that the progression from bud to blossom to fruit was a necessary and natural development in the life of a plant, with each stage representing a different phase of its existence.


# Approach 3 

### This time we'll try a sentence approach but we'll also take n sentences to the left and right of each of the k selected sentences for context as a synthesis of approach 1 and approach 2

In [278]:
def ragpad(client, textdata, vecdf, query, k, n):

    ind = find_top_k_matches(query, vecdf, textdata, k)

    chunks = []
    
    for ki in ind:
        padinds = []
        for i in range(ki-n,ki+n):
            padinds.append(i)
        padinds = [i for i in padinds if i >= 0 and i < len(textdata)]
        chunk = " ".join([textdata[i] for i in padinds])
        chunks.append(chunk)

    info = " ".join([f"\nInfo chunk {_+1}:\n\n" + chunks[_] + "\n" for _ in range(len(chunks))])

    combquery = f"The question is: {query}, please answer it comprehensively, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':\n {info}"

    print("What the LLM sees:\n")
    print(combquery + "\n\n")

    print("Resulting answer:\n") 
    return answerme(client, combquery)

In [279]:
print(ragpad(client, textdata=splitdata2, vecdf=df2, query="What is spirit in Hegel's hierarchy?", k=8, n=4))

What the LLM sees:

The question is: What is spirit in Hegel's hierarchy?, please answer it comprehensively, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

Spirit must become conscious of this immediate ethical life, thereby undermining it. Passing in this way from shape to shape, spirit advances to self- knowledge. These shapes are shapes of a world, located in time and space. The previous shapes, even if they presupposed a background culture, were primarily shapes of individ- ual consciousness. Hegel summarizes the whole development of spirit. It begins with the tightly knit ethical life of the Greek city-state. This is rent apart by its self-knowledge, leading to the abstract ‘right’ of the Roman Empire, which sets the Self in opposition to the substance. Division persists in the history of Europe down to the French revolution: there are two rea

In [280]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is spirit in Hegel's hierarchy? Please answer comprehensively"))

Answer from the basic LLM:

In Hegel's hierarchy of reality, spirit occupies the highest level of development. Hegel conceives of spirit as the culmination of the dialectical progression of consciousness and self-consciousness, where the individual realizes their own subjective freedom and self-determination.

Spirit represents the highest form of self-awareness and rationality. It is the collective consciousness of a society or culture, encompassing the shared values, beliefs, and norms that shape human behavior and institutions. Spirit is dynamic and historical, evolving over time through the interaction of individuals and social structures.

Hegel identifies three main forms of spirit: subjective spirit, objective spirit, and absolute spirit. Subjective spirit refers to individual consciousness and the inner life of the mind, including thoughts, feelings, and desires. Objective spirit pertains to the external manifestation of spirit in social institutions, laws, customs, and practic

In [281]:
print(ragpad(client, textdata=splitdata2, vecdf=df2, query="What is the the abstract universality of Being according to Hegel?", k=8, n=4))

What the LLM sees:

The question is: What is the the abstract universality of Being according to Hegel?, please answer it comprehensively, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

On the peculiarity of ‘I’, see further Chisholm (1981). The I (or the word ‘I’) is ‘universal’ in two distinct ways. First, the I is not inseparably attached to any particular state or object: one and the same I can see a house, a tree, or neither. Secondly, everyone is an I: the word ‘I’ is applicable by everyone to him- or herself. The first sentence seems to refer to the first type of universality, but Hegel’s primary concern is the second. The final sentence refers obliquely to Wilhelm Traugott Krug’s challenge to the new idealists to deduce the pen he is writing with: see Di Giovanni and Harris (1985), pp. The challenge would be illegitimate if it were impossib

In [283]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is the the abstract universality of Being according to Hegel? Please answer comprehensively"))

Answer from the basic LLM:

In Hegel's philosophy, the abstract universality of Being refers to the most basic and fundamental category of existence. Being, in its most general sense, is the idea of pure existence without any specific qualities or characteristics. It is the starting point for Hegel's dialectical method, which proceeds through a series of stages or moments that develop and unfold from this initial point.

Hegel argues that Being is ultimately indeterminate and empty of content, as it lacks any specific characteristics or defining features. It is a universal concept that encompasses all possibilities of existence, but is itself devoid of any particularity. This abstract universality of Being serves as the foundation for Hegel's exploration of the nature of reality and the development of his system of philosophy.

Through the dialectical process, Hegel shows how Being transitions into non-Being, and how these two moments then come together to form the concept of Becoming.

In [284]:
print(ragpad(client, textdata=splitdata2, vecdf=df2, query="What did Hegel say about blossoms and buds?", k=8, n=4))

What the LLM sees:

The question is: What did Hegel say about blossoms and buds?, please answer it comprehensively, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

On the one hand, it regards the other self-consciousness as an appendage of itself. On the other hand, its pleasure depends on the other’s perceived independence. Once it achieves its aim of union with the other, it is no longer just itself, but merged together with the other, hence no longer a mere singleton but a universal. (Goethe’s Faust seduces Margaret, who then has a child; she drowns it and is condemned to death. Hegel may have this episode in mind. One possibility is that the singleton must pass from one hapless victim to the next, becoming a universal seducer. Alternatively, or additionally, Hegel may have in mind a marriage, in which the victim of seduction later becomes a wife

In [285]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What did Hegel say about blossoms and buds? Please answer comprehensively"))

Answer from the basic LLM:

Hegel did not explicitly discuss blossoms and buds in his philosophical works. However, his dialectical method can provide insight into the concept of blooms and buds. 

In Hegel's dialectical method, reality is seen as a process of continual development and change, where everything is in a constant state of becoming. This process follows a triadic pattern of thesis, antithesis, and synthesis. 

In the context of blossoms and buds, we can think of buds as the initial stage of development, representing the thesis. Buds contain the potential for growth and transformation, but they have not yet fully developed. As the bud grows and matures, it undergoes a process of negation or opposition, represented by the antithesis. This may be in the form of challenges, obstacles, or conflicts that the bud must overcome in order to fully develop. 

Finally, the bud reaches a state of full bloom, where it has integrated the opposing forces and reached a state of harmony and