In [1]:
import os
import re
from openai import OpenAI
import pandas as pd
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

The task here was to make a basic retrieval-augmented generation tool in as little time as possible. To do this, I called the OpenAI API for the LLM to avoid running anything on my side. It only took a couple of hours and, I hope, demonstrates the principles of RAG. 

Note: only data from the Phenomenology of Spirit was used here! So don't complain if it's not the best Hegelian! 

# Importing the raw text data from a markdown file taken from a pdf parsed by llama

In [4]:
with open("hegelpos.md", "r", encoding="utf8") as ff:
    data = ff.read()

In [5]:
data[:1000]

'\n        Preface\n\n¶1. It is customary to begin a work by explaining in a preface the aim that the author\nset himself in the work, his reasons for writing it, and the relationship in which he\nbelieves it to stand to other earlier or contemporary treatments of the same subject.\nIn the case of a philosophical work, however, such an explanation seems not only\nsuperfluous but, in view of the nature of the Thing,1                          even inappropriate and\nmisleading. For the sort of statement that might properly be made about philosophy\nin a preface—say, a historical            report   of the main direction and standpoint, of the\ngeneral content and results, a string of desultory assertions and assurances about the\ntrue—cannot be accepted as the way and manner in which to expound philosophical\ntruth. Also, philosophy moves essentially in the element2                            of universality that\nembraces the particular within itself, and this creates the impression, mo

In [6]:
type(data)

str

# Cleaning the text data a bit to make it easier for the LLM to parse

In [8]:
cleandata = data.strip()

In [9]:
cleandata = cleandata.replace("\n", " ")

In [10]:
pattern = '[\uF000-\uF999]'

cleandata = re.sub(pattern, " ", cleandata)

In [11]:
pattern = '\s+'

cleandata = re.sub(pattern, " ", cleandata)

# APPROACH 1

### Creating chunks using the existing aphorisms

In [14]:
splitdata = cleandata.split("¶")

In [15]:
splitdata[708]

'708. This simple shape has thus obliterated in itself the unrest of endless singularization—the singularization of the nature-element, which acts necessarily only as universal essence, but in its Being-there and movement acts contingently, as well as the singularization of the people which, dispersed into the particular masses of activity and into individual points of self-consciousness, has a Being-there of mani- fold sense and activity—and condensed it into tranquil individuality. This individu- ality is therefore confronted by the moment of unrest, it—the essence—is confronted by self-consciousness, for which, as the birthplace of that individuality, has nothing left for itself except to be pure activity. What belongs to the substance, the artist imparted entirely to his product, but to himself as a determinate individuality he gave no actuality in his product; he could confer perfection on his product only by discarding his particularity, by disembodying himself and ascending to t

# Creating API client from OpenAI

In [17]:
client = OpenAI(api_key="ADD YOUR OWN API KEY HERE IF YOU WISH TO RUN THE CODE")

In [18]:
resp = client.chat.completions.create(
 model="gpt-3.5-turbo",
 messages=[{"role": "user", "content": "Hello world"}]
)

In [19]:
resp

ChatCompletion(id='chatcmpl-A6OpU8S8Jcy4rWclyDClpxpc5wvEn', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='Hello! How can I assist you today?', role='assistant', function_call=None, tool_calls=None, refusal=None))], created=1726088776, model='gpt-3.5-turbo-0125', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=9, prompt_tokens=9, total_tokens=18))

In [20]:
resp.choices[0].message.content

'Hello! How can I assist you today?'

# Function to return text response from client query

In [22]:
def answerme(client, question):

    resp = client.chat.completions.create(model="gpt-3.5-turbo", messages=[{"role": "user", "content": question}])

    return resp.choices[0].message.content

In [23]:
print(answerme(client, "What is the port of Aysén famous for?"))

The port of Aysén is famous for its stunning natural beauty and is a popular destination for tourists seeking outdoor activities such as fishing, hiking, and kayaking. It is also known for its seafood, particularly its salmon production. Additionally, Aysén serves as an important transportation hub for the region, connecting the city of Coyhaique with other towns and cities in southern Chile.


# Function to embed a chunk of text

In [25]:
def get_embedding(client, text, model="text-embedding-ada-002"):
   text = text.replace("\n", " ")
   return client.embeddings.create(input = [text], model=model).data[0].embedding

# Vectorizing the text

In [27]:
#This only needs to be run once to obtain the vectors. 

#vectors = []

#for _ in range(0, len(splitdata)):

    #try:
        #vectors.append(get_embedding(client, splitdata[_]))
    #except:
        #vectors.append(np.zeros(1536))

In [28]:
#df = pd.DataFrame(vectors)

In [29]:
#df.to_csv("vectors.csv")

# Importing existing vector frame

In [31]:
df = pd.read_csv("vectors.csv", index_col=0)

In [32]:
df

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,1526,1527,1528,1529,1530,1531,1532,1533,1534,1535
0,0.014789,-0.010883,-0.037749,-0.007581,0.006367,0.023629,0.005017,-0.008140,-0.020314,0.003286,...,0.002692,-0.011487,0.005162,-0.015997,-0.016832,-0.007137,0.018040,-0.021830,0.018541,-0.013138
1,0.016903,0.012458,0.012169,-0.010006,-0.014281,0.013520,-0.031656,-0.008517,-0.006711,-0.005488,...,0.004806,-0.005062,0.011960,-0.008845,-0.038869,-0.032863,0.015618,-0.004383,0.009566,-0.050068
2,0.012877,0.003183,0.008545,-0.001652,-0.015958,0.017289,-0.017595,-0.013489,-0.009450,-0.012411,...,0.009403,-0.014853,0.015093,-0.013975,-0.035110,-0.025461,0.014521,0.003266,-0.000340,-0.039130
3,-0.008276,0.001609,0.025049,-0.009256,-0.014511,0.017089,-0.016095,-0.022203,-0.030660,-0.030365,...,0.001323,0.003054,0.014807,-0.015746,-0.032190,-0.010471,-0.002618,-0.006940,-0.003950,-0.043386
4,0.013413,-0.000642,0.026282,-0.017366,-0.002777,0.016197,-0.018176,0.006843,-0.009852,-0.012204,...,0.017047,0.004710,0.008610,-0.025777,-0.030294,-0.017858,0.032447,-0.007208,-0.016037,-0.038665
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2718,-0.011102,0.007972,0.017217,-0.009396,-0.017637,0.006784,-0.001374,-0.012690,-0.012834,-0.026036,...,0.017965,-0.010059,0.012867,-0.033069,-0.030943,0.008149,0.008950,-0.005922,-0.006791,-0.029841
2719,-0.012151,0.002775,0.023420,0.001810,-0.008421,0.025766,-0.013826,-0.032601,-0.023718,-0.010978,...,0.035368,-0.016097,0.024627,-0.042718,-0.002151,0.033632,0.003940,-0.005831,-0.006591,-0.035368
2720,0.010984,-0.017450,0.026957,-0.004615,-0.008260,0.027577,-0.003366,-0.029478,-0.020322,0.009750,...,0.021199,-0.013660,0.002067,-0.025055,-0.025891,0.001400,-0.001155,-0.020403,-0.016883,-0.017733
2721,0.007435,-0.009077,-0.009428,0.002120,-0.025064,-0.004552,-0.009012,-0.023025,-0.007493,-0.018233,...,0.006617,-0.010324,0.009376,-0.031920,-0.016947,0.023155,-0.001584,0.005912,0.006214,-0.010363


In [33]:
# Check the indexing still matches up

len(splitdata)

2723

# Function to return top k matches using cosine simiarlity

In [35]:
def find_top_k_matches(query, vecdf, textdata, k=5):
    query_embedding = np.array(get_embedding(client, query, model="text-embedding-ada-002"))
    
    embeddings = vecdf.to_numpy()
    
    similarities = cosine_similarity(query_embedding.reshape(1,-1), embeddings).flatten()

    top_k_indices = similarities.argsort()[-k:][::-1]

    return top_k_indices


In [36]:
topk = find_top_k_matches("What is the spirit in Hegel's hierarchy?", df, textdata=splitdata, k=5)

# Example of top k matches found

In [38]:
for _ in topk:

    print(splitdata[_])

767. 1. Spirit, as conceived by Christianity, involves three stages that correspond not only to the three persons of the Trinity, but to the three parts of Hegel’s system. The pure substance, or God the Father, is thinking or logic: cf. Hegel, SL, p.29: ‘It can therefore be said that this content is the exposition of God as he is in his eternal essence before the creation of nature and of a finite spirit.’ Logic descends into the singularity of Christ or into nature, the second part of the system. This involves representation especially, not only in religion, but also in Hegel’s system, since the description of nature requires determinate empirical concepts, such as that of a plant, not only pure thoughts, such as those of Being or of substance. Christ is other than God and nature is other than pure thought. Finally, in the holy spirit and in Hegel’s account of mind or spirit, spirit returns from otherness and representation into self- consciousness. Each of these stages is a sort of c

# Function to run the RAG model

In [40]:
def rag(client, textdata, vecdf, query, k):

    ind = find_top_k_matches(query, vecdf, textdata, k)

    info = ". ".join([f"\nInfo chunk {_+1}:\n\n" + textdata[ind[_]] + "\n" for _ in range(len(ind))])

    combquery = f"The question is: {query}, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':\n {info}"

    print("What the LLM sees:\n")
    print(combquery + "\n\n")

    print("Resulting answer:\n") 
    return answerme(client, combquery)

# Comparison of RAG output vs default output

In [42]:
print(rag(client, textdata=splitdata, vecdf=df, query="What is spirit in Hegel's hierarchy?", k=6))

What the LLM sees:

The question is: What is spirit in Hegel's hierarchy?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

767. 1. Spirit, as conceived by Christianity, involves three stages that correspond not only to the three persons of the Trinity, but to the three parts of Hegel’s system. The pure substance, or God the Father, is thinking or logic: cf. Hegel, SL, p.29: ‘It can therefore be said that this content is the exposition of God as he is in his eternal essence before the creation of nature and of a finite spirit.’ Logic descends into the singularity of Christ or into nature, the second part of the system. This involves representation especially, not only in religion, but also in Hegel’s system, since the description of nature requires determinate empirical concepts, such as that of a plant, not only pure thoughts, such as those of Being

In [43]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is spirit in Hegel's hierarchy?"))

Answer from the basic LLM:

In Hegel's philosophy, spirit (Geist) is the highest and most complex level in the hierarchy of development. Spirit is the culmination of the dialectical process of history, in which the contradictions and conflicts of the preceding stages (nature and human consciousness) are reconciled and synthesized at a higher level of self-awareness and freedom.

Spirit is characterized by self-consciousness, rationality, and freedom, and is manifested in various forms such as art, religion, and philosophy. It is through spirit that individuals and societies come to understand themselves as part of a larger whole and strive towards self-realization and ethical action.

Overall, spirit represents the highest stage of human development according to Hegel, where individuals and societies achieve self-awareness, moral autonomy, and unity with the absolute (God or the universal spirit).


In [44]:
print(rag(client, textdata=splitdata, vecdf=df, query="What is the the abstract universality of Being according to Hegel?", k=6))

What the LLM sees:

The question is: What is the the abstract universality of Being according to Hegel?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

51). In Hegel’s system it corresponds to the transition from Logic to the philosophy of nature, showing how the logical ‘Idea’ necessarily burgeons into a world of space, time, matter, etc. and eventually into the human mind itself. The ‘becoming of Being-there as Being-there’ is associated with (among other things) this process, the transition from logic to nature, while the ‘becoming of the essence’ is presented in the Logic itself, the extraction of the logical essence of the world and the passage to higher and higher logical categories. The becoming of the ‘substance’, i.e. of the logical ‘essence’, passes over into ‘externality’, i.e. nature, where it is ‘for another’, i.e. known to us. The bec

In [45]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is the the abstract universality of Being according to Hegel?"))

Answer from the basic LLM:

According to Hegel, the abstract universality of Being refers to the fundamental characteristic of existence that transcends individual particularities and differences. In other words, it is the concept of Being that is universal and applies to all things in existence, regardless of their specific attributes or qualities. This concept is central to Hegel's philosophy of dialectical idealism, which posits that reality is ultimately grounded in a universal, abstract principle that encompasses all individual beings. Hegel believed that by understanding and engaging with this abstract universality of Being, individuals could gain insight into the interconnectedness of all things and the underlying unity of the universe.


In [46]:
print(rag(client, textdata=splitdata, vecdf=df, query="What did Hegel say about feminimism?", k=6))

What the LLM sees:

The question is: What did Hegel say about feminimism?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

, Hegel uses the neuter gender (das), but the ‘shaped unchangeable’ is masculine, both here and in 
. 
Info chunk 2:

76. Hegel’s primary response to this problem is to take on board both assertions, albeit in a modified form. This is how he treats seemingly competing philoso- phies: cf. 
. 
Info chunk 3:

245, 360; Hegel (1895) II, pp. 148–65; and PS 
. 
Info chunk 4:

: ‘Note Hegel’s incorporation of Evil into the Absolute’. Whatever else Hegel may have believed about the absolute, he did not believe that it was purely good to the exclusion of evil. Evil, and falsity, must be taken on board. 
. 
Info chunk 5:

459. 1. Hegel refers back to 
. 
Info chunk 6:

213–44 for Hegel’s later use of the word. 



Resulting answer:

I don

In [47]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What did Hegel say about feminimism?"))

Answer from the basic LLM:

Hegel did not explicitly discuss feminism in his works as it was not a prominent social or political movement during his time. However, his philosophy of dialectical idealism, which emphasizes the development of self-consciousness and individual freedom, can be interpreted as laying the groundwork for feminist ideas. In Hegel's view, individuals should strive for self-actualization and self-determination, which are central goals of feminism. Additionally, his concept of recognition and the importance of mutual respect and equality between individuals can be seen as aligning with feminist principles. Nonetheless, Hegel's ideas are complex and open to interpretation, and it is important to consider his writings in their historical context when examining his views on feminism.


# APPROACH 2

### This time we chunk the source text into sentences rather than aphorisms. This means we need many more vector representations of chunk as the number of chunks is far higher.

In [49]:
pattern = "¶"

cleandata2 = re.sub(pattern, " ", cleandata)

### Sentence splitting function taken from: https://stackoverflow.com/questions/4576077/how-can-i-split-a-text-into-sentences

In [51]:

alphabets= "([A-Za-z])"
prefixes = "(Mr|St|Mrs|Ms|Dr)[.]"
suffixes = "(Inc|Ltd|Jr|Sr|Co)"
starters = "(Mr|Mrs|Ms|Dr|Prof|Capt|Cpt|Lt|He\s|She\s|It\s|They\s|Their\s|Our\s|We\s|But\s|However\s|That\s|This\s|Wherever)"
acronyms = "([A-Z][.][A-Z][.](?:[A-Z][.])?)"
websites = "[.](com|net|org|io|gov|edu|me)"
digits = "([0-9])"
multiple_dots = r'\.{2,}'

def split_into_sentences(text: str) -> list[str]:
    """
    Split the text into sentences.

    If the text contains substrings "<prd>" or "<stop>", they would lead 
    to incorrect splitting because they are used as markers for splitting.

    :param text: text to be split into sentences
    :type text: str

    :return: list of sentences
    :rtype: list[str]
    """
    text = " " + text + "  "
    text = text.replace("\n"," ")
    text = re.sub(prefixes,"\\1<prd>",text)
    text = re.sub(websites,"<prd>\\1",text)
    text = re.sub(digits + "[.]" + digits,"\\1<prd>\\2",text)
    text = re.sub(multiple_dots, lambda match: "<prd>" * len(match.group(0)) + "<stop>", text)
    if "Ph.D" in text: text = text.replace("Ph.D.","Ph<prd>D<prd>")
    text = re.sub("\s" + alphabets + "[.] "," \\1<prd> ",text)
    text = re.sub(acronyms+" "+starters,"\\1<stop> \\2",text)
    text = re.sub(alphabets + "[.]" + alphabets + "[.]" + alphabets + "[.]","\\1<prd>\\2<prd>\\3<prd>",text)
    text = re.sub(alphabets + "[.]" + alphabets + "[.]","\\1<prd>\\2<prd>",text)
    text = re.sub(" "+suffixes+"[.] "+starters," \\1<stop> \\2",text)
    text = re.sub(" "+suffixes+"[.]"," \\1<prd>",text)
    text = re.sub(" " + alphabets + "[.]"," \\1<prd>",text)
    if "”" in text: text = text.replace(".”","”.")
    if "\"" in text: text = text.replace(".\"","\".")
    if "!" in text: text = text.replace("!\"","\"!")
    if "?" in text: text = text.replace("?\"","\"?")
    text = text.replace(".",".<stop>")
    text = text.replace("?","?<stop>")
    text = text.replace("!","!<stop>")
    text = text.replace("<prd>",".")
    sentences = text.split("<stop>")
    sentences = [s.strip() for s in sentences]
    if sentences and not sentences[-1]: sentences = sentences[:-1]
    return sentences

# Creating chunks using the existing aphorisms

In [53]:
cleandata2 = split_into_sentences(cleandata2)

In [54]:
splitdata2 = [_ for _ in cleandata2 if len(_) > 20]

In [55]:
splitdata2

['It is customary to begin a work by explaining in a preface the aim that the author set himself in the work, his reasons for writing it, and the relationship in which he believes it to stand to other earlier or contemporary treatments of the same subject.',
 'In the case of a philosophical work, however, such an explanation seems not only superfluous but, in view of the nature of the Thing,1 even inappropriate and misleading.',
 'For the sort of statement that might properly be made about philosophy in a preface—say, a historical report of the main direction and standpoint, of the general content and results, a string of desultory assertions and assurances about the true—cannot be accepted as the way and manner in which to expound philosophical truth.',
 'Also, philosophy moves essentially in the element2 of universality that embraces the particular within itself, and this creates the impression, more here than in the case of other sciences, that the Thing itself, in all its essential

# Vectorizing the text

In [57]:
#This only needs to be run once to obtain the vectors. 

#vectors2 = []

#for _ in range(0, len(splitdata2)):

    #try:
        #vectors2.append(get_embedding(client, splitdata2[_]))
    #except:
        #vectors2.append(np.zeros(1536))

In [58]:
#df2 = pd.DataFrame(vectors2)

In [59]:
#df2.to_csv("vectors2.csv")

# Importing existing vector frame

In [61]:
df2 = pd.read_csv("vectors2.csv", index_col=0)

In [62]:
df2

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,1526,1527,1528,1529,1530,1531,1532,1533,1534,1535
0,0.018040,-0.009153,-0.009210,-0.003637,-0.013296,0.012897,-0.020545,0.010336,-0.002603,0.015484,...,0.015522,-0.016711,0.010260,-0.004624,-0.029071,-0.027401,0.018824,-0.006341,0.017888,-0.025605
1,0.008922,-0.002289,0.008695,-0.017286,-0.024967,-0.000258,-0.045463,-0.024369,-0.015908,-0.010567,...,0.000769,-0.011437,0.018573,-0.018144,-0.031557,-0.017195,0.015232,-0.002611,0.008078,-0.033714
2,0.000723,0.018356,-0.002905,-0.015889,-0.012609,0.006137,-0.027160,-0.001853,0.004087,0.013973,...,-0.001609,-0.008837,0.028655,-0.014105,-0.036082,-0.046709,0.010857,-0.005032,0.016388,-0.043062
3,0.004204,0.004378,0.016330,-0.032240,-0.020872,0.003905,-0.034734,-0.018903,-0.020163,-0.012996,...,0.010771,-0.016658,0.016921,-0.010718,-0.037884,-0.008460,0.015188,-0.013757,0.000290,-0.048228
4,-0.002144,0.027829,0.037220,-0.007724,-0.009867,0.007248,-0.028570,-0.002568,-0.005813,-0.011375,...,0.004348,0.004894,0.031189,-0.024126,-0.017896,-0.034654,0.015422,-0.004944,-0.008710,-0.027115
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
11809,0.008419,-0.009955,0.024746,-0.028968,-0.012700,0.014118,-0.011399,-0.013314,-0.011510,-0.004843,...,0.029491,-0.025033,0.000802,-0.015621,-0.016458,0.012281,0.021648,-0.016327,-0.010347,-0.010412
11810,-0.006728,-0.001566,0.025972,-0.023427,-0.023012,-0.009872,0.003211,-0.011955,-0.008365,-0.026964,...,0.002866,0.012156,-0.001374,-0.019422,-0.013770,-0.023454,0.011727,0.013201,-0.003764,-0.021713
11811,0.022691,-0.026305,-0.015821,0.008549,-0.019767,0.006286,-0.025679,-0.018733,0.002104,-0.027990,...,0.020469,-0.018464,0.000081,-0.021950,-0.031847,0.040990,0.000468,0.001482,-0.020763,-0.016817
11812,-0.005004,-0.013928,0.015388,-0.019822,-0.016745,0.008796,-0.008307,-0.008424,0.004881,-0.018270,...,0.009624,-0.000889,0.000335,-0.021935,-0.025469,0.013693,0.002893,0.004202,-0.023030,0.005024


In [63]:
# Check the indexing still matches up

len(splitdata2)

11833

# Comparison of RAG output vs default output

In [65]:
print(rag(client, textdata=splitdata2, vecdf=df2, query="What is spirit in Hegel's hierarchy?", k=15))

What the LLM sees:

The question is: What is spirit in Hegel's hierarchy?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

Moreover, while each of the previous shapes left its predecessors behind, spirit incorporates these shapes as aspects of itself.
. 
Info chunk 2:

I,  112 Addition associates mirrors with ‘reflection’ rather than speculation.
. 
Info chunk 3:

The sensory presence of Christ, distinct from other Selves, was as much a hindrance as a help: cf.
. 
Info chunk 4:

This eases the introduction of spirit.
. 
Info chunk 5:

--- c. The state of right  477.
. 
Info chunk 6:

The content of this representation is true, but the representational form is defective.
. 
Info chunk 7:

They are rather the stages of spirit’s development, while knowledge and objectivity are the ‘opposition’ that inheres in each stage.
. 
Info chunk 8:

It is this th

In [66]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is spirit in Hegel's hierarchy?"))

Answer from the basic LLM:

In Hegel's philosophy, spirit is the highest level of his hierarchy of reality. Spirit represents the highest form of self-consciousness and the realization of freedom and rationality. It is the realm where individuals are able to achieve complete self-awareness, self-determination, and moral consciousness.

According to Hegel, spirit is manifested in various forms, including art, religion, and philosophy. It is through the development of spirit that individuals are able to actualize their full potential and become truly self-fulfilled beings. Spirit also plays a crucial role in the development of society, as it is through the collective spirit of individuals that cultural and societal progress is made.

Overall, spirit in Hegel's hierarchy represents the pinnacle of human development and self-realization, where individuals are able to transcend their limitations and achieve a higher state of consciousness and freedom.


In [67]:
print(rag(client, textdata=splitdata2, vecdf=df2, query="What is the the abstract universality of Being according to Hegel?", k=15))

What the LLM sees:

The question is: What is the the abstract universality of Being according to Hegel?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

However, ‘I’ is different from ‘this’, ‘here’, and ‘now’.
. 
Info chunk 2:

His arguments are not exclusively linguistic: cf.
. 
Info chunk 3:

But to what extent --- does Hegel regard the logical order as mirrored in the historical development of humanity?
. 
Info chunk 4:

This explains the aversion to prefaces expressed in   1f.
. 
Info chunk 5:

It plays with its mask to show that it is a Self, just like the actor and the spectator.
. 
Info chunk 6:

The absolute is more complex, but the principle is the same.
. 
Info chunk 7:

In this universality the reality of the ethical spirit is lost and, empty of content, the spirits of the national individuals are gathered into a single pantheon, not a p

In [68]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is the the abstract universality of Being according to Hegel?"))

Answer from the basic LLM:

The abstract universality of Being, according to Hegel, refers to the fundamental concept of existence that underlies all phenomena and characteristics of the world. It is the idea that everything in existence is connected and part of a larger whole, and that this interconnectedness provides the basis for understanding reality and existence. Hegel argued that this abstract universality of Being is essential for understanding the nature of reality and for developing a comprehensive philosophical system that can explain the world in its entirety.


In [69]:
print(rag(client, textdata=splitdata2, vecdf=df2, query="What did Hegel say about blossoms and buds?", k=15))

What the LLM sees:

The question is: What did Hegel say about blossoms and buds?, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 1:

The reference to ‘theory’ as a grey shadow alludes to Mephistopheles’s words: ‘My worthy friend, grey are all theories/And green alone life’s golden tree’.
. 
Info chunk 2:

Spirit involves two estrangements, that of the essence into self-consciousness and that of self-consciousness into the essence.
. 
Info chunk 3:

The bud disappears when the blossom bursts forth, and one could say that the bud is refuted by the blossom; similarly, when the fruit appears, the blossom is declared to be a false Being-there2 of the plant, and the fruit replaces the blossom as the truth of the plant.
. 
Info chunk 4:

Nevertheless, it is useful to consider a concept in terms of Hegel’s botanical analogy.
. 
Info chunk 5:

--- c. The state 

In [70]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What did Hegel say about blossoms and buds?"))

Answer from the basic LLM:

Hegel did not directly talk about blossoms and buds, as it was not a central theme in his philosophical works. However, he did use the metaphor of blossoming or unfolding to describe the development of ideas and concepts in his dialectical method. Hegel believed that ideas and concepts evolve and develop over time, much like a bud blossoming into a flower. This process of development and unfolding is central to his dialectical approach to philosophy, where contradictions and conflicts lead to the resolution of opposing ideas, resulting in a higher level of understanding.


# Approach 3 

### This time we'll try a sentence approach but we'll also take n sentences to the left and right of the k selected sentences for context as a synthesis of approach 1 and approach 2

In [188]:
def ragpad(client, textdata, vecdf, query, k, n):

    ind = find_top_k_matches(query, vecdf, textdata, k)

    chunks = []
    
    for ki in ind:
        padinds = []
        for i in range(ki-n,ki+n):
            padinds.append(i)
        padinds = [i for i in padinds if i >= 0 and i < len(textdata)]
        chunk = ". ".join([textdata[i] for i in padinds])
        chunks.append(chunk)

    info = ". ".join([f"\nInfo chunk {_}:\n\n" + chunks[_] + "\n" for _ in range(len(chunks))])

    combquery = f"The question is: {query}, please answer it comprehensively, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':\n {info}"

    print("What the LLM sees:\n")
    print(combquery + "\n\n")

    print("Resulting answer:\n") 
    return answerme(client, combquery)

In [190]:
print(ragpad(client, textdata=splitdata2, vecdf=df2, query="What is spirit in Hegel's hierarchy?", k=8, n=4))

What the LLM sees:

The question is: What is spirit in Hegel's hierarchy?, please answer it comprehensively, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 0:

It is never sublated, and it has been at work all along in various guises, seeking to actualize itself, as the title of PS indicates.. The isolation and analyses of the previous shapes were conducted by spirit, focusing on particular aspects of itself in order to discover and develop its true nature.. These aspects are only abstractions from spirit, whereas spirit itself is ‘existence’, a word that has the flavour of ‘stepping-forth’, in accordance with its Latin etymology.. This may be why (as Hyppolite, II, p.16, n.6 suggests) the dialectical development of spirit corresponds to a coherent temporal and historical development, which is, in the preceding shapes, dimly discernible in the developm

In [186]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is spirit in Hegel's hierarchy? Please answer comprehensively"))

Answer from the basic LLM:

Spirit, in Hegel's philosophy, is the highest level of development in his hierarchy of absolute spirit. It represents the culmination of the development of human consciousness and self-awareness. In Hegel's view, spirit is the ultimate reality, the absolute substance that underlies and gives meaning to all of existence.

Spirit is characterized by self-consciousness, rationality, and freedom. It is the realm of human culture, art, religion, and morality. According to Hegel, spirit represents the highest stage of human development, where individuals are able to recognize and overcome their own limitations and achieve true self-realization.

In Hegel's philosophy, spirit is understood as a dynamic process of self-development and self-actualization. It is not a fixed or static entity, but rather a constantly evolving and fluid concept. Spirit is both individual and universal, encompassing both the particularity of the individual consciousness and the universali

In [192]:
print(ragpad(client, textdata=splitdata2, vecdf=df2, query="What is the the abstract universality of Being according to Hegel?", k=8, n=4))

What the LLM sees:

The question is: What is the the abstract universality of Being according to Hegel?, please answer it comprehensively, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 0:

The knowing I is now the essential, active party.. This reversal acknowledges all the work done by the I in its attempt to pick out the object, and provides another way of picking it out, viz.. by reference to the I itself.. Then Hegel notes that ‘I’ is itself an indexical, applicable to everyone, and so cannot be used to refer to anyone in particular.. However, ‘I’ is different from ‘this’, ‘here’, and ‘now’.. Once we know who is using the word ‘this’, we still may not know what is referred to.. But once we know who is using the word ‘I’, we know who is referred to.. On the peculiarity of ‘I’, see further Chisholm (1981).
. 
Info chunk 1:

So sensory certainty cann

In [196]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What is the the abstract universality of Being according to Hegel? Please answer comprehensively"))

Answer from the basic LLM:

In Hegel's philosophy, the abstract universality of Being is a concept that deals with the fundamental nature of existence itself. Hegel argues that all things that exist share a common essence, which he refers to as Being. This essence is universal and abstract, meaning that it is not tied to any particular thing or individual.

According to Hegel, the abstract universality of Being is the starting point for understanding the nature of reality. It is the most basic concept that underlies all other concepts and categories in his system of thought. Hegel argues that everything that exists can be understood in terms of Being, and that it is the foundation for all other aspects of reality.

Furthermore, Hegel suggests that the abstract universality of Being is dynamic and evolving. It is not a static concept, but rather a process that is constantly changing and developing. This concept of becoming is central to Hegel's philosophy, as he believes that reality is

In [198]:
print(ragpad(client, textdata=splitdata2, vecdf=df2, query="What did Hegel say about blossoms and buds?", k=8, n=4))

What the LLM sees:

The question is: What did Hegel say about blossoms and buds?, please answer it comprehensively, please make use of the following retrieved text written by Hegel when answering the question. NOTE: if you cannot find suitable information below, return 'I don't know':
 
Info chunk 0:

), where an attempt is made to close the gap between self-consciousness and external actuality, and especially to its first subsection, ‘The spiritual animal kingdom’ (  397ff.. ), where the individual sets out to express himself in an unresisting actuality.. a. Pleasure and necessity  360.. A free quotation from Goethe’s Faust, Part I of 1790, which inspires this whole section.. The reference to ‘theory’ as a grey shadow alludes to Mephistopheles’s words: ‘My worthy friend, grey are all theories/And green alone life’s golden tree’.. Self-consciousness has, like Faust himself, abandoned observing reason’s theoretical approach to Being and now exploits it for its own pleasure.. As  364 rev

In [200]:
print("Answer from the basic LLM:\n")

print(answerme(client, "What did Hegel say about blossoms and buds? Please answer comprehensively"))

Answer from the basic LLM:

Hegel, in his famous work "Phenomenology of Spirit", discusses the concept of blossoms and buds in the context of his dialectical philosophy. He uses the metaphor of blossoms and buds to illustrate the process of development and transformation in the natural world, as well as in human consciousness.

According to Hegel, the relationship between the bud and the blossom is one of potentiality and actuality. The bud represents the potential for growth and transformation, while the blossom represents the actualization of that potential. Hegel argues that the blossom is the culmination of the bud's development, the fulfillment of its inherent potentiality.

Moreover, Hegel suggests that the blossom does not simply replace the bud, but rather transcends and includes it. In the process of blossoming, the bud is transformed and elevated to a higher level of existence. This idea reflects Hegel's dialectical method, which emphasizes the interconnectedness and movement