<a href="https://colab.research.google.com/github/trancethehuman/ai-workshop-code/blob/main/Long_term_memory_%26_personalized_LLM_responses.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Long-term memory and personalized LLM responses (5 tiers)

Before we start, make sure you have an [OpenAI API key](https://platform.openai.com/docs/overview) and a [Pinecone API key](https://www.pinecone.io/). Pinecone should be free, and OpenAI will cost you very very little.

# The setup
First, you'll need to install some dependencies.

In [None]:
pip install openai --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m314.1/314.1 kB[0m [31m3.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m5.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25h

Then, enter your OpenAI API key.

In [None]:
import getpass

OPENAI_API_KEY = getpass.getpass('Enter your OpenAI API key: ')

Enter your OpenAI API key: ··········


Now, set up an OpenAI client to interact with GPT.

In [None]:
from openai import OpenAI

client = OpenAI(api_key=OPENAI_API_KEY)

Next, let's declare a `system` message to our AI assistant. The system message is supposedly there to help ground the conversation.

In [None]:
system_message = {"role": "system", "content": "You are a helpful personal assistant. Your main goal is to take into account what you know about the user and answer them in pirate speak."}

messages_history = [system_message]

Then we create a handy function to receive user's request and send back an answer using `gpt-4-turbo`.

In Python, to capture the content as it's streamed in from OpenAI, we'll have to use `yield` instead of `return` statement.

Notice here that I'm appending new user queries directly into the messages_history list outside of this function to keep a list of previous messages, so we can better simulate a conversation.

In [None]:
def get_ai_answer(new_question: str, messages: list):
    messages.append({"role": "user", "content": new_question})

    stream = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=messages,
        stream=True,
    )

    for chunk in stream:
        if chunk.choices[0].delta.content is not None:
            yield chunk.choices[0].delta.content

Additionally, here's a handy function to just print the streaming content to our terminal.

In [None]:
def print_ai_answer(user_input: str):
    for chunk in get_ai_answer(user_input, messages_history):
        print(chunk, end="")

Now, let's test and see if our new function works. Let's pass in a question and see the response streamed back.

In [None]:
print_ai_answer("Tell me a joke")

Arrr, why did the pirate go to school?

To improve his "Arrrrrrrrrr-ticulation"! Avast!

# F-tier: No personalization
To test the first tier of personalization, which is no personalization, ask the AI what it knows about you.

In [None]:
print_ai_answer("What do you know about me?")

Arrr, me heartie! So far, I've naught but the basic bearings o' our conversation. If ye be willin' t' share more 'bout yer preferences or hobbies, I can tailor me responses better t' yer liking! So, what sparks yer interest, matey?

As expected, it doesn't know anything. This is the F-tier (the worst tier)

# D-tier: System prompt personalization
To put a tiny bit of personalization in, we can change the system message so it'll know a few things about us up-front.

Here I'm just adding an extra sentence to the content of the system message, which is the first message that the LLM will see in the list.

In [None]:
system_message["content"] += " My name is Hai and I'm currently an unemployed washed up full stack developer. I don't like chocolate."

print_ai_answer("What do you know about me?")

Ahoy, Hai! I know ye be a former full stack developer who's currently not working, and ye be no friend of chocolate!

Only one of those things is true about me (guess which one). This method of personalization is rigid, and we can't ask it about anything else other than this.

In [None]:
print_ai_answer("Do I like baseball?")

Ahoy, Hai! Ye've not mentioned yer likin' or dislikin' for baseball to me, so I be not knowin' for sure, matey!

# C-tier: Personalization through messages history
This takes us to the next tier: personalization through messages history. This means you let the LLM see a history of our conversation, and as long as you've mentioned a fact about yourself, then it can refer to it and answer your newest query.

Here, I'm just simulating past messages by passing into our messages_history list a previous message we've sent about my dancing habit.

In [None]:
messages_history.append({"role": "user", "content": "I like to Boogie all night long."})

Now, before asking what the assistant knows about my dancing habits, let's look at the messages we've already sent.

Here I'm just using pprint to display the list of messages a bit nicer.

In [None]:
from pprint import pprint

pprint(messages_history)

[{'content': 'You are a helpful personal assistant. Your main goal is to take '
             'into account what you know about the user and answer them in '
             "pirate speak. My name is Hai and I'm currently an unemployed "
             "washed up full stack developer. I don't like chocolate.",
  'role': 'system'},
 {'content': 'Tell me a joke', 'role': 'user'},
 {'content': 'What do you know about me?', 'role': 'user'},
 {'content': 'What do you know about me?', 'role': 'user'},
 {'content': 'Do I like baseball?', 'role': 'user'},
 {'content': 'I like to Boogie all night long.', 'role': 'user'}]


Now let's ask it if it knows what type of dancing I like to do.

In [None]:
print_ai_answer("What kind of dancing do I like?")

Ahoy, Hai! Ye be likin' to Boogie all night long, that be the dancin' ye fancy!

One silly downside to keeping things in messages history is long-running conversations will result in the LLM forgetting (or technically not paying attention to) key details about the user. It can also become costly since LLMs like the ones behind an API from OpenAI are stateless (meaning it needs the full messages history sent alongside the latest message) every time you make a request. Not efficient.

# B-tier: Entity memory
Ok, on to the next tier: entity memory. This one combines LLM text generation with an old NLP technique: entity extraction.

LLMs are great at entity extraction, meaning it can classify and selectively save certain words, phrases or key details from a text (in this case, a message from the user) based on a set of criteria. Imagine putting a filter on a column in your Excel spreadsheet and then save that as a worksheet.

We can first look at the user's message, see if there's any juicy details we want to remember, and then pass those juicy details into the `system` message so it's always there for the LLM to refer back to.

Things I'd like my LLM to remember about me: my list of likes and dislikes.

What I'm gonna do is:
- Define a new function for the app to run just before answering my question to extract out the juicy details about me.
- Update the `system` prompt.
- Then with the juicy details updated, answer my question.
- **I'll go an extra step** and wipe the `messages_history` just so we can test whether or not it was able to extract key details and save them or not. Plus, this saves me some coins.

First, let's declare a make a function to extract goodies from my query. I want to keep what I like and dislike. This will be it's own LLM call, with it's own `system` message.

However, I don't want to stream here because I need the full output before jumping into answering. I also use `gpt-3.5-turbo` instead of `gpt-4-turbo` here for faster inference speed (it's smart enough for this task). As our pipeline gets more complicated, we're both getting better results but higher latency due to the number of hops.

In [None]:
def extract_personality(user_input: str):
  entity_extraction_system_message = {"role": "system", "content": "Looking at the user's message, you must extract their likes and dislikes and put them as strings into lists, and respond only in JSON in this format: {{""likes"": ""[<things_user_likes>]"", ""dislikes"": ""[<things_user_dislikes>]""}}"}

  messages = [entity_extraction_system_message]
  messages.append({"role": "user", "content": user_input})

  response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        stream=False,
        response_format={"type": "json_object"}
    )

  return response.choices[0].message.content

Looking at the system message here for entity extraction, I'm just letting it know that it has one job: to get the user's likes and dislikes, and respond in JSON format (I've also specified `response_format` for the OpenAI's client since their SDK supports JSON mode, which sends back (most of the time) valid JSON for you to then parse into a Python dict and have more flexibility in using it.

Also, this part,`{{""likes"": ""[<things_user_likes>]"", ""dislikes"": ""[<things_user_dislikes>]""}}` is just a JSON object with two keys: likes and dislikes. The extra double quotes and curly brackets are there to keep them all inside one string.

Let's test it.

In [None]:
my_preferences = extract_personality("Sometimes when I'm really sad, I go on long walks alone. Then I come home and eat a bowl of ramen. I usually don't watch TVs at night when I'm tired, but often stay up late using my phone.")

import json

pprint(json.loads(my_preferences))

{'dislikes': ['watching TV at night when tired',
              'staying up late using the phone'],
 'likes': ['long walks alone', 'eating a bowl of ramen']}


Perfect. Now we can modify the answer pipeline so the assistant first take the user's query, extract info, and then come up with an informed answer.

We do this by:
- Declare a dictionary that will store our personality data: `extracted_personality`. This will be persistent even when our `messages_history` changes. This is by design so that the key memories about the user is independent of the conversation's context.
- re-writing our `print_ai_answer()` function to first extract, then add this to the messages_history list (it's just convenient to do it like this; you can put this in the `system` prompt, an `assistant` message or `user` message. Doesn't really matter.

I like to add HTML-like tags (or XML tags) when injecting large chunks of string into an LLM's prompt like this because it's shown (anecdotally) that this helps the LLM pay attention and differentiate different sections of content in it's prompt. It could be placebo but it makes it easier to read for me.

In [None]:
extracted_personality = {"likes": [], "dislikes": []}

def print_ai_answer(user_input: str):
  # Clear the current messages history for fair testing :)
  messages_history = []
  print("Messages history cleared.")

  # Extract relevant personality info and update personality dictionary
  new_personalities = json.loads(extract_personality(user_input))
  print("Done extracting personality.")
  pprint(new_personalities)
  extracted_personality["likes"].extend(new_personalities["likes"])
  extracted_personality["dislikes"].extend(new_personalities["dislikes"])

  # Insert into the messages_history list
  messages_history.append({"role": "assistant", "content": f"The user's preferences are as follows: <user_preferences>{extracted_personality}</user_preferences>"})

  # Generate a final answer
  for chunk in get_ai_answer(user_input, messages_history):
      print(chunk, end="")

Let's run our first test.

In [None]:
print_ai_answer("I've been craving avocado toasts recently but I can't afford it. Give me one bougie thing for a millenial like me to have for breakfast.")

Messages history cleared.
Done extracting personality.
{'dislikes': ['not being able to afford it'], 'likes': ['avocado toasts']}
If you're looking for something that feels special and is budget-friendly, consider making a sweet potato toast. It's a delightful alternative to avocado toast and also caters to a trendy, wholesome breakfast option. Simply slice a sweet potato lengthwise into 1/4-inch thick slices, toast them in your toaster or oven until they are firm but pierce-able with a fork, and then top them with ingredients you already have at home. Good topping options include peanut butter and banana, almond butter and apple slices, or even a savory option like hummus and sliced radishes. This can give you that gourmet feel without breaking the bank!

Ok, it was able to extract my preference: `{'dislikes': [], 'likes': ['avocado toasts']}`

With no persistent messages history, let's see if I keep telling it stuff and it'll remember.


In [None]:
print_ai_answer("When I'm sad, I go on long walks and it helps me relax a lot. I don't eat too much too often because I'm trying to be in shape. I spend a lot of time at the gym.")

Messages history cleared.
Done extracting personality.
{'dislikes': ['eating too much'],
 'likes': ['going on long walks', 'relaxing', 'spending time at the gym']}
It's great that you have found healthy ways to manage your emotions and maintain your physical fitness. Going on long walks not only helps you relax but also has the added benefit of physical exercise, which can improve your overall mood and health. Spending time at the gym is another excellent way to stay in shape and release endorphins, which can naturally help lift your spirits. Keep up the good work maintaining a balanced lifestyle! If you ever feel like your routine is becoming too repetitive or you need new ideas to stay motivated, consider trying different walking routes or new fitness classes at the gym.

Now let's ask for recommendations. This is where personalization is critical.

Remember that there's no prior messages to this. Let's see how our assistant handles this query.

In [None]:
print_ai_answer("I'm bored. What do you think I should do?")

Messages history cleared.
Done extracting personality.
{'dislikes': [], 'likes': []}
Since you enjoy relaxing and going on long walks, why not take a leisurely stroll in a nearby park or nature reserve? This could be a great way to clear your mind and enjoy some fresh air. If the weather isn't cooperative, perhaps you could spend some time at the gym, which is another activity you like. It's a productive way to beat boredom and stay active.

Small example, but imagine your user starting a new chat thread and able to instantly get personalized recommendations. Pretty magical I'd say.

Plus, our personality bank is memory efficient. Let's take a look.

In [None]:
pprint(extracted_personality)

{'dislikes': ['eating too much'],
 'likes': ['avocado toasts',
           'long walks',
           'relaxing',
           'spending time at the gym']}


Seems like it knows quite a lot about me.

Now this tier is a lot closer to what you'd have in a production environment. As your user uses the app, more data will start to pile up. After a year of usage, you can't just stuff a list of 500 things the user likes into an LLM's prompt for every request.

So again we be more selective in what preferences to pull in.

# A-tier: Vectorstore-backed memory
This means storing the personality preferences in a data structure that's easy to look up. In the case of LLMs, since Q&A is unstructured data and queries, a vector database might come to mind. This is not the only option you have (you can always convert unstructured queries to structured, and query your data in SQL or other ways).

But, vector databases are the hot new thing and I don't think I can convince you otherwise, so we'll be using a lightweight Pinecone setup to store our user's personality bank, and only query the relevant parts when the assistant looks at the user's query

First, we set up Pinecone as our vector database.

Step one is to install Pinecone.

In [None]:
pip install pinecone-client --quiet

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/214.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━[0m [32m153.6/214.5 kB[0m [31m4.4 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m214.5/214.5 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
[?25h

Then, save our Pinecone API key.

In [None]:
PINECONE_API_KEY = getpass.getpass('Enter your Pinecone API key: ')

Enter your Pinecone API key: ··········


Now, let's setup our Pinecone SDK client. We'll be using Pinecone's new serverless architecture (because it's cheap and free lol)

In [None]:
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key=PINECONE_API_KEY)

Now, we have to create an index. An index for a vector database is a way to organize embeddings in an efficient manner for search. Popular indexing algorithms are: HNSW, IVF, etc.. I personally use HNSW with Supabase (which is just a Postgres table and pg-vector extension underneath) It's great for my usecase.

In [None]:
index_name = "user-preferences"

EMBEDDINGS_DIMENSIONS = 512

# Just checking to see if this index already exist
existing_indexes = pc.list_indexes().names()

# If index doesn't exist yet, then delete it and create one (we're starting from scratch)
if index_name in existing_indexes:
  pc.delete_index(index_name)
  print("Deleted index.")

pc.create_index(
    name=index_name,
    dimension=EMBEDDINGS_DIMENSIONS,
    metric="cosine",
    spec=ServerlessSpec(
        cloud='aws',
        region='us-east-1'
    )
)
print("Created index.")

index = pc.Index(index_name)

Deleted index.
Created index.


Notice a few things about this setup:
- "cosine" stands for cosine similarity, which is a way to calculate distance between vectors in 3D space. It's just one way to do vector search.
- dimension: How many dimensions does your embedding model create for each piece of text? Heavily fine-tuned models with high quality data can get away with lower numbers (like OpenAI's recent `text-embedding-3-large` can throw away half the dimensions and still perform as well as dumber models that require more dimensions to represent the same concept ([read more here](https://openai.com/index/new-embedding-models-and-api-updates)))
- We went with 512 because that's an optimal number for our chosen embeddings model. Lower dimensions also saves your database storage and potentially save you lots of money.

We also need to setup our embedding model so we can easily convert text to numbers for easy vector search.

Here I'm going to use LangChain because there's some math involved with cutting down number of dimensions for OpenAI's embeddings that I don't remember lol. LangChain takes care of that for me.

They recently separated a lot of the 3rd party abstractions into their own Python packages, so I'll just use the OpenAI wrapper for this one.

In [None]:
pip install langchain-openai --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m302.9/302.9 kB[0m [31m7.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m34.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m120.8/120.8 kB[0m [31m11.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.0/53.0 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m142.5/142.5 kB[0m [31m13.0 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
from langchain_openai import OpenAIEmbeddings

embedding_client = OpenAIEmbeddings(api_key=OPENAI_API_KEY,
    model="text-embedding-3-small", dimensions=EMBEDDINGS_DIMENSIONS)

Some people like to shit on LangChain but it's great for the majority of use cases and prototyping. I don't want to write all the utility functions from scratch every time. Here, every time I need to generate embeddings, I can just call `embedding_client.embed_query("MY QUERY")` and it's done. Boom.

Ok, now we update our chat pipeline to do a search of the user's existing preferences in the vector index to get similar things. This will help our LLM make a better decision when recommending things.

We'll need to:
- Embed the latest user query. This allows us to use the embeddings to find similar data in user's preferences.
- Find the list of existing preferences.
- Let the LLM know that these are the preferences that are relevant to the latest query.

In [None]:
import uuid

# Declare this here just to wipe the slate clean again.
extracted_personality = {"likes": [], "dislikes": []}

def print_ai_answer(user_input: str):
  # Clear the current messages history for fair testing :)
  messages_history = []
  print("Messages history cleared.")

  # Extract relevant personality info and update personality dictionary
  new_personalities = json.loads(extract_personality(user_input))
  print("Done extracting personality.")
  extracted_personality["likes"].extend(new_personalities["likes"])
  extracted_personality["dislikes"].extend(new_personalities["dislikes"])

  # Embed each new personality item, give them metadata so we can filter later, and upsert them to our vector database
  embeddings = []
  for dislike in extracted_personality["dislikes"]:
    text_to_embed = f"The user dislikes {dislike}"
    current_embeddings = embedding_client.embed_query(text_to_embed)

    dislike_with_metadata = {
        "id": str(uuid.uuid4()), "values": current_embeddings, "metadata": {"type": "dislikes", "content": dislike}
    }
    embeddings.append(dislike_with_metadata)

  for like in extracted_personality["likes"]:
    text_to_embed = f"The user likes {like}"
    current_embeddings = embedding_client.embed_query(text_to_embed)

    dislike_with_metadata = {
        "id": str(uuid.uuid4()), "values": current_embeddings, "metadata": {"type": "likes", "content": like}
    }
    embeddings.append(dislike_with_metadata)

  # Push all of our embeddings (likes and dislikes) to vector database (Pinecone)
  index.upsert(vectors=embeddings)

  # Embed the user's question so we can compare to our embedded personality items
  user_query_embedded = embedding_client.embed_query(user_input)

  # Search for relevant personalities. Here, to make my life easier, we just look up things the person likes
  likes_filter={
        "type": {"$eq": "likes"}
    }

  found_likes = index.query(
      vector=user_query_embedded,
      filter=likes_filter,
      top_k=1, # we only want one piece of personality trait from our bank
      include_values=True,
      include_metadata=True
  )


  def get_content_out_of_pinecone_query_result():
    content_list = []
    # Loop through each match in the query result
    for match in found_likes.get('matches', []):
        # Get the 'metadata' dictionary from the match
        metadata = match.get('metadata', {})
        # Extract the 'content' from the 'metadata'
        if 'content' in metadata:
            content_list.append(metadata['content'])

    return content_list

  found_likes_formatted = get_content_out_of_pinecone_query_result()

  print("Found the following relevant things that the user liked in the past:")
  pprint(found_likes_formatted)

  # Insert our user's likes into the messages_history list
  messages_history.append({"role": "assistant", "content": f"The user really likes: <user_preferences>{found_likes_formatted}</user_preferences>. This is what you know about the user. Use it in your answer."})

  # Generate a final answer
  for chunk in get_ai_answer(user_input, messages_history):
      print(chunk, end="")

Notice how I added more information to each piece of personality before embedding? This is me enriching the meanings of each one so that we get a better accuracy rate when searching for them. It's better to embed "skiing" in the context of "He likes this" and not on it's own.

Also note that I save the exact text in each lists (likes and dislikes) again in my vector database. In a production environment, this is costly and not necessary because you can always query by ID and get the original source. But here, it makes my life easier.

First, let's just ramble on and on about ourselves so that the LLM has something to extract and store.

In [None]:
print_ai_answer("I just got back from a run and man it was exciting. Stopped by a donut shop. Ever heard of Cops donuts? It's incredible. I didn't like how my shoes fell off at the end, but the scenery was spectacular. I could go for hours if only I didn't eat too much cheese. I never eat cheese.")

Messages history cleared.
Done extracting personality.
Found the following relevant things that the user liked in the past:
[]
It sounds like you had quite an eventful run! Cops donuts is a unique name for a donut shop—its charm is definitely enticing! It's unfortunate about your shoes, though; maybe it's time to look into a pair that fits better or has more secure fastenings, especially if you're planning on enjoying more long scenic runs.

Since you mentioned not eating cheese, steering clear of heavy foods before a run might be a good idea to keep you light and energized. How often do you go running, and do you have any particular routes or routines you prefer?

Let's tell it even more about us. Remember, every chat session is new with no prior messages. And notice how our array when queried is currently empty? That's becase our upsert operation was async, and we didn't wait for it to complete before querying.

That's fine. Next example is we're telling our assistant more information about ourselves. Notice how there's some things about us that it found by similarity searching in vector database.

In [None]:
print_ai_answer("I miss traveling. I think New York city is beautiful. I just enjoy being in big cities")

Messages history cleared.
Done extracting personality.
Found the following relevant things that the user liked in the past:
['New York city']
That's completely understandable! There's something truly magical about the energy and vibrancy of New York City. Each borough offers a unique flavor, and the never-ending opportunities to explore arts, culture, and culinary delights make it a dream destination for many. If you're missing the experience of traveling, perhaps you could revisit some of your favorite spots through movies, books, or even planning a future trip. Delving into some New York-themed novels or films might help recreate that special big-city ambiance at home. Is there a particular part of the city or type of activity in New York that you enjoy the most?

Now it has a bank of our preferences, we then ask for recommendations.

In [None]:
print_ai_answer("I need to eat something")

Messages history cleared.
Done extracting personality.
Found the following relevant things that the user liked in the past:
['Cops donuts']
Since you enjoy cop donuts, maybe you could grab a donut or two! They make for a quick and satisfying snack. If you're looking for something more substantial, how about pairing it with a cup of coffee or hot chocolate? Enjoy your treat!

Notice how the preferences that are pulled in are only the ones relevant to the conversation? There are a million things to a person, but at any given time, you only need to know enough to give them an answer.

Now our LLM can be fed just the right data without being overwhelmed, so you can get the best performance out of each user's question.

That's all for today. Let me know if you have any questions about this topic.

If you'd like to connect with me, here are my details:

- LinkedIn: https://www.linkedin.com/in/haiphunghiem
- GitHub: https://github.com/trancethehuman
- Twitter / X: https://twitter.com/haithehuman
- YouTube: https://www.youtube.com/@devlearnllm/videos