Topic model with Ollama (to generate human readable topic labels)

We first test with LDA topic model  
Then test with BERTopic, and CTM

Tutorial: https://python.langchain.com/docs/integrations/llms/ollama

In [1]:
from pathlib import Path

In [2]:
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain_community.llms import Ollama

Init Ollama object

langchain api: https://api.python.langchain.com/en/latest/llms/langchain_community.llms.ollama.Ollama.html#  
ollama api: https://github.com/jmorganca/ollama/blob/main/docs/api.md

basically the api is identical, just pass the params in the init()

In [3]:
llm = Ollama(model="llama2")        # assuming the port is 11434

In [4]:
llm.invoke("Tell me about history of AI")

'\nArtificial intelligence (AI) has a rich and varied history that spans several decades. Here is a brief overview:\n\n1. 1950s-1960s: The Dartmouth Conference and the Birth of AI: The field of AI was founded in 1956 at a conference held at Dartmouth College in Hanover, New Hampshire. Attendees included computer scientists, mathematicians, and cognitive scientists who were interested in exploring the possibilities of creating machines that could simulate human intelligence.\n2. 1950s-1960s: The First AI Programs: In the late 1950s and early 1960s, researchers developed the first AI programs, including the Logical Theorist, which was able to reason and solve problems using logical deduction, and the ELIZA chatbot, which could mimic a conversation with a psychotherapist.\n3. 1970s: Rule-Based Expert Systems: In the 1970s, AI researchers developed rule-based expert systems, which used a set of rules to reason and make decisions. These systems were widely used in industries such as banking

In [5]:
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("Tell me about history of {topic}")
chain = prompt | llm

In [6]:
prompt

ChatPromptTemplate(input_variables=['topic'], messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['topic'], template='Tell me about history of {topic}'))])

In [7]:
chain.invoke({"topic":"Blockchain"})

'\nSure, I\'d be happy to explain the history of blockchain!\n\nBlockchain technology has its roots in the early 1990s, when a group of researchers at the University of California, Berkeley were exploring ways to create a digital ledger that could record transactions in a decentralized and secure manner. This work was done in response to the limitations of traditional centralized databases, which were vulnerable to tampering and hacking.\n\nThe first blockchain-like system was developed in 1995 by a group of researchers at the University of California, Berkeley. They created a system called "Cryptocurrencies," which was the first decentralized digital currency. However, it was not until 2008 that blockchain technology truly gained mainstream attention with the launch of Bitcoin, the first decentralized cryptocurrency.\n\nBitcoin was created by an anonymous individual or group using the pseudonym Satoshi Nakamoto. The Bitcoin protocol is built on top of a blockchain, which allows for pe

How to interact with langchain

Quick Start  
https://python.langchain.com/docs/modules/model_io/quick_start

Writing prompt with ChatPromptTemplate()  
https://python.langchain.com/docs/modules/model_io/prompts/quick_start

https://python.langchain.com/docs/modules/model_io/prompts/composition

The topic keywords and top reviews are taken from the script ctm_demo_gridsearch.ipynb

In [8]:
# system_message = "You are a moderator of an online gaming forum that allows players to post reviews about different games."
system_message = "You are a player of the game who is reading the reviews about the game."
human_template = \
"Create a name for a topic given the topic's keywords and some most representative reviews of the topic. The name of the game is 'Terraria'. The top keywords of the topic is: \'\'\'{topic_keywords}\'\'\'. The most representative reviews of the topic are: \'\'\'{topic_reviews}\'\'\'. Output a description less than 5 words for the topic. Do not output other text."

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_message),
    ("human", human_template)
])

topic_top_keywords = [
    'play',
    'hour',
    'buy',
    'time',
    'pc',
    'xbox',
    'first',
    'love',
    'life',
    'would'
]

topic_top_reviews = [
    'Own this on xbox? Pshh, you do not know what you are missing out on. I played the heck of terraria on xbox 360 and I thought I was done playing this game. The PC version is way better and more smooth then xbox 360. 10/10 would play again. One last thing, do not buy the 4-pack unless you 100% know that the friends you bought it for would play this game. I gave it away to 3 of my friends. One ended up playing a good bit of time. the second played for 10 minutes, and the third ended up removing me after I gifted it away. #fail',
    "Well, let me tell you about this game. My son LOVES it. I got it for him around the winter sale as a gift. He has been playing it non-stop for a long time. He has been bugging me to get it and play it to, but it just didn't look entertaining. BOY, was I wrong. I just bought it during the last summer sale (a few weeks ago) and been playing it ALOT. It is worth it and a GREAT break from daily life. ENJOY! ",
    'I love this game, i play it everyday and every night when i get a chance. I was first shown this game on the XboX with my brother playing it at his house, i seen it and first thought that was the dumbest thing ive ever A side scrolling minecraft? So he begged me to try it and after the first 10 mins i was HOOKED! Hooked i tell you! i couldnt take my eyes off the game for a second. I played the game straight for 29 hours none stop except to eat and use the bathroom and occasional xbox freezes from overuse. Man this game is awesome. this game is alot funner if you have friends that play with you. Besides the mutiplayer part i been playing well over 300 hours and still going strong. I HIGHLY suggest this game to ANYONE that loves to play minecraft. ive already gotten 20 of my other friends to buy and play this game and they all love it. Sadly most of their computers broke and i no longer have friends to play mutiplayer with anymore. I need friends to play with shoot me a game invite. PLAY PLAY PLAY you will not be dissapointed! *sorry for bad spelling*',
    ":Heres A True Story: I remember when this game was in stores and everyone was playing it, buying it and loving this game until, 'Minecraft' came out and I never heard anyone talking about Terraria again :,( 3 Years later (2014) I bought this game for Ps3 and I loved it and playing it for ages made me love it even more but the Ps3 Version was glitched. 5 Months later (Still 2014) I bought Terraria for the Pc and I played it more than my Ps3 Version and WOW IT'S AMAZING!!!! 1st January - May (2015) Oh, wow Terraria is being updated in June, time to start playing again! 50 Years later (2060) I lived a good life (I died)",
    'I first got Terraria ios for my kindle 2 years ago and i loved i started playing it again and realized how limited it is so i bought it on steam and i knew what to do right off the bat and i had a blast playing it can wait to get back to it and play some more Terraria! Buy This Game Love it! out of 5',
    # 'purchases game on xbox plays for days with his friends is complete and total noob stops splaying because friends start hacking/cheating months Gets new compute gets steam account buys terraria gets friend to buy it plays for days with friends friends start cheating stops playing months later comes out plays for days with friends surprisingly friends dont cheat and we all have a jolly good time! oh yeah and its a good game too, not a noob anymore , i beat the moonlred like 20 no-noones proud of me? :( ',
    # 'DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE DONT LET ME GET INTO MY ZONE',
    # 'This game is People that berely go on the computer play for 10-70 hours Normal People that go on computers play 70-150 hours People that go on the computer frequently plays 150-500 hours Totally addicted players play for 500-2500 hours People that think they are in terraria 2500- ∞ What will happen to us in a few ',
    # "Didn't want to buy this game at first, but my friend Sara told me to buy as it was on sale as my outher friends bought this game too. At first 4 of us played for the first 2 days and after that i played alone as all of them don't want to play. 11/10 And btw fu*k you Jon, waste my money! Awesome game btw :D",
    # "Bee Movie Script According to all known laws of aviation, there is no way a bee should be able to fly. Its wings are too small to get its fat little body off the ground. The bee, of course, flies anyway because bees don't care what humans think is impossible. Yellow, black. Yellow, black. Yellow, black. Yellow, black. Ooh, black and yellow! Let's shake it up a little. Barry! Breakfast is ready! Ooming! Hang on a second. Hello? - Barry? - Adam? - Oan you believe this is happening? - I can't. I'll pick you up. Looking sharp. Use the stairs. Your father paid good money for those. Sorry. I'm excited. Here's the graduate. We're very proud of you, son. A perfect report card, all B's. Very proud. Ma! I got a thing going here. - You got lint on your fuzz. - Ow! That's me! - Wave to us! We'll be in row 118,000. - Bye! Barry, I told you, stop flying in the house! - Hey, Adam. - Hey, Barry. - Is that fuzz gel? - A little. Special day, graduation. Never thought I'd make it. Three days grade school, three days high school. Those were awkward. Three days college. I'm glad I took a day and hitchhiked around the hive. You did come back different. - Hi, Barry. - Artie, growing a mustache? Looks good. - Hear about Frankie? - Yeah. - You going to the funeral? - No, I'm not going. Everybody knows, sting someone, you die. Don't waste it on a squirrel. Such a hothead. I guess he could have just gotten out of the way. I love this incorporating an amusement park into our day. That's why we don't need vacations. Boy, quite a bit of under the circumstances. - Well, Adam, today we are men. - We are! - Bee-men. - Amen! Hallelujah! Students, faculty, distinguished bees, please welcome Dean Buzzwell. Welcome, New Hive Oity graduating class That concludes our ceremonies. And begins your career at Honex Industries! Will we pick ourjob today? I heard it's just orientation. Heads up! Here we go. Keep your hands and antennas inside the tram at all times. - Wonder what it'll be like? - A little scary. Welcome to Honex, a division of Honesco and a part of the Hexagon Group. This is it! Wow. Wow. We know that you, as a bee, have worked your whole life to get to the point where you can work for your whole life. Honey begins when our valiant Pollen Jocks bring the nectar to the hive. Our top-secret formula is automatically color-corrected, scent-adjusted and bubble-contoured into this soothing sweet syrup with its distinctive golden glow you know Honey! - That girl was hot. - She's my cousin! - She is? - Yes, we're all cousins. - Right. You're right. - At Honex, we constantly strive to improve every aspect of bee existence. These bees are stress-testing a new helmet technology. - What do you think he makes? - Not enough. Here we have our latest advancement, the Krelman. - What does that do? - Oatches that little strand of honey that hangs after you pour it. Saves us millions. Oan anyone work on the Krelman? Of course. Most bee jobs are small ones. But bees know that every small job, if it's done well, means a lot. But choose carefully because you'll stay in the job you pick for the rest of your life. The same job the rest of your life? I didn't know that. What's the difference? You'll be happy to know that bees, as a species, haven't had one day off in 27 million years. So you'll just work us to death? We'll sure try. Wow! That blew my mind! 'What's the difference?' How can you say that? One job forever? That's an insane choice to have to make. I'm relieved. Now we only have to make one decision in life. But, Adam, how could they never have told us that? Why would you question anything? We're bees. We're the most perfectly functioning society on Earth. You ever think maybe things work a little too well here? Like what? Give me one example. I don't know. But you know what I'm talking about. Please clear the gate. Royal Nectar Force on approach. Wait a second. Oheck it out. - Hey, those are Pollen Jocks! - Wow. I've never seen them this close. They know what it's like outside the hive. Yeah, but some don't come back. - Hey, Jocks! - Hi, Jocks! You guys did great! You're monsters! You're sky freaks! I love it! I love it! - I wonder where they were. - I don't know. Their day's not planned. Outside the hive, flying who knows where, doing who knows what. You can'tjust decide to be a Pollen Jock. You have to be bred for that. Right. Look. That's more pollen than you and I will see in a lifetime. It's just a status symbol. Bees make too much of it. Perhaps. Unless you're wearing it and the ladies see you wearing it. Those ladies? Aren't they our cousins too? Distant. Distant. Look at these two. - Oouple of Hive Harrys. - Let's have fun with them. It must be dangerous being a Pollen Jock. Yeah. Once a bear pinned me against a mushroom! He had a paw on my throat, and with the other, he was slapping me! - Oh, my! - I never thought I'd knock him out. What were you doing during this? Trying to alert the authorities. I can autograph that. A little gusty out there today, wasn't it, comrades? Yeah. Gusty. We're hitting a sunflower patch six miles from here tomorrow. - Six miles, huh? - Barry! A puddle jump for us, but maybe you're not up for it. - Maybe I am. - You are not! We're going 0900 at J-Gate. What do you think, buzzy-boy? Are you bee enough? I might be. It all depends on what 0900 means. Hey, Honex! Dad, you surprised me. You decide what you're interested in? - Well, there's a lot of choices. - But you only get one. Do you ever get bored doing the same job every day? Son, let me tell you about stirring. You grab that stick, and you just move it around, and you stir it around. You get yourself into a rhythm. It's a beautiful thing. You know, Dad, the more I think about it, maybe the honey field just isn't right for me. You were thinking of what, making balloon animals? That's a bad job for a guy with a stinger. Janet, your son's not sure he wants to go into honey! - Barry, you are so funny sometimes. - I'm not trying to be funny. You're not funny! You're going into honey. Our son, the stirrer! - You're gonna be a stirrer? - No one's listening to me! Wait till you see the sticks I have. I could say anything right now. I'm gonna get an ant tattoo! Let's open some honey and celebrate! Maybe I'll pierce my thorax. Shave my antennae. Shack up with a grasshopper. Get a gold tooth and call everybody 'dawg'! I'm so proud. - We're starting work today! - Today's the day. Oome on! All the good jobs will be gone. Yeah, right. Pollen counting, stunt bee, pouring, stirrer, front desk, hair - Is it still available? - Hang on. Two left! One of them's yours! Oongratulations! Step to the side. - What'd you get? - Picking crud out. Stellar! Wow! Oouple of newbies? Yes, sir! Our first day! We are ready! Make your choice. - You want to go first? - No, you go. Oh, my. What's available? Restroom attendant's open, not for the reason you think. - Any chance of getting the Krelman? - Sure, you're on. I'm sorry, the Krelman just closed out. Wax monkey's always open. The Krelman opened up again. What happened? A bee died. Makes an opening. See? He's dead. Another dead one. Deady. Deadified. Two more dead. Dead from the neck up. Dead from the neck down. That's life! Oh, this is so har",
]

chain = chat_prompt | llm 


In [9]:
chain.invoke({
    "topic_keywords": topic_top_keywords,
    "topic_reviews": topic_top_reviews
})

'"Terraria love affair"'

In [10]:
topic_top_keywords  = [
    'content',
    'update',
    'game',
    'hour',
    'new',
    'one',
    'time',
    'developer',
    'play',
    'still'
]

topic_top_reviews = [
"LEGIT THE BEST GAME EVER BETTER THAN MINECRAFT, SO MANY THINGS TO DO!!!!!!! BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!BEST GAME EVER!!!",
"Just realized I've had this game for years and never reviewed it. Which is just horrible of me. Of all the games in my steam library this deserves a review. I've enjoyed this game a lot. I've also enjoyed reading the games development storie. The dev stopped development on this game years ago. Was 'encouraged' by his girlfriend to update it. That encouragement must have been awesome. He released not only one more major update for it, but nominated a team of developers to continue work on it for long after that. They are still updating it. Just this month (November of 2016) they released another update that in many games would have been a entire paid DLC but in this game is just par for the course of another free update. After this game was released it has seen many changes and updates. Adding content. Not nerfiing content, but adding more content. Even takign the time to fix bugs a long the way. They game is great. The amount of content I have gotten for free since I have bought this game is amazing though. Most AAA developers might have released a patch for it after release. Maybe a few pieces of paid DLC. The developers on this have released multiple patches a year for years after it was released. Most those patches equivalent to a paid DLC from a AAA title. Not sayign this is the only way to dev a game. Just saying it is a damn nice one to see. Thanks.",
"Simply one of the best indie exploration-sandbox game of all time? I'm writing this review after the HUGE update that brought so much to the game than, after hundreds of hours already spent on this game, I still felt I was up for another nth playthrough, this time in goddam painful 'Expert mode' (but I'm enjoying it!) Seriously, which developer cherish more his baby than Terraria's one? The game is out since 2011, and yet 4 years after, you just happen to get a ton of new content for free for the thord time, for a game that has already been bought by so many people here. There is no profit involved here, just one man's dedication for his game and his community. If all developers could be as much Terraria is simply one of the best gaming experience I ever had. - It's retro stylish, without being ugly nor botched. - Musics are great and stick to the head as you could expect on old videogames. - There are a lot of things to do to really beat the game. - Exploring is both entertaining and rewarding. - Classic combat style and yet so much diversified thanks to a ton of different weapons. - I never felt like I was grinding for something. - You can build so much things, with so much - It's just totally amazing to watch all the crazy/beautiful things the community already made with this game. - Multiplayer is such a wonderful experience here, and it's even better since they made it easier to organize. What else could we ask for? I would buy this again.",
"If these side scroller retro crafting type games appeal to you at all, this is one of the best. Absolutely recommended. Tons of content for the price. Probably the best game I've ever bought for this price! And they just keep adding more content!! Edit/update: Every time I scroll past this game in my Steam library, it makes me smile. I can't wait for a sequel/expansion/major update. What a game!",
"Terraria is a truly amazing game experience, over the years countless updates have been added to the game enhancing it above and beyond what was already a great game. Terraria is a great game if you enjoy action adventure, with the potential to also build your hearts desire (in a 2D space). If you enjoy these kinds of games and have not played Terraria I highly recommend it. Great value for price and still to this day the developer over at Redigit have continued to put out countless updates adding new content for free.",
# "A super fun and addictive Sandbox Crafting 2D Side Scrolling Retro RPG. The amount of content and work put in to this little indie game is astounding. Literally as many hours or more of gameplay as The Witcher 3 right up until endgame, not to mention the insane replayability all thanks to the major free content updates the devs provide us every few months. One of the best price/content ratio games on steam right now, i got it for around $4-5 CND when it went on sale one time and it's one of the best choices i made on steam. Trust me, this game is well worth your hard earned money if you are a fan of the genre. Pick it up, you won't be disappointed.",
# "Terraria is currently the game I've invested the longest amount of time into. When I first booted up the game in one of its early versions, I was one of the players that easily classified Terraria as 'Minecraft, minus one Of course, back then I would not have imagined the plethora of free updates and content Terraria would become throughout its development cycle. Re-Logic, Terraria's developers, have given the gaming community one of the most content-rich, sandbox-exploration games. Even after they began other projects, Re-Logic continues to surprise Terraria communities with new content years after Terraria's release. At the time of writing this review, the game version of Terraria has reached If there was ever an announcement for yet another massive update in the form of I wouldn't be surprised, but excited to start from scratch to enjoy the new additions.",
# "Fantastic game, every time I play it the developer has added new stuff (all for free, not microtransactions). I'm completely amazed by the amount of support the developer has given this game. One of the best games I've ever played, it's a 2D sidescrolling open world adventure game. It has hours on hours of content for a fantastically low price",
# "I've played through this game multiple times with multiple groups of varying size. It's fun even when playing alone, but with a group of people this is pretty much the best game involving crafting that you can pick up. It has tons of content, the combat system is much better than most games of this type, it runs on pretty much anything, it's regularly on sale, the soundtrack is great, and it still gets gigantic free content updates years after release. I own it on Steamand DRM-free on GoG and the Humble Store and I've bought multiple copies for friends. over the years. If you've spent more than 10$ on computer games in your life and don't have this one in your library, you've made a mistake. The only reason not to get this game if you have the chance is if you really despise the genre or have nobody to play it with. The developers deserve praise perhaps more than any other developer for their continued free support and the many community suggestions they've implemented. TL;DR: This game is absolutely amazing, and well worth the asking price. If you don't pick it up full-price, get it on one of the many sales. A blast to play with a group of people.",
# "Much more than a 2D Minecraft, Terraria blends one of the deepest crafting systems of all time, a charming graphical style, Metroidvania elements and epic boss fights together into a game that's more than the sum of its parts. And best of all, the game's only continued to improve, greatly deepening in new features, graphics, weapons, bosses, and even adding entirely new ways to play the game, years after the original release. Almost anyone else would sell the content they give out for free as expansion packs. Not this team. Terraria continues to grow, with the update approaching. While much smaller in scope than the expansion pack-sized it shows the continued dedication that exists to improving an already incredible game. Must buy."
]

In [11]:
chain.invoke({
    "topic_keywords": topic_top_keywords,
    "topic_reviews": topic_top_reviews
})

'Best game ever!'

In [12]:
human_template2 = \
"Create a name for a topic given the topic's keywords. The name of the game is 'Terraria'. The top keywords of the topic is: \'\'\'{topic_keywords}\'\'\'. Output a description less than 5 words for the topic. Do not output other text."

chat_prompt2 = ChatPromptTemplate.from_messages([
    ("system", system_message),
    ("human", human_template2)
])

chain2 = chat_prompt2 | llm

In [13]:
chain2.invoke({
    "topic_keywords": topic_top_keywords,
    # "topic_reviews": topic_top_reviews
})

'\nNew Terraria content in the works!'

---

RAG with Ollama

Ref: https://python.langchain.com/docs/integrations/vectorstores/chroma

In [14]:
# import
from langchain.vectorstores import Chroma
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.document_loaders import TextLoader, DirectoryLoader                                                                                                                                                                                                                                                                                                                                                                                                                                      
from langchain_community.embeddings.sentence_transformer import (
    SentenceTransformerEmbeddings,
)
from langchain_community.vectorstores import Chroma

# load the document and split it into chunks
doc_path = Path("cyberpunk_2077_phantom_liberty/cyberpunk_2077_phantom_liberty_01.txt")

loader = TextLoader(str(doc_path))
documents = loader.load()

# split it into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

# create the open-source embedding function
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

# load it into Chroma
db = Chroma.from_documents(docs, embedding_function)

query = "Generate a summary of the document."
docs = db.similarity_search(query)

print(docs[0].page_content)

  from .autonotebook import tqdm as notebook_tqdm


It took me a second to come around to Phantom Liberty's set-up: rescue the president of the New United States after Space Force One (lol) gets shot down over Dogtown, a newly added district of Night City that looks like the Thunderdome by way of Blade Runner: 2049's haunting, irradiated Las Vegas.

CD Projekt is great at teeing up a story that seems like it'd be derivative or Spike TV-edgy, then absolutely curving you with unexpected depth and nuance, and that's no different here. President Myers comes off as a down to earth, "one of the guys" former soldier, like Harrison Ford from Air Force One, but there's a serpentlike ambition to her that comes out as the story goes on.


In [15]:
# make a chain

from langchain.chains import RetrievalQA

chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type='stuff',
    retriever=db.as_retriever(),
    return_source_documents=True
)

In [16]:
## Cite sources
def process_llm_response(llm_response):
    print(llm_response['result'])
    print('\n\nSources:')
    for source in llm_response["source_documents"]:
        print(source.metadata['source'])

In [17]:
# full example
query = "Generate a summary of the document."
llm_response = chain(query)
process_llm_response(llm_response)

print('\n\n')
print(llm_response)

  warn_deprecated(


The summary of the document is as follows:

The author has positive feelings towards Phantom Liberty, an expansion for Cyberpunk 2077. They appreciate how the story sets up a scenario that seems derivative but then surprise with unexpected depth and nuance. The author enjoys the gig additions, particularly one where they help out a pair of hapless detectives in Dogtown. The main players in the expansion have conflicting motivations, leading to a difficult choice at the end. The author notes that the path they chose resulted in a gut-wrenching ending, and they are torn between their desire for a better ending and enjoying the darker outcome.


Sources:
cyberpunk_2077_phantom_liberty/cyberpunk_2077_phantom_liberty_01.txt
cyberpunk_2077_phantom_liberty/cyberpunk_2077_phantom_liberty_01.txt
cyberpunk_2077_phantom_liberty/cyberpunk_2077_phantom_liberty_01.txt
cyberpunk_2077_phantom_liberty/cyberpunk_2077_phantom_liberty_01.txt



{'query': 'Generate a summary of the document.', 'result': 'T

In [18]:
# comparison: without retriever, only llm
from langchain_core.prompts import ChatPromptTemplate

human_template = \
'''Generate a summary of the document. \'\'\'{document}\'\'\'
'''

chat_prompt = ChatPromptTemplate.from_messages([
    ("human", human_template)
])
chain = chat_prompt | llm

# load the document and get the content as string
doc_path = Path("cyberpunk_2077_phantom_liberty/cyberpunk_2077_phantom_liberty_01.txt")
with open(doc_path, 'r') as f:
    document = f.read()

chain.invoke({
    "document": document
})

'In this review, the reviewer expresses their enthusiasm for Cyberpunk 2077\'s Phantom Liberty DLC, praising its engaging storyline, immersive gameplay, and high-quality writing and worldbuilding. They also note that the DLC has a distinct look from the rest of Night City, but feels a little disappointing in terms of interactivity and exploration opportunities compared to other immersive sim games. Overall, the reviewer is pleased with Phantom Liberty and considers it a satisfying conclusion to the Cyberpunk 2077 world created by CD Projekt.\n\nThe review highlights several key points:\n\n1. Engaging storyline: The reviewer enjoys the DLC\'s cinematic storytelling and drawn-out set pieces, describing them as "draw-dropping moments."\n2. Immersive gameplay: They praise the DLC for its interactive and immersive gameplay elements, such as the ability to infiltrate the youth sports academy of the future or help out a pair of hapless detectives.\n3. High-quality writing and worldbuilding: T

Conclusion: the result produced by RAG provides more elaboration, by citing the character of the game, and pointing out the fact that it took a while to get into the story. Both praised the qualit of the game.

---

Build a persistent dir as a database

https://www.gettingstarted.ai/tutorial-chroma-db-best-vector-database-for-langchain-store-embeddings/

video tut: https://youtu.be/3yPBVii7Ct0

In [19]:
file_dir_path = Path('cyberpunk_2077_phantom_liberty')

embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")           # TODO: change to ollama LLM embeddings due to context length 

loader = DirectoryLoader(str(file_dir_path), glob="./*.txt", loader_cls=TextLoader)
docs = loader.load()

# split it into chunks
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
docs = text_splitter.split_documents(documents)

# chroma_db = Chroma.from_documents(
#     documents=docs, 
#     embedding=embedding_function, 
#     persist_directory="data", 
#     collection_name="lc_chroma_demo"
# )

In [21]:
persist_directory = 'cyberpunk_2077_phantom_liberty_db'

In [None]:
# create the open-source embedding function
embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")

# create the vector database
vector_db = Chroma.from_documents(
    documents=docs,
    embedding=embedding_function,
    persist_directory=persist_directory
)

vector_db.persist()
del vector_db

In [22]:
# load the vector database
vector_db = Chroma(
    persist_directory=persist_directory,
    embedding_function=embedding_function)

make a retriever

then make a chain

In [27]:
# test the retriever

prompt = 'What are the drawbacks of the game?'

# retriever
retriever = vector_db.as_retriever()

docs = retriever.get_relevant_documents(prompt)

print(len(docs))
print('\n\n')
for doc in docs:
    print(doc.page_content)
    print(doc.metadata['source'])
    print('\n\n')

4



Doggone it
It makes sense that Dogtown's physically siloed off from the rest of Night City, but I found myself wishing it was either better-integrated into the main game, or even more of a hostile, alien, alternate zone. The "save the President" plot actually resolves itself pretty quickly in favor of a deeper conspiracy, and you're subsequently free to come and go between Dogtown and Pacifica as you please, but the DLC really doesn't have you doing much in Night City proper aside from running quick errands like delivering procedurally spawned cars for its new, literal grand theft auto minigame.

Night City just isn't tactile or inviting the way a good immersive sim or even the Elder Scrolls games manage to be.
cyberpunk_2077_phantom_liberty/cyberpunk_2077_phantom_liberty_01.txt



Phantom Liberty is an extra-refined bite of Cyberpunk 2077—an expansion pack's expansion pack. It doesn't reinvent the game as a whole, but it's a fantastic final outing for V and Night City, as well as

In [28]:
# make a chain

# create the chain to answer questions 
chain = RetrievalQA.from_chain_type(llm=llm, 
                                    chain_type="stuff", 
                                    retriever=retriever, 
                                    return_source_documents=True)

# full example
llm_response = chain(prompt)
llm_response
# process_llm_response(llm_response)

{'query': 'What are the drawbacks of the game?',
 'result': "Based on the context provided, the main drawbacks of Cyberpunk 2077 are:\n\n1. Lack of tactile and inviting gameplay: The game's world, Night City, is not as immersive or interactive as some other games in the genre, such as immersive sims or the Elder Scrolls series.\n2. Limited exploration opportunities: With the 2.0 update, the game's loot and progression systems have been improved, but the game still lacks a sense of urgency or motivation to explore its world beyond explicit mission objectives.\n3. Uninteractive world: The game's visual feast of Night City is often marred by the fact that the world is not as interactive or immersive as it could be, with limited opportunities for players to snoop around and discover new things.",
 'source_documents': [Document(page_content='Doggone it\nIt makes sense that Dogtown\'s physically siloed off from the rest of Night City, but I found myself wishing it was either better-integrate

In [29]:
print(chain.combine_documents_chain)

llm_chain=LLMChain(prompt=PromptTemplate(input_variables=['context', 'question'], template="Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.\n\n{context}\n\nQuestion: {question}\nHelpful Answer:"), llm=Ollama()) document_variable_name='context'


In [30]:
# see the chain template
print(chain.combine_documents_chain.llm_chain.prompt.template)

Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
Helpful Answer:


---

Prompt engineering on Retrieval QA

https://github.com/langchain-ai/langchain/discussions/3115

The original prompt is from: https://github.com/langchain-ai/langchain/blob/0bc397957b72fcd9896d1cf2bceae1d6a06e7889/libs/langchain/langchain/chains/question_answering/stuff_prompt.py#L20

```python
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer.

{context}

Question: {question}
Helpful Answer:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["context", "question"]
)

system_template = """Use the following pieces of context to answer the user's question. 
If you don't know the answer, just say that you don't know, don't try to make up an answer.
----------------
{context}"""
messages = [
    SystemMessagePromptTemplate.from_template(system_template),
    HumanMessagePromptTemplate.from_template("{question}"),
]
CHAT_PROMPT = ChatPromptTemplate.from_messages(messages)


PROMPT_SELECTOR = ConditionalPromptSelector(
    default_prompt=PROMPT, conditionals=[(is_chat_model, CHAT_PROMPT)]
)
```

The code is also quoted in this Q&A: https://github.com/langchain-ai/langchain/discussions/12256

The above code is encapsulated with global constant PROMPT_SELECTOR  
It will be called when calling the _call() function (a default func in python for obj())  

The class/call-stack is like:  
BaseRetrievalQA(Chain)  
+-- load_qa_chain()  
| +-- _load_stuff_chain()

In the first line of the func [Link](https://github.com/langchain-ai/langchain/blob/0bc397957b72fcd9896d1cf2bceae1d6a06e7889/libs/langchain/langchain/chains/question_answering/__init__.py#L63)
```python

def _load_stuff_chain(
    llm: BaseLanguageModel,
    prompt: Optional[BasePromptTemplate] = None,
    document_variable_name: str = "context",
    verbose: Optional[bool] = None,
    callback_manager: Optional[BaseCallbackManager] = None,
    callbacks: Callbacks = None,
    **kwargs: Any,
) -> StuffDocumentsChain:
    _prompt = prompt or stuff_prompt.PROMPT_SELECTOR.get_prompt(llm)
    llm_chain = LLMChain(
        llm=llm,
        prompt=_prompt,
        verbose=verbose,
        callback_manager=callback_manager,
        callbacks=callbacks,
    )
    # TODO: document prompt
    return StuffDocumentsChain(
        llm_chain=llm_chain,
        document_variable_name=document_variable_name,
        verbose=verbose,
        callback_manager=callback_manager,
        callbacks=callbacks,
        **kwargs,
    )
```

The first line is where the prompt template is loaded.

Then it returns a chain, which we can call it directly

To do prompt engineering, we overwrite the PromptTempate() to our own one

In [31]:
from langchain.chains import RetrievalQAWithSourcesChain
from langchain_core.prompts import PromptTemplate

# system_template = \
# '''You are a reviewer of the game. Use the following pieces of context to answer any question about the game.
# If you don't know the answer, just say 'NA'. Do NOT try to make up an answer.
# ---
# {context}'''

# prompt, let say write a summary of the game with some predefined aspects
# Gameplay, Graphics, Sound, Performance, Bug, Suggestion, Price, Overall

# TODO: fine-tune the prompt to use the theory I stated below
prompt_template = \
'''You are reading reviews of a game to understand the characteristics of the game. Use the following pieces of context to answer user's question. 

{summaries}

Question: {question}

If you don't know the answer, just output a json with all values in the json as 'NA'. Do NOT try to make up an answer.
Only output the JSON. Do NOT output other text.'''

my_question = \
'''Extract the following aspects of the game from the reviews. Output a json with each of the aspects as key, and the extracted information as the value. The format of the json is {"ASPECT":"INFORMATION"}. The aspects are: [Gameplay, Graphics, Sound, Performance, Bug, Suggestion, Price, Overall]
'''

retriever = vector_db.as_retriever()

chain =  RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    chain_type_kwargs={
        "prompt": PromptTemplate(
            template=prompt_template,
            input_variables=["summaries", "question"],
        )
    },
    return_source_documents=True,
)

The 'return_source_documents' will be used in _call() func [Github](https://github.com/langchain-ai/langchain/blob/0bc397957b72fcd9896d1cf2bceae1d6a06e7889/libs/langchain/langchain/chains/retrieval_qa/base.py#L119)

To get the relavent docs from the Chroma vectorDB, it call the _get_docs() within the func  
which the RetrievalQA implemented it to call the retriever.get_relevant_documents() [Github](https://github.com/langchain-ai/langchain/blob/0bc397957b72fcd9896d1cf2bceae1d6a06e7889/libs/langchain/langchain/chains/retrieval_qa/base.py#L214)

Hence, to test what document will be returned based on our prompt, we can pass the variable _question_ to the retriever to test what it is returned withint the 'summaries' field in the main prompt.

In Langchain, using a vectorized database can reduce the length of the prompt to within the context window size of the LLM, by searching the corresponding key-sentences about the task. Then integrate them to the prompt, and return a result.  
As a bonus, it return the lines from the document to indicate which part of the document is included within the prompt.

In [32]:
response = chain.invoke({
    'question': my_question,
    # 'summaries': docs
})

response

{'question': 'Extract the following aspects of the game from the reviews. Output a json with each of the aspects as key, and the extracted information as the value. The format of the json is {"ASPECT":"INFORMATION"}. The aspects are: [Gameplay, Graphics, Sound, Performance, Bug, Suggestion, Price, Overall]\n',
 'answer': '{\n"Gameplay": "CD Projekt makes a case here for more cinematic, bounded RPG design. The main players in Phantom Liberty all think they\'re doing the right thing and that they don\'t have any other options, all while expecting you to back them up, and that eventually shakes out into one of the toughest choices I\'ve had to make in an RPG: you have to betray someone, and both ending paths feature their own triumphs and gut punches.",\n"Graphics": "CD Projekt really is in the same league as Naughty Dog or Sony Santa Monica when it comes to delivering draw-dropping moments, but sets itself apart with the RPG choice, consequence, and interactivity I crave.",\n"Sound": "Ma

In [33]:
print(response['answer'])

{
"Gameplay": "CD Projekt makes a case here for more cinematic, bounded RPG design. The main players in Phantom Liberty all think they're doing the right thing and that they don't have any other options, all while expecting you to back them up, and that eventually shakes out into one of the toughest choices I've had to make in an RPG: you have to betray someone, and both ending paths feature their own triumphs and gut punches.",
"Graphics": "CD Projekt really is in the same league as Naughty Dog or Sony Santa Monica when it comes to delivering draw-dropping moments, but sets itself apart with the RPG choice, consequence, and interactivity I crave.",
"Sound": "Many are simple but enjoyable 'go here and kill everyone as fits your playstyle' deals, but some of them felt more like full-on side stories with voice acting, a twist, and maybe even a gameplay curveball.",
"Performance": "I always enjoyed Cyberpunk's smaller side quests or 'gigs,' too. Many are simple but enjoyable 'go here and 

In [34]:
for doc in response['source_documents']:
    print(doc)

page_content='CD Projekt makes a case here for more cinematic, bounded RPG design.\n\nThe main players in Phantom Liberty all think they\'re doing the right thing and that they don\'t have any other options, all while expecting you to back them up, and that eventually shakes out into one of the toughest choices I\'ve had to make in an RPG: you have to betray someone, and both ending paths feature their own triumphs and gut punches.\n\nThe path I chose was way more gut punches than triumphs, so I think I picked the "bad" ending. From chatting with my coworkers, though, the other option sounds at least bittersweet, and my RPG perfectionist drive to get the best ending possible is conflicting with just how good that bad ending was. The Darkest Timeline had me feeling like my stomach had a lead weight in it from the point of no return to credits rolling, and by the end, the bastards got away with everything. Forget it V, it\'s Dogtown.' metadata={'source': 'cyberpunk_2077_phantom_liberty/c

Without RAG with Chroma

In [35]:
from langchain_core.prompts import PromptTemplate

prompt_norag = PromptTemplate(
    template=prompt_template,
    input_variables=["summaries", "question"],
)

chain_norag = prompt_norag | llm

# need to get all text from the txt files under folder file_dir_path
# use glob to get all the file paths
file_dir_path = Path('cyberpunk_2077_phantom_liberty')
file_paths = file_dir_path.glob('*.txt')
file_paths = list(file_paths)
# get the text and store in a list
docs = []
for file_path in file_paths:
    with open(file_path, 'r') as f:
        docs.append(f.read())



result_norag = chain_norag.invoke({
    'question': my_question,
    'summaries': '\n'.join(docs)
})

result_norag

'{\n"Gameplay": "CD Projekt makes a case here for more cinematic, bounded RPG design.",\n"Graphics": "The visual feast of Night City has always clashed with how uninteractive its world is.",\n"Sound": "Idris Elba is a real treat as Solomon Reed",\n"Performance": "Now that 2.0 has fixed Cyberpunk\'s loot and progression woes, its biggest remaining issue to my eye is that there\'s no real call to explore or poke around its world outside explicit mission objectives.",\n"Bug": "NA",\n"Suggestion": "I think Dogtown\'s smaller, more manageable canvas could have been an opportunity to create something like that within Cyberpunk 2077.",\n"Price": "NA",\n"Overall": "CD Projekt can hang with the big dogs when it comes to cinematic storytelling, with a quality of writing and world building that I prefer to the likes of Sony\'s vaunted first party lineup. I\'ve been eager to see what CD Projekt would do with a Cyberpunk 2077 expansion ever since first beating the game at the end of 2020, and Phant

In [36]:
print(result_norag)

{
"Gameplay": "CD Projekt makes a case here for more cinematic, bounded RPG design.",
"Graphics": "The visual feast of Night City has always clashed with how uninteractive its world is.",
"Sound": "Idris Elba is a real treat as Solomon Reed",
"Performance": "Now that 2.0 has fixed Cyberpunk's loot and progression woes, its biggest remaining issue to my eye is that there's no real call to explore or poke around its world outside explicit mission objectives.",
"Bug": "NA",
"Suggestion": "I think Dogtown's smaller, more manageable canvas could have been an opportunity to create something like that within Cyberpunk 2077.",
"Price": "NA",
"Overall": "CD Projekt can hang with the big dogs when it comes to cinematic storytelling, with a quality of writing and world building that I prefer to the likes of Sony's vaunted first party lineup. I've been eager to see what CD Projekt would do with a Cyberpunk 2077 expansion ever since first beating the game at the end of 2020, and Phantom Liberty is 

Future: host the chromaDB in a docker instead of in-memory DB

https://rito.hashnode.dev/installing-chroma-db-locally-querying-personal-data

https://docs.trychroma.com/api-reference

Seems deploying in docker is simple, as long as the docker has infinite amount of space (to store the db file)  
Just prepare a linux env, with python (and miniforge installed), and chroma (the package) install. Then the server can be started with command line :D