# Scify is back: Version A (using Langchain Anthropic package) 

### Who is Scify?
An LLM chatbot who is expected to 
1) find books on [Project Gutenberg](https://www.gutenberg.org/)
2) find the txt url for each book


### How was Scify?
 * In [my previous experiemnt](https://github.com/dujm/book-reader/blob/master/notebooks/03-chatbot.ipynb), I used `llama2 7B` (Ollma ID `78e26419b446`) 
 * Scify did a good job in finding the relevant books, but failed to find the relevant text urls.

<br>

### How is Scify different now? 
 * In this experiment, I used Anthropic's Claude-3 model (`claude-3-opus-20240229`).

#### Result
 * Claude 3 helped Scify to significantly improve its performance (i.e. no hallucination).
 * Scify did not only find the relevant books, but also the correct txt urls. 



# Settings

### Packages

In [1]:
#%pip install langchain-anthropic

In [2]:
import os

# process Python abstract syntax
import ast

# pretty print
from pprint import pprint

# langchain
from langchain import PromptTemplate, LLMChain

# langchain_anthropic
from langchain_anthropic import ChatAnthropic


# load the .env file (for saving API keys, see details in `.env_example file`)
import dotenv
dotenv.load_dotenv()

# assign loaded API keys to variables
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY")

### Variables

In [3]:
#----------------------#
# model related
#----------------------#
# model
llm_model_id = 'claude-3-opus-20240229'


#----------------------#
# query related
#----------------------#
# topic 
topic = "Science Fiction"

# number of results I want to get
n= 5

# Build an LLM Chatbot

### Template

In [4]:
# template
template = """
You are a friendly chatbot assistant that responds in a conversational
manner to users questions. Keep the answers short, unless specifically
asked by the user to elaborate on something.

Question: {question}

Answer:"""
prompt = PromptTemplate(template=template, input_variables=["question"])


### Invite an LLM

In [5]:
llm = ChatAnthropic(model=llm_model_id)

### Create a Langchain workflow

In [6]:
llm_chain = LLMChain(prompt=prompt, llm=llm)

# Start Chatting

In [7]:
question = f"""
     Please go to 'https://www.gutenberg.org/', search for  {topic}
     list the most popular {n} books and their matched Plain Text UTF-8 links on 'https://www.gutenberg.org/'
     Respond with the book title and links.
     For example, the most popular Book is
     ['The Time Machine': 'https://www.gutenberg.org/cache/epub/35/pg35.txt']
     Return a pretty list of python readable form. 
"""
response = llm_chain({question})

  warn_deprecated(


In [8]:
# convert to text 
response_text = response['text']


### Result¶

In [9]:
# get clean result
results ="".join(response_text.splitlines()[3:n+3]).lstrip()
results

# find out what current syntax grammar looks like programmatically (in this case, turn a string into a list)
result_display= ast.literal_eval(results )

# pretty print
pprint(result_display)


(['The War of the Worlds', 'https://www.gutenberg.org/files/36/36-0.txt'],
 ['The Time Machine', 'https://www.gutenberg.org/files/35/35-0.txt'],
 ['Flatland: A Romance of Many Dimensions',
  'https://www.gutenberg.org/files/201/201-0.txt'],
 ['The Poison Belt', 'https://www.gutenberg.org/files/126/126-0.txt'],
 ['A Princess of Mars', 'https://www.gutenberg.org/files/62/62-0.txt'])
