<a href="https://colab.research.google.com/github/Santiago-R/aprende/blob/main/lm-hackers.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# [A hacker's guide to Language Models](https://colab.research.google.com/github/fastai/lm-hackers/blob/main/lm-hackers.ipynb#scrollTo=0b017bfc-5be0-4e41-9fa1-9f685c3b0de5)


## Initial restart

In [None]:
import os
import importlib
a_spec = importlib.util.find_spec('accelerate')
b_spec = importlib.util.find_spec('bitsandbytes')
found = a_spec is not None and b_spec is not None

if found: print('Dependencies installed in previous run ✔️')
else:
    !pip install accelerate -qq
    !pip install bitsandbytes -qq
    !pip install auto-gptq -qq
    !pip install optimum -qq

    print('\nDependency installation requires restart --> killing runtime 💀')

Dependencies installed in previous run ✔️


In [None]:
if not found:
    os.kill(os.getpid(), 9)  # Kill runtime if required

## Setup

In [1]:
from google.colab import drive
drive.mount('/content/drive')  # , force_remount=True)

Mounted at /content/drive


In [2]:
# import tokenize
# from io import BytesIO

In [3]:
# Load api keys as environvent variable (from Drive's api_keys.env)
!pip install python-dotenv -qq
from dotenv import load_dotenv
load_dotenv(dotenv_path='/content/drive/MyDrive/LLM/api_keys.env')

True

## OpenAI API with system prompt

In [6]:
!pip install openai -qq
import openai
client = openai.OpenAI()
ChatCompletion = client.chat.completions
Completion = client.completions

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/328.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━[0m [32m184.3/328.7 kB[0m [31m5.5 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m328.7/328.7 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m5.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m58.3/58.3 kB[0m [31m8.1 MB/s[0m eta [36m0:00:00[0m
[?25h

#### Sys prompt, doctor's assistant (careful!)

In [97]:
from IPython.display import display, Markdown  # , HTML

In [109]:
class ChatSys:
    def __init__(self, p_sys, model="gpt-4o"):
        self.messages = [{"role": "system", "content": p_sys}]
        self.model = model
    def add(self, question):
        self.messages.append({"role": "user", "content": question})
        answer = ChatCompletion.create(
            model=self.model,
            messages=self.messages
        ).choices[0].message.content
        self.messages.append({"role": "assistant", "content": answer})
        return answer
    def back(self, n=1):
        self.messages = self.messages[:-n*2]
    def display(self):
        md_list = [f'###{m["role"].upper()}\n{m["content"]}' for m in self.messages]
        display(Markdown('\n***\n'.join(md_list)))

In [110]:
p_sys = """You are an expert doctor's assistant focused on enhancing individuals' overall wellness and physical performance beyond just treating illness.
You will aid doctors in finding possible causes and treatment options for patiens' issues.

Input: Summary of a patient's issue
Output: A list of underlying causes, each with a connection to symptoms and its potential solution, for the doctor to evaluate.
"""

In [111]:
# qa = gen_qa_sys(p_sys)
chat = ChatSys(p_sys, model="gpt-4o")

In [112]:
p_user = """
- 35 yo male, no history of illness.
- Patient reports letargy in the morning, even after getting enough sleep, more often than not.
- Muscle weakness in the first minutes to hours of the day, most noticeably lack of grip strength.
- Hydration and coffee often help, but patient feels like they are not sufficient to get him to a normal state.
"""

answer = chat.add(p_user)
print(answer)

Certainly, here is a list of underlying causes along with their connections to the symptoms and potential solutions for the doctor to evaluate:

1. **Sleep Disorders (e.g., Sleep Apnea, Restless Leg Syndrome)**
   - **Connection to Symptoms:** Sleep disorders can cause poor-quality sleep, leading to morning lethargy and disrupted muscle function.
   - **Potential Solution:** Conduct a sleep study to diagnose any sleep disorders. Use CPAP for sleep apnea, or medication and lifestyle changes for other disorders.

2. **Electrolyte Imbalance**
   - **Connection to Symptoms:** Imbalances in electrolytes like sodium, potassium, and magnesium can result in muscle weakness and morning lethargy.
   - **Potential Solution:** Perform blood tests to check electrolyte levels. Recommend dietary changes or supplements to correct any imbalances.

3. **Hypoadrenia (Adrenal Fatigue)**
   - **Connection to Symptoms:** Poor adrenal function can lead to insufficient cortisol levels in the morning, causing 

In [113]:
print(chat.add('How about magnesium defficiency?'))

Certainly, magnesium deficiency is a plausible underlying cause worth consideration. Here are the details specific to magnesium deficiency:

### **Magnesium Deficiency**

- **Connection to Symptoms:**
  - **Morning Lethargy:** Magnesium is crucial for energy production and its deficiency can lead to fatigue and a feeling of sluggishness, particularly noticeable in the morning.
  - **Muscle Weakness:** Magnesium plays an essential role in muscle contraction and relaxation. Low levels can result in muscle weakness, cramps, and reduced grip strength, especially noticeable after waking.

- **Potential Solution:**
  - **Diagnostic Tests:** Conduct serum magnesium tests to confirm deficiency. Sometimes intracellular magnesium levels might need to be tested if serum levels are inconclusive.
  - **Dietary Adjustments:** Recommend an increase in magnesium-rich foods such as leafy green vegetables (spinach, kale), nuts and seeds (almonds, pumpkin seeds), whole grains (brown rice, quinoa), and le

In [114]:
chat.display()

###SYSTEM
You are an expert doctor's assistant focused on enhancing individuals' overall wellness and physical performance beyond just treating illness.
You will aid doctors in finding possible causes and treatment options for patiens' issues.

Input: Summary of a patient's issue
Output: A list of underlying causes, each with a connection to symptoms and its potential solution, for the doctor to evaluate.

***
###USER

- 35 yo male, no history of illness.
- Patient reports letargy in the morning, even after getting enough sleep, more often than not.
- Muscle weakness in the first minutes to hours of the day, most noticeably lack of grip strength.
- Hydration and coffee often help, but patient feels like they are not sufficient to get him to a normal state. 

***
###ASSISTANT
Certainly, here is a list of underlying causes along with their connections to the symptoms and potential solutions for the doctor to evaluate:

1. **Sleep Disorders (e.g., Sleep Apnea, Restless Leg Syndrome)**
   - **Connection to Symptoms:** Sleep disorders can cause poor-quality sleep, leading to morning lethargy and disrupted muscle function.
   - **Potential Solution:** Conduct a sleep study to diagnose any sleep disorders. Use CPAP for sleep apnea, or medication and lifestyle changes for other disorders.

2. **Electrolyte Imbalance**
   - **Connection to Symptoms:** Imbalances in electrolytes like sodium, potassium, and magnesium can result in muscle weakness and morning lethargy.
   - **Potential Solution:** Perform blood tests to check electrolyte levels. Recommend dietary changes or supplements to correct any imbalances.

3. **Hypoadrenia (Adrenal Fatigue)**
   - **Connection to Symptoms:** Poor adrenal function can lead to insufficient cortisol levels in the morning, causing fatigue and muscle weakness.
   - **Potential Solution:** Conduct a cortisol level test. Treatment may include lifestyle changes, stress management, and possibly adrenal supplements.

4. **Dietary Deficiencies (e.g., Vitamin D, B12, Iron)**
   - **Connection to Symptoms:** Deficiencies in key vitamins and minerals can result in fatigue and muscle weakness.
   - **Potential Solution:** Perform laboratory tests to check for deficiencies. Recommend appropriate supplementation or dietary adjustments.

5. **Thyroid Dysfunction (Hypothyroidism)**
   - **Connection to Symptoms:** An underactive thyroid gland can cause generalized fatigue and muscle weakness.
   - **Potential Solution:** Evaluate thyroid function through blood tests (TSH, Free T4). Treat with thyroid hormone replacement therapy if necessary.

6. **Chronic Fatigue Syndrome (CFS)**
   - **Connection to Symptoms:** CFS can be characterized by severe, ongoing fatigue that is not alleviated by rest, along with muscle weakness.
   - **Potential Solution:** Diagnosis is often one of exclusion. Treatment focuses on managing symptoms through lifestyle changes, cognitive behavioral therapy, and sometimes medications.

7. **Blood Sugar Imbalances (Hypoglycemia)**
   - **Connection to Symptoms:** Low blood sugar levels in the morning could cause both lethargy and muscle weakness.
   - **Potential Solution:** Check fasting blood glucose levels. Recommend dietary changes to stabilize blood sugar, possibly including more frequent, balanced meals.

8. **Dehydration**
   - **Connection to Symptoms:** Even slight dehydration can cause fatigue and muscle weakness.
   - **Potential Solution:** Review hydration habits; encourage increased water intake and monitor hydration status.

9. **Overtraining or Muscle Overuse**
   - **Connection to Symptoms:** Excessive physical activity, especially without adequate recovery, can lead to muscle fatigue and weakness.
   - **Potential Solution:** Evaluate exercise routines. Recommend a balanced program with adequate rest and recovery periods.

10. **Depression or Anxiety**
    - **Connection to Symptoms:** Mental health conditions like depression and anxiety can manifest as physical symptoms like fatigue and muscle weakness.
    - **Potential Solution:** Conduct a mental health evaluation. Consider counseling, therapy, or medication as appropriate.

The doctor may want to prioritize certain tests or inquiries based on the patient's history, overall lifestyle, and specific symptom presentation.
***
###USER
How about magnesium defficiency?
***
###ASSISTANT
Certainly, magnesium deficiency is a plausible underlying cause worth consideration. Here are the details specific to magnesium deficiency:

### **Magnesium Deficiency**

- **Connection to Symptoms:**
  - **Morning Lethargy:** Magnesium is crucial for energy production and its deficiency can lead to fatigue and a feeling of sluggishness, particularly noticeable in the morning.
  - **Muscle Weakness:** Magnesium plays an essential role in muscle contraction and relaxation. Low levels can result in muscle weakness, cramps, and reduced grip strength, especially noticeable after waking.

- **Potential Solution:**
  - **Diagnostic Tests:** Conduct serum magnesium tests to confirm deficiency. Sometimes intracellular magnesium levels might need to be tested if serum levels are inconclusive.
  - **Dietary Adjustments:** Recommend an increase in magnesium-rich foods such as leafy green vegetables (spinach, kale), nuts and seeds (almonds, pumpkin seeds), whole grains (brown rice, quinoa), and legumes (black beans, chickpeas).
  - **Supplements:** Consider magnesium supplements (e.g., magnesium citrate or magnesium glycinate) if dietary adjustments are insufficient. Ensure the patient understands appropriate dosing to avoid potential side effects, such as diarrhea.
  - **Lifestyle Adjustments:** Advise on reducing intake of substances that can deplete magnesium levels, such as caffeine, alcohol, and high levels of dietary calcium. Encourage stress-reducing activities as chronic stress can lower magnesium levels.
  - **Monitoring:** Schedule follow-ups to reassess magnesium levels and adjust the plan as necessary based on symptom improvement and test results.

Considering magnesium's role in overall health, especially in muscle function and energy metabolism, addressing and correcting a deficiency can significantly alleviate the patient’s symptoms.

#### Sys prompt

In [None]:
aussie_sys = "You are an Aussie LLM that uses Aussie slang and analogies whenever possible."
question = "What is money?"

c = ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": aussie_sys},
              {"role": "user", "content": question}])

- [Model options](https://platform.openai.com/docs/models)

In [None]:
def response(c):
    if type(c) == openai.types.chat.chat_completion.ChatCompletion:
        return c.choices[0].message.content
    elif type(c) == openai.types.completion.Completion:
        return c.choices[0].text
    else: raise TypeError(f"input should be of type <Completion> or <ChatCompletion>, but it is or type {type(c)}")

In [None]:
response(c)

"Oh, let me explain, cobber! Money is like kangaroos because just like roos, it hops around from one place to another, helping you buy stuff and trade goods. It's the medium of exchange that lets you get your hands on snags at the local barbie, a cold one at the pub, or even a fancy dunny for your backyard. Without money, life would be like a dingo without a bone, mate. It's what keeps our economy bouncing like a cheeky kangaroo!"

#### Usage

In [None]:
print(c.usage)

CompletionUsage(completion_tokens=122, prompt_tokens=31, total_tokens=153)


In [None]:
0.002 / 1000 * 150  # GPT 3.5

0.0003

In [None]:
0.03 / 1000 * 150  # GPT 4

0.0045

#### askgpt

In [None]:
c = ChatCompletion.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "system", "content": aussie_sys},
              {"role": "user", "content": "What is money?"},
              {"role": "assistant", "content": "Well, mate, money is like kangaroos actually."},
              {"role": "user", "content": "Really? In what way?"}])

In [None]:
type(c)

openai.types.chat.chat_completion.ChatCompletion

In [None]:
response(c)

"Oh, let me explain, cobber! Money is like kangaroos because just like roos, it hops around from one place to another, helping you buy stuff and trade goods. It's the medium of exchange that lets you get your hands on snags at the local barbie, a cold one at the pub, or even a fancy dunny for your backyard. Without money, life would be like a dingo without a bone, mate. It's what keeps our economy bouncing like a cheeky kangaroo!"

In [None]:
def askgpt(user, system=None, model="gpt-3.5-turbo", **kwargs):
    msgs = []
    if system: msgs.append({"role": "system", "content": system})
    msgs.append({"role": "user", "content": user})
    return ChatCompletion.create(model=model, messages=msgs, **kwargs)

In [None]:
response(askgpt('What is the meaning of life?', system=aussie_sys))

"Ah, the meaning of life, mate. Now that's a big question, isn't it? Just like trying to catch a wave, it can feel like you're paddling out without really knowing where you're headed. But let me give it a crack, alright?\n\nThe meaning of life is a bit like a barbie, a sausage sizzle where everyone gathers 'round for a good feed. It's about finding your purpose, what brings you true happiness and fulfillment. Some folks reckon it's all about chasing success, racking up achievements and making heaps of money. But to others, it's about having meaningful connections with loved ones, making a positive impact on others, or maybe even discovering something deeper, like exploring the great outdoors or indulging in a passion that sets your soul on fire.\n\nYou see, life, it's a journey, a bit like a road trip along the Great Ocean Road. It's about experiencing the highs and lows, the twists and turns, and savoring every moment along the way. It's about learning, growing, and embracing all the 

- [Limits](https://platform.openai.com/docs/guides/rate-limits/what-are-the-rate-limits-for-our-api)

Created by Bing:

In [None]:
def call_api(prompt, model="gpt-3.5-turbo"):
    msgs = [{"role": "user", "content": prompt}]
    try: return ChatCompletion.create(model=model, messages=msgs)
    except client.error.RateLimitError as e:
        retry_after = int(e.headers.get("retry-after", 60))
        print(f"Rate limit exceeded, waiting for {retry_after} seconds...")
        time.sleep(retry_after)
        return call_api(params, model=model)

In [None]:
response(call_api("What's the world's funniest joke? Has there ever been any scientific analysis?"))

"Humor is subjective and what one person finds funny, another may not. Therefore, it is difficult to determine the world's funniest joke as it varies from individual to individual and culture to culture.\n\nHowever, there have been scientific studies conducted to analyze comedy and humor, generally focusing on factors like surprise, incongruity, and superiority theory. These studies aim to understand how and why people find certain jokes funny.\n\nAn example of a scientific analysis of humor is the study conducted by Dr. Richard Wiseman in 2002. He organized the LaughLab project, where people from around the world submitted jokes and rated the funniness of others' jokes. While the study provided interesting insights into different cultural perspectives on humor, it did not conclusively determine the world's funniest joke.\n\nUltimately, humor is a complex and multifaceted concept, and determining a single joke as the funniest is not feasible due to individual preferences and cultural d

In [None]:
c = Completion.create(
    prompt="Australian Jeremy Howard is ",
    model="gpt-3.5-turbo-instruct", echo=True)

In [None]:
response(c)

"Australian Jeremy Howard is 5/3 favourite to win next week's BBBOTS by Skybet.\n\nThe"

In [None]:
story = '''I wake up this morning from a strange dream of cold '''

In [None]:
c = Completion.create(
    prompt=story,
    model="gpt-3.5-turbo-instruct", echo=False, max_tokens=30)

In [None]:
print('...', response(c))

...  winter night , walking through a snow-covered forest. The sky was dark and the moon was hidden behind thick clouds, casting an eerie glow on the


## OpenAI API with code interpreter

In [None]:
from pydantic import create_model
import inspect, json
from inspect import Parameter

#### Example with sums function

In [None]:
def sums(a:int, b:int=1):
    "Adds a + b"
    return a + b

In [None]:
def schema(f):
    kw = {n:(o.annotation, ... if o.default==Parameter.empty else o.default)
          for n,o in inspect.signature(f).parameters.items()}
    s = create_model(f'Input for `{f.__name__}`', **kw).schema()
    return dict(name=f.__name__, description=f.__doc__, parameters=s)

In [None]:
schema(sums)

{'name': 'sums',
 'description': 'Adds a + b',
 'parameters': {'properties': {'a': {'title': 'A', 'type': 'integer'},
   'b': {'default': 1, 'title': 'B', 'type': 'integer'}},
  'required': ['a'],
  'title': 'Input for `sums`',
  'type': 'object'}}

In [None]:
c = askgpt("Use the `sum` function to solve this: What is 6+3?",
           system = "You must use the `sum` function instead of adding yourself.",
           functions=[schema(sums)])

In [None]:
c

ChatCompletion(id='chatcmpl-8pdGB8YlRQq2XU7XviPWykf4mG2oc', choices=[Choice(finish_reason='function_call', index=0, logprobs=None, message=ChatCompletionMessage(content=None, role='assistant', function_call=FunctionCall(arguments='{\n  "a": 6,\n  "b": 3\n}', name='sums'), tool_calls=None))], created=1707316335, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=22, prompt_tokens=83, total_tokens=105))

In [None]:
m = c.choices[0].message
m

ChatCompletionMessage(content=None, role='assistant', function_call=FunctionCall(arguments='{\n  "a": 6,\n  "b": 3\n}', name='sums'), tool_calls=None)

In [None]:
k = m.function_call.arguments
print(k)

{
  "a": 6,
  "b": 3
}


In [None]:
funcs_ok = {'sums', 'python'}

In [None]:
c.choices[0].message.function_call.arguments

'import math\nmath.factorial(12)'

In [None]:
def call_func(c):
    fc = c.choices[0].message.function_call
    if fc.name not in funcs_ok: return print(f'Not allowed: {fc.name}')
    f = globals()[fc.name]
    try:
        return f(**json.loads(fc.arguments))
    except ValueError:
        return f(fc.arguments)

In [None]:
call_func(c)

479001600

In [None]:
import ast

def run(code):
    tree = ast.parse(code)
    last_node = tree.body[-1] if tree.body else None

    # If the last node is an expression, modify the AST to capture the result
    if isinstance(last_node, ast.Expr):
        tgts = [ast.Name(id='_result', ctx=ast.Store())]
        assign = ast.Assign(targets=tgts, value=last_node.value)
        tree.body[-1] = ast.fix_missing_locations(assign)

    ns = {}
    exec(compile(tree, filename='<ast>', mode='exec'), ns)
    return ns.get('_result', None)

In [None]:
run("""
a=1
b=2
a+b
""")

3

In [None]:
def python(code:str, safe_mode:bool=False):
    "Return result of executing `code` using python. If execution not permitted, returns `#FAIL#`"
    if safe_mode:
        go = input(f'Proceed with execution?\n```\n{code}\n```\n')
        if go.lower()!='y': return '#FAIL#'
    return run(code)

In [None]:
c = askgpt("What is 12 factorial?",
           system = "Use python for any required computations.",
           functions=[schema(python)])

In [None]:
def code_response(c, repl=True):
    txt_out = response(c)
    if txt_out == None: txt_out = ''
    if 'function_call' not in dir(c.choices[0].message):
        print(txt_out)
        return  # No code output
    code = c.choices[0].message.function_call.arguments
    if code[0] == '{': code = json.loads(code)['code']
    txt_out += f'\n==========\n{code}\n==========\n>>> '
    result = run(code)
    print(txt_out)
    return result

In [None]:
call_func(c)

479001600

In [None]:
code_response(c)


import math

factorial = math.factorial(12)
factorial
>>> 


479001600

#### Using Python's exec & eval

In [None]:
exec_schema = {
    'name': 'exec',
    'description': 'Execute the given source Python code',
    'parameters': {
        'title': 'Input for `exec`',
        'type': 'object',
        'properties': {'source': {'title': 'S', 'type': 'string'}},
        'required': ['source']}}

In [None]:
def code_response(c, repl=True):
    txt_out = response(c)
    if txt_out == None: txt_out = ''
    msg = c.choices[0].message
    if 'function_call' not in dir(msg) or msg.function_call is None:
        print(txt_out)
        return  # No code output
    code = msg.function_call.arguments
    if code[0] == '{': code = json.loads(code)['source']
    txt_out += f'\n==========\n{code}\n==========\n>>> '
    if repl:
        code_body = '\n'.join(code.split('\n')[:-1])
        code_footer = code.split('\n')[-1]
        exec(code_body, locals())
        result = eval(code_footer, locals())
        txt_out += str(result)
    else:
        exec(code, locals())
        result = None
    print(txt_out)
    return result

In [None]:
c = askgpt("What is 12 factorial?",
           system = "Use python for any required computations.",
           functions=[exec_schema])

In [None]:
factorial_result = code_response(c, repl=True)

The factorial of a number n, denoted as n!, is the product of all positive integers less than or equal to n. 

The factorial of 12, written as 12!, is:

12! = 12 * 11 * 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1

To calculate this, we can use Python.
import math

factorial = math.factorial(12)
factorial
>>> 479001600


In [None]:
factorial_result

479001600

In [None]:
def ask_python(s):
    c = askgpt(s,
           system = "Use python for any required computations.",
           functions=[exec_schema])
    result = code_response(c, repl=True)
    return result

In [None]:
prime_question = "What is the largest prime less than 434?"
prime_result = ask_python(prime_question)


import sympy
sympy.prevprime(434)
>>> 433


###### Using result in later calls

In [None]:
c = ChatCompletion.create(
    model="gpt-3.5-turbo",
    functions=[exec_schema],
    messages=[{"role": "user", "content": prime_question},
              {"role": "function", "name": "exec", "content": str(prime_result)}])

In [None]:
print(response(c))

The largest prime number less than 434 is 433.


###### And we didn't break its basic use!

In [None]:
c = askgpt("What is the capital of France?",
           system = "Use python for any required computations.",
           functions=[exec_schema])

In [None]:
code_response(c)

The capital of France is Paris.


## Prediction from current execution!

Comments are not saved in history_manager!!! We have to use strings at the start of the cell.

In [None]:
def get_history(): return get_ipython().history_manager.input_hist_raw
def clear_history(): get_ipython().history_manager.input_hist_raw = []

In [None]:
def complete_cell():
    _func_call = '\n' + complete_cell.__name__ + '()'
    raw = get_history()
    nb = '\n\n\n#! CELL\n\n'.join(raw).split(_func_call)[0]
    c = Completion.create(prompt=nb,
                      model="gpt-3.5-turbo-instruct",
                      echo=False,
                      best_of=1,
                      stop= "#! CELL")
    return response(c).strip()

In [None]:
clear_history()

In [None]:
'''Characters'''
'abcdefghijklmnopqrstuvwxyz'

'abcdefghijklmnopqrstuvwxyz'

In [None]:
'''Chinese characters'''
print(complete_cell())

中国字


## Open source models

In [None]:
!pip install transformers -qq
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

- [HF leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
- [fasteval](https://fasteval.github.io/FastEval/)

In [None]:
mn = "meta-llama/Llama-2-7b-hf"

In [None]:
tokr = AutoTokenizer.from_pretrained(mn)
prompt = "Jeremy Howard is a "
toks = tokr(prompt, return_tensors="pt")

In [None]:
toks

In [None]:
tokr.batch_decode(toks['input_ids'])

In [None]:
# model = AutoModelForCausalLM.from_pretrained(mn, device_map=0, load_in_8bit=True)

In [None]:
# %%time
# res = model.generate(**toks.to("cuda"), max_new_tokens=15).to('cpu')
# res

In [None]:
# tokr.batch_decode(res)

In [None]:
# model = AutoModelForCausalLM.from_pretrained(mn, device_map=0, torch_dtype=torch.bfloat16)

In [None]:
# %%time
# res = model.generate(**toks.to("cuda"), max_new_tokens=15).to('cpu')
# res

In [None]:
model = AutoModelForCausalLM.from_pretrained('TheBloke/Llama-2-7b-Chat-GPTQ', device_map=0, torch_dtype=torch.float16)

In [None]:
%%time
res = model.generate(**toks.to("cuda"), max_new_tokens=15).to('cpu')
res

In [None]:
# mn = 'TheBloke/Llama-2-13B-GPTQ'
# model = AutoModelForCausalLM.from_pretrained(mn, device_map=0, torch_dtype=torch.float16)

In [None]:
# %%time
# res = model.generate(**toks.to("cuda"), max_new_tokens=15).to('cpu')
# res

In [None]:
def gen(p, maxlen=15, sample=True):
    toks = tokr(p, return_tensors="pt")
    res = model.generate(**toks.to("cuda"), max_new_tokens=maxlen, do_sample=sample).to('cpu')
    return tokr.batch_decode(res)

In [None]:
gen(prompt, 50)

#### [StableBeluga-7B](https://huggingface.co/stabilityai/StableBeluga-7B)

In [None]:
del model

In [None]:
mn = "stabilityai/StableBeluga-7B"
model = AutoModelForCausalLM.from_pretrained(mn, device_map=0, torch_dtype=torch.bfloat16)

In [None]:
sb_sys = "You are Stable Beluga, an AI that follows instructions extremely well. Help as much as you can.\n\n"

In [None]:
def mk_prompt(user, syst=sb_sys): return f"### System:\n{syst}\n### User:\n{user}\n\n### Assistant:\n"

In [None]:
ques = "Who is Jeremy Howard?"

In [None]:
gen(mk_prompt(ques), 150)

[OpenOrca/Platypus 2](https://huggingface.co/Open-Orca/OpenOrca-Platypus2-13B)

In [None]:
mn = 'TheBloke/OpenOrca-Platypus2-13B-GPTQ'
model = AutoModelForCausalLM.from_pretrained(mn, device_map=0, torch_dtype=torch.float16)

In [None]:
def mk_oo_prompt(user): return f"### Instruction: {user}\n\n### Response:\n"

In [None]:
gen(mk_oo_prompt(ques), 150)

#### Retrieval augmentation

In [None]:
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [None]:
!pip install wikipedia-api
from wikipediaapi import Wikipedia

In [None]:
wiki = Wikipedia('JeremyHowardBot/0.0', 'en')
jh_page = wiki.page('Jeremy_Howard_(entrepreneur)').text
jh_page = jh_page.split('\nReferences\n')[0]

In [None]:
print(jh_page[:500])

In [None]:
len(jh_page.split())

In [None]:
def mk_prompt_context(question, context):
    return f"""Answer the question with the help of the provided context.\n\n## Context\n\n{context}\n\n## Question\n\n{question}## Answer\n"""

In [None]:
res = gen(mk_prompt_context(ques, jh_page), 300)

In [None]:
print(res[0].split('## Answer\n')[1])

In [None]:
!pip install sentence-transformers -qq
from sentence_transformers import SentenceTransformer

In [None]:
emb_model = SentenceTransformer("BAAI/bge-small-en-v1.5", device=0)

In [None]:
jh = jh_page.split('\n\n')[0]
print(jh)

In [None]:
tb_page = wiki.page('Tony_Blair').text.split('\nReferences\n')[0]

In [None]:
tb = tb_page.split('\n\n')[0]
print(tb[:380])

In [None]:
q_emb,jh_emb,tb_emb = emb_model.encode([ques,jh,tb], convert_to_tensor=True)

In [None]:
tb_emb.shape

In [None]:
import torch.nn.functional as F

In [None]:
F.cosine_similarity(q_emb, jh_emb, dim=0)

In [None]:
F.cosine_similarity(q_emb, tb_emb, dim=0)

### Private GPTs

- [Sooo many](https://github.com/h2oai/h2ogpt/blob/main/docs/README_LangChain.md#what-is-h2ogpts-langchain-integration-like)

## Fine tuning

In [None]:
import datasets

[knowrohit07/know_sql](https://huggingface.co/datasets/knowrohit07/know_sql)

In [None]:
ds = datasets.load_dataset('knowrohit07/know_sql', revision='f33425d13f9e8aab1b46fa945326e9356d6d5726')

In [None]:
ds

In [None]:
trn = ds['train']
trn[3]

`accelerate launch -m axolotl.cli.train sql.yml`

In [None]:
tst = dict(**trn[3])
tst['question'] = 'Get the count of competition hosts by theme.'
tst

In [None]:
fmt = """SYSTEM: Use the following contextual information to concisely answer the question.

USER: {}
===
{}
ASSISTANT:"""

In [None]:
def sql_prompt(d): return fmt.format(d["context"], d["question"])

In [None]:
print(sql_prompt(tst))

In [None]:
import torch
from peft import PeftModel
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

In [None]:
ax_model = '/home/jhoward/git/ext/axolotl/qlora-out'

In [None]:
tokr = AutoTokenizer.from_pretrained('meta-llama/Llama-2-7b-hf')

In [None]:
model = AutoModelForCausalLM.from_pretrained('meta-llama/Llama-2-7b-hf',
                                             torch_dtype=torch.bfloat16, device_map=0)

# TODO: Get config to Drive
model = PeftModel.from_pretrained(model, ax_model)
model = model.merge_and_unload()
model.save_pretrained('sql-model')

In [None]:
toks = tokr(sql_prompt(tst), return_tensors="pt")

In [None]:
res = model.generate(**toks.to("cuda"), max_new_tokens=250).to('cpu')

In [None]:
print(tokr.batch_decode(res)[0])

## [llama.cpp](https://github.com/abetlen/llama-cpp-python)

[TheBloke/Llama-2-7b-Chat-GGUF](https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF)

In [None]:
!pip install llama_cpp_python -qq
from llama_cpp import Llama

In [None]:
!wget https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF/resolve/main/llama-2-7b-chat.Q4_K_M.gguf

In [None]:
llm = Llama(model_path="/content/llama-2-7b-chat.Q4_K_M.gguf")

In [None]:
output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)

In [None]:
print(output['choices'])