Skip to content

Commit

Permalink
Merge pull request #11 from ericmjl:ollama
Browse files Browse the repository at this point in the history
Ollama
  • Loading branch information
ericmjl committed Oct 29, 2023
2 parents e13800d + d10e7e1 commit 1f090b8
Show file tree
Hide file tree
Showing 9 changed files with 230 additions and 31 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/code-style.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,6 @@ jobs:
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: 3.11
- uses: pre-commit/action@v2.0.0
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ repos:
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: https://github.com/psf/black
rev: 23.10.0
rev: 23.10.1
hooks:
- id: black
- repo: https://github.com/kynan/nbstripout
Expand All @@ -26,7 +26,7 @@ repos:
- "--config=pyproject.toml"
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.1.1
rev: v0.1.3
hooks:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
Expand Down
74 changes: 64 additions & 10 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,23 +15,28 @@ To install LLaMaBot:
pip install llamabot
```

## How to use
## Get access to LLMs

### Option 1: Using local models with Ollama

### Obtain an OpenAI API key
LlamaBot supports using local models through Ollama.
To do so, head over to the [Ollama website](https://ollama.ai) and install Ollama.
Then follow the instructions below.

Obtain an OpenAI API key and set it as the environment variable `OPENAI_API_KEY`.
(Here's a [reference][envvar] on what an environment variable is, if you're not sure.)
### Option 2: Use the OpenAI API

[envvar]: https://ericmjl.github.io/essays-on-data-science/software-skills/environment-variables/
Obtain an OpenAI API key, then configure LlamaBot to use the API key by running:

We recommend setting the environment variable in a `.env` file
in the root of your project repository.
From there, `llamabot` will automagically load the environment variable for you.
```bash
llamabot configure
```

## How to use

### Simple Bot
### SimpleBot

The simplest use case of LLaMaBot
is to create a simple bot that keeps no record of chat history.
is to create a `SimpleBot` that keeps no record of chat history.
This is effectively the same as a _stateless function_
that you program with natural language instructions rather than code.
This is useful for prompt experimentation,
Expand All @@ -53,6 +58,55 @@ For example:
feynman("Enzyme function annotation is a fundamental challenge, and numerous computational tools have been developed. However, most of these tools cannot accurately predict functional annotations, such as enzyme commission (EC) number, for less-studied proteins or those with previously uncharacterized functions or multiple activities. We present a machine learning algorithm named CLEAN (contrastive learning–enabled enzyme annotation) to assign EC numbers to enzymes with better accuracy, reliability, and sensitivity compared with the state-of-the-art tool BLASTp. The contrastive learning framework empowers CLEAN to confidently (i) annotate understudied enzymes, (ii) correct mislabeled enzymes, and (iii) identify promiscuous enzymes with two or more EC numbers—functions that we demonstrate by systematic in silico and in vitro experiments. We anticipate that this tool will be widely used for predicting the functions of uncharacterized enzymes, thereby advancing many fields, such as genomics, synthetic biology, and biocatalysis.")
```

This will return something that looks like:

```text
Alright, let's break this down.
Enzymes are like little biological machines that help speed up chemical reactions in our
bodies. Each enzyme has a specific job, or function, and we use something called an
Enzyme Commission (EC) number to categorize these functions.
Now, the problem is that we don't always know what function an enzyme has, especially if
it's a less-studied or new enzyme. This is where computational tools come in. They try
to predict the function of these enzymes, but they often struggle to do so accurately.
So, the folks here have developed a new tool called CLEAN, which stands for contrastive
learning–enabled enzyme annotation. This tool uses a machine learning algorithm, which
is a type of artificial intelligence that learns from data to make predictions or
decisions.
CLEAN uses a method called contrastive learning. Imagine you have a bunch of pictures of
cats and dogs, and you want to teach a machine to tell the difference. You'd show it
pairs of pictures, some of the same animal (two cats or two dogs) and some of different
animals (a cat and a dog). The machine would learn to tell the difference by contrasting
the features of the two pictures. That's the basic idea behind contrastive learning.
CLEAN uses this method to predict the EC numbers of enzymes more accurately than
previous tools. It can confidently annotate understudied enzymes, correct mislabeled
enzymes, and even identify enzymes that have more than one function.
The creators of CLEAN have tested it with both computer simulations and lab experiments,
and they believe it will be a valuable tool for predicting the functions of unknown
enzymes. This could have big implications for fields like genomics, synthetic biology,
and biocatalysis, which all rely on understanding how enzymes work.
```

LlamaBot defaults to using the OpenAI API for convenience.
However, if you'd like to use an Ollama local model instead:

```python
from llamabot import SimpleBot
bot = SimpleBot(
"You are Richard Feynman. You will be given a difficult concept, and your task is to explain it back.",
model_name="llama2:13b"
)
```

Simply specify the `model_name` keyword argument
and provide a model name from the [Ollama library of models](https://ollama.ai/library).
(The same can be done for the `ChatBot` and `QueryBot` classes below!)

### Chat Bot

To experiment with a Chat Bot in the Jupyter notebook,
Expand Down
22 changes: 18 additions & 4 deletions llamabot/bot/model_dispatcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from langchain.callbacks.base import BaseCallbackManager
from time import sleep
from loguru import logger
from functools import partial

# get this list from: https://ollama.ai/library
ollama_model_keywords = [
Expand Down Expand Up @@ -56,19 +57,32 @@ def create_model(
This is necessary to validate b/c LangChain doesn't do the validation for us.
Example usage:
```python
# use the vicuna model
model = create_model(model_name="vicuna")
# use the llama2 model
model = create_model("llama2")
# use codellama with a temperature of 0.5
model = create_model("codellama:13b", temperature=0.5)
```
:param model_name: The name of the model to use.
:param temperature: The model temperature to use.
:param streaming: (LangChain config) Whether to stream the output to stdout.
:param verbose: (LangChain config) Whether to print debug messages.
:return: The model.
"""
ModelClass = ChatOpenAI
# We use a `partial` here to ensure that we have the correct way of specifying
# a model name between ChatOpenAI and ChatOllama.
ModelClass = partial(ChatOpenAI, model_name=model_name)
if model_name.split(":")[0] in ollama_model_keywords:
ModelClass = ChatOllama
launch_ollama(model_name, verbose=verbose)
ModelClass = partial(ChatOllama, model=model_name)

return ModelClass(
model_name=model_name,
temperature=temperature,
streaming=streaming,
verbose=verbose,
Expand Down
1 change: 0 additions & 1 deletion llamabot/bot/querybot.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,6 @@ def __init__(
self.response_tokens = response_tokens
self.history_tokens = history_tokens

# @validate_call
def __call__(
self,
query: str,
Expand Down
1 change: 1 addition & 0 deletions llamabot/doc_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
".xlsx": "PandasExcelReader",
".md": "MarkdownReader",
".ipynb": "IPYNBReader",
".html": "UnstructuredReader",
}


Expand Down
12 changes: 6 additions & 6 deletions scratch_notebooks/blogging_assistant.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"outputs": [],
"source": [
"%load_ext autoreload\n",
"%autoreload 2"
"%autoreload 2\n"
]
},
{
Expand Down Expand Up @@ -51,7 +51,7 @@
"with open(here() / \"data/blog_text.txt\", \"r+\") as f:\n",
" blog_text = f.read()\n",
"\n",
"response = bot(blog_tagger_and_summarizer(blog_text))"
"response = bot(blog_tagger_and_summarizer(blog_text))\n"
]
},
{
Expand All @@ -63,7 +63,7 @@
"from llamabot.prompt_library.output_formatter import coerce_dict\n",
"\n",
"output = coerce_dict(response.content)\n",
"output"
"output\n"
]
},
{
Expand All @@ -75,7 +75,7 @@
"import json\n",
"\n",
"answers = json.loads(response.content)\n",
"answers[\"summary\"]"
"answers[\"summary\"]\n"
]
},
{
Expand All @@ -85,7 +85,7 @@
"outputs": [],
"source": [
"for tag in answers[\"tags\"]:\n",
" print(tag)"
" print(tag)\n"
]
}
],
Expand All @@ -105,7 +105,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.11.5"
}
},
"nbformat": 4,
Expand Down
12 changes: 6 additions & 6 deletions scratch_notebooks/diffbot.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"outputs": [],
"source": [
"from llamabot import SimpleBot\n",
"from llamabot.prompt_library.diffbot import diffbot, get_github_diff"
"from llamabot.prompt_library.diffbot import diffbot, get_github_diff\n"
]
},
{
Expand All @@ -19,7 +19,7 @@
"url = \"https://github.com/pyjanitor-devs/pyjanitor/pull/1262\"\n",
"\n",
"\n",
"# print(get_github_diff(url))"
"# print(get_github_diff(url))\n"
]
},
{
Expand All @@ -38,7 +38,7 @@
"\n",
"diff = get_github_diff(\"https://github.com/pyjanitor-devs/pyjanitor/pull/1262\")\n",
"\n",
"diffbot(describe_advantages(diff))"
"diffbot(describe_advantages(diff))\n"
]
},
{
Expand All @@ -47,7 +47,7 @@
"metadata": {},
"outputs": [],
"source": [
"diffbot(suggest_improvements(diff))"
"diffbot(suggest_improvements(diff))\n"
]
},
{
Expand Down Expand Up @@ -105,7 +105,7 @@
"metadata": {},
"outputs": [],
"source": [
"asdfasdfadsf"
"asdfasdfadsf\n"
]
}
],
Expand All @@ -125,7 +125,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.11.5"
}
},
"nbformat": 4,
Expand Down

0 comments on commit 1f090b8

Please sign in to comment.