Skip to content

Commit

Permalink
Merge pull request #17 from ijwfly/feature/dalle-3
Browse files Browse the repository at this point in the history
Dall-E 3 support with API usage calculation
Function calling refactoring
Readme update for easier installation
settings.py comments clarification (thanks @yaroslavyaroslav for idea and contribution)
  • Loading branch information
ijwfly committed Dec 12, 2023
2 parents 2161852 + 68833d7 commit f509c80
Show file tree
Hide file tree
Showing 18 changed files with 406 additions and 146 deletions.
39 changes: 28 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,26 +3,33 @@
This GitHub repository contains the implementation of a telegram bot, designed to facilitate seamless interaction with GPT-3.5 and GPT-4, state-of-the-art language models by OpenAI.

🔥 **GPT-4 Turbo + Vision preview support (gpt-4-1106-preview + gpt-4-vision-preview)**
🔥 **DALL-E 3 Image generation support**

🔑 **Key Features**

1. **Dynamic Dialog Management**: The bot automatically manages the context of the conversation, eliminating the need for the user to manually reset the context using the /reset command. You still can reset dialog manually if needed.
2. **Automatic Context Summarization**: In case the context size exceeds the model's maximum limit, the bot automatically summarizes the context to ensure the continuity of the conversation.
3. **Functions Support**: You can embed functions within the bot. This allows the GPT to invoke these functions when needed, based on the context. The description of the function and its parameters are extracted from the function's docstring. See the `app/context/function_manager.py` file for more details.
4. **Sub-dialogue Mechanism**: "Chat Thread Isolation" feature, where if a message is replied to within the bot, only the corresponding message chain is considered as context. This adds an extra level of context control for the users.
5. **Voice Recognition**: The bot is capable of transcribing voice messages, allowing users to use speech as context or prompt for ChatGPT.
6. **API Usage Tracking**: The bot includes a function that tracks and provides information about the current month's usage of the OpenAI API. This allows users to monitor and manage their API usage costs.
7. **Model Support**: The bot supports both gpt-3.5-turbo and gpt-4 models with the capability to switch between them on-the-fly.
8. **Context Window Size Customization**: The bot provides a feature to customize the maximum context window size. This allows users to set the context size for gpt-3.5-turbo and gpt-4 models individually, enabling more granular control over usage costs. This feature is particularly useful for managing API usage and optimizing the balance between cost and performance.
9. **Access Control**: The bot includes a feature for access control. Each user is assigned a role (stranger, basic, advanced, admin), and depending on the role, they gain access to the bot. Role management is carried out through a messaging mechanism, with inline buttons sent to the admin for role changes.
1. **Model Support**: gpt-3.5-turbo, gpt-4, gpt-4-1106-preview, gpt-4-vision-preview.
2. **Image Generation**: You can ask bot to generate images using DALL-E 3 model, use bot just like official chatgpt app.
3. **Dynamic Dialog Management**: The bot automatically manages the context of the conversation, eliminating the need for the user to manually reset the context using the /reset command. You still can reset dialog manually if needed.
4. **Automatic Context Summarization**: In case the context size exceeds the model's maximum limit, the bot automatically summarizes the context to ensure the continuity of the conversation.
5. **Function calling support**: You can embed functions within the bot. This allows the GPT to invoke these functions when needed, based on the context. `app/context/function_manager.py` file for more details.
6. **Sub-dialogue Mechanism**: When you reply to a message, the bot only looks at that specific conversation thread, making it easier to manage multiple discussions at once.
7. **Voice Recognition**: The bot is capable of transcribing voice messages, allowing users to use speech as context or prompt for ChatGPT.
8. **API Usage Tracking**: The bot includes a function that tracks and provides information about the usage of the OpenAI API. This allows users to monitor and manage their API usage costs.
9. **Context Window Size Customization**: You can setup maximum context window size for each model in `app/context/context_manager.py` file. When context size exceeds this limit, bot will automatically summarize context.
10. **Access Control**: The bot includes a feature for access control. Each user is assigned a role (stranger, basic, advanced, admin), and depending on the role, they gain access to the bot. Role management is carried out through a messaging mechanism, with inline buttons sent to the admin for role changes.

🔧 **Installation**

To get this bot up and running, follow these steps:

1. Set the `TELEGRAM_BOT_TOKEN` and `OPENAI_TOKEN` variables in the `settings.py` file.
2. Set the `IMAGE_PROXY_URL` to your server IP / hostname in the `settings.py` file.
3. Run `docker-compose up -d` in the root directory of the project.
3. (optional) Set the `USER_ROLE_MANAGER_CHAT_ID` variable in the `settings.py` file to your telegram id. This is required for access control.
4. (optional) Set the `ENABLE_USER_ROLE_MANAGER_CHAT` variable in the `settings.py` file to `True`. This is required for access control.
5. (optional) Set the `USER_ROLE_*` variables in the `settings.py` file to desired roles.
6. Run `docker-compose up -d` in the root directory of the project.

If you've done optional steps, when you send your first message to the bot, you will get a management message with your telegram id and info. You can use this message to setup your role as admin.

🤖 **Commands**
```
Expand All @@ -35,4 +42,14 @@ To get this bot up and running, follow these steps:
/gpt4vision - set model to gpt-4-vision-preview
/usage_all - show usage for all users
```
These commands will provide additional interaction control for the bot users.
These commands will provide additional interaction control for the bot users. You can find most settings in settings menu, commands are just shortcuts for them.


⚠️ **Troubleshooting**

If you have any issues with the bot, please create an issue in this repository. I will try to help you as soon as possible.

Here are some typical issues and solutions:
- ```Error code: 400 - {'error': {'message': 'Invalid image.', 'type': 'invalid_request_error' ...}}``` - This error usually occurs when openai cannot access the image. Make sure you set up the `IMAGE_PROXY_URL` variable correctly with your server IP / hostname.
You can try to open this url in your browser to check if it works. Also you can debug the setup by looking at `chatgpttg.message` table in postgres, there will be message with image url. You can try to open this url in your browser to check if it works.
- ```Error code: 400 - {'error': {'message': 'Invalid content type. image_url is only supported by certain models.', 'type': 'invalid_request_error' ...}}``` - This error usually occurs when you have image in your context, but current model doesn't support vision. You can try to change model to gpt-4-vision-preview or reset your context with /reset command.
13 changes: 10 additions & 3 deletions app/bot/message_processor.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,16 +83,23 @@ async def handle_gpt_response(self, chat_gpt_manager, context_manager, response_
if response_dialog_message.function_call:
function_name = response_dialog_message.function_call.name
function_args = response_dialog_message.function_call.arguments
function_response_raw = await function_storage.run_function(function_name, function_args)
function_class = function_storage.get_function_class(function_name)
function = function_class(self.user, self.db, context_manager, self.message)
function_response_raw = await function.run_str_args(function_args)

function_response = DialogUtils.prepare_function_response(function_name, function_response_raw)
function_response_message_id = -1
if self.user.function_call_verbose:
with suppress(BadRequest):
# TODO: split function call message if it's too long
function_response_text = f'Function call: {function_name}({function_args})\n\n{function_response_raw}'
function_response_text = f'Function call: {function_name}({function_args})\n\nResponse: {function_response_raw}'
function_response_tg_message = await send_telegram_message(self.message, function_response_text)
function_response_message_id = function_response_tg_message.message_id

if function_response_raw is None:
# None means there is no need to pass response to GPT or add it to context
return

function_response = DialogUtils.prepare_function_response(function_name, function_response_raw)
context_dialog_messages = await context_manager.add_message(function_response, function_response_message_id)
response_generator = await chat_gpt_manager.send_user_message(self.user, context_dialog_messages, is_cancelled)

Expand Down
4 changes: 2 additions & 2 deletions app/bot/scheduled_tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
import logging

import settings
from app.bot.utils import get_completion_usage_response_all_users
from app.bot.utils import get_usage_response_all_users

FAIL_LIMIT = 5
WAIT_BETWEEN_RETRIES = 5
Expand Down Expand Up @@ -66,7 +66,7 @@ async def get_monthly_usage():

previous_month = datetime.datetime.now(settings.POSTGRES_TIMEZONE).replace(day=1) - datetime.timedelta(days=1)
previous_month = previous_month.date()
result = await get_completion_usage_response_all_users(db, previous_month)
result = await get_usage_response_all_users(db, previous_month)
await bot.send_message(
settings.USER_ROLE_MANAGER_CHAT_ID, result
)
Expand Down
4 changes: 3 additions & 1 deletion app/bot/settings_menu.py
Original file line number Diff line number Diff line change
Expand Up @@ -97,8 +97,9 @@ def __init__(self, bot: Bot, dispatcher: Dispatcher, db: DB):
'current_model': VisibleOptionsSetting('current_model', GPT_MODELS_OPTIONS),
'current_model_preview': VisibleOptionsSetting('current_model', GPT_MODELS_OPTIONS_PREVIEW),
'gpt_mode': ChoiceSetting('GPT mode', 'gpt_mode', list(settings.gpt_mode.keys())),
'voice_as_prompt': OnOffSetting('Voice as prompt', 'voice_as_prompt'),
'use_functions': OnOffSetting('Use functions', 'use_functions'),
'image_generation': OnOffSetting('Image generation', 'image_generation'),
'voice_as_prompt': OnOffSetting('Voice as prompt', 'voice_as_prompt'),
'function_call_verbose': OnOffSetting('Verbose function calls', 'function_call_verbose'),
'streaming_answers': OnOffSetting('Streaming answers', 'streaming_answers'),
# 'auto_summarize': OnOffSetting('Auto summarize', 'auto_summarize'),
Expand All @@ -107,6 +108,7 @@ def __init__(self, bot: Bot, dispatcher: Dispatcher, db: DB):
self.minimum_required_roles = {
'current_model': settings.USER_ROLE_CHOOSE_MODEL,
'current_model_preview': settings.USER_ROLE_CHOOSE_MODEL,
'image_generation': settings.USER_ROLE_IMAGE_GENERATION,
'streaming_answers': settings.USER_ROLE_STREAMING_ANSWERS,
}
self.dispatcher.register_callback_query_handler(self.process_callback, lambda c: SETTINGS_PREFIX in c.data)
Expand Down
16 changes: 13 additions & 3 deletions app/bot/telegram_bot.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,10 @@
from app.bot.settings_menu import Settings
from app.bot.user_middleware import UserMiddleware
from app.bot.user_role_manager import UserRoleManager
from app.bot.utils import (get_hide_button, get_completion_usage_response_all_users, TypingWorker)
from app.bot.utils import (get_hide_button, get_usage_response_all_users, TypingWorker)
from app.bot.utils import send_telegram_message
from app.openai_helpers.utils import calculate_completion_usage_price, calculate_whisper_usage_price, OpenAIAsync
from app.openai_helpers.utils import (calculate_completion_usage_price, calculate_whisper_usage_price, OpenAIAsync,
calculate_image_generation_usage_price)
from app.storage.db import DBFactory, User
from app.storage.user_role import check_access_conditions, UserRole
from app.openai_helpers.chatgpt import GptModel
Expand Down Expand Up @@ -113,6 +114,15 @@ async def get_usage(self, message: types.Message, user: User):
completion_usages = await self.db.get_user_current_month_completion_usage(user.id)
result = []
total = whisper_price

image_generation_usage = await self.db.get_user_current_month_image_generation_usage(user.id)
for usage in image_generation_usage:
price = calculate_image_generation_usage_price(
usage['model'], usage['resolution'], usage['usage_count']
)
total += price
result.append(f'*{usage["model"]}:* {usage["usage_count"]} images, {usage["resolution"]} resolution, ${price}')

for usage in completion_usages:
price = calculate_completion_usage_price(usage.prompt_tokens, usage.completion_tokens, usage.model)
total += price
Expand All @@ -138,7 +148,7 @@ async def get_usage_all_users(self, message: types.Message, user: User):
month = datetime.datetime.now(settings.POSTGRES_TIMEZONE) + relativedelta(months=month_offset)
month = month.date()

result = await get_completion_usage_response_all_users(self.db, month)
result = await get_usage_response_all_users(self.db, month)
await send_telegram_message(
message, result, reply_markup=get_hide_button()
)
Expand Down
19 changes: 17 additions & 2 deletions app/bot/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@
from aiogram import types
from aiogram.utils.exceptions import CantParseEntities

from app.openai_helpers.utils import calculate_completion_usage_price, calculate_whisper_usage_price
from app.openai_helpers.utils import (calculate_completion_usage_price, calculate_whisper_usage_price,
calculate_image_generation_usage_price)

TYPING_TIMEOUT = 180
TYPING_DELAY = 2
Expand Down Expand Up @@ -145,6 +146,15 @@ async def edit_telegram_message(message: types.Message, text: str, message_id, p
return await message.bot.edit_message_text(text, chat_id, message_id)


async def send_photo(message: types.Message, photo_bytes, caption=None, reply_markup=None):
if message.reply_to_message is None:
send_message = message.answer_photo
else:
send_message = message.reply_photo

return await send_message(photo_bytes, caption=caption, reply_markup=reply_markup)


def merge_dicts(dict_1, dict_2):
"""
This function merge two dicts containing strings using plus operator on each key
Expand All @@ -161,16 +171,21 @@ def merge_dicts(dict_1, dict_2):
return result


async def get_completion_usage_response_all_users(db, month_date: date = None) -> str:
async def get_usage_response_all_users(db, month_date: date = None) -> str:
completion_usages = await db.get_all_users_completion_usage(month_date)
whisper_usages = await db.get_all_users_whisper_usage(month_date)
image_generation_usages = await db.get_all_users_image_generation_usage(month_date)
result = []
for name, user_completion_usages in completion_usages.items():
user_usage_price = 0
for usage in user_completion_usages:
user_usage_price += calculate_completion_usage_price(
usage.prompt_tokens, usage.completion_tokens, usage.model
)
for usage in image_generation_usages.get(name, []):
user_usage_price += calculate_image_generation_usage_price(
usage['model'], usage['resolution'], usage['usage_count']
)
user_whisper_usage = whisper_usages.get(name, 0)
user_usage_price += calculate_whisper_usage_price(user_whisper_usage)
result.append((name, user_usage_price))
Expand Down
17 changes: 15 additions & 2 deletions app/context/function_manager.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,12 @@
from typing import Optional

import settings
from app.functions.wolframalpha import query_wolframalpha
from app.functions.dalle_3 import GenerateImageDalle3
from app.functions.wolframalpha import QueryWolframAlpha
from app.openai_helpers.function_storage import FunctionStorage
from app.storage.db import DB, User
from app.storage.user_role import check_access_conditions
from settings import USER_ROLE_IMAGE_GENERATION


class FunctionManager:
Expand All @@ -17,7 +20,15 @@ def get_static_functions():
functions = []

if settings.ENABLE_WOLFRAMALPHA:
functions.append(query_wolframalpha)
functions.append(QueryWolframAlpha)

return functions

def get_conditional_functions(self):
functions = []

if self.user.image_generation and check_access_conditions(USER_ROLE_IMAGE_GENERATION, self.user.role):
functions.append(GenerateImageDalle3)

return functions

Expand All @@ -26,6 +37,8 @@ async def process_functions(self) -> Optional[FunctionStorage]:
return None

functions = self.get_static_functions()
functions += self.get_conditional_functions()

if not functions:
return None

Expand Down
59 changes: 59 additions & 0 deletions app/functions/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
from typing import Optional

import pydantic
from aiogram.types import Message
from abc import ABC, abstractmethod


class OpenAIFunctionParams(pydantic.BaseModel):
pass


class OpenAIFunction(ABC):
PARAMS_SCHEMA = OpenAIFunctionParams

def __init__(self, user, db, context_manager, message: Message):
self.user = user
self.db = db
self.context_manager = context_manager
self.message = message

@abstractmethod
async def run(self, params: OpenAIFunctionParams) -> Optional[str]:
pass

async def run_dict_args(self, params: dict):
try:
params = self.PARAMS_SCHEMA(**params)
except Exception as e:
return f"Parsing error: {e}"
return await self.run(params)

async def run_str_args(self, params: str):
try:
params = self.PARAMS_SCHEMA.parse_raw(params)
except Exception as e:
return f"Parsing error: {e}"
return await self.run(params)

@classmethod
@abstractmethod
def get_description(cls) -> str:
pass

@classmethod
def get_name(cls) -> str:
return cls.__name__

@classmethod
def get_params_schema(cls) -> dict:
params_schema = cls.PARAMS_SCHEMA.schema()
return params_schema

@classmethod
def get_system_prompt_addition(cls) -> Optional[str]:
"""
Returns text to add to system prompt when this function is added to context. You can use this to add
additional instructions about how to use this function.
"""
return None
Loading

0 comments on commit f509c80

Please sign in to comment.