Skip to content

Commit

Permalink
simplify prompt (langchain-ai#13)
Browse files Browse the repository at this point in the history
Simplify to always try to start with "objects" task, to avoid using image name as a valid tag
Start with OCR if it is a pdf
Tweak for photo editing
  • Loading branch information
dashesy committed Mar 30, 2023
1 parent 6d6e537 commit db57ac6
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 8 deletions.
8 changes: 6 additions & 2 deletions langchain/agents/assistant/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,10 @@ def _extract_tool_and_input(self, llm_output: str, tries=0) -> Optional[Tuple[st
elif "brand" in sub_cmd:
action = "Bing Search"
elif "objects" in sub_cmd:
action = "Image Understanding"
if action_input.lower().endswith(".pdf"):
action = "OCR Understanding"
else:
action = "Image Understanding"
if not action_input:
if not action:
if cmd.endswith("?"):
Expand All @@ -143,7 +146,8 @@ def _extract_tool_and_input(self, llm_output: str, tries=0) -> Optional[Tuple[st
assert action_input
if not action and is_face:
action = "Celebrity Understanding"
if not action and " text" in sub_cmd:
# TODO: separate llm to decide the task
if not action and (" is written" in sub_cmd or " text" in sub_cmd or sub_cmd.endswith(" say?")):
action = "OCR Understanding"
if not action:
if tries < 4:
Expand Down
14 changes: 8 additions & 6 deletions langchain/agents/assistant/prompt.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,16 @@
Any time there is an image in our conversation that you want to know about objects description, texts, OCR (optical character recognition), people, celebrities inside of the image you could ask Assistant by addressing him.
These are the tasks that Assistant can handle for an image: photo editing, celebrities, business card, receipt, objects, OCR, Bing
If the task does not fit any of the above, make sure the question has the word objects in it.
For example to ask about an image without any description, make sure the question has the word objects in it.
Ask Assistant about the objects in the image.
Then if there is text in the image, ask Assistant to do OCR
For example to ask about an image that could be a business card, make sure the question has the word business card in it.
For example to ask about an image that could be a receipt, make sure the question has the word receipt in it.
For example other image types that may have text (sign, label, plan, invoice, money), and require OCR.
For example if there is a person's face in the image find if there are celebrities in the image.
Other image types that may have text (sign, label, plan, invoice, money), and require OCR.
* Ask to do OCR if pdf
<|im_end|>
Gather your thoughts and observations in a list then if needed ask Assistant a task it can handle.
Keep tasks Assistant can handle in mind.
Gather your thoughts and observations in a short list then if needed ask Assistant a task it can handle.
Finally summerize the information and answer the question.
For example:
<|im_start|>Human
Expand All @@ -39,7 +40,8 @@
<|im_start|>Human
Move the logo in this image to the right
<|im_sep|>{ai_prefix}
1. This is a photo editing task
1. The image should be edited
2. This is a photo editing task
Assistant, Move the logo in this business card image to the right https://i.ibb.co/tsQ0Myn/00.jpg
EXAMPLE END
Expand Down

0 comments on commit db57ac6

Please sign in to comment.