Skip to content
Jerry Peng edited this page Jul 20, 2023 · 24 revisions

A list of tools available


Tool Description Use API_KEY ?
arxiv_search Search 3 most relevant research papers from arxiv.org No
calculator Solve arithmetic calculations and return a number No
code_interpreter Execute Python codes No
write_file Write strings to a file in the local hardisk No
read_file Read strings from a existing file in the local hardisk No
google_search Search relevant results from google.com No
text_to_speech Convert text into speech No
text_to_image Generate images based on input prompts No
text_to_video Generate videos based on input prompts No
visual_question_answer Answer a question based on a given image No
image_caption Generate a caption for a given image No
speech_to_text Transcribe a speech audio into texts No
image_to_text Generate a prompt for StableDiffusion that matches the input image No
search_doc search for the most relevant text chunk in a document No
shell Execute bash commands and returns the output No
summarize Summarize a long text No
get_today_weather Get the current weather information for a given location Yes
get_future_weather Get the weather information in the upcoming days for a given location Yes
web_page Get web content from a given url No
wikipedia Search relevant results from Wikipedia No
wolfram_alpha A WolframAlpha engine for solving algebraic equations Yes

arxiv_search

Description: Search engine from arxiv.org. It returns several relevant paper titles, authors, and short summaries. The input should be a search query.

Location: arixv_search.ArxivSearch

Example: ArxivSearch()._run("Attention for transformer")

Published: 2022-09-30
Title: Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
Authors: Chendong Zhao, Jianzong Wang, Wen qi Wei, Xiaoyang Qu, Haoqian Wang, Jing Xiao
Summary: The Transformer architecture model, based on self-attention and multi-head
attention, has achieved remarkable success in offline end-to-end Automatic
Speech Recognition (ASR). However, self-attention and multi-head attention
cannot be easily applied for streaming or online ASR. For self-attention in
Transformer ASR, the softmax normalization function-based attention mechanism
makes it impossible to highlight important speech information. For multi-head
attention in Transformer ASR, it is not easy to model monotonic alignments in
different heads. To overcome these two limits, we integrate sparse attention
and monotonic attention into Transformer-based ASR. The sparse mechanism
introduces a learned sparsity scheme to enable each self-attention structure to
fit the corresponding head better. The monotonic attention deploys
regularization to prune redundant heads for the multi-head attention structure.
The experiments show that our method can effectively improve the attention
mechanism on widely used benchmarks of speech recognition.

Published: 2021-12-28
Title: Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
Authors: Sitong Wu, Tianyi Wu, Haoru Tan, Guodong Guo
Summary: Recently, Transformers have shown promising performance in various vision
tasks. To reduce the quadratic computation complexity caused by the global
self-attention, various methods constrain the range of attention within a local
region to improve its efficiency. Consequently, their receptive fields in a
single attention layer are not large enough, resulting in insufficient context
modeling. To address this issue, we propose a Pale-Shaped self-Attention
(PS-Attention), which performs self-attention within a pale-shaped region.
Compared to the global self-a
Attention for transformer

calculator

Description: A calculator that can compute arithmetic expressions. Useful when you need to perform math calculations. The input should be a mathematical expression.

Location: calculator.Calculator

Example: Calculator()._run("sqrt(1+1)*pi")

4.442882938158366

code_interpreter

Description: Execute Python codes. The input is a string containing Python codes.

Location: code_interpreter.PythonCodeInterpreter

Example: PythonCodeInterpreter()._run("import os\nprint(\"helloworld\")") ··· helloworld Code executed successfully. ···

write_file

Description: Write strings to a file in hardisk. The inputs are file_path and a text string.

Location: file_operation.WriteFile

Example: WriteFile()._run("hello_world.text", "hello_world")

File written successfully to hello_world.text.

read_file

Description: Read a file from hardisk. The input is file_path.

Location: file_operation.ReadFile

Example: ReadFile()._run("hello_world.text")

hello_world

google_search

Description: Search results from Google. The input is a search query.

Location: google_search.GoogleSearch

Example: GoogleSearch()._run("What is mMTC?")

SearchResult(url=https://blog.antenova.com/what-is-mmtc-in-5g-how-does-it-work, title=What is mMTC in 5G? How does it work?, description=Nov 24, 2021 — Massive Machine-Type Communications (mMTC) is one of three core 5G service areas. It has been created specifically to enable a huge volume ...)

SearchResult(url=https://www.gigabyte.com/Solutions/mmtc, title=[5G-mMTC]Smart City Solution, description=mMTC (Massive Machine-Type Communications) is a new service category of 5G that can support extremely high connection density of online devices. Network ...)

SearchResult(url=https://en.wikipedia.org/wiki/MMTC_Ltd, title=MMTC Ltd, description=MMTC Ltd., Metals and Minerals Trading Corporation of India, is one of the two highest earners of foreign exchange for India and India's largest public ...)

SearchResult(url=https://inseego.com/resources/5g-glossary/what-is-mmtc/, title=What is mMTC: Massive Machine Type Communications?, description=mMTC enables ultra-low latency connections, improved network performance, and low energy usage for a wide variety of IoT use cases. mMTC is already being rolled ...)

text_to_speech

Description: Convert input texts into a speech audio

Location: gradio.TTS

Example: TTS()._run("Please surprise me and speak in whatever voice you enjoy. Vielen Dank und Gesundheit!")

the audio file saved into: /var/folders/xv/d1zbl5b50p58ttlb7bh655340000gn/T/tmpl2_vozq_bklfm9yd.wav

text_to_image

Description: Generate an image based on the input prompt

Location: gradio.TextToImage

Example: TextToImage()._run("an asian student wearing a black t-shirt")

the image file saved into: ~/Downloads/prompt/Gentopia-AI/Gentopia/gentopia/tools/c45fc4a8-af17-4d04-8d56-3e6ea6eea3d9/tmp71ynw3ir.jpg

text_to_video

Description: Generate a video based on the input prompt

Location: gradio.TextToVideo

Example: an asian student wearing a black t-shirt

the video file saved into: /var/folders/xv/d1zbl5b50p58ttlb7bh655340000gn/T/tmphsmt6somlfgw_496.mp4

visual_question_answer

Description: Answer a question based on a given image

Location: gradio.VisualQA

Example: VisualQA()._run("tools/image.jpg", "what does the image contain ?")


image_caption

Description: Generate a caption for a given image

Location: gradio.ImageCaption

speech_to_text

Description: Transcribing speech audio into a text transcript.

Location: gradio.AudioToText

image_to_text

Description: Generate a prompt for StableDiffusion that matches the input image

Location: gradio.ImageToPrompt

search_doc

Description: Search for the most relevant text chunk in a document

Location: search_doc.SearchDoc

shell

Description: Execute bash commands and returns the output.

Location: bash.RunShell

summarize

Description: Summarize a long text into a short one

Location: Not available

get_today_weather

Description: Get the current weather information for a given location

Location: weather.GetTodayWeather

get_future_weather

Description: Get the weather information in the upcoming days for a given location

Location: weather.GetFutureWeather

web_page

Description: Get web content from a given url

Location: web_page.WebPage

wikipedia

Description: Search relevant results from Wikipedia

Location: wikipedia.Wikipedia

wolfram_alpha

Description: A WolframAlpha engine for solving algebraic equations

Location: wolfram_alpha.WolframAlpha

Clone this wiki locally