-
Notifications
You must be signed in to change notification settings - Fork 6
Tools
A list of tools available
| Tool | Description | Use API_KEY ? |
|---|---|---|
| arxiv_search | Search 3 most relevant research papers from arxiv.org | No |
| calculator | Solve arithmetic calculations and return a number | No |
| code_interpreter | Execute Python codes | No |
| write_file | Write strings to a file in the local hardisk | No |
| read_file | Read strings from a existing file in the local hardisk | No |
| google_search | Search relevant results from google.com | No |
| text_to_speech | Convert text into speech | No |
| text_to_image | Generate images based on input prompts | No |
| text_to_video | Generate videos based on input prompts | No |
| visual_question_answer | Answer a question based on a given image | No |
| image_caption | Generate a caption for a given image | No |
| speech_to_text | Transcribe a speech audio into texts | No |
| image_to_text | Generate a prompt for StableDiffusion that matches the input image | No |
| search_doc | search for the most relevant text chunk in a document | No |
| shell | Execute bash commands and returns the output | No |
| summarize | Summarize a long text | No |
| get_today_weather | Get the current weather information for a given location | Yes |
| get_future_weather | Get the weather information in the upcoming days for a given location | Yes |
| web_page | Get web content from a given url | No |
| wikipedia | Search relevant results from Wikipedia | No |
| wolfram_alpha | A WolframAlpha engine for solving algebraic equations | Yes |
Description: Search engine from arxiv.org. It returns several relevant paper titles, authors, and short summaries. The input should be a search query.
Location: arixv_search.ArxivSearch
Example: ArxivSearch()._run("Attention for transformer")
Published: 2022-09-30
Title: Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
Authors: Chendong Zhao, Jianzong Wang, Wen qi Wei, Xiaoyang Qu, Haoqian Wang, Jing Xiao
Summary: The Transformer architecture model, based on self-attention and multi-head
attention, has achieved remarkable success in offline end-to-end Automatic
Speech Recognition (ASR). However, self-attention and multi-head attention
cannot be easily applied for streaming or online ASR. For self-attention in
Transformer ASR, the softmax normalization function-based attention mechanism
makes it impossible to highlight important speech information. For multi-head
attention in Transformer ASR, it is not easy to model monotonic alignments in
different heads. To overcome these two limits, we integrate sparse attention
and monotonic attention into Transformer-based ASR. The sparse mechanism
introduces a learned sparsity scheme to enable each self-attention structure to
fit the corresponding head better. The monotonic attention deploys
regularization to prune redundant heads for the multi-head attention structure.
The experiments show that our method can effectively improve the attention
mechanism on widely used benchmarks of speech recognition.
Published: 2021-12-28
Title: Pale Transformer: A General Vision Transformer Backbone with Pale-Shaped Attention
Authors: Sitong Wu, Tianyi Wu, Haoru Tan, Guodong Guo
Summary: Recently, Transformers have shown promising performance in various vision
tasks. To reduce the quadratic computation complexity caused by the global
self-attention, various methods constrain the range of attention within a local
region to improve its efficiency. Consequently, their receptive fields in a
single attention layer are not large enough, resulting in insufficient context
modeling. To address this issue, we propose a Pale-Shaped self-Attention
(PS-Attention), which performs self-attention within a pale-shaped region.
Compared to the global self-a
Attention for transformer
Description: A calculator that can compute arithmetic expressions. Useful when you need to perform math calculations. The input should be a mathematical expression.
Location: calculator.Calculator
Example: Calculator()._run("sqrt(1+1)*pi")
4.442882938158366
Description: Execute Python codes. The input is a string containing Python codes.
Location: code_interpreter.PythonCodeInterpreter
Example: PythonCodeInterpreter()._run("import os\nprint(\"helloworld\")")
···
helloworld
Code executed successfully.
···
Description: Write strings to a file in hardisk. The inputs are file_path and a text string.
Location: file_operation.WriteFile
Example: WriteFile()._run("hello_world.text", "hello_world")
File written successfully to hello_world.text.
Description: Read a file from hardisk. The input is file_path.
Location: file_operation.ReadFile
Example: ReadFile()._run("hello_world.text")
hello_world
Description: Search results from Google. The input is a search query.
Location: google_search.GoogleSearch
Example: GoogleSearch()._run("What is mMTC?")
SearchResult(url=https://blog.antenova.com/what-is-mmtc-in-5g-how-does-it-work, title=What is mMTC in 5G? How does it work?, description=Nov 24, 2021 — Massive Machine-Type Communications (mMTC) is one of three core 5G service areas. It has been created specifically to enable a huge volume ...)
SearchResult(url=https://www.gigabyte.com/Solutions/mmtc, title=[5G-mMTC]Smart City Solution, description=mMTC (Massive Machine-Type Communications) is a new service category of 5G that can support extremely high connection density of online devices. Network ...)
SearchResult(url=https://en.wikipedia.org/wiki/MMTC_Ltd, title=MMTC Ltd, description=MMTC Ltd., Metals and Minerals Trading Corporation of India, is one of the two highest earners of foreign exchange for India and India's largest public ...)
SearchResult(url=https://inseego.com/resources/5g-glossary/what-is-mmtc/, title=What is mMTC: Massive Machine Type Communications?, description=mMTC enables ultra-low latency connections, improved network performance, and low energy usage for a wide variety of IoT use cases. mMTC is already being rolled ...)
Description: Convert input texts into a speech audio
Location: gradio.TTS
Example: TTS()._run("Please surprise me and speak in whatever voice you enjoy. Vielen Dank und Gesundheit!")
the audio file saved into: /var/folders/xv/d1zbl5b50p58ttlb7bh655340000gn/T/tmpl2_vozq_bklfm9yd.wav
Description: Generate an image based on the input prompt
Location: gradio.TextToImage
Example: TextToImage()._run("an asian student wearing a black t-shirt")
the image file saved into: ~/Downloads/prompt/Gentopia-AI/Gentopia/gentopia/tools/c45fc4a8-af17-4d04-8d56-3e6ea6eea3d9/tmp71ynw3ir.jpg
Description: Generate a video based on the input prompt
Location: gradio.TextToVideo
Example: an asian student wearing a black t-shirt
the video file saved into: /var/folders/xv/d1zbl5b50p58ttlb7bh655340000gn/T/tmphsmt6somlfgw_496.mp4
Description: Answer a question based on a given image
Location: gradio.VisualQA
Example: VisualQA()._run("tools/image.jpg", "what does the image contain ?")
Description: Generate a caption for a given image
Location: gradio.ImageCaption
Description: Transcribing speech audio into a text transcript.
Location: gradio.AudioToText
Description: Generate a prompt for StableDiffusion that matches the input image
Location: gradio.ImageToPrompt
Description: Search for the most relevant text chunk in a document
Location: search_doc.SearchDoc
Description: Execute bash commands and returns the output.
Location: bash.RunShell
Description: Summarize a long text into a short one
Location: Not available
Description: Get the current weather information for a given location
Location: weather.GetTodayWeather
Description: Get the weather information in the upcoming days for a given location
Location: weather.GetFutureWeather
Description: Get web content from a given url
Location: web_page.WebPage
Description: Search relevant results from Wikipedia
Location: wikipedia.Wikipedia
Description: A WolframAlpha engine for solving algebraic equations
Location: wolfram_alpha.WolframAlpha
Doc Paper Twitter