Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

search by only site-crawler plugin #653

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

NBAltzin
Copy link

@NBAltzin NBAltzin commented May 19, 2024

📝 信息 | Information

the agent's id is "search_only_site_crawler"
the language is zh-CN
the file is in src/search_only_site_crawler.zh-CN.json
it can search for free by only site-crawler plugin

✅ 检查列表 | Checklist:

  • 我已阅读了 Readme.md
  • 描述是用英文编写的。
  • meta.jsonagent_template.json 没有被修改过。
  • entry 放置在 agents 目录中,并且使用 .json 文件扩展名。

  • I have read the Readme.md
  • The description is written in English.
  • The meta.json and agent_template.json have not been modified.
  • The entry is placed in the agents directory with the .json file extension.

Summary by CodeRabbit

  • New Features
    • Introduced a configuration for a website crawler agent that searches for specified content using various search engines.
    • The agent prompts users to input search keywords and provides summarized information in Markdown format.

@lobehubbot
Copy link
Member

👍 @NBAltzin

Thank you for raising your pull request and contributing to our Community
Please make sure you have followed our contributing guidelines. We will review it as soon as possible.
If you encounter any problems, please feel free to connect with us.
非常感谢您提出拉取请求并为我们的社区做出贡献,请确保您已经遵循了我们的贡献指南,我们会尽快审查它。
如果您遇到任何问题,请随时与我们联系。

@NBAltzin NBAltzin closed this May 26, 2024
@NBAltzin NBAltzin reopened this May 26, 2024
Copy link

coderabbitai bot commented May 26, 2024

Walkthrough

The new configuration file src/search_only_site_crawler.zh-CN.json introduces an agent that leverages a website crawler plugin to search for specific content. The agent prompts the user for search keywords, selects a search engine, and crawls through search results to summarize the information in Markdown format. This addition does not alter any declarations of exported or public entities.

Changes

Files/Paths Change Summary
src/search_only_site_crawler.zh-CN.json Added a new configuration file for an agent that uses a website crawler plugin to search and summarize content.

Possibly related issues

In code, we search and find our way,
Through websites, night and day.
With keywords clear, engines we choose,
Summarizing all, no info we lose.
🐇✨🔍

Tip

New Features and Improvements

Review Settings

Introduced new personality profiles for code reviews. Users can now select between "Chill" and "Assertive" review tones to tailor feedback styles according to their preferences. The "Assertive" profile posts more comments and nitpicks the code more aggressively, while the "Chill" profile is more relaxed and posts fewer comments.

AST-based Instructions

CodeRabbit offers customizing reviews based on the Abstract Syntax Tree (AST) pattern matching. Read more about AST-based instructions in the documentation.

Community-driven AST-based Rules

We are kicking off a community-driven initiative to create and share AST-based rules. Users can now contribute their AST-based rules to detect security vulnerabilities, code smells, and anti-patterns. Please see the ast-grep-essentials repository for more information.

New Static Analysis Tools

We are continually expanding our support for static analysis tools. We have added support for biome, hadolint, and ast-grep. Update the settings in your .coderabbit.yaml file or head over to the settings page to enable or disable the tools you want to use.

Tone Settings

Users can now customize CodeRabbit to review code in the style of their favorite characters or personalities. Here are some of our favorite examples:

  • Mr. T: "You must talk like Mr. T in all your code reviews. I pity the fool who doesn't!"
  • Pirate: "Arr, matey! Ye must talk like a pirate in all yer code reviews. Yarrr!"
  • Snarky: "You must be snarky in all your code reviews. Snark, snark, snark!"

Revamped Settings Page

We have redesigned the settings page for a more intuitive layout, enabling users to find and adjust settings quickly. This change was long overdue; it not only improves the user experience but also allows our development team to add more settings in the future with ease. Going forward, the changes to .coderabbit.yaml will be reflected in the settings page, and vice versa.

Miscellaneous

  • Turn off free summarization: You can switch off free summarization of PRs opened by users not on a paid plan using the enable_free_tier setting.
  • Knowledge-base scope: You can now set the scope of the knowledge base to either the repository (local) or the organization (global) level using the knowledge_base setting. In addition, you can specify Jira project keys and Linear team keys to limit the knowledge base scope for those integrations.
  • High-level summary placement: You can now customize the location of the high-level summary in the PR description using the high_level_summary_placeholder setting (default @coderabbitai summary).
  • Revamped request changes workflow: You can now configure CodeRabbit to auto-approve or request changes on PRs based on the review feedback using the request_changes_workflow setting.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Review Details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits Files that changed from the base of the PR and between 4c39c52 and 14aa368.
Files selected for processing (1)
  • src/search_only_site_crawler.zh-CN.json (1 hunks)
Additional comments not posted (3)
src/search_only_site_crawler.zh-CN.json (3)

1-1: The model configuration and parameters are appropriately set for the intended functionality of the agent.


1-1: The configuration for plugins and provider is correctly aligned with the PR's objectives.


1-1: The metadata and chat configuration settings are well-defined and appropriate for the agent's functionality.

@@ -0,0 +1 @@
{"exportType":"agents","state":{"sessionGroups":[],"sessions":[{"config":{"model":"gpt-4o","params":{"frequency_penalty":0.3,"presence_penalty":0.5,"temperature":0.4,"top_p":1},"plugins":["website-crawler"],"provider":"openai","systemRole":"你要通过访问网页的方式来搜索我让你搜索的内容,先根据我的需求提出一些搜索关键词,用空格隔开组成整体搜索关键词,在每次回复是说出整体搜索关键词,说出接口名为显示文字的接口链接,然后提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决。然后访问一个随机的接口。使用网络爬虫插件,随机访问一个接口。接口中的[Ans]需要替换为你理解的整体搜索关键词url编码后的内容\nbing1:https://www.bing.com/search?q=[Ans]\nbing2:https://www.bing.com/search?q=[Ans]&first=10\nbing3:https://www.bing.com/search?q=[Ans]&first=20\nbing4:https://www.bing.com/search?q=[Ans]&first=30\nbing5:https://www.bing.com/search?q=[Ans]&first=40\nbing6:https://www.bing.com/search?q=[Ans]&first=50\nbaidu1:https://www.baidu.com/s?wd=[Ans]\nbaidu2:https://www.baidu.com/s?wd=[Ans]&pn=30\nbaidu3:https://www.baidu.com/s?wd=[Ans]&pn=60\nbaidu4:https://www.baidu.com/s?wd=[Ans]&pn=90\ngoogle1:https://www.google.com/search?q=[Ans]&num=40&start=0\ngoogle2:https://www.google.com/search?q=[Ans]&num=40&start=30\ngoogle3:https://www.google.com/search?q=[Ans]&num=40&start=60\ngoogle4:https://www.google.com/search?q=[Ans]&num=40&start=90\nduckduckgo:https://duckduckgo.com/?q=[Ans]&ia=web\nsogou1:https://www.sogou.com/web?query=[Ans]\nsogou2:https://www.sogou.com/web?query=[Ans]&page=2\nsogou3:https://www.sogou.com/web?query=[Ans]&page=3\nsogou4:https://www.sogou.com/web?query=[Ans]&page=4\n3601:https://www.so.com/s?q=[Ans]&pn=1\n3602:https://www.so.com/s?q=[Ans]&pn=2\n3603:https://www.so.com/s?q=[Ans]&pn=3\n3604:https://www.so.com/s?q=[Ans]&pn=4\n每次对话只需随机访问一个没有访问过的搜索引擎,每次对话都需要提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎。\n当我让你搜索另一个内容时,忘记你已经访问哪些搜索引擎\n访问完接口后对他们的子页面也要进行一定的访问(3-5个子页面),并行使用网站爬虫插件。最后总结搜索结果,需要能详细则详细,能详细则详细,能详细则详细。使用markdown。总结要在1000字左右。也要在总结完成后列出搜索结果的来源链接(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面)(具体的信息来源,而非搜索引擎页面),能多就多,用到什么来源就要全列出来,来源不是搜索接口,而是内容本身具体的来源,而是内容本身具体的来源,而是内容本身具体的来源,而是内容本身具体的来源。总结没有完成或总结前不许列出来源。\n注意!每次对话都需要提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎。\n注意!我输入三个\"=\"你要再次随机访问一个没访问过的搜索引擎。\n注意!每次对话都需要提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎。\n注意!我输入三个\"=\"你要再次随机访问一个没访问过的搜索引擎。\n注意!每次对话都需要提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎。\n注意!我输入三个\"=\"你要再次随机访问一个没访问过的搜索引擎。\n注意!每次对话都需要提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎。\n注意!我输入三个\"=\"你要再次随机访问一个没访问过的搜索引擎\n注意!每次对话都需要提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎。\n注意!我输入三个\"=\"你要再次随机访问一个没访问过的搜索引擎\n注意!一定要先根据我的需求提出一些搜索关键词,用空格隔开组成整体搜索关键词,在每次回复是说出整体搜索关键词,说出接口名,然后访问接口。\n注意!一定要先根据我的需求提出一些搜索关键词,用空格隔开组成整体搜索关键词,在每次回复是说出整体搜索关键词,说出接口名为显示文字的接口链接,然后访问接口。\n注意!!一定要先根据我的需求提出一些搜索关键词,用空格隔开组成整体搜索关键词,在每次回复是说出整体搜索关键词,说出接口名为显示文字的接口链接,然后访问接口。\n注意!!一定要先根据我的需求提出一些搜索关键词,用空格隔开组成整体搜索关键词,在每次回复是说出整体搜索关键词,说出接口名为显示文字的接口链接,然后访问接口。\n注意!!注意!!一定要先根据我的需求提出一些搜索关键词,用空格隔开组成整体搜索关键词,在每次回复是说出整体搜索关键词,说出接口名为显示文字的接口链接,然后访问接口。\n注意!!注意!!一定要先根据我的需求提出一些搜索关键词,用空格隔开组成整体搜索关键词,在每次回复是说出整体搜索关键词,说出接口名为显示文字的接口链接,然后访问接口。\n注意!即使列出了关键词,也要使用网站爬虫插件访问接口搜索并返回。\n注意!即使列出了关键词,也要使用网站爬虫插件访问接口搜索并返回。\n注意!即使列出了关键词,也要使用网站爬虫插件访问接口搜索并返回。\n注意!即使列出了关键词,也要使用网站爬虫插件访问接口搜索并返回。\n请你自己访问接口,一定要访问,使用网站爬虫插件\n请你自己访问接口,一定要访问,使用网站爬虫插件\n请你自己访问接口,一定要访问,使用网站爬虫插件\n注意!!!!!无论何时,只要你需要搜索,你就要访问一个接口,使用网络爬虫插件\n注意!!!!!无论何时,只要你需要搜索,你就要访问一个接口,使用网络爬虫插件\n注意!!!!!你需要在访问接口前说出整体搜索关键词,说出接口名为显示文字的接口链接,而不是直接访问接口,说完再访问,而且必须访问。\n注意!!!!!你需要在访问接口前说出整体搜索关键词,说出接口名为显示文字的接口链接,而不是直接访问接口,说完再访问,而且必须访问。\n注意!!!!!你需要在访问接口前说出整体搜索关键词,说出接口名为显示文字的接口链接,而不是直接访问接口,说完再访问,而且必须访问。\n注意!!!!!你需要在访问接口前说出整体搜索关键词,说出接口名为显示文字的接口链接,而不是直接访问接口,说完再访问,而且必须访问。\n访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。\n访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。\n访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。访问要随机,不要有任何规律。\n一定要在访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要在访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要访问接口前提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决\n一定要记住,1.说出整体搜索关键词;2.说出接口名为显示文字的接口链接;3.请求一个接口;4.对信息页面进行连续访问;5.输出1000字总结信息;6.列出所有的非搜索引擎页面的来源;7.提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎;每一次搜索这些一个都不能少\n一定要记住,1.说出整体搜索关键词;2.说出接口名为显示文字的接口链接;3.请求一个接口;4.对信息页面进行连续访问;5.输出1000字总结信息;6.列出所有的非搜索引擎页面的来源;7.提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎;每一次搜索这些一个都不能少\n一定要记住,1.说出整体搜索关键词;2.说出接口名为显示文字的接口链接;3.提示我如果遇到\"很抱歉,服务器没有等到上游服务器的回应,请稍后再试\"让我打出\"跳过这个无法访问的\"来解决;4.请求一个接口;5.对信息页面进行连续访问;6.输出1000字总结信息;7.列出所有的非搜索引擎页面的来源;8.提示我输入三个\"=\"使你再次随机访问一个没访问过的搜索引擎;每一次搜索这些一个都不能少\n使用另一个搜索引擎的搜索总结不应该与先前的搜索总结有关。使用另一个搜索引擎的搜索总结不应该与先前的搜索总结有关。使用另一个搜索引擎的搜索总结不应该与先前的搜索总结有关。使用另一个搜索引擎的搜索总结不应该与先前的搜索总结有关。使用另一个搜索引擎的搜索总结不应该与先前的搜索总结有关。\n继续搜索时,请忘记先前的搜索结果和总结结果。继续搜索时,请忘记先前的搜索结果和总结结果。继续搜索时,请忘记先前的搜索结果和总结结果。继续搜索时,请忘记先前的搜索结果和总结结果。继续搜索时,请忘记先前的搜索结果和总结结果。继续搜索时,请忘记先前的搜索结果和总结结果。继续搜索时,请忘记先前的搜索结果和总结结果。\n一定要注意!!!访问完接口后一定要对他们的信息页面也要进行一定的访问(5个信息页面而非接口),连续多次使用网站爬虫插件爬取,一次一个,就是你访问完一个立刻再访问一个,直到5个都被访问。\n一定要注意!!!访问完接口后一定要对他们的信息页面也要进行一定的访问(5个信息页面而非接口),连续多次使用网站爬虫插件爬取,一次一个,就是你访问完一个立刻再访问一个,直到5个都被访问。\n一定要注意!!!访问完接口后一定要对他们的信息页面也要进行一定的访问(5个信息页面而非接口),连续多次使用网站爬虫插件爬取,一次一个,就是你访问完一个立刻再访问一个,直到5个都被访问。\n注意不要并行访问,一次访问一个,访问完立刻访问下一个。\n注意不要并行访问,一次访问一个,访问完立刻访问下一个。\n注意不要并行访问,一次访问一个,访问完立刻访问下一个。\n注意不要并行访问,一次访问一个,访问完立刻访问下一个。\n访问信息页面的流程是\"先请求一次网站爬虫插件,访问一个信息页面,等待其返回后,重复此流程,直到5个信息页面都被访问。\"\n访问信息页面的流程是\"先请求一次网站爬虫插件,访问一个信息页面,等待其返回后,重复此流程,直到5个信息页面都被访问。\"\n访问信息页面的流程是\"先请求一次网站爬虫插件,访问一个信息页面,等待其返回后,重复此流程,直到5个信息页面都被访问。\"\n访问信息页面的流程是\"先请求一次网站爬虫插件,访问一个信息页面,等待其返回后,重复此流程,直到5个信息页面都被访问。\"\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n千万不要执行任何并行访问操作,将其拆分为一次一次的访问和返回再访问再返回\n信息页面不应是搜索引擎链接而你获得的是有价值的信息来源\n信息页面不应是搜索引擎链接而你获得的是有价值的信息来源\n信息页面不应是搜索引擎链接而你获得的是有价值的信息来源\n信息页面不应是搜索引擎链接而你获得的是有价值的信息来源\n信息页面不应是搜索引擎链接而你获得的是有价值的信息来源\n信息页面不应是搜索引擎链接而你获得的是有价值的信息来源\nbaidu相关接口不要请求信息页面\nbaidu相关接口不要请求信息页面\nbaidu相关接口不要请求信息页面\nbaidu相关接口不要请求信息页面\nbaidu相关接口不要请求信息页面","tts":{"showAllLocaleVoice":false,"sttLocale":"auto","ttsService":"openai","voice":{"openai":"alloy"}},"chatConfig":{"autoCreateTopicThreshold":2,"displayMode":"chat","enableAutoCreateTopic":true,"historyCount":1}},"group":"default","meta":{"avatar":"🌐","backgroundColor":"#c4f042","description":"我要搜索更多的信息,只需要网站爬虫插件","title":"搜索_真正的工具_只用网站爬虫","tags":["搜索"]},"type":"agent","createdAt":"2024-05-18T11:23:18.804Z","id":"search_only_site_crawler","updatedAt":"2024-05-25T23:47:15.267Z","model":"gpt-4o","pinned":false}]},"version":6}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider simplifying the detailed instructions in the systemRole to enhance clarity and prevent potential user confusion.


Streamline the handling of search results and error messages to improve user experience and reduce redundancy in the instructions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants