# OSS - 订阅智能体

# 1 什么是订阅智能体

当有我们关注的事件发生时，Agent获取信息并进行处理，然后通过一些如邮件、微信、discord等通知渠道将处理后的信息发送给我们，我们将这类Agent称为订阅智能体.

在这里，Agent的Role是以一个“资讯订阅员”的身份为你服务，而Agent中包含的Action则主要有两种：
1. 从外界信息源中搜集信息和对搜集得到的信息进行总结；
2. 完成上述任务后，我们可以为这个Agent开发更多额外功能：定时运行的功能和发送到通知渠道的功能

# 2. 前置准备
首先,给咱们的订阅员一个名字:「牛睿得」(News Reader)

了解以下知识:
- 普通智能体的开发：参考前面的教程
- 网络爬虫
  - 基础的概念：
    - HTML：了解 HTML 的基本结构、标签和常见元素
    - CSS：了解 CSS 在 HTML 中的作用
    - 工具：使用浏览器开发者模式能定位网页元素
  - Python 工具
    - aiohttp：熟悉 aiohttp 库的基本用法，会用aiohttp发起网络请求
    - beautifulsoup：了解HTML解析库 BeautifulSoup，掌握如何使用它从HTML中提取信息
- MetaGPT 订阅模块
  - Role、Trigger、Callback的概念
  - Trigger是个异步生成器，参考 https://peps.python.org/pep-0525/，知道如何实现即可


在 `key.yaml`中添加代理以便访问aiohttp
```yaml
#### Global proxy for aiohttp
GLOBAL_PROXY: http://127.0.0.1:7890
```

# 3. OSS订阅智能体实现
订阅智能体的实现主要有3个要素，分别是Role、Trigger、Callback，即智能体本身、触发器、数据回调。我们先拆解一下这个工作：
1. 实现一个 OSSWatcher 的 Role：OSS 即 Open source software，我们对OSS 智能体定位是，帮我们关注并分析热门的开源项目，当有相关信息时将信息推送给我们，这里需要确定让 OSS 从哪个网页获取信息
2. 触发Trigger：指这个OSSWatcher角色运行的触发条件，可以是定时触发或者是某个网站有更新时触发
3. 结果Callback：处理OSSWatcher角色运行生成的信息，我们可以将数据发送到微信或者discord

## 3.1 OSSWatcher Role
以 Github trending 为例:
- 访问链接: https://github.com/trending
- 筛选条件:
  - spoken language：en/zh/...
  - language：html/javascript/python/go/java/...
  - since：daily/weekly/monthly

## 3.2 GitHub Trending爬取
我们先来完成网页爬取的功能，我们教程直接爬取当天不分国家语言和编程语言的热门仓库进行分析，如果有特殊要求，爬取加上筛选条件条件后网页即可。我们先打开 https://github.com/trending 网页，观察网页内容，找到我们需要的内容对应的html元素(<div data-hpc="">),保存到本地`github-trending-raw.html`

### 3.2.1 html 瘦身

In [None]:
# 瘦身代码,先用存入本地的html测试
from bs4 import BeautifulSoup

with open("nature.html") as f:
    html = f.read()

localhtml = html
souphtml = BeautifulSoup(html, "html.parser")
for i in soup.find_all(True):
    for name in list(i.attrs):
        if i[name] and name not in ["class"]:
            del i[name]

for i in soup.find_all(["svg", "img", "video", "audio"]):
    i.decompose()

with open("nature-slim.html", "w") as f:
    f.write(str(soup))

### 3.2.2 构建爬虫代码
询问LLM写一个爬虫代码. Prompt:

你是一个精通python的爬虫工程师，需要使用aiohttp爬取网页，然后用BeautifulSoup解析出列表中的几个字段：几个字段：文章名，摘要内容, url，doi, 发表日期, 发表杂志
目标html代码有如下结构:
```html
{瘦身代码}
```

In [None]:
import asyncio
import aiohttp
from bs4 import BeautifulSoup

async def fetch(session, url):
    async with session.get(url) as response:
        return await response.text()

async def scrape_articles(url):
    async with aiohttp.ClientSession() as session:
        #html_content = await fetch(session, url)
        #soup = BeautifulSoup(html_content, 'html.parser')
        soup = souphtml

        articles_list = soup.find_all('li', class_='app-article-list-row__item')

        articles = []

        for article_item in articles_list:
            title_element = article_item.find('h3', class_='c-card__title')
            title = title_element.text.strip() if title_element else None

            summary_element = article_item.find('div', class_='c-card__summary')
            summary = summary_element.p.text.strip() if summary_element else None

            url_element = title_element.find('a') if title_element else None
            url = url_element['href'].strip() if url_element and url_element.has_attr('href') else None

            # Since the provided HTML does not contain a DOI, url is used in its place.
            # In real-world applications, make sure to locate and extract the actual DOI.
            doi = url

            publish_date_element = article_item.find('time')
            publish_date = publish_date_element.text.strip() if publish_date_element else None

            article_type_element = article_item.find('span', class_='c-meta__type')
            article_type = article_type_element.text.strip() if article_type_element else None

            # The provided HTML does not contain a publish magazine field.
            # In real-world applications, locate and extract it if it exists.
            publish_magazine = None

            article_data = {
                'article_type': article_type,
                'title': title,
                'summary': summary,
                'url': url,
                'doi': doi,
                'publish_date': publish_date,
                'publish_magazine': publish_magazine
            }

            articles.append(article_data)

        return articles

async def main():
    target_url = 'your-target-url'  # Replace with the actual URL you want to scrape
    articles = await scrape_articles(target_url)
    for article in articles:
        print(article)

# Run the main function in the asyncio event loop
# asyncio.run(main())

articles = await scrape_articles("psudohtml")


In [None]:
articles[0]

### 3.2.3 借助生成的代码,构建一个Action

In [10]:
from metagpt.actions.action import Action
from metagpt.config import CONFIG

import aiohttp
from bs4 import BeautifulSoup
from metagpt.actions.action import Action
from metagpt.config import CONFIG

from datetime import datetime, timedelta

async def fetch(session, url,proxy_url):
    async with session.get(url,proxy=proxy_url) as response:
        return await response.text()

class CrawlNatureArticles(Action):
    async def run(self, url: str = "https://www.nature.com/nature/reviews-and-analysis", timeframe: str = "daily",journal_name: str="None"):
        # Get today's date
        today = datetime.today()
        start_date = None
        if timeframe == 'daily':
            start_date = today
        elif timeframe == 'weekly':
            # Set start_date to 7 days before today
            start_date = today - timedelta(days=7)
            
        
        async with aiohttp.ClientSession() as session:
            html_content = await fetch(session, url, "http://127.0.0.1:7897")
            soup = BeautifulSoup(html_content, 'html.parser')
            #soup = souphtml
            articles_list = soup.find_all('li', class_='app-article-list-row__item')

            articles = []

            for article_item in articles_list:
                title_element = article_item.find('h3', class_='c-card__title')
                title = title_element.text.strip() if title_element else None

                summary_element = article_item.find('div', class_='c-card__summary')
                summary = summary_element.p.text.strip() if summary_element else None

                url_element = title_element.find('a') if title_element else None
                url = url_element['href'].strip() if url_element and url_element.has_attr('href') else None

                # Since the provided HTML does not contain a DOI, url is used in its place.
                # In real-world applications, make sure to locate and extract the actual DOI.
                #doi = url

                publish_date_element = article_item.find('time')
                publish_date = publish_date_element.text.strip() if publish_date_element else None

                article_type_element = article_item.find('span', class_='c-meta__type')
                article_type = article_type_element.text.strip() if article_type_element else None

                # The provided HTML does not contain a publish magazine field.
                # In real-world applications, locate and extract it if it exists.
                publish_magazine = journal_name

                article_data = {
                    'article_type': article_type,
                    'title': title,
                    'summary': summary,
                    'url': f"https://www.nature.com{url}",
                    'publish_date': publish_date,
                    'publish_magazine': publish_magazine
                }
                check_date = False
                try:
                    article_date = datetime.strptime(publish_date, '%d %b %Y')
                    check_date = start_date <= article_date
                except ValueError:
                    # If there is an issue with parsing the date, we'll consider it not matching
                    check_date = False
                    
                if check_date and "News" not in article_data['article_type']:
                    articles.append(article_data)

            return articles  
        



In [12]:
testAct = CrawlNatureArticles()

resp_na = await testAct.run('https://www.nature.com/nature/reviews-and-analysis','weekly',"Nature")
resp_nbt = await testAct.run('https://www.nature.com/nbt/research-articles','weekly',"Nature biotechnology")
resp_nc = await testAct.run('https://www.nature.com/ncomms/research-articles','weekly',"Nature Communications")
resp_nmicro = await testAct.run('https://www.nature.com/nmicrobiol/research-articles','weekly',"Nature Microbiology")
resp_nmethod = await testAct.run('https://www.nature.com/nmeth/research-articles','weekly',"Nature Methods")


2024-01-22 21:13:28.020 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 21:13:28.022 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI


In [None]:
print(resp_na)
print(resp_nbt)

In [13]:
import json
resps = resp_na +  resp_nbt + resp_nc + resp_nmicro + resp_nmicro
resps_json = json.dumps(resps,indent=4)
# Extract only "summary" and "title" from each article
extracted_info = [{'title': article['title'], 'summary': article['summary']} for article in resps]

print(resps_json)


[
    {
        "article_type": "Research Briefing",
        "title": "Predator die-off reshapes ecosystems in expected and unexpected ways",
        "summary": "Mass-mortality events of predators are becoming more common, but their precise effects on food webs remain unclear. Experimentally induced predator die-offs led both to reduced predation and to fertilization from the bottom up. Together, these effects stabilized food webs.",
        "url": "https://www.nature.com/articles/d41586-023-04117-9",
        "publish_date": "17 Jan 2024",
        "publish_magazine": "Nature"
    },
    {
        "article_type": "Research Briefing",
        "title": "Greenland\u2019s glaciers are retreating everywhere and all at once",
        "summary": "A comprehensive analysis of satellite data finds that the Greenland ice sheet has lost more ice in the past four decades than previously thought. Moreover, the glaciers that are the most sensitive to seasonal temperature swings will probably retreat t

### 3.2.4 总结内容
1. 今天榜单的整体趋势，例如哪几个编程语言比较热门、最热门的项目是哪些、主要集中在哪些领域
2. 榜单的仓库分类
3. 推荐进一步关注哪些仓库，推荐原因是什么

In [46]:
from typing import Any
from metagpt.actions.action import Action

TRENDING_ANALYSIS_PROMPT = """# 任务要求
你是一位文献调研员. 你被要求从如下提供的「文章内容汇总」中,基于各大杂志擅长领域和其近期发表的文献,筛选与"宏基因组学、环境微生物、测序技术、生物信息学"有关的内容,汇总生成一篇报告,向用户提供其中的亮点和个性化推荐. 

内容风格请参考以下大纲:
# 「本周顶刊」 标题 (取一个生动的标题、突出亮点)
## 热点领域：汇总热点研究！探索研究热点领域，并发现吸引学者注意的关键领域。从**到**，以前所未有的方式见证顶级研究。
## 列表亮点：聚焦文献标题, 为用户提供独特且引人注目的内容。


报告内容请严格按照以下格式生成:

# 「本周顶刊」 T2T组装新算法取得重大进展
## 热点领域
1. 代谢通路新突破
    - **magazine** | published date
      [title1](url)
      summary ...
    - **magazine** | published date
      [title1](url)
      summary ...
...
## 热点文章
1. **magazine** | published date
      [title1](url)
      摘要: summary ...
      点评: 提供推荐此项目的具体原因
2. **magazine** | published date
      [title1](url)
      摘要: summary ...
      点评: 提供推荐此项目的具体原因
3. ...

# 总结:
概括主要内容,形成结论.

资料来源: 
 - [list the main domains from provided urls]
 
小编: *请给出你的大模型名称(版本号)*
主编: *CIAO*


附:
「文章内容汇总」:
{articles}
"""

class AnalysisOSSTrending(Action):

    async def run(
        self,
        articles: Any
    ):
        return await self._aask(TRENDING_ANALYSIS_PROMPT.format(articles=articles))

In [47]:
testAnalysis = AnalysisOSSTrending()

resp2 = await testAnalysis.run(resps_json)


2024-01-22 21:41:04.132 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 21:41:04.133 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI


# 「本周顶刊」 探索微生物宇宙：宏基因组学与环境微生物的最新发现
## 热点领域
1. 微生物宏基因组学的新视角
    - **Nature Microbiology** | 17 Jan 2024
      [High-throughput transcriptomics of 409 bacteria–drug pairs reveals drivers of gut microbiota perturbation](https://www.nature.com/articles/s41564-023-01581-x)
      高通量转录组学揭示了药物与微生物相互作用的机制，为理解药物如何扰动肠道微生物群提供了新的视角。

2. 环境微生物的多样性与功能
    - **Nature Microbiology** | 16 Jan 2024
      [Methane-dependent complete denitrification by a single Methylomirabilis bacterium](https://www.nature.com/articles/s41564-023-01578-6)
      一种单细胞的甲烷氧化亚硝酸盐还原菌展示了其在生物反应器中的全面脱氮能力，这一过程之前被认为需要共生作用。

## 热点文章
1. **Nature Microbiology** | 17 Jan 2024
   [Double-stranded RNA sequencing reveals distinct riboviruses associated with thermoacidophilic bacteria from hot springs in Japan](https://www.nature.com/articles/s41564-023-01579-5)
   摘要: 通过双链RNA测序技术，发现与日本温泉中的嗜热嗜酸细菌相关的独特核糖病毒，这些病毒形成了RNA病毒分支中的不同分支。
   点评: 这项研究为理解极端环境中微生物与病毒的相互作用提供了新的视角，对环境微生物学和宏基因组学的研究者具有高度吸引力。

2. **Nature Microbiology** | 17 Jan 2024
   [Gut 

2024-01-22 21:42:04.547 | INFO     | metagpt.utils.cost_manager:update_cost:48 - Total running cost: $0.402 | Max budget: $10.000 | Current cost: $0.101, prompt_tokens: 6015, completion_tokens: 1350


IAO*


In [45]:
resp2

'```\n# 「本周顶刊」 探索微生物宏基因组学的新视野\n## 热点领域\n1. 微生物宏基因组学的创新应用\n    - **Nature Microbiology** | 17 Jan 2024\n      [High-throughput transcriptomics of 409 bacteria–drug pairs reveals drivers of gut microbiota perturbation](https://www.nature.com/articles/s41564-023-01581-x)\n      High-throughput bacterial transcriptomics provides mechanistic insights into how various drugs interact with and impact the gut microbiota, leading to shifts in microbial populations.\n    - **Nature Microbiology** | 16 Jan 2024\n      [Methane-dependent complete denitrification by a single Methylomirabilis bacterium](https://www.nature.com/articles/s41564-023-01578-6)\n      A groundbreaking discovery of a single bacterium capable of methane oxidation coupled to nitrate reduction, challenging the previously held belief that this process required syntrophic interactions.\n\n2. 环境微生物学的新发现\n    - **Nature Microbiology** | 17 Jan 2024\n      [Gut commensal Christensenella minuta modulates host metabolism via acylated 

In [16]:
from metagpt.llm import LLM
from metagpt.provider.base_llm import BaseLLM
from metagpt.config import CONFIG, LLMProviderEnum

testAnalysis = AnalysisOSSTrending()
testAnalysis.llm = LLM(LLMProviderEnum.ZHIPUAI)

resp3 = await testAnalysis.run(resps_json)

2024-01-22 21:19:48.688 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 21:19:48.692 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI


# 「本周顶刊」 宏基因组学、环境微生物、测序技术、生物信息学领域研究亮点

## 热点领域

1. **环境微生物**：微生物在环境中的角色越来越受到重视，近期研究发现，一些微生物在环境变化中起到了关键作用，如温室气体排放、污染物降解等。

2. **测序技术**：测序技术的发展使得对微生物群落的研究更加深入，近期的一些研究利用测序技术揭示了新的微生物种类和功能。

3. **生物信息学**：生物信息学在微生物研究中的应用越来越广泛，一些研究利用生物信息学方法预测微生物的功能，为实验研究提供了方向。

## 热点文章

1. **Nature Microbiology** | 18 Jan 2024
   [Yersinia entomophaga Tc toxin is released by T10SS-dependent lysis of specialized cell subpopulations](https://www.nature.com/articles/s41564-023-01571-z)
   摘要: Yersinia entomophaga通过一种特殊的细胞亚群释放Tc毒素，这一过程依赖于T10SS。
   点评: 这项研究揭示了Yersinia entomophaga的一种新的毒素释放机制，对于理解这种病原菌的致病机制具有重要意义。

2. **Nature Microbiology** | 17 Jan 2024
   [Gut commensal Christensenella minuta modulates host metabolism via acylated secondary bile acids](https://www.nature.com/articles/s41564-023-01570-0)
   摘要: 研究发现肠道共生菌Christensenella minuta可以通过产生一种新的次级胆汁酸来调节宿主代谢。
   点评: 这项研究揭示了Christensenella minuta的一种新的代谢调节机制，对于理解肠道微生物与宿主代谢的关系具有重要意义。

3. **Nature Communications** | 22 Jan 2024
   [Using big sequencing data to i

2024-01-22 21:20:24.373 | INFO     | metagpt.utils.cost_manager:update_cost:48 - Total running cost: $0.202 | Max budget: $10.000 | Current cost: $0.008, prompt_tokens: 5945, completion_tokens: 556


biology
- Nature Communications

主编: CIAO


In [44]:
print(resp3)

# 「本周顶刊」 宏基因组学、环境微生物、测序技术、生物信息学领域研究亮点

## 热点领域

1. **环境微生物**：微生物在环境中的角色越来越受到重视，近期研究发现，一些微生物在环境变化中起到了关键作用，如温室气体排放、污染物降解等。

2. **测序技术**：测序技术的发展使得对微生物群落的研究更加深入，近期的一些研究利用测序技术揭示了新的微生物种类和功能。

3. **生物信息学**：生物信息学在微生物研究中的应用越来越广泛，一些研究利用生物信息学方法预测微生物的功能，为实验研究提供了方向。

## 热点文章

1. **Nature Microbiology** | 18 Jan 2024
   [Yersinia entomophaga Tc toxin is released by T10SS-dependent lysis of specialized cell subpopulations](https://www.nature.com/articles/s41564-023-01571-z)
   摘要: Yersinia entomophaga通过一种特殊的细胞亚群释放Tc毒素，这一过程依赖于T10SS。
   点评: 这项研究揭示了Yersinia entomophaga的一种新的毒素释放机制，对于理解这种病原菌的致病机制具有重要意义。

2. **Nature Microbiology** | 17 Jan 2024
   [Gut commensal Christensenella minuta modulates host metabolism via acylated secondary bile acids](https://www.nature.com/articles/s41564-023-01570-0)
   摘要: 研究发现肠道共生菌Christensenella minuta可以通过产生一种新的次级胆汁酸来调节宿主代谢。
   点评: 这项研究揭示了Christensenella minuta的一种新的代谢调节机制，对于理解肠道微生物与宿主代谢的关系具有重要意义。

3. **Nature Communications** | 22 Jan 2024
   [Using big sequencing data to i

In [19]:
from metagpt.config import CONFIG
repo = CONFIG.get("HEXO_LOCAL_DIR")
print(repo) 

#repo = '$HOME/repo/hexo-blog'

/Users/ciao/repo/hexo-blog


In [39]:
from metagpt.logs import logger
from metagpt.config import CONFIG
from datetime import date
import subprocess

class pushOSS_to_hexo(Action):

    name: str = "pushOSS_to_hexo"

    async def run(self, article: str ):
        # get today's date with format like 2024-01-22
        repo = CONFIG._get("HEXO_LOCAL_DIR")

        today = date.today().strftime("%Y-%m-%d")
        # Run bash command : hexo new block "Daily highlight"
        command = f"cd {repo} && hexo new post 'Top Paper weekly'"
        logger.info(f"bash: {command}")
        subprocess.run(command, shell=True, cwd=repo)
        
        # write {article} into source/_posts/2024-01-22_Top_Paper_weekly.md
        with open(f"{repo}/source/_posts/{today}-Top-Paper-weekly.md", "a") as f:
            f.write(article)
        logger.info(f"hexo blog written!")
        
        # Automatically deploy to remote (NOT RECOMMENDED)
        #subprocess.run('hexo d', shell=True, cwd=repo)

        

'```\n# 「本周顶刊」 探索微生物宏基因组学的新视野\n## 热点领域\n1. 微生物宏基因组学的创新应用\n    - **Nature Microbiology** | 17 Jan 2024\n      [High-throughput transcriptomics of 409 bacteria–drug pairs reveals drivers of gut microbiota perturbation](https://www.nature.com/articles/s41564-023-01581-x)\n      High-throughput bacterial transcriptomics provides mechanistic insights into how various drugs interact with and impact the gut microbiota, leading to shifts in microbial populations.\n    - **Nature Microbiology** | 16 Jan 2024\n      [Methane-dependent complete denitrification by a single Methylomirabilis bacterium](https://www.nature.com/articles/s41564-023-01578-6)\n      A groundbreaking discovery of a single bacterium capable of methane oxidation coupled to nitrate reduction, challenging the previously held belief that this process required syntrophic interactions.\n\n2. 环境微生物学的新发现\n    - **Nature Microbiology** | 17 Jan 2024\n      [Gut commensal Christensenella minuta modulates host metabolism via acylated 

In [42]:
testPush = pushOSS_to_hexo()

resp4 = await testPush.run(resp2)

print(resp2)

2024-01-22 21:39:02.768 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 21:39:02.769 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI
2024-01-22 21:39:02.786 | INFO     | __main__:run:17 - bash: cd /Users/ciao/repo/hexo-blog && hexo new post 'Top Paper weekly'


[32mINFO [39m Validating config


2024-01-22 21:39:03.626 | INFO     | __main__:run:23 - hexo blog written!


[32mINFO [39m Created: [35m~/repo/hexo-blog/source/_posts/2024-01-22-Top-Paper-weekly-1.md[39m
```
# 「本周顶刊」 探索微生物宏基因组学的新视野
## 热点领域
1. 微生物宏基因组学的创新应用
    - **Nature Microbiology** | 17 Jan 2024
      [High-throughput transcriptomics of 409 bacteria–drug pairs reveals drivers of gut microbiota perturbation](https://www.nature.com/articles/s41564-023-01581-x)
      High-throughput bacterial transcriptomics provides mechanistic insights into how various drugs interact with and impact the gut microbiota, leading to shifts in microbial populations.
    - **Nature Microbiology** | 16 Jan 2024
      [Methane-dependent complete denitrification by a single Methylomirabilis bacterium](https://www.nature.com/articles/s41564-023-01578-6)
      A groundbreaking discovery of a single bacterium capable of methane oxidation coupled to nitrate reduction, challenging the previously held belief that this process required syntrophic interactions.

2. 环境微生物学的新发现
    - **Nature Microbiology** | 17 Jan 202

## 3.3 OSSWatcher Role实现
在metaGPT中,可以将创建的Action存入对应目录下,如`metagpt/actions/oss_nature_recent.py`
然后创建Role文件`metagpt/roles/oss_watcher.py`

In [None]:
from pydantic import ConfigDict, Field, model_validator
from metagpt.llm import LLM
from metagpt.provider.base_llm import BaseLLM

md = Field(default_factory=LLM, exclude=True)
print(LLM)

In [1]:
from metagpt.schema import Message
from bigpt.actions.oss_nature_recent import CrawlNatureArticles, AnalysisOSSTrending
from metagpt.roles.role import Role, RoleReactMode
from pydantic import BaseModel, Field
from metagpt.provider.base_llm import BaseLLM

class OssWatcher(Role):
    name: str ="Codey"
    profile: str ="OssWatcher"
    #goal: str="Generate an insightful GitHub Trending analysis report."
    #constraints: str="Only analyze based on the provided GitHub Trending data."
    def __init__(self,**kwargs):
        super().__init__(**kwargs)
        self._init_actions([CrawlNatureArticles, AnalysisOSSTrending])
        #self._init_actions([SimpleWriteCode, SimpleRunCode])
        #self._set_react_mode(react_mode="by_order")
        self._set_react_mode(react_mode=RoleReactMode.BY_ORDER.value)

    async def _act(self) -> Message:
        logger.info(f"{self._setting}: ready to {self.rc.todo}")
        # By choosing the Action by order under the hood
        # todo will be first CrawlOSSTrending() then AnalysisOSSTrending()
        todo = self.rc.todo

        msg = self.get_memories(k=1)[0] # find the most k recent messages
        result = await todo.run(msg.content)

        msg = Message(content=str(result), role=self.profile, cause_by=type(todo))
        self.rc.memory.add(msg)
        return msg

2024-01-22 13:54:46.366 | INFO     | metagpt.const:get_metagpt_package_root:32 - Package root set to /Users/ciao/repo/MetaGPT
2024-01-22 13:54:46.490 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 13:54:46.491 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI


In [2]:
from metagpt.logs import logger
from metagpt.config import CONFIG, LLMProviderEnum
from metagpt.llm import LLM

# test
#model_config = ConfigDict(arbitrary_types_allowed=True, exclude=["llm"])

#from metagpt.roles.oss_watcher import OssWatcher
#setWatcher = OssWatcher()
setWatcher = OssWatcher()
#setWatcher.llm = LLM(LLMProviderEnum.ZHIPUAI)
print(setWatcher.llm)
resp = await setWatcher.run("demo")
#resp = asyncio.run(setWatcher.run())

2024-01-22 13:54:49.877 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 13:54:49.878 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI


2024-01-22 13:54:49.904 | INFO     | __main__:_act:20 - Codey(OssWatcher): ready to CrawlNatureArticles
2024-01-22 13:54:49.904 | INFO     | bigpt.actions.oss_nature_recent:run:85 - [CrawlNatureArticles]: ready to fetch from nature


<metagpt.provider.openai_api.OpenAILLM object at 0x137f10070>


2024-01-22 13:54:56.488 | INFO     | bigpt.actions.oss_nature_recent:run:87 - [CrawlNatureArticles]: ready to fetch from Nature biotechnology
2024-01-22 13:55:02.643 | INFO     | bigpt.actions.oss_nature_recent:run:89 - [CrawlNatureArticles]: ready to fetch from Nature Communications
2024-01-22 13:55:08.525 | INFO     | bigpt.actions.oss_nature_recent:run:91 - [CrawlNatureArticles]: ready to fetch from Nature Microbiology
2024-01-22 13:55:14.438 | INFO     | bigpt.actions.oss_nature_recent:run:93 - [CrawlNatureArticles]: ready to fetch from Nature Methods
2024-01-22 13:55:21.475 | INFO     | bigpt.actions.oss_nature_recent:run:95 - [CrawlNatureArticles]: all fetched
2024-01-22 13:55:21.476 | INFO     | __main__:_act:20 - Codey(OssWatcher): ready to AnalysisOSSTrending


```
# 「本周顶刊」 宏基因组学与环境微生物的新时代
## 热点领域
1. 微生物组学的新发现
    - [Nature Microbiology]|[Yersinia entomophaga Tc toxin is released by T10SS-dependent lysis of specialized cell subpopulations](https://www.nature.com/articles/s41564-023-01571-z)：这项研究揭示了Yersinia entomophaga细菌如何通过特殊的细胞亚群释放Tc毒素，这对理解细菌如何与宿主互动具有重要意义。
    - [Nature Microbiology]|[Double-stranded RNA sequencing reveals distinct riboviruses associated with thermoacidophilic bacteria from hot springs in Japan](https://www.nature.com/articles/s41564-023-01579-5)：通过双链RNA测序技术，研究人员发现了与日本温泉中的嗜热酸性细菌相关的独特的RNA病毒，这有助于我们理解极端环境中的病毒多样性。
    - [Nature Microbiology]|[High-throughput transcriptomics of 409 bacteria–drug pairs reveals drivers of gut microbiota perturbation](https://www.nature.com/articles/s41564-023-01581-x)：通过高通量转录组学分析409对细菌-药物相互作用，揭示了药物如何影响肠道微生物群的组成，这对于理解和治疗肠道相关疾病至关重要。

2. 测序技术与生物信息学的进展
    - [Nature Methods]|[Massively parallel single-cell sequencing of diverse microbial populations](https://www.nature.com/articles/s41592-023-02157-7)：

2024-01-22 13:56:57.394 | INFO     | metagpt.utils.cost_manager:update_cost:48 - Total running cost: $0.092 | Max budget: $10.000 | Current cost: $0.092, prompt_tokens: 5327, completion_tokens: 1283


In [None]:
resp

# Trigger 实现


In [None]:
# 简单的的trigger
import asyncio
import time

from datetime import datetime, timedelta
from metagpt.schema import Message
from pydantic import BaseModel, Field


class OssInfo(BaseModel):
    url: str
    timestamp: float = Field(default_factory=time.time)


async def oss_trigger(hour: int, minute: int, second: int = 0, url: str = "https://github.com/trending"):
    while True:
        now = datetime.now()
        next_time = datetime(now.year, now.month, now.day, hour, minute, second)
        if next_time < now:
            next_time = next_time + timedelta(1)
        wait = next_time - now
        print(wait.total_seconds())
        await asyncio.sleep(wait.total_seconds())
        yield Message(url, OssInfo(url=url))

In [1]:
import metagpt
from bigpt.actions.oss_nature_recent import CrawlNatureArticles

testAct = CrawlNatureArticles()

await testAct.run()

2024-01-22 17:37:33.589 | INFO     | metagpt.const:get_metagpt_package_root:32 - Package root set to /Users/ciao/repo/MetaGPT


2024-01-22 17:37:33.763 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 17:37:33.764 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI
2024-01-22 17:37:34.532 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 17:37:34.536 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI
2024-01-22 17:37:34.567 | INFO     | bigpt.actions.oss_nature_recent:run:86 - [CrawlNatureArticles]: ready to fetch from nature
2024-01-22 17:37:40.383 | INFO     | bigpt.actions.oss_nature_recent:run:88 - [CrawlNatureArticles]: ready to fetch from Nature biotechnology
2024-01-22 17:37:46.254 | INFO     | bigpt.actions.oss_nature_recent:run:90 - [CrawlNatureArticles]: ready to fetch from Nature Communications
2024-01-22 17:37:53.009 | INFO     | bigpt.actions.oss_nature_recent:run:92 - [Cr

'[\n    {\n        "article_type": "Research Briefing",\n        "title": "Predator die-off reshapes ecosystems in expected and unexpected ways",\n        "summary": "Mass-mortality events of predators are becoming more common, but their precise effects on food webs remain unclear. Experimentally induced predator die-offs led both to reduced predation and to fertilization from the bottom up. Together, these effects stabilized food webs.",\n        "url": "https://www.nature.com/articles/d41586-023-04117-9",\n        "publish_date": "17 Jan 2024",\n        "publish_magazine": "Nature"\n    },\n    {\n        "article_type": "Research Briefing",\n        "title": "Greenland\\u2019s glaciers are retreating everywhere and all at once",\n        "summary": "A comprehensive analysis of satellite data finds that the Greenland ice sheet has lost more ice in the past four decades than previously thought. Moreover, the glaciers that are the most sensitive to seasonal temperature swings will prob

In [14]:
import metagpt

from bigpt.roles.oss_academic import OssWatcher


testRole = OssWatcher()

#Not work
await testRole.run()

2024-01-22 21:15:06.168 | INFO     | metagpt.config:get_default_llm_provider_enum:124 - LLMProviderEnum.OPENAI Model: gpt-4-1106-preview
2024-01-22 21:15:06.170 | INFO     | metagpt.config:get_default_llm_provider_enum:126 - API: LLMProviderEnum.OPENAI


In [15]:
#必须说点什么,不然不执行
await testRole.run("Hi")

2024-01-22 21:15:14.117 | INFO     | bigpt.roles.oss_academic:_act:21 - Codey(OssWatcher): ready to CrawlNatureArticles
2024-01-22 21:15:14.118 | INFO     | bigpt.actions.oss_nature_recent:run:86 - [CrawlNatureArticles]: ready to fetch from nature
2024-01-22 21:15:25.790 | INFO     | bigpt.actions.oss_nature_recent:run:88 - [CrawlNatureArticles]: ready to fetch from Nature biotechnology
2024-01-22 21:15:31.730 | INFO     | bigpt.actions.oss_nature_recent:run:90 - [CrawlNatureArticles]: ready to fetch from Nature Communications
2024-01-22 21:15:37.784 | INFO     | bigpt.actions.oss_nature_recent:run:92 - [CrawlNatureArticles]: ready to fetch from Nature Microbiology
2024-01-22 21:15:44.272 | INFO     | bigpt.actions.oss_nature_recent:run:94 - [CrawlNatureArticles]: ready to fetch from Nature Methods
2024-01-22 21:15:50.556 | INFO     | bigpt.actions.oss_nature_recent:run:96 - [CrawlNatureArticles]: all fetched
2024-01-22 21:15:50.556 | INFO     | bigpt.roles.oss_academic:_act:21 - Codey

# 「本周顶刊」 探索微观世界：宏基因组学与环境微生物的新篇章
## 热点领域
1. 微生物群落的新发现
    - **Nature Microbiology** | 17 Jan 2024
      [Double-stranded RNA sequencing reveals distinct riboviruses associated with thermoacidophilic bacteria from hot springs in Japan](https://www.nature.com/articles/s41564-023-01579-5)
      [在日本温泉中发现与嗜热酸性细菌相关的独特双链RNA病毒。]
    - **Nature Microbiology** | 17 Jan 2024
      [High-throughput transcriptomics of 409 bacteria–drug pairs reveals drivers of gut microbiota perturbation](https://www.nature.com/articles/s41564-023-01581-x)
      [通过对409对细菌-药物组合的高通量转录组学分析，揭示了肠道微生物群落扰动的驱动因素。]
2. 宏基因组学的进展
    - **Nature Methods** | 17 Jan 2024
      [Massively parallel single-cell sequencing of diverse microbial populations](https://www.nature.com/articles/s41592-023-02157-7)
      [DoTA-seq利用微流控滴系统隔离和裂解多样化的微生物，并扩增目标遗传位点，实现微生物群落的高通量单细胞测序。]

## 热点文章
1. **Nature Microbiology** | 17 Jan 2024
   [Gut commensal Christensenella minuta modulates host metabolism via acylated secondary bile acids](https://www.

2024-01-22 21:17:33.243 | INFO     | metagpt.utils.cost_manager:update_cost:48 - Total running cost: $0.194 | Max budget: $10.000 | Current cost: $0.096, prompt_tokens: 6255, completion_tokens: 1111
2024-01-22 21:17:33.246 | INFO     | bigpt.roles.oss_academic:_act:21 - Codey(OssWatcher): ready to pushOSS_to_hexo
/bin/sh: line 0: cd: {repo}: No such file or directory
2024-01-22 21:17:33.270 | INFO     | bigpt.actions.oss_nature_recent:run:176 - hexo blog written!


OssWatcher: None

In [7]:
from datetime import date, datetime, timedelta
timeframe = "weekly"
today = date.today()
start_date = None
if timeframe == 'daily':
    start_date = today
elif timeframe == 'weekly':
    # Set start_date to 7 days before today
    start_date = today - timedelta(days=7)

In [None]:
# aiocron管理的trigger

import time
from aiocron import crontab
from typing import Optional
from pytz import BaseTzInfo
from pydantic import BaseModel, Field
from metagpt.schema import Message


class OssInfo(BaseModel):
    url: str
    timestamp: float = Field(default_factory=time.time)


class GithubTrendingCronTrigger():

    def __init__(self, spec: str, tz: Optional[BaseTzInfo] = None, url: str = "https://github.com/trending") -> None:
        self.crontab = crontab(spec, tz=tz)
        self.url = url

    def __aiter__(self):
        return self

    async def __anext__(self):
        await self.crontab.next()
        return Message(self.url, OssInfo(url=self.url))


In [None]:
# 指定UTC 时间 10:00 AM 触发
# 创建 GithubTrendingCronTrigger 实例，指定每天 UTC 时间 10:00 AM 触发
cron_trigger = GithubTrendingCronTrigger("0 10 * * *")

注意了,当进程关闭时，这些任务将不再执行。所以可能需要保持触发器一直运行.

# 3.4 Callback设计

Discord是国外的,微信限制太多体验差. 我尝试看见DoDo社区的机器人能否实现Callback
https://open.imdodo.com/go/introduction/deployment.html#_2-1-注册-登录开放平台

一、认证一个DoDo开发者的账号
1、登录DoDo开放平台：https://doker.imdodo.com

2、选择个人认证or企业认证，并认真填写申请理由。
3、DoDo工作人员会在2个工作日内完成审核。
4、认证完成后，开发者重新登录开放平台即可创建机器人。

二、创建机器人
1、使用方式选择快速使用机器人

2、上传头像，输入机器人昵称和简介。