 # Система автоматизированного анализа научных статей с использованием подхода поисковой дополненной генерации

*by Nosov Ivan*

 Этот ноутбук предназначен для автоматизированного анализа научных статей, ориентированного на исследование тем, связанных с данными и технологиями хранения информации (например, для небольших спутниковых систем). В основе решения лежит использование поисковой дополненной генерации, которая позволяет эффективно извлекать, индексировать и синтезировать информацию из большого объёма документов.

Монтирование Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Установка пакетов

In [None]:
!pip install llama-index openai llama-index-agent-coa llama-parse llama-index-packs-agents-coa llama-index-postprocessor-cohere-rerank


Импорт библиотек

In [None]:
import os
import pickle
import json
from tqdm.notebook import tqdm
from openai import BadRequestError
from google.colab import userdata

from llama_index.agent.openai import OpenAIAgent
from llama_index.core import (
    load_index_from_storage,
    StorageContext,
    VectorStoreIndex,
    KeywordTableIndex,
    SummaryIndex,
    SimpleDirectoryReader,
    Settings,
    get_response_synthesizer,
    DocumentSummaryIndex,
)
from llama_index.core.node_parser import (
    SentenceSplitter,
    SemanticSplitterNodeParser,
)
from llama_index.core.schema import TextNode
from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.openai import OpenAI
from llama_index.core.objects import (
    ObjectIndex,
    ObjectRetriever,
)
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.schema import QueryBundle
from llama_index.core.agent import ReActAgent
from llama_index.llms.openai import OpenAI
from llama_index.core.llms import ChatMessage
from llama_index.core.tools import BaseTool, FunctionTool
from llama_index.core import PromptTemplate
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.indices.list.retrievers import (
    SummaryIndexLLMRetriever,
  )

Чтение документов в папках

In [None]:
reader = SimpleDirectoryReader(input_dir="/content/drive/MyDrive/space_data/space_pdfs")
documents = reader.load_data()

Вывод метаданных о документе для проверки правильности чтения

In [None]:
documents[0].get_metadata_str()

Настройка эмбеддинг-модели и LLM

In [None]:
openai_api_token = userdata.get('OPENAI_API_TOKEN')

if not openai_api_token:
    raise ValueError("OPENAI_API_TOKEN is not found in .env")


Settings.embed_model = OpenAIEmbedding(
    api_key=openai_api_token,
    model="text-embedding-3-small",
)

Settings.llm = OpenAI(api_key=openai_api_token,
                      model="gpt-4o-mini", temperature=0.3, max_retries=10000, timeout=100000)



Включение логирования

In [None]:
import logging
import sys
logging.basicConfig(stream=sys.stdout, level=logging.DEBUG)
logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))

Использование сплиттера для разделения документов на ноды. Для документов по умолчанию используется семантический сплиттер, разделяющий на основе семантической близости кусочков текста (по эмбеддингам), но для кусочков текста, которые слишком большие для эмбеддинг модели (8192 токена для text-embedding-3-small) используется "безопасный" сплиттер, делящий текст по равным кусочкам

подробнее о проблеме в библиотеке llama index: https://github.com/run-llama/llama_index/issues/12270

In [None]:
unsafe_splitter = SemanticSplitterNodeParser(
    buffer_size=2,
    breakpoint_percentile_threshold=75,
    embed_model=Settings.embed_model,
    show_progress=True,
    include_metadata=True,
)

safe_splitter = SentenceSplitter(
    chunk_size=256,
    chunk_overlap=32,
    include_metadata=True,
)

all_nodes = []

documents_count = len(documents)

for i, document in enumerate(documents):
    print(f"Обработка документа {i} из {documents_count}.")
    nodes = []
    try:
        nodes = unsafe_splitter.get_nodes_from_documents([document])
    except BadRequestError:
        print('Ошибка парсинга: openai bad request. Парсинг с помщою safe splitter.')
        nodes = safe_splitter.get_nodes_from_documents([document])

    all_nodes.extend(nodes)


Вывод ноды (тест)

In [None]:
print(nodes[0].get_content())

Scanning The Issue342 Proceedings of the IEEE | Vol. 106, No. 3, March 2018Director of NASA's Ames Research Center at Moffett Field, CA, USA, until 
his retirement on March 31, 2015. He has held several positions in the U.S. 
Air Force and was Research Professor of Astronomy at the University of 
Arizona. 


Вывод содержимого ноды

In [None]:
doc_node = documents_nodes[0][0]
node_content = doc_node.to_json()

print(node_content)

print(node_content.__class__)

{"id_": "cdb89805-e3d4-442c-93c5-11fae8e14fab", "embedding": null, "metadata": {"page_label": "48", "file_name": "file_1.pdf", "file_path": "/content/drive/MyDrive/space_pdfs/file_1.pdf", "file_type": "application/pdf", "file_size": 129805, "creation_date": "2024-08-20", "last_modified_date": "2024-08-17"}, "excluded_embed_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "excluded_llm_metadata_keys": ["file_name", "file_type", "file_size", "creation_date", "last_modified_date", "last_accessed_date"], "relationships": {"1": {"node_id": "d7303472-b5fd-4bae-9bc2-79c4d063cd53", "node_type": "4", "metadata": {"page_label": "48", "file_name": "file_1.pdf", "file_path": "/content/drive/MyDrive/space_pdfs/file_1.pdf", "file_type": "application/pdf", "file_size": 129805, "creation_date": "2024-08-20", "last_modified_date": "2024-08-17"}, "hash": "cb3df171ab5196f0cbd1ab0d5d8c8500e8aac109a42be1661d973f4fa4483d16", "class_name": 

Создание списка нод из всех документов

In [None]:
all_nodes = []
for document_nodes in documents_nodes:
  for node in document_nodes:
    all_nodes.append(node)

In [None]:
print(len(all_nodes))

61451


Сохранение созданных нод на диск

In [None]:
all_nodes_json = []

for node in all_nodes:
  all_nodes_json.append(node.to_dict())

In [None]:
file_path = '/content/drive/MyDrive/space_data/space_nodes_json_semantic_splitter.json'

with open(file_path, 'w', encoding='utf-8') as json_file:
    json.dump(all_nodes_json, json_file, indent=4, ensure_ascii=False)

print(f"Файл успешно сохранен по пути: {file_path}")

Файл успешно сохранен по пути: /content/drive/MyDrive/space_nodes_json_semantic_splitter.json


Загрузка сохраненных нод

In [None]:
file_path = '/content/drive/MyDrive/space_data/space_nodes_json_semantic_splitter.json'

with open(file_path, 'r', encoding='utf-8') as json_file:
    all_nodes_dict = json.load(json_file)

all_nodes = [TextNode.from_dict(node_dict) for node_dict in all_nodes_dict]

print(f"Загружено {len(all_nodes)} нод.")

Загружено 61451 нод.


Создание словаря названий файлов, где каждому названию соответствует список его нод

In [None]:
nodes_by_pdf_files = []

unique_file_names = list(set(node.to_dict()['metadata']['file_name'] for node in all_nodes))

nodes_by_pdf_files = {file_name: [] for file_name in unique_file_names}

for node in all_nodes:
    file_name = node.to_dict()['metadata']['file_name']
    nodes_by_pdf_files[file_name].append(node)



Функция создания агента для каждого документа, векторного индекса, summary индекса, сохранения на диске.


Для каждого pdf-документа строится векторный индекс, summary индекс (используемые ноды хранятся как последовательность, все ноды используются для синтеза ответа).

Для обоих индексов создается движок запросов и свой выборщик (retriever), для векторного индекса выбираются 20 самых подходящих запросу нод. Движки запросов используются как в subquestion tool, так и для агента верхнего уровня.

Subquestion tool разбивает входной запрос на серию вопросов.
Агент верхнего уровня реализован по модели ReAct (thought, action, answer). Он получает входной запрос по документу и имеет доступ к инструментам поиска по векторному индексу, суммаризационному синтезу ответа и subquestion tool.

Также из каждого документа извлекается краткое описание (summary), название и авторы для добавление в метаданные агента-документа.

In [None]:
async def build_agent_for_document(file_name_with_extension, document_nodes):
    file_name = os.path.splitext(file_name_with_extension)[0].replace("(", "_").replace(")", "_").replace(" ", "")

    # Определение путей для сохранения индексов и других данных
    vi_output_path = f'/content/drive/MyDrive/space_data/vector_index/{file_name}_vi'
    sii_output_path = f'/content/drive/MyDrive/space_data/summary_index_index/{file_name}_si.pkl'
    si_output_path = f'/content/drive/MyDrive/space_data/summary_index/{file_name}_si.pkl'
    at_output_path = f'/content/drive/MyDrive/space_data/titles_authors/{file_name}_si.pkl'
    kti_output_path = f'content/drive/MyDrive/space_data/keyword_index/{file_name}_kti'

    if not os.path.exists(vi_output_path):
        vector_index = VectorStoreIndex(document_nodes, show_progress=True)
        vector_index.storage_context.persist(persist_dir=vi_output_path)
    else:
        vector_index = load_index_from_storage(StorageContext.from_defaults(persist_dir=vi_output_path))

    if not os.path.exists(sii_output_path):
        summary_index = SummaryIndex(document_nodes, show_progress=True)
        summary_index.storage_context.persist(persist_dir=sii_output_path)
    else:
        summary_index = load_index_from_storage(StorageContext.from_defaults(persist_dir=sii_output_path))

    QA_PROMPT_TMPL = (
        "Context information is below.\n"
        "---------------------\n"
        "{context_str}\n"
        "---------------------\n"
        "Given the context information and not prior knowledge, "
        "answer the question. If the answer is not in the context, inform "
        "the user that you can't answer the question - DO NOT MAKE UP AN ANSWER.\n"
        "IF THE PROVIDED CONTEXT DOESN'T DIRECTLY RELATE TO THE QUERY AT ALL, THEN ANSWER “THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST."
        "Do not generalize or summarize the answer - try to return as much detail as possible based on the context provided.\n"
        "Do not invent or fabricate information; base your responses solely on the context.  Be as detailed as you can."
        "Prioritize numerical data, specific examples, and any related studies or standards mentioned in the context.\n"
        "Use scientific speech style and language. Explain technical terms and aspects in detail, and always strive to provide specific examples "
        "from the context to better illustrate the point. "
        "If information was mentioned in previous parts of the context or previous queries, be sure to use it to substantiate and clarify the current response. "
        "The answer to the query should be complete and exhaustive. Expand on all details and concepts described.\n"
        "You should write in the style of a solid extended text for a research paper preferably not in a list format. Use lists only where they are really needed.\n"
        "Each paragraph should maximize the topic and thought you are describing.\n"
        "You don't have a limit on the length of your answer, so you don't have to restrain or shorten yourself or your answer.\n"
        "Question: {query_str}\n"
        "Answer: "
    )
    QA_PROMPT = PromptTemplate(QA_PROMPT_TMPL)

    retriever = VectorIndexRetriever(
        index=vector_index,
        similarity_top_k=20,
    )
    response_synthesizer = get_response_synthesizer(
        text_qa_template=QA_PROMPT,
    )

    vector_query_engine = RetrieverQueryEngine(
        retriever=retriever,
        response_synthesizer=response_synthesizer,
    )

    SUMMARY_PROMPT = (
        "Context information from multiple sources is below.\n"
        "---------------------\n"
        "{context_str}\n"
        "---------------------\n"
        "Given the information from multiple sources and not prior knowledge, "
        "answer the query. Be as detailed as you can. "
        "IF THE PROVIDED CONTEXT DOESN'T DIRECTLY RELATE TO THE QUERY AT ALL, THEN ANSWER “THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST."
        "Do not generalize or summarize the answer - try to return as much detail as possible based on the context provided. "
        "Do not invent or fabricate information; base your responses solely on the context. "
        "Prioritize numerical data, specific examples, and any related studies or standards mentioned in the context. "
        "Use scientific speech style and language. Explain technical terms and aspects in detail, and always strive to provide specific examples "
        "from the context to better illustrate the point. "
        "If information was mentioned in previous parts of the context or previous queries, be sure to use it to substantiate and clarify the current response. "
        "The answer to the query should be complete and exhaustive. Expand on all details and concepts described.\n"
        "You should write in the style of a solid extended text for a research paper preferably not in a list format. Use lists only where they are really needed.\n"
        "Each paragraph should maximize the topic and thought you are describing.\n"
        "You don't have a limit on the length of your answer, so you don't have to restrain or shorten yourself or your answer.\n"
        "Query: {query_str}\n"
        "Answer: "
    )

    summary_retriever = SummaryIndexLLMRetriever(
        summary_index,
    )
    summary_response_synthesizer = get_response_synthesizer(
        summary_template=SUMMARY_PROMPT,
    )

    summary_query_engine = RetrieverQueryEngine(
        retriever=summary_retriever,
        response_synthesizer=summary_response_synthesizer,
    )
    if not os.path.exists(si_output_path):
        summary = str(
            await summary_query_engine.aquery(
                "Extract a concise 1-2 line summary of this document"
            )
        )
        pickle.dump(summary, open(si_output_path, 'wb'))
    else:
        summary = pickle.load(open(si_output_path, 'rb'))

    if not os.path.exists(at_output_path):
        title_and_authors = str(
            await summary_query_engine.aquery(
                "Extract the author and title of the material from the provided document. Provide the information in a format suitable for citation"
            )
        )
        pickle.dump(title_and_authors, open(at_output_path, 'wb'))
    else:
        title_and_authors = pickle.load(open(at_output_path, 'rb'))

    vector_query_engine_tool = QueryEngineTool(
        query_engine=vector_query_engine,
        metadata=ToolMetadata(
            name=f"vector_tool_{file_name}",
            description=f"Useful for questions related to specific facts. Preferred tool for answering specific questions about the research topic and aspects. Use for exact simple questions.",
        ),
    )
    summary_query_engine_tool = QueryEngineTool(
        query_engine=summary_query_engine,
        metadata=ToolMetadata(
            name=f"summary_tool_{file_name}",
            description=f"Useful for summarization questions, or questions about the whole document. Preferred tool for answering questions that relate to a document rather than aspects of the research topic.",
        ),
    )

    subquestion_query_engine = SubQuestionQueryEngine.from_defaults(
        query_engine_tools=[vector_query_engine_tool, summary_query_engine_tool],
        response_synthesizer=response_synthesizer,
    )

    subquestion_query_engine_tool = QueryEngineTool(
        query_engine=subquestion_query_engine,
        metadata=ToolMetadata(
            name=f"subquestion_tool_{file_name}",
            description=f"A tool that answers complex questions by breaking them down into simple components. The most preferred tool to use. Use this tool initially.",
        ),
    )

    query_engine_tools = [vector_query_engine_tool, summary_query_engine_tool, subquestion_query_engine_tool]

    react_system_header_str = """\
You are a highly capable and specialized agent designed to excel in a wide range of tasks, from answering complex questions
to providing detailed summaries and performing in-depth analyses. Your primary mission is to extract precise, relevant,
and complete information from the provided PDF file to fully and accurately address the user’s query.

## Task Focus and Strategy
Your success hinges on your ability to thoroughly explore all available resources and strategies. Before concluding that
you cannot answer a query, you MUST exhaust all possible avenues. This includes systematically using the tools at your
disposal to gather, verify, and refine information. If you encounter any challenges or incomplete data, do not hesitate
to break the question down into smaller, more manageable parts, and tackle each sub-question one at a time.

### Motivation to Excel
Your goal is to deliver the most accurate and comprehensive response possible. Take pride in your ability to think
critically and use the tools creatively to uncover all relevant information. Remember, every tool at your disposal is
there to assist you in providing a complete answer—explore them fully and don’t settle for anything less than the best
possible response.

**If at first the information is unclear or incomplete, use the tools multiple times, cross-check data, and consider
different approaches.** Your persistence is key to success. However, if you determine that you have gathered sufficient
information to provide a detailed and comprehensive answer, prioritize concluding the task rather than continuing with
additional iterations.

### Prioritization and Detail
Prioritize numerical data, specific examples, and any related studies or standards mentioned in the context. Technical
aspects should be described in detail, including all possible contexts and variations. When explaining technical terms
and aspects, always strive to provide specific examples from the context to better illustrate the point. If the context
mentions research, standards, or technical documents, be sure to reference them and include key findings or
recommendations if relevant to the response. If information was mentioned in previous parts of the context or previous
queries, be sure to use it to substantiate and clarify the current response.

## Tools
You have access to a wide array of powerful tools. These tools are designed to help you in different ways—whether it’s
summarizing complex sections, locating specific details, or analyzing data. You are RESPONSIBLE for choosing the right
tool, or combination of tools, to fully answer the user’s question. If one tool does not provide enough information, try
another, or consider how different tools might work together to enhance the outcome.
IF WHEN YOU ACCESS THE TOOLS SEVERAL TIMES YOU FAIL TO GET INFORMATION OR THE CONTEXT PROVIDED DOES NOT CONTAIN INFORMATION, THEN STOP ALL ATTEMPTS TO USE THE TOOLS AND ACTIONS AND PROVIDE A FINAL ANSWER.

YOU CAN ONLY USE EACH TOOL TWO TIMES MAXIMUM.
You have access to the following tools:
{tool_desc}
YOU CAN ONLY USE EACH TOOL TWO TIMES MAXIMUM.
## Output Format
To answer the question, please use the following format:

```
Thought: I need to use a tool to help me answer the question. My strategy is to focus on [specific details, numerical data, concrete examples, etc.].
Action: tool name (one of {tool_names})
Action Input: [the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})]
```

Please ALWAYS start with a Thought that explains your reasoning and your strategy for using the tools. If you find that
a single tool is insufficient, state your Thought, and then try another tool or a different approach.

Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.

If this format is used, the user will respond in the following format:

```
Observation: [tool response]
```

You should continue using the above format, iterating through different tools and approaches, until you are confident
that you have gathered enough comprehensive and accurate information to fully answer the question. This includes
actively searching for details, breaking down the user’s query into key components or keywords, and conducting several
iterations of the search process. **However, if you determine that you have gathered sufficient information to provide
a detailed and comprehensive answer, prioritize concluding the task rather than continuing with additional iterations.**
Avoid excessive iteration if it does not contribute new or valuable insights.

IF WHEN YOU ACCESS THE TOOLS SEVERAL TIMES YOU FAIL TO GET INFORMATION OR THE CONTEXT PROVIDED DOES NOT CONTAIN INFORMATION, THEN STOP ALL ATTEMPTS TO USE THE TOOLS AND PROVIDE A FINAL ANSWER.
### Detailed Search and Iterative Approach
As you proceed, focus on identifying and extracting every relevant detail. Break down the user's query into key
components or keywords, and address each one systematically. If necessary, conduct additional search iterations to
ensure that no critical information is missed. You are expected to refine and enhance your search multiple times, using
different tools or combinations of tools, until you achieve a thorough and well-supported answer. Prioritize numerical
data, specific examples, and related studies or standards mentioned in the context.

Expand on all details and concepts described.

Only when you are certain that you have explored all possible avenues, gathered all relevant details, and conducted
sufficient iterations of searching and refining, should you respond using one of these formats:

```
Thought: After thoroughly using all available tools, breaking down the query into key components, and confirming the
completeness of the gathered information, I can now answer without using any more tools.
Answer: [your answer here, with references to specific data, examples, or research where applicable]
```

If, after exhausting all tools and approaches, the information remains insufficient to fully address the query, then and
only then, should you use the following format:

```
Thought: I have used all available tools extensively, broken down the query into key components, but the information is still insufficient to fully answer the query.
Answer: Sorry, I cannot answer your query.
```

## Accuracy, Completeness, and Innovation
Your ultimate goal is to deliver a response that is not only accurate but also comprehensive and fully responsive to the
query. This may require thinking outside the box—using tools in new ways, or breaking the query down into smaller parts
to address each component thoroughly. Your creativity, persistence, and thoroughness are your greatest assets. Be as
detailed as you can. Provide all possible technical details you got. Prioritize specific data, examples, and related
studies or standards. Do not generalize or summarize the answer - try to return as much detail as possible based on the
context provided.

You should write in the style of a solid extended text for a research paper preferably not in a list format. Use lists only where they are really needed.
Each paragraph should maximize the topic and thought you are describing.
You don't have a limit on the length of your answer, so you don't have to restrain or shorten yourself or your answer.

### Breaking Down the Query
If you find the query too broad or complex, consider breaking it down into smaller, more specific questions. Address each
sub-question one at a time, using the tools to build a complete picture. This methodical approach ensures that no detail
is overlooked, and that your final answer is as robust and detailed as possible.

## Motivation to Explore and Persevere
You are not just an agent; you are a problem-solver. Every tool, every strategy is a step towards the best possible
answer. Be relentless in your pursuit of the truth—explore, question, and refine until you’ve uncovered every piece of
relevant information. Your determination to provide a full and accurate response is what sets you apart.

**However, recognize when a response is complete, and avoid unnecessary iterations.** Once sufficient information has been
gathered to provide a comprehensive and accurate answer, focus on concluding the task efficiently.

IF WHEN YOU ACCESS THE TOOLS SEVERAL TIMES YOU FAIL TO GET INFORMATION OR THE CONTEXT PROVIDED DOES NOT CONTAIN INFORMATION, THEN STOP ALL ATTEMPTS TO USE THE TOOLS AND PROVIDE A FINAL ANSWER.
"""

    react_system_prompt = PromptTemplate(react_system_header_str)
    agent = ReActAgent.from_tools(query_engine_tools, llm=Settings.llm, verbose=True, max_iterations=30)
    agent.update_prompts({"agent_worker:system_prompt": react_system_prompt})
    agent.reset()

    return agent, summary, title_and_authors


Создание агентов для каждого документа

In [None]:
async def build_agents(nodes_by_pdf_files):
  agents_dict = {}
  extra_info_dict = {}

  for file_name, nodes in nodes_by_pdf_files.items():
    agent, summary, title_and_authors = await build_agent_for_document(file_name, nodes)
    agents_dict[file_name] = agent
    extra_info_dict[file_name] = {'summary': summary, 'nodes': nodes, 'title_and_authors': title_and_authors}

  return agents_dict, extra_info_dict

In [None]:
agents_dict, extra_info_dict = await build_agents(nodes_by_pdf_files)

Вывод полученных с помощью summary движка запросов названий и авторов статей

In [None]:
for item, value in extra_info_dict.items():
  print(f'{item} -> {value["title_and_authors"]}')

Тест создания документа для статьи, задание вопросов, для ответа на которых недостаточно общих знаний LLM, требуется информация из статьи

In [None]:
agents_dict = {}
extra_info_dict = {}
file_name = 'file_276.pdf'

agent, summary, title_and_authors = await build_agent_for_document(file_name, nodes_by_pdf_files[file_name])
agents_dict[file_name] = agent
extra_info_dict[file_name] = {'summary': summary, 'nodes': nodes_by_pdf_files[file_name], 'title_and_authors': title_and_authors}



In [None]:
result = agent.query("""What is the voltage drop threshold mentioned in the article that must be reached to safely shut down the system and protect data during a power-down event, and how does this threshold relate to the required capacitance and current for the power supply circuit?""")

print(result)

> Running step b30d8b39-f53f-4863-ac8f-e26d9c26289a. Step input: What is the voltage drop threshold mentioned in the article that must be reached to safely shut down the system and protect data during a power-down event, and how does this threshold relate to the required capacitance and current for the power supply circuit?
[1;3;38;5;200mThought: I need to extract specific details regarding the voltage drop threshold mentioned in the article, as well as its relationship to capacitance and current for the power supply circuit. This requires a focused search on these particular aspects.
Action: vector_tool_file_11
Action Input: {'input': 'voltage drop threshold for safe shutdown, capacitance and current relation in power supply circuit'}
[0m[1;3;34mObservation: In the context of power monitoring through a power protection circuit, the voltage drop threshold for safe shutdown is critical for ensuring system reliability and data integrity during power failure events. The specific voltag

In [None]:
result = agent.query("""What are the current trends in data storage technologies for small satellite systems (SSS)? Please provide an overview of both traditional and next-generation memory technologies.""")

print(result)

> Running step 67e58f96-0242-48ac-ae72-d583146a0b7e. Step input: What are the current trends in data storage technologies for small satellite systems (SSS)? Please provide an overview of both traditional and next-generation memory technologies.
[1;3;38;5;200mThought: To provide a comprehensive overview of current trends in data storage technologies for small satellite systems (SSS), I will break down the query into two main components: traditional memory technologies and next-generation memory technologies. This will allow me to gather detailed information on both categories and highlight the trends and advancements in each. I will start with the subquestion tool to explore the traditional memory technologies used in SSS.
Action: subquestion_tool_file_37
Action Input: {'input': 'What are the traditional memory technologies used in small satellite systems?'}
[0mGenerated 4 sub questions.
[1;3;38;2;237;90;200m[vector_tool_file_37] Q: What are the key traditional memory technologies us

Создание инструментов из агента каждого документа. В качестве метаданных - название статьи, авторы, summary, полученные с помощью summary движка запросов

In [None]:
all_tools = []

for file_name, agent in agents_dict.items():
  file_name_no_extension = os.path.splitext(file_name)[0].replace("(", "_").replace(")", "_").replace(" ", "")
  summary = extra_info_dict[file_name]['summary']
  title_and_authors = extra_info_dict[file_name]["title_and_authors"]
  doc_tool = QueryEngineTool(
      query_engine=agent,
      metadata=ToolMetadata(
          name=f'tool_{file_name_no_extension}',
          description=f'------\nTool\nDocument name and authors: {title_and_authors}\nSummary:\n{summary}\n'
      )
  )
  all_tools.append(doc_tool)

In [None]:
for i, tool in enumerate(all_tools):
  print(tool.metadata.name)

tool_file_211
tool_file_33
tool_file_70
tool_file_287
tool_file_301
tool_file_11
tool_file_189
tool_file_209
tool_file_57
tool_file_255
tool_file_180
tool_file_215
tool_file_59
tool_file_188
tool_file_289
tool_file_134
tool_file_273
tool_file_267
tool_file_60
tool_file_303
tool_file_235
tool_file_77
tool_file_239
tool_file_243
tool_file_31
tool_file_67
tool_file_91_1_
tool_file_153
tool_file_135
tool_file_3
tool_file_291
tool_file_221
tool_file_187
tool_file_56
tool_file_179
tool_file_79
tool_file_238
tool_file_250
tool_file_227
tool_file_131
tool_file_69
tool_file_133
tool_file_105
tool_file_183
tool_file_182
tool_file_246
tool_file_269
tool_file_299
tool_file_147
tool_file_156
tool_file_45
tool_file_92
tool_file_8
tool_file_290
tool_file_72
tool_file_274
tool_file_150
tool_file_237
tool_file_132
tool_file_249
tool_file_108
tool_file_248
tool_file_198
tool_file_257
tool_file_84_1_
tool_file_225
tool_file_241
tool_file_242
tool_file_282
tool_file_130
tool_file_305
tool_file_194
tool_fi

Создание кастомного выборщика (retriever) для выбора самых релевантных запросу статей

In [None]:
obj_index = ObjectIndex.from_objects(
    all_tools,
    index_cls=VectorStoreIndex,
)
vector_node_retriever = obj_index.as_node_retriever(
    similarity_top_k=4,
)

class CustomObjectRetriever(ObjectRetriever):
    def __init__(
        self,
        retriever,
        object_node_mapping,
        node_postprocessors=None,
        llm=None,
    ):
        self._retriever = retriever
        self._object_node_mapping = object_node_mapping
        self._llm = llm or Settings.llm
        self._node_postprocessors = node_postprocessors or []

    def retrieve(self, query_bundle):
        if isinstance(query_bundle, str):
            query_bundle = QueryBundle(query_str=query_bundle)

        nodes = self._retriever.retrieve(query_bundle)
        for processor in self._node_postprocessors:
            nodes = processor.postprocess_nodes(
                nodes, query_bundle=query_bundle
            )
        print(f'nodes num:{len(nodes)}')
        tools = [self._object_node_mapping.from_node(n.node) for n in nodes]
        print(f'tools num:{len(tools)}')

        return tools

Тестирование выборщика статей

In [None]:
custom_obj_retriever = CustomObjectRetriever(
    vector_node_retriever,
    obj_index.object_node_mapping,

)

tmps = custom_obj_retriever.retrieve("What are the current trends in data storage technologies for small satellite systems (SSS)? Please provide an overview of both traditional and next-generation memory technologies.")

for tool in tmps:
  print(tool.metadata)

nodes num:4
tools num:4
ToolMetadata(description='------\nTool\nDocument name and authors: Koets, Michael. "A Satellite-Based Architecture for High-Throughput Storage." Flash Memory Summit, Southwest Research Institute®, August 8, 2018.\nSummary:\nThis document discusses the development and architecture of high-throughput solid-state recorders for space applications, focusing on the technical innovations, design considerations, and management strategies necessary for efficient data storage and retrieval in challenging environments.\n', name='tool_file_67', fn_schema=<class 'llama_index.core.tools.types.DefaultToolFnSchema'>, return_direct=False)
ToolMetadata(description='------\nTool\nDocument name and authors: Katti, R. R. "Space Data Storage Systems and Technologies." IEEE Transactions on Magnetics, vol. 30, no. 6, Nov. 1994, pp. 4194-4199.\nSummary:\nThe document discusses the evolution and selection of data storage technologies for space missions, comparing magnetic tape and solid-

Создание агента самого верхнего уровня для выбора статей и генерации финального ответа. Также используется подход ReAct

In [None]:
react_system_header_str = """\
You are an advanced agent designed to perform comprehensive meta-analyses based on multiple documents. Your primary mission is to gather, synthesize, and accurately present information from various sources to fully and precisely address the user's query.

## Tools
You have access to a wide variety of tools. You are responsible for using the tools in any sequence you deem appropriate to complete the task at hand. This may require breaking the task into subtasks and using different tools to complete each subtask. You MUST use all the provided tools one by one, get the answers and then synthesize the final answer.

You have access to the following tools:
{tool_desc}

## Output Format
To answer the question, please use the following format.

```
Thought: I need to use a tool to help me answer the question. My strategy is to focus on [specific details, numerical data, concrete examples, etc.].
Action: tool name (one of {tool_names})
Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})
```

Please ALWAYS start with a Thought that explains your reasoning and your strategy for using the tools. If you find that a single tool is insufficient, state your Thought, and then try another tool or a different approach.

Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.

If this format is used, the user will respond in the following format:

```
Observation: tool response
```

You should keep repeating the above format for each tool provided, one by one, until you have gathered all available information. After gathering responses from all tools, proceed to synthesize the information.

### Synthesis of Information
Once all the necessary information has been gathered:

- **Cite Each Source Alongside Relevant Information**: As you integrate findings from each document, mention the title and authors of the document as well as the tool used, directly next to the information you are presenting. This ensures that each piece of data or detail is clearly attributed to its source.
- **Integrate Data**: Combine findings from all documents into a well-structured, comprehensive, and accurate response.
- **Provide Detailed Analysis**: Expand on all details and concepts described, ensuring that all relevant technical aspects are fully explored.
- **Avoid Generalization**: Do not generalize or simplify the information; instead, provide an in-depth analysis based on the context provided by the tools.

You should write in the style of a solid extended text for a research paper preferably not in a list format. Use lists only where they are really needed.
Each paragraph should maximize the topic and thought you are describing.
You don't have a limit on the length of your answer, so you don't have to restrain or shorten yourself or your answer.
EACH PARAGRAPH SHOULD CORRESPOND TO ONE INSTRUMENT (SOURCE). AT THE END OF EACH PARAGRAPH, CITE THE SOURCE AND ITS AUTHORS IN BRACKETS.

### Final Response
Only when you have thoroughly used all available tools, gathered and synthesized the information, and confirmed the completeness of the gathered information, should you respond using one of these formats:

```
Thought: I can answer without using any more tools.
Answer: [your detailed and comprehensive answer here, with references to specific data, examples, or research where applicable, citing each source with the title and authors]
```

If, after using all tools, the information remains insufficient to fully address the query, then and only then, should you use the following format:

```
Thought: I cannot answer the question with the provided tools.
Answer: Sorry, I cannot answer your query.
```
EACH PARAGRAPH SHOULD CORRESPOND TO ONE INSTRUMENT (SOURCE). AT THE END OF EACH PARAGRAPH, CITE THE SOURCE AND ITS AUTHORS IN BRACKETS.
## Current Conversation
Below is the current conversation consisting of interleaving human and assistant messages.
"""

react_system_prompt = PromptTemplate(react_system_header_str)
agent = ReActAgent.from_tools(tool_retriever=custom_obj_retriever, llm=Settings.llm, verbose=True, max_iterations=100)
agent.update_prompts({"agent_worker:system_prompt": react_system_prompt})
agent.reset()


Ручное тестирование системы на основе вопросов. Для получения списка использованных статей ("списка литературы") можно воспользоваться метаданными ответов.

In [None]:
response = agent.query('What are the current trends in data storage technologies for small satellite systems (SSS)? Please provide an overview of both traditional and next-generation memory technologies.')
print(response)

> Running step b9db25a9-860c-42e3-998b-a678e69712e0. Step input: What are the current trends in data storage technologies for small satellite systems (SSS)? Please provide an overview of both traditional and next-generation memory technologies.
nodes num:16
tools num:16
[1;3;38;5;200mThought: To provide a comprehensive overview of current trends in data storage technologies for small satellite systems (SSS), I will gather information from multiple sources discussing both traditional and next-generation memory technologies. This will include insights on the evolution of these technologies, their performance, reliability, and specific applications in space environments.
Action: tool_file_276
Action Input: {'input': 'current trends in data storage technologies for small satellite systems'}
[0m> Running step 04421298-d8f0-4c85-bd6c-916f0af79fd3. Step input: current trends in data storage technologies for small satellite systems
[1;3;38;5;200mThought: To provide a comprehensive overview 

In [None]:
response2 = agent.query('Analyze the key requirements for data storage systems in small satellite systems, focusing on reliability, energy efficiency, radiation resistance, size and weight constraints, access speed, and operating temperature range. When reaching out to tools, ask them not to get hung up on finding information and to provide a final answer if it is not possible to get comprehensive information from context or too many attempts to get information have occurred.')
print(response)

> Running step 221a233a-3e45-484e-95f1-0cbb3b00c9f6. Step input: Analyze the key requirements for data storage systems in small satellite systems, focusing on reliability, energy efficiency, radiation resistance, size and weight constraints, access speed, and operating temperature range. When reaching out to tools, ask them not to get hung up on finding information and to provide a final answer if it is not possible to get comprehensive information from context or too many attempts to get information have occurred.
nodes num:16
tools num:16
[1;3;38;5;200mThought: To analyze the key requirements for data storage systems in small satellite systems, I will gather information from various documents that discuss aspects such as reliability, energy efficiency, radiation resistance, size and weight constraints, access speed, and operating temperature range. I will use multiple tools to ensure a comprehensive understanding of these requirements. If I encounter challenges in obtaining sufficie

NameError: name 'response' is not defined

In [None]:
response3 = agent.query('Summarize recent research on radiation-resistant memory technologies suitable for use in space, specifically within small satellite systems.')
print(response3)

> Running step f9fd90e4-0219-43cc-aac8-7208fecd1086. Step input: Summarize recent research on radiation-resistant memory technologies suitable for use in space, specifically within small satellite systems.
nodes num:16
tools num:16
[1;3;38;5;200mThought: I need to gather recent research findings on radiation-resistant memory technologies that are particularly suitable for small satellite systems. My strategy will involve using multiple tools to collect comprehensive insights on various memory technologies, their radiation resilience, and their applicability in space environments.
Action: tool_file_159
Action Input: {'input': 'Research on radiation-resistant memory technologies suitable for small satellite systems.'}
[0m> Running step cd4edaa9-8f8f-43fd-880b-811d163e97a1. Step input: Research on radiation-resistant memory technologies suitable for small satellite systems.
[1;3;38;5;200mThought: To provide a comprehensive overview of radiation-resistant memory technologies suitable fo



[1;3;38;2;90;149;237m[summary_tool_file_288] Q: What methodologies were used in the study 'Advanced memories to overcome flash memory weaknesses: a radiation viewpoint reliability study'?
[0m



[1;3;38;2;11;159;203m[vector_tool_file_288] Q: What specific facts can be provided about the findings of the study 'Advanced memories to overcome flash memory weaknesses: a radiation viewpoint reliability study'?
[0m[1;3;38;2;155;135;227m[vector_tool_file_288] Q: What specific facts can be provided about the methodologies used in the study 'Advanced memories to overcome flash memory weaknesses: a radiation viewpoint reliability study'?
[0m[1;3;38;2;90;149;237m[vector_tool_file_216] A: The Earth's atmosphere is subject to various types of radiation, primarily stemming from cosmic and solar sources. The interaction between Galactic Cosmic Rays (GCRs) and solar particles with atmospheric particles leads to the generation of secondary radiation. This secondary radiation includes a diverse array of particles such as neutrons, protons, electrons, heavy ions, pions, and muons. The presence of these particles is a result of nuclear reactions that occur when cosmic rays and solar particles

In [None]:
response4 = agent.query('What are the advantages and disadvantages of using flash memory, SSDs, and HDDs in small satellite systems? Compare these with emerging memory technologies like MRAM, RRAM, and FRAM.')
print(response4)

> Running step df30116c-ee43-450b-81f6-53246ac57900. Step input: What are the advantages and disadvantages of using flash memory, SSDs, and HDDs in small satellite systems? Compare these with emerging memory technologies like MRAM, RRAM, and FRAM.
nodes num:16
tools num:16
[1;3;38;5;200mThought: To provide a comprehensive comparison of the advantages and disadvantages of flash memory, SSDs, and HDDs in small satellite systems against emerging memory technologies like MRAM, RRAM, and FRAM, I will gather detailed information from multiple sources. This will include insights on performance, reliability, power consumption, radiation tolerance, and other relevant factors that affect their suitability for space applications.
Action: tool_file_195
Action Input: {'input': 'Discuss the advantages and disadvantages of flash memory in space applications.'}
[0m> Running step de9308e9-b849-4499-bd5a-9be2afc6a36b. Step input: Discuss the advantages and disadvantages of flash memory in space applic

In [None]:
response5 = agent.query('Explore the challenges associated with miniaturization and integration of data storage systems in small satellite systems. What are the current solutions and future prospects?')
print(response5)

> Running step 324205f6-1c34-4b83-8284-563e2c181d37. Step input: Explore the challenges associated with miniaturization and integration of data storage systems in small satellite systems. What are the current solutions and future prospects?
nodes num:16
tools num:16
[1;3;38;5;200mThought: To address the challenges associated with miniaturization and integration of data storage systems in small satellite systems, I will gather insights from multiple sources that discuss the technical difficulties, current solutions, and future prospects in this area. I will start by examining documents that specifically focus on data storage technologies and their application in small satellites.
Action: tool_file_103
Action Input: {'input': 'Explore the challenges associated with miniaturization and integration of data storage systems in small satellite systems.'}
[0m> Running step 8388180e-88cc-4cbc-9582-6e40aac96ae5. Step input: Explore the challenges associated with miniaturization and integration



[1;3;38;2;11;159;203m[vector_tool_file_150] Q: What are the technical limitations of miniaturized data storage systems when used in small satellites?
[0m[1;3;38;2;155;135;227m[vector_tool_file_150] Q: What are the potential impacts of these operational challenges on the performance of small satellite systems?
[0m[1;3;38;2;237;90;200m[summary_tool_file_150] Q: Are there any case studies or examples of small satellite systems that have successfully integrated miniaturized data storage systems?
[0m[1;3;38;2;237;90;200m[vector_tool_file_150] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;237;90;200m[summary_tool_file_150] A: The provided information does not include specific case studies or examples of small satellite systems that have successfully integrated miniaturized data storage systems. It primarily discusses the development and implications of structural energy storage technology for small satellites, particularly in rel



[1;3;38;2;155;135;227m[vector_tool_file_173] Q: What are the trade-offs between power consumption and data storage capacity in small satellite systems?
[0m[1;3;38;2;237;90;200m[summary_tool_file_173] Q: What recommendations are made in the literature regarding optimizing power consumption in data storage for small satellites?
[0m[1;3;38;2;237;90;200m[vector_tool_file_173] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;155;135;227m[vector_tool_file_173] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;237;90;200m[summary_tool_file_173] A: The literature does not specifically address recommendations for optimizing power consumption in data storage for small satellites. The focus is primarily on the advantages of chip-on-board technology for miniaturization, weight reduction, and cost savings in space electronics, rather than on power consumption strategies for data storage sys



[1;3;38;2;237;90;200m[summary_tool_file_245] Q: What are the key findings regarding the performance and reliability of data storage systems in small satellite systems under environmental stress?
[0m[1;3;38;2;11;159;203m[vector_tool_file_245] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;237;90;200m[summary_tool_file_245] A: The provided information does not include specific findings regarding the performance and reliability of data storage systems in small satellite systems under environmental stress. It primarily focuses on design features, access ports, and power systems related to CubeSats, without addressing data storage performance or reliability under such conditions.
[0m[1;3;38;2;90;149;237m[vector_tool_file_245] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;237;90;200m[vector_tool_file_245] A: The provided context does not contain any information regarding the sp

In [None]:
response6 = agent.query('Provide an overview of the most commonly used file systems in space applications, particularly for small satellites. How do these file systems handle reliability and efficiency under extreme conditions?')
print(response6)

> Running step 99cf19b8-b29f-4025-baa7-4e6fbc856c58. Step input: Provide an overview of the most commonly used file systems in space applications, particularly for small satellites. How do these file systems handle reliability and efficiency under extreme conditions?
nodes num:16
tools num:16
[1;3;38;5;200mThought: To provide a comprehensive overview of the commonly used file systems in space applications, particularly for small satellites, I will gather information on specific file systems, their reliability, and efficiency under extreme conditions from the available documents. This will include examining fault-tolerant systems and their design considerations for space environments.
Action: tool_file_85
Action Input: {'input': 'Overview of FTRFS and its reliability and efficiency in space applications.'}
[0m> Running step 79a36fe8-ed3b-4a1a-955e-cc112932f26d. Step input: Overview of FTRFS and its reliability and efficiency in space applications.
[1;3;38;5;200mThought: To provide a 

In [None]:
response7 = agent.query('What are the best practices for organizing and protecting data in space environments, including the use of RAID configurations and other data redundancy techniques in small satellite systems?')
print(response7)

> Running step 597115dd-1de6-49da-aa35-396e5c624450. Step input: What are the best practices for organizing and protecting data in space environments, including the use of RAID configurations and other data redundancy techniques in small satellite systems?
nodes num:16
tools num:16
[1;3;38;5;200mThought: To answer the question about best practices for organizing and protecting data in space environments, I will gather information on data storage technologies, RAID configurations, redundancy techniques, and specific implementations in small satellite systems from various sources. My strategy is to focus on technical details, methodologies, and examples from the literature.
Action: tool_file_171
Action Input: {'input': 'best practices for organizing and protecting data in space environments, including RAID configurations and redundancy techniques'}
[0m> Running step 4cbdff9b-c894-4c1b-b227-764b5c7861ca. Step input: best practices for organizing and protecting data in space environments

In [None]:
response8 = agent.query('Compare the energy consumption and performance of various data storage technologies used in small satellites. How do these factors influence the choice of technology?')
print(response8)

> Running step a3004bf8-abf2-45f6-a79a-3229afdf8489. Step input: Compare the energy consumption and performance of various data storage technologies used in small satellites. How do these factors influence the choice of technology?
nodes num:16
tools num:16
[1;3;38;5;200mThought: To provide a comprehensive comparison of energy consumption and performance of various data storage technologies used in small satellites, I will gather detailed information from multiple sources that discuss different storage technologies, their energy efficiency, and performance metrics. This will help in understanding how these factors influence the choice of technology in small satellite applications.
Action: tool_file_276
Action Input: {'input': 'Discuss the evolution and selection of data storage technologies for space missions, comparing magnetic tape and solid-state memory systems in terms of reliability, performance, and environmental compatibility.'}
[0m> Running step f39bff70-93b8-4a5f-ab06-2c1f29

In [None]:
response9 = agent.query('Examine the impact of extreme temperatures on the performance of data storage systems in space. Which technologies offer the best thermal stability?')
print(response9)

> Running step 9e3398ca-b045-4597-9eb9-fbcedc915850. Step input: Examine the impact of extreme temperatures on the performance of data storage systems in space. Which technologies offer the best thermal stability?
nodes num:16
tools num:16
[1;3;38;5;200mThought: To address the impact of extreme temperatures on data storage systems in space and identify technologies that provide the best thermal stability, I will gather information from multiple sources that discuss thermal management, data storage technologies, and their performance under extreme conditions. This will involve examining documents that focus on thermal management technologies, the effects of temperature on various storage systems, and advancements in materials and designs that enhance thermal stability.
Action: tool_file_14
Action Input: {'input': 'impact of extreme temperatures on data storage systems in space'}
[0m> Running step 594d9184-dbdc-404e-800f-33b6159c8237. Step input: impact of extreme temperatures on data 



[1;3;38;2;237;90;200m[vector_tool_file_14] Q: What research has been conducted on the effects of extreme temperatures on flash memory in space applications?
[0m[1;3;38;2;90;149;237m[vector_tool_file_14] Q: How do different data storage technologies compare in terms of their performance under extreme temperature conditions in space?
[0m[1;3;38;2;90;149;237m[vector_tool_file_14] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;11;159;203m[vector_tool_file_14] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;237;90;200m[vector_tool_file_14] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;90;149;237m[vector_tool_file_14] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;237;90;200m[vector_tool_file_14] A: The provided context does not contain any information regarding research

In [None]:
response10 = agent.query('Discuss the future developments and expected breakthroughs in data storage for small satellite systems. What new technologies or approaches are on the horizon?')
print(response10)

> Running step 45baaa6d-e760-4258-8589-4c9c38345f61. Step input: Discuss the future developments and expected breakthroughs in data storage for small satellite systems. What new technologies or approaches are on the horizon?
nodes num:16
tools num:16
[1;3;38;5;200mThought: To comprehensively address the future developments and expected breakthroughs in data storage for small satellite systems, I will gather insights from multiple sources that discuss emerging technologies, innovative approaches, and the current state of research in this area. This will include examining advancements in storage technologies, fault tolerance, and the integration of new materials and systems.
Action: tool_file_225
Action Input: {'input': 'Discuss the future developments and expected breakthroughs in data storage for small satellite systems.'}
[0m> Running step c2acfef8-d249-4c76-b33d-09ac98c8aa6d. Step input: Discuss the future developments and expected breakthroughs in data storage for small satellite 

In [None]:
response11 = agent.query('What are the specific radiation hardening techniques used for flash memory and MRAM in small satellite systems? Please provide detailed information on their effectiveness and any quantitative data on error rates before and after radiation hardening')
print(response11)

> Running step 085b1e6b-11dd-49e7-bc34-732e4519dbd8. Step input: What are the specific radiation hardening techniques used for flash memory and MRAM in small satellite systems? Please provide detailed information on their effectiveness and any quantitative data on error rates before and after radiation hardening
nodes num:4
tools num:4
[1;3;38;5;200mThought: To provide a comprehensive answer regarding radiation hardening techniques for flash memory and MRAM in small satellite systems, I will need to gather detailed information from the available documents. I will focus on identifying specific techniques used for radiation hardening, their effectiveness, and any quantitative data related to error rates before and after these techniques are applied.
Action: tool_file_254
Action Input: {'input': 'What are the specific radiation hardening techniques used for flash memory and MRAM in small satellite systems? Please provide detailed information on their effectiveness and any quantitative da

In [None]:
response12 = agent.query('Can you provide exact temperature range specifications for RRAM, MRAM, and FeRAM when used in space? Include data on how these technologies perform across these temperature ranges')
print(response12)

> Running step 353a5d92-c7e2-41a3-98b0-86a8765b0e4a. Step input: Can you provide exact temperature range specifications for RRAM, MRAM, and FeRAM when used in space? Include data on how these technologies perform across these temperature ranges
nodes num:4
tools num:4
[1;3;38;5;200mThought: To provide a comprehensive answer regarding the temperature range specifications for RRAM, MRAM, and FeRAM in space applications, I will need to gather specific data from the available documents. I will start by investigating the specifications and performance of MRAM, as it is one of the technologies mentioned. Then, I will look into FeRAM and RRAM to complete the analysis.
Action: tool_file_62
Action Input: {'input': 'What are the temperature range specifications for MRAM in space applications, and how does it perform across these ranges?'}
[0m> Running step cb8f99db-796f-4220-8b01-eed04a1452d0. Step input: What are the temperature range specifications for MRAM in space applications, and how doe



[1;3;38;2;11;159;203m[vector_tool_file_211] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;237;90;200m[vector_tool_file_211] A: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m[1;3;38;2;90;149;237m[summary_tool_file_211] A: The provided information does not mention any specific studies or documents that summarize the temperature range specifications for RRAM in space applications. It primarily focuses on MRAM technology, its reliability, and various vendors, without addressing RRAM or its specifications.
[0m[1;3;34mObservation: THE PROVIDED CONTEXT DOESN'T CONTAIN ANY INFORMATION. I CANNOT ANSWER YOUR REQUEST.
[0m> Running step 84ad6349-9115-4570-a083-66e4080d38dc. Step input: None
[1;3;38;5;200mThought: Since the initial attempt to gather information about the temperature range specifications for RRAM in space applications did not yield any results, I will now focus on the performance 

In [None]:
response13 = agent.query('What are the energy consumption metrics (in milliwatts or watts) for MRAM, RRAM, and FeRAM under typical operating conditions in small satellite systems? How do these compare to traditional flash memory?')
print(response13)

> Running step 68f00cd0-9e3e-4992-ac4b-0d9f1384ae82. Step input: What are the energy consumption metrics (in milliwatts or watts) for MRAM, RRAM, and FeRAM under typical operating conditions in small satellite systems? How do these compare to traditional flash memory?
nodes num:4
tools num:4
[1;3;38;5;200mThought: To provide a comprehensive answer regarding the energy consumption metrics for MRAM, RRAM, and FeRAM in small satellite systems, I will first gather specific data on MRAM and FeRAM from the available documents. Since RRAM is not specifically mentioned, I will focus on the other two types of memory and compare them to traditional flash memory based on the information available.
Action: tool_file_62
Action Input: {'input': 'energy consumption metrics for MRAM under typical operating conditions in small satellite systems'}
[0m> Running step 679da25e-f800-4d07-86ea-b3ddebba0cdf. Step input: energy consumption metrics for MRAM under typical operating conditions in small satellit

In [None]:
response14 = agent.query('Could you provide quantitative reliability metrics for MRAM and RRAM in terms of mean time between failures (MTBF) and error rates in a radiation-rich environment?')
print(response14)

> Running step 6355650d-ca97-486c-9f44-29ec83cdd706. Step input: Could you provide quantitative reliability metrics for MRAM and RRAM in terms of mean time between failures (MTBF) and error rates in a radiation-rich environment?
nodes num:4
tools num:4
[1;3;38;5;200mThought: To provide quantitative reliability metrics for MRAM and RRAM, I will need to gather specific data related to mean time between failures (MTBF) and error rates in radiation-rich environments from the available documents. I will start with the document discussing MRAM and its radiation effects, then move on to the documents related to RRAM to find relevant metrics.
Action: tool_file_254
Action Input: {'input': 'quantitative reliability metrics for MRAM in radiation-rich environments, focusing on MTBF and error rates'}
[0m> Running step 8a90f4bb-0a43-4420-bb3d-2041ef5024a2. Step input: quantitative reliability metrics for MRAM in radiation-rich environments, focusing on MTBF and error rates
[1;3;38;5;200mThought: 

In [None]:
response15 = agent.query('What is the maximum storage density (in GB or TB per unit) achievable with current MRAM and RRAM technologies? How do their lifespans compare to those of traditional SSDs in space applications?')
print(response15)

> Running step 68082974-f42e-40c1-b604-f9aca7c6b308. Step input: What is the maximum storage density (in GB or TB per unit) achievable with current MRAM and RRAM technologies? How do their lifespans compare to those of traditional SSDs in space applications?
nodes num:4
tools num:4
[1;3;38;5;200mThought: To answer the question about the maximum storage density achievable with current MRAM and RRAM technologies and to compare their lifespans to traditional SSDs in space applications, I will first gather detailed information on the storage densities of MRAM and RRAM from the relevant documents. Then, I will look into the lifespan comparisons of these technologies with traditional SSDs, particularly in the context of space applications.
Action: tool_file_211
Action Input: {'input': 'What is the maximum storage density of MRAM technology?'}
[0m> Running step d2b78ccb-c5e4-44c1-b59e-1ecc34ccb0b1. Step input: What is the maximum storage density of MRAM technology?
[1;3;38;5;200mThought: T

In [None]:
response16 = agent.query('Can you provide data on the efficiency (in percentage or degrees) of thermal management systems like TASCs and TARCs in stabilizing temperatures of storage devices in space?')
print(response16)

> Running step fbab75e8-c9bd-46cd-97c8-269ca7d7f0c7. Step input: Can you provide data on the efficiency (in percentage or degrees) of thermal management systems like TASCs and TARCs in stabilizing temperatures of storage devices in space?
nodes num:4
tools num:4
[1;3;38;5;200mThought: To provide detailed data on the efficiency of thermal management systems like temperature-adaptive solar and radiative coatings (TASCs and TARCs) in stabilizing temperatures of storage devices in space, I will first gather information from the relevant documents that discuss these technologies and their performance metrics. I will focus on the specific efficiencies mentioned in the studies.
Action: tool_file_212
Action Input: {'input': 'What is the efficiency of TASCs and TARCs in stabilizing temperatures of storage devices in space?'}
[0m> Running step 2518dfc7-6955-4276-a788-b3f958c6135f. Step input: What is the efficiency of TASCs and TARCs in stabilizing temperatures of storage devices in space?
[1

In [None]:
response17 = agent.query('How does the Fault-Tolerant Radiation-Robust Filesystem (FTRFS) compare in terms of read/write speed and error correction performance with other commonly used file systems in small satellite applications?')
print(response17)

> Running step 19962bb2-7e34-4557-9833-e88ef56c1a4b. Step input: How does the Fault-Tolerant Radiation-Robust Filesystem (FTRFS) compare in terms of read/write speed and error correction performance with other commonly used file systems in small satellite applications?
nodes num:4
tools num:4
[1;3;38;5;200mThought: To answer this question, I need to gather detailed information on the performance metrics of the Fault-Tolerant Radiation-Robust Filesystem (FTRFS), particularly focusing on its read/write speed and error correction capabilities. I will start by examining the FTRFS document to understand its design and performance characteristics.
Action: tool_file_85
Action Input: {'input': 'FTRFS performance metrics, read/write speed, error correction capabilities'}
[0m> Running step d162e4db-c347-49c8-a840-683b34951ddd. Step input: FTRFS performance metrics, read/write speed, error correction capabilities
[1;3;38;5;200mThought: I need to gather detailed information about FTRFS, specifi

In [None]:
response18 = agent.query('What specific RAID configurations are most commonly used in small satellites, and how effective are they in preserving data integrity under space conditions? Provide any relevant performance benchmarks?')
print(response18)

> Running step 32090604-94a6-49d2-b730-65f7368fdd3b. Step input: What specific RAID configurations are most commonly used in small satellites, and how effective are they in preserving data integrity under space conditions? Provide any relevant performance benchmarks?
nodes num:4
tools num:4
[1;3;38;5;200mThought: To address the query regarding RAID configurations used in small satellites and their effectiveness in preserving data integrity under space conditions, I will first gather information on RAID configurations and their reliability in space environments from the available documents. This will include insights on specific RAID models, their performance benchmarks, and their effectiveness in mitigating data loss due to space-related challenges.
Action: tool_file_171
Action Input: {'input': 'What specific RAID configurations are commonly used in small satellites, and how effective are they in preserving data integrity under space conditions?'}
[0m> Running step 47c0dd74-61d2-4c48

In [None]:
response19 = agent.query('Are there any upcoming storage technologies or methods that are currently in experimental stages but show promise for small satellite applications? Please include any preliminary data or performance metrics?')
print(response19)

> Running step 0ff37786-15bc-40db-84e4-62e58e3b7e72. Step input: Are there any upcoming storage technologies or methods that are currently in experimental stages but show promise for small satellite applications? Please include any preliminary data or performance metrics?
nodes num:4
tools num:4
[1;3;38;5;200mThought: To address the query about upcoming storage technologies or methods for small satellite applications that are in experimental stages, I will utilize the provided tools to gather information on innovative approaches, their benefits, and any preliminary performance metrics. I will focus on documents that discuss energy storage technology, data storage solutions, and the overall impact of miniaturization on satellite systems.
Action: tool_file_150
Action Input: {'input': 'Are there any upcoming storage technologies or methods that are currently in experimental stages but show promise for small satellite applications?'}
[0m> Running step 0d995bc3-ba1b-43d3-a538-bb792f3915cf

In [None]:
response20 = agent.query('What are the cost implications of using advanced memory technologies like MRAM or RRAM versus traditional flash memory in small satellites? Include any available cost estimates for implementation and maintenance?')
print(response20)

> Running step c692a9f3-3146-4c25-9f43-aeb9069b466f. Step input: What are the cost implications of using advanced memory technologies like MRAM or RRAM versus traditional flash memory in small satellites? Include any available cost estimates for implementation and maintenance?
nodes num:4
tools num:4
[1;3;38;5;200mThought: To address the cost implications of using advanced memory technologies like MRAM (Magnetoresistive Random Access Memory) and RRAM (Resistive Random Access Memory) compared to traditional flash memory in small satellites, I will gather detailed information from the provided documents. I will focus on cost estimates, implementation, and maintenance aspects related to these memory technologies in space applications.
Action: tool_file_211
Action Input: {'input': 'cost implications of MRAM technology in small satellites, including implementation and maintenance estimates'}
[0m> Running step b784a518-7baf-4447-848a-5f0d4f7d8c89. Step input: cost implications of MRAM tech