In [1]:
def read_txt_file(file_path):
    """
    Read the specified JSON file and return the parsed object.
    
    :param file_path: Path to the JSON file
    :return: Parsed JSON data (usually dict or list)
    """
    # Use the with statement to open the file, ensuring it is properly closed afterwards
    with open(file_path, 'r', encoding='utf-8') as file:
        # Use the read() method to read all contents of the file into a string variable
        data = file.read()
    # print(type(data))
    # At this point, the 'data' variable contains all the contents of the file.
    # Note: If the file has multiple lines, 'data' will also include newline characters '\n'.
    return data


In [3]:
def write_txt_file(file_path, data):
    """
    Write the given data to a JSON file at the specified path.
    
    :param file_path: Path to the JSON file
    :param data: Data to be written (usually dict or list)
    """
    with open(file_path, 'w', encoding='utf-8') as file:
        file.write(data)


In [5]:
def deepseek_qa(txtfile, txtdir):
    file_path = f"{txtdir}\\{txtfile}"
    txt = read_txt_file(file_path)
    print(f"{txtfile} is running")
    
    reasoning_content = ""  # define the complete reasoning process
    answer_content = ""     # define the complete response
    is_answering = False    # determine whether the reasoning has ended and the response has started

    # create chat completion request
    stream = client.chat.completions.create(
        model="deepseek-v3",  # using deepseek-v3 as an example, can be changed as needed
        messages=[
            {"role": "user", "content": f"""Please carefully read the provided paper and complete the following tasks:

Scientific Problem and Answer:

Extract the core scientific problem addressed in the paper and present it in the form of a question.

Provide a detailed and comprehensive answer to the question based on the paper's content, including methods, experiments, results, and conclusions.

Use <|begin_of_question|> and <|end_of_question|> to mark the start and end of the question.

Use <|begin_of_answer|> and <|end_of_answer|> to mark the start and end of the answer.

The answer should be written from an objective perspective, avoiding any reference to "this paper" or "the authors."

Thought Chain for Solving the Problem:

Reconstruct the thought chain used to solve the scientific problem from an objective perspective or first-person perspective.

Use <|begin_of_thought|> and <|end_of_thought|> to mark the start and end of the thought chain.

The thought chain should include the following detailed elements:

Problem Identification: Clearly state the problem or gap in the field that motivated the research.

Why It Matters: Explain why solving this problem is important and what impact it could have on the field or real-world applications.

Hypothesis Formation: Describe the hypothesis or key idea proposed to address the problem, and explain the reasoning behind it.

Method Design: Explain the methodology or approach developed to test the hypothesis, including any novel techniques or tools. Clearly articulate why this method was chosen and how it addresses the problem.

Experimental Setup: Detail the experimental design, including datasets, metrics, and baseline comparisons. Explain why these choices were made and how they align with the research goals.

Data Analysis: Describe how the data was analyzed, including any challenges encountered and how they were addressed. Explain why specific analysis techniques were used and how they help validate the hypothesis.

Results and Interpretation: Summarize the key results and their implications for the hypothesis and the broader field. Explain why these results are significant and how they contribute to solving the problem.

Limitations and Future Work: Discuss the limitations of the approach and propose potential future directions for improvement or extension. Explain why these limitations exist and how future work could address them.

Avoid any reference to "this paper" or "the authors."

Ensure the logic is clear and easy to understand, even for readers without deep expertise in the field.

Thought Chain for Deriving the Research Idea:

Reconstruct the thought chain that led to the formation of the research idea, from an objective perspective or first-person perspective.

Use <|begin_of_idea_thought|> and <|end_of_idea_thought|> to mark the start and end of this thought chain.

The thought chain should include the following detailed elements:

Research Background: Describe the broader context of the research area and why it is important.

Current State of the Field: Summarize the existing approaches and their limitations.

Problem Discovery: Explain how the specific problem addressed in the paper was identified, including any observations or gaps in the literature.

Idea Formation: Describe the process of developing the core idea or hypothesis, including any inspiration, analogies, or prior work that influenced the thinking.

Validation of the Idea: Explain how the idea was initially validated or tested (e.g., through preliminary experiments, theoretical analysis, or literature review).

Refinement of the Idea: Discuss how the idea evolved over time, including any adjustments or iterations based on feedback or new insights.

Avoid any reference to "this paper" or "the authors."

Ensure the logic is clear and easy to understand, even for readers without deep expertise in the field.

Output Format:

<|begin_of_question|>
[Present the core scientific problem in the form of a question]
<|end_of_question|>

<|begin_of_answer|>
[Provide a detailed and comprehensive answer, including methods, experiments, results, and conclusions, written from an objective perspective]
<|end_of_answer|>

<|begin_of_thought|>
[Describe the thought process in the first person or from an objective perspective, including all detailed elements: problem identification, why it matters, hypothesis formation, method design, experimental setup, data analysis, results and interpretation, limitations, and future work. Ensure the logic is clear and easy to understand.]
<|end_of_thought|>

<|begin_of_idea_thought|>
[Describe the thought process in the first person or from an objective perspective, including all detailed elements: research background, current state of the field, problem discovery, idea formation, validation of the idea, and refinement of the idea. Ensure the logic is clear and easy to understand.]
<|end_of_idea_thought|>

Example:

<|begin_of_question|>
How can the generalization ability of deep learning models on small-sample datasets be improved without increasing computational complexity?
<|end_of_question|>

<|begin_of_answer|>
A meta-learning-based adaptive weight adjustment method has been proposed to address this challenge. This method involves designing a lightweight meta-network that dynamically adjusts the weights of the main network based on the features of the input data. The core idea is to use meta-learning to simulate the model's performance across different tasks, thereby enhancing its generalization ability on small-sample data. Experiments conducted on several small-sample datasets, such as Mini-ImageNet and CIFAR-FS, demonstrated that this approach significantly improves model performance without substantially increasing computational complexity. For instance, on the Mini-ImageNet dataset, the model's accuracy improved by approximately 8%. However, the method's sensitivity to hyperparameters was identified as a limitation, suggesting a need for further optimization, such as exploring more efficient meta-network architectures.
<|end_of_answer|>

<|begin_of_thought|>
The problem of poor generalization in deep learning models on small-sample datasets was identified as a significant challenge in the field. Existing models often overfit due to limited data availability, leading to suboptimal performance in real-world applications. Solving this problem is crucial because many practical scenarios, such as medical diagnosis or rare event prediction, involve limited data. Improving generalization in such settings could enable more reliable and accurate AI systems.

To address this issue, a hypothesis was formed: dynamically adjusting model parameters based on input data features could improve adaptability and generalization without requiring additional computational resources. This idea was motivated by the observation that traditional models use fixed parameters, which may not be optimal for diverse small-sample tasks. By allowing the model to adapt its parameters dynamically, it could better capture the unique characteristics of each task.

To test this hypothesis, a lightweight meta-network was designed. This meta-network operates alongside the main model, analyzing input data features and dynamically adjusting the main model's weights. The design prioritized efficiency to ensure that the computational overhead remained minimal. The meta-network was trained using a meta-learning framework, which allowed it to simulate performance across diverse tasks and datasets. This approach was chosen because meta-learning has shown promise in enabling models to generalize across tasks, making it a natural fit for small-sample problems.

Experiments were conducted on multiple small-sample datasets, including Mini-ImageNet and CIFAR-FS. These datasets were selected because they are widely used benchmarks for small-sample learning, allowing for fair comparisons with existing methods. The experimental setup included comparisons with baseline models to evaluate performance improvements. Key metrics such as accuracy, training time, and computational cost were measured. These metrics were chosen because they directly reflect the goals of improving generalization without increasing computational complexity.

During data analysis, it was observed that the method's performance was highly dependent on the choice of hyperparameters. This sensitivity was addressed through extensive hyperparameter tuning, but it remains a limitation of the approach. Additionally, the method's effectiveness varied across different types of small-sample datasets, suggesting that further customization may be needed for specific applications. These challenges were analyzed to understand their root causes and identify potential solutions.

The results showed that the proposed method significantly improved model performance, particularly in data-scarce scenarios. For example, on the Mini-ImageNet dataset, accuracy improved by approximately 8%, while computational costs remained comparable to baseline models. These results are significant because they demonstrate that dynamic weight adjustment can effectively enhance generalization without sacrificing efficiency. This finding has broad implications for fields where data is scarce, such as healthcare or environmental monitoring.

However, the method's sensitivity to hyperparameters and dataset variability highlights the need for further refinement. Future work could explore more robust meta-network architectures, automated hyperparameter optimization techniques, and applications to a broader range of tasks, such as medical imaging or natural language processing. These directions are important because they address the current limitations and could further improve the method's practicality and effectiveness.
<|end_of_thought|>

<|begin_of_idea_thought|>
The research idea emerged from a broader interest in improving the practicality of deep learning models in real-world scenarios, where data is often limited. The field of small-sample learning has gained attention due to its relevance in applications like medical imaging, where collecting large datasets is expensive or ethically challenging. However, existing methods often struggle with overfitting and fail to generalize well to new tasks.

A review of the current state of the field revealed that most approaches focus on either data augmentation or complex model architectures, which often come with high computational costs. While these methods can improve performance, they are not always feasible in resource-constrained settings. This gap highlighted the need for a more efficient solution that could enhance generalization without increasing computational complexity.

The specific problem of poor generalization in small-sample datasets was identified through experiments with existing models. It became clear that fixed model parameters, which work well in large-scale settings, are not suitable for small-sample tasks where data variability is high. This observation led to the hypothesis that dynamic parameter adjustment could be a key to improving generalization.

The core idea of using meta-learning for dynamic weight adjustment was inspired by prior work in few-shot learning, where meta-learning has been successful in enabling models to adapt quickly to new tasks. The analogy was drawn that a similar approach could be applied to small-sample learning, but with a focus on efficiency to avoid excessive computational overhead.

To validate this idea, preliminary experiments were conducted using simple meta-network prototypes. These experiments showed promising results, indicating that dynamic weight adjustment could indeed improve generalization. However, they also revealed challenges, such as the meta-network's sensitivity to hyperparameters, which needed to be addressed.

Over time, the idea was refined through iterative experimentation and feedback from the research community. The meta-network architecture was optimized for efficiency, and new training techniques were introduced to stabilize performance. These refinements were crucial in transforming the initial idea into a practical and effective solution.
<|end_of_idea_thought|>

There is Artical:
{txt}"""}
        ],
        stream=True
        # Uncomment the following to include token usage in the final chunk
        # stream_options={
        #     "include_usage": True
        # }
    )

    for chunk in stream:
        # Handle usage information
        if not getattr(chunk, 'choices', None):
            print("\n" + "=" * 20 + "Token Usage" + "=" * 20 + "\n")
            print(chunk.usage)
            continue

        delta = chunk.choices[0].delta

        # Handle response content
        if getattr(delta, 'content', None):
            # print(delta.content, end='', flush=True)
            answer_content += delta.content

    # If you need to print the full content, uncomment the following

    # print("=" * 20 + "Complete Response" + "=" * 20 + "\n")
    # print(answer_content)

    write_txt_file(file_path, answer_content)
    return f"{txtfile} is done"


In [11]:
def main(txtdir):
    futures = []
    with ThreadPoolExecutor(max_workers=10) as executor:
        for txtfile in os.listdir(txtdir):
            if txtfile.endswith('.txt'):  # ensure only text files are processed
                future = executor.submit(deepseek_qa, txtfile, txtdir)
                futures.append(future)
        
        # collect results
        for future in as_completed(futures):
            try:
                result = future.result()
                print(result)
            except Exception as e:
                print(f"An exception occurred: {e}")


In [13]:
import os
from openai import OpenAI
from concurrent.futures import ThreadPoolExecutor, as_completed

client = OpenAI(
    # If the environment variable is not configured, replace the following line with your Bailian API Key, e.g., api_key="sk-xxx",
    api_key="",  # How to get an API Key: https://help.aliyun.com/zh/model-studio/developer-reference/get-api-key
    base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
)

txtdir = "F:\\Working\\ModelDistillation\\pdfoutput"
main(txtdir)


10.1002-adma.202408416.pdf.txt is running
10.1002-advs.202203889.pdf.txt is running
10.1002-adma.202412708.pdf.txt is running
10.1002-advs.202304424.pdf.txt is running
10.1002-aenm.202303281.pdf.txt is running
10.1002-advs.202409290.pdf.txt is running
10.1002-anie.202218076.pdf.txt is running
10.1002-jssc.202201057.pdf.txt is running
10.1002-smll.202301130.pdf.txt is running
10.1002-marc.202300730.pdf.txt is running
生成了一个异常: Error code: 400 - {'error': {'code': 'invalid_parameter_error', 'param': None, 'message': '<400> InternalError.Algo.InvalidParameter: Range of input length should be [1, 57344]', 'type': 'invalid_request_error'}, 'id': 'chatcmpl-0969c701-4549-9aca-906d-f0374d40f9ad', 'request_id': '0969c701-4549-9aca-906d-f0374d40f9ad'}
10.1002-smll.202402783.pdf.txt is running
生成了一个异常: Error code: 400 - {'error': {'code': 'invalid_parameter_error', 'param': None, 'message': '<400> InternalError.Algo.InvalidParameter: Range of input length should be [1, 57344]', 'type': 'invalid_re