# Notebook for Paper Comparison Flow via OpenAI
In this example, we will show you how to generate question-answers (QAs) from a pdf using OpenAI's models via `uniflow`'s [OpenAIJsonModelFlow](https://github.com/CambioML/uniflow/blob/main/uniflow/flow/model_flow.py#L125).


### Before running the code

You will need to `uniflow` conda environment to run this notebook. You can set up the environment following the instruction: https://github.com/CambioML/uniflow/tree/main#installation.

Next, you will need a valid [OpenAI API key](https://platform.openai.com/api-keys) to run the code. Once you have the key, set it as the environment variable `OPENAI_API_KEY` within a `.env` file in the root directory of this repository. For more details, see this [instruction](https://github.com/CambioML/uniflow/tree/main#api-keys)

In this example, we'll be using two papers in markdown format from under 'example/transform/data/raw_input/'

### Update system path

In [2]:
%reload_ext autoreload
%autoreload 2

import sys

sys.path.append(".")
sys.path.append("..")
sys.path.append("../..")

### Install helper packages

### Import Dependency

In [3]:
from dotenv import load_dotenv

from uniflow.flow.flow_factory import FlowFactory
from uniflow.flow.client import TransformClient
from uniflow.flow.config import TransformOpenAIConfig
from uniflow.op.model.model_config import OpenAIModelConfig
from uniflow.op.prompt import Context

load_dotenv()


  from .autonotebook import tqdm as notebook_tqdm


True

In [4]:
FlowFactory.list()

{'extract': ['ExtractHTMLFlow',
  'ExtractImageFlow',
  'ExtractIpynbFlow',
  'ExtractMarkdownFlow',
  'ExtractPDFFlow',
  'ExtractTxtFlow',
  'ExtractGmailFlow'],
 'transform': ['TransformAzureOpenAIFlow',
  'TransformComparisonGoogleFlow',
  'TransformComparisonOpenAIFlow',
  'TransformCopyFlow',
  'TransformGoogleFlow',
  'TransformGoogleMultiModalModelFlow',
  'TransformHuggingFaceFlow',
  'TransformLMQGFlow',
  'TransformOpenAIFlow'],
 'rater': ['RaterFlow']}

### Prepare the input data
They are in preprocessed in markdown formats

In [5]:
with open(r"data/raw_input/paper_1_raw.md", 'r') as file:
    paper_1_content = file.read()

with open(r"data/raw_input/paper_2_raw.md", 'r') as file:
    paper_2_content = file.read()

raw_context_input = [
    paper_1_content,
    paper_2_content,
]
raw_context_input

['# The Impact of Artificial Intelligence on Healthcare: A Review\n\n## Abstract\nArtificial intelligence (AI) has emerged as a transformative technology in healthcare, offering opportunities to improve patient outcomes, streamline processes, and enhance decision-making. This paper provides an overview of the current state of AI in healthcare, explores its applications, benefits, challenges, and future prospects.\n\n## Introduction\nIn recent years, artificial intelligence has gained significant traction across various industries, and healthcare is no exception. With advancements in machine learning, natural language processing, and robotics, AI has the potential to revolutionize healthcare delivery, diagnosis, treatment, and management. This paper aims to delve into the role of AI in healthcare, highlighting its implications, challenges, and future directions.\n\n## Background\nThe integration of AI into healthcare systems has been facilitated by the exponential growth of data, couple

### Use LLM to compare two papers

In this example, we will use the [OpenAIModelConfig](https://github.com/CambioML/uniflow/blob/main/uniflow/model/config.py#L17)'s default LLM to generate questions and answers.

In [24]:
input_data = [[Context(context=raw_context_input[0]), Context(context=raw_context_input[1])]]
config = TransformOpenAIConfig(
    flow_name="TransformComparisonOpenAIFlow",
    model_config=OpenAIModelConfig(),
)
client = TransformClient(config)

Now we call the `run` method on the `client` object to execute the question-answer generation operation on the data shown above.

In [49]:
output = client.run(input_data)

100%|██████████| 1/1 [01:29<00:00, 89.62s/it]


### Process the output

Let's take a look of the generated output. We need to do a little postprocessing on the raw output.

In [50]:
comparison = output[0]['output']

for contexts_list in comparison:
    for context in contexts_list:
        print(context.context[0])

Both paper A and paper B focus on the impact of renewable energy adoption on global carbon emissions. Both papers emphasize the importance of transitioning to renewable energy sources to address the threats posed by climate change. 

Paper A outlines its methodology of using data from various countries over the past two decades and utilizing statistical models to assess the effectiveness of renewable energy in reducing carbon footprints worldwide. Paper B also highlights a comprehensive approach to the analysis, focusing on a diverse set of countries and using statistical models to provide a rigorous assessment of the impact of renewable energy adoption on global carbon emissions.

The findings of both papers suggest that adopting renewable energy is a viable strategy for significantly reducing global carbon emissions. Both papers emphasize the practical implications of their research, highlighting the need for supportive policies to facilitate the transition to renewable energy source

## End of the notebook

Check more Uniflow use cases in the [example folder](https://github.com/CambioML/uniflow/tree/main/example/model#examples)!

<a href="https://www.cambioml.com/" title="Title">
    <img src="../image/cambioml_logo_large.png" style="height: 100px; display: block; margin-left: auto; margin-right: auto;"/>
</a>