# Paper Generator
This notebooks demonstrates the paper generation papeline we propose. We use langchain for interaction with the OpenAI API.

## Prequisits


In [37]:
%pip install langchain langchain-core langchain-community langchain_openai ipywidgets



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.1[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [38]:
import ipywidgets as widgets
from IPython.display import display, Markdown


OPENAI_API_KEY = input("Enter your OpenAI API key: ")


In [39]:


NOUGAT_URL = input("Enter Nougat URL: ")


In [40]:
if not OPENAI_API_KEY:
    raise ValueError("Please provide OpenAI API Key")
if not NOUGAT_URL:
    raise ValueError("Please provide Nougat URL")

## File Upload


In [65]:
uploader_base = widgets.FileUpload(
    accept='.pdf',  # Accepted file extension e.g. '.txt', '.pdf', 'image/*', 'image/*,.pdf'
    multiple=False,  # True to accept multiple files upload else False
    description="Upload Base"
)
display(uploader_base)

uploader_similar = widgets.FileUpload(
    accept='.pdf',  # Accepted file extension e.g. '.txt', '.pdf', 'image/*', 'image/*,.pdf'
    multiple=True,  # True to accept multiple files upload else False
    description="Upload Similar"
)
display(uploader_similar)



FileUpload(value=(), accept='.pdf', description='Upload Base')

FileUpload(value=(), accept='.pdf', description='Upload Similar', multiple=True)

## Nougat 

In [68]:
from requests import Response, post, get
file = uploader_base.value[0]
display(f'NOUGAT_URL: {NOUGAT_URL}')
display(f'Uploaded file: {file.name}')
headers = {
    "Accept": "application/json",
}
response = post(NOUGAT_URL + "/predict",
                            files={"file": file.content}, headers=headers)
response.raise_for_status()
if not response.ok:
    raise Exception("Error parsing PDF to Markdown")

response = response.json()


similar_papers = []
sim_files = uploader_similar.value
for file in sim_files:
    response = post(NOUGAT_URL + "/predict",
                                files={"file": file["content"]}, headers=headers)
    response.raise_for_status()
    if not response.ok:
        raise Exception("Error parsing PDF to Markdown")
    similar_papers.append(response.json())


'NOUGAT_URL: http://137.226.232.15:8503'

'Uploaded file: sulayman_corona.pdf'

In [73]:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(api_key=OPENAI_API_KEY, model="gpt-4-turbo")

## Table of content generation

In [74]:
topic = "Research in time of COVID-19"
context = "\n\n\n\n".join(similar_papers)
base_chapter = response

In [78]:
from langchain_core.prompts import PromptTemplate
toc_template = """
You are a scientific researcher, an expert in crafting high-quality scientific documents.
You're trained across a wide range of scientific disciplines, enabling you to provide
specialized assistance across various topics.


Context: {context}


Base: {base_chapter}

Write the Table of Content for a paper with  the following topic: {topic}.
You can use the base as a starting point.
Use the context for information about the topic.


Only output markdown formatted text. Only output the generated chapter, no additional information.
"""

toc_prompt = PromptTemplate.from_template(toc_template)

toc_chain =  toc_prompt | llm

In [79]:
toc_response = toc_chain.invoke(input={"topic": topic, "context": context, "base_chapter": base_chapter})
display(Markdown(toc_response.content))

# Table of Contents

1. **Introduction**
   - Overview of COVID-19 pandemic and its global impact
   - Objectives of the paper

2. **Impact of COVID-19 on Research Practices**
   - Shifts in Research Priorities
   - Changes in Research Funding and Resources

3. **Methodological Adjustments in Research during COVID-19**
   - Adoption of Remote and Digital Research Methods
   - Challenges and Innovations in Data Collection

4. **Collaboration Patterns in Research during the Pandemic**
   - Increase in International Collaborations
   - Evolution of Interdisciplinary Research Teams
   - Role of Digital Platforms in Facilitating Collaborations

5. **Publication Trends during COVID-19**
   - Preprint and Peer-Review Processes
   - Open Access and Research Dissemination
   - Analysis of Publication Delays

6. **Sector-Specific Impacts**
   - Biomedical and Health Research
   - Social Sciences and Humanities
   - Engineering and Technology

7. **Gender and Demographic Considerations**
   - Research Output by Gender
   - Impact on Early-Career Researchers

8. **Policy Implications and Recommendations**
   - Recommendations for Research Funding Bodies
   - Policy Changes in Research Institutions
   - Future Preparedness for Global Emergencies

9. **Conclusion**
   - Summary of Key Findings
   - Future Research Directions

10. **References**

11. **Appendices**
   - Additional Data and Methodological Details
   - Case Studies and Interviews