# Ethical Review Automation for Clinical Research Proposals

This notebook automates the ethical review process of clinical research proposals for the Brazilian SUS health system. It guides users through uploading research documents, leveraging AI to assess key ethical dimensions and parameters, and extracting structured results including identified weaknesses and compliance recommendations.

The results are cleaned for compatibility and exported into a clear, professional PDF report — enabling researchers and ethics committees to efficiently review, share, and act on ethical considerations in their projects.


## Automated Ethical Review of Research Proposals

This section enables you to upload one or more research proposal documents (RTF, DOCX, TXT) for automated ethical review based on Brazil's SUS health system regulations.

Using OpenAI's GPT-4o model, each proposal is analyzed across **7 core ethical dimensions** and **41 detailed parameters** relevant to CNS Resolution 466/2012 and SUS CEP/CONEP guidelines.

The AI returns:

- A structured JSON summary with ethical evaluations for each dimension and parameter.
- A brief summary of **key ethical weaknesses** detected in the project.
- **Recommendations** to achieve full ethical compliance.

The code then parses the AI responses, compiles the results into a pandas DataFrame, and saves them as a CSV file (`ethical_review_results.csv`). This CSV is automatically downloaded for further analysis or reporting.

### Key features:
- Bulk upload of multiple proposals.
- Clear JSON output to facilitate structured data handling.
- Extraction of human-readable summaries of weaknesses and recommendations.
- Progress tracking via tqdm progress bar.


In [None]:
import json
import pandas as pd
from google.colab import files
from tqdm import tqdm
from openai import OpenAI
import seaborn as sns
import matplotlib.pyplot as plt
!pip install unidecode
import unidecode
from fpdf import FPDF

# Initialize OpenAI client (replace YOUR_API_KEY securely)
client = OpenAI(api_key="YOUR_API_KEY")

# Upload research proposals
uploaded = files.upload()  # This will open a file dialog to upload multiple files

print("Upload your research project files (RTF, DOCX, TXT)...")

# Prompt builder with 7 dimensions + 41 parameters + weaknesses + recommendations
def build_prompt(proposal_text):
    return f"""
You are an expert in ethical review of clinical research proposals for the SUS health system in Brazil. Analyze the proposal text below and respond with a JSON object including:

- 7 core ethical dimensions with values such as Yes/No/High/Medium/Low/Partial/Full/None/Adopt/Revise/Reject, etc.
- 41 specific ethical parameters relevant to SUS CEP/CONEP.

Additionally, provide two textual fields outside the JSON:

⚠️ Key Ethical Weaknesses for Each Project:
✅ Recommendations to Reach Full Ethical Compliance (CNS Res. 466/12):

Proposal Text:
{proposal_text}

Respond EXACTLY in this format:

{{
  "ethical_clarity": "...",
  "feasibility": "...",
  "policy_alignment": "...",
  "community_participation": "...",
  "review_outcome": "...",
  "data_protection": "...",
  "informed_consent": "...",
  "subject_benefits": "...",
  "research_integrity": "...",
  "risk_benefit_ratio": "...",
  "safety_concerns": "...",
  "adverse_event_reporting": "...",
  "literature_review": "...",
  "training_for_research_staff": "...",
  "project_budget": "...",
  "ethics_training": "...",
  "patient_privacy": "...",
  "telehealth_best_practices": "...",
  "implementation_strategy": "...",
  "evaluation_methods": "...",
  "community_participation_detail": "...",
  "scientific_validity": "...",
  "social_value": "...",
  "study_design": "...",
  "recruitment_process": "...",
  "inclusion_exclusion_criteria": "...",
  "participant_information": "...",
  "language_clarity": "...",
  "voluntariness": "...",
  "withdrawal_rights": "...",
  "assent_for_minors": "...",
  "third_party_consent": "...",
  "vulnerable_population_justification": "...",
  "fair_risk_distribution": "...",
  "benefit_access": "...",
  "compensation_policy": "...",
  "adverse_event_handling": "...",
  "safety_monitoring": "...",
  "personal_data_handling": "...",
  "data_storage_security": "...",
  "confidentiality_guarantee": "...",
  "anonymization_method": "...",
  "genetic_data_use": "...",
  "sample_storage": "...",
  "future_data_use": "...",
  "study_duration": "...",
  "staff_qualification": "...",
  "infrastructure_capacity": "...",
  "budget_transparency": "...",
  "funding_source_independence": "...",
  "conflict_of_interest_disclosure": "...",
  "local_regulatory_compliance": "...",
  "CNS_resolution_466_compliance": "...",
  "ANVISA_requirements": "...",
  "international_guidelines_alignment": "...",
  "community_benefit_plan": "...",
  "post_trial_access": "...",
  "result_dissemination_plan": "...",
  "informed_consent_training": "...",
  "ethical_monitoring_plan": "..."
}}

Key Ethical Weaknesses for Each Project:
[Your brief bullet points or text here]

Recommendations to Reach Full Ethical Compliance (CNS Res. 466/12):
[Your brief bullet points or text here]
"""

# Function to send the proposal to AI for review
def review_proposal(proposal_text):
    try:
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": "You are an expert in research ethics and proposal review."},
                {"role": "user", "content": build_prompt(proposal_text)}
            ],
            temperature=0.2,
        )
        return response.choices[0].message.content
    except Exception as e:
        return f"Error during review: {e}"

# Parse AI response to get JSON + weaknesses + recommendations
def parse_response(ai_text):
    try:
        # Extract JSON block (from first { to last })
        json_start = ai_text.find('{')
        json_end = ai_text.rfind('}') + 1
        json_str = ai_text[json_start:json_end]
        data = json.loads(json_str)

        # Extract the weaknesses and recommendations text after JSON
        after_json = ai_text[json_end:].strip()

        weak_key = "⚠️ Key Ethical Weaknesses for Each Project:"
        rec_key = "✅ Recommendations to Reach Full Ethical Compliance (CNS Res. 466/12):"

        weaknesses = ""
        recommendations = ""

        if weak_key in after_json and rec_key in after_json:
            weaknesses = after_json.split(weak_key)[1].split(rec_key)[0].strip()
            recommendations = after_json.split(rec_key)[1].strip()

        data["ethical_weaknesses"] = weaknesses
        data["recommendations"] = recommendations

        return data
    except Exception as e:
        print("Parsing error:", e)
        print("AI Response was:", ai_text)
        return None

# Process all uploaded files
all_results = []
for filename, content_bytes in tqdm(uploaded.items()):
    content = content_bytes.decode("utf-8", errors="ignore")
    print(f"Reviewing file: {filename}")
    ai_output = review_proposal(content)
    parsed = parse_response(ai_output)
    if parsed:
        parsed["filename"] = filename
        all_results.append(parsed)
    else:
        print(f"Failed to parse AI output for {filename}")

# Save all results as DataFrame and CSV
df = pd.DataFrame(all_results)
df.to_csv("ethical_review_results.csv", index=False)
print("CSV file saved as ethical_review_results.csv")

# Download CSV for local use
files.download("ethical_review_results.csv")


## Cleaning and Normalizing the Ethical Review Results CSV

This code loads the previously generated `ethical_review_results.csv` file and performs text normalization to ensure compatibility with PDF generation and other processing steps.

**Key actions performed:**

- Reads the CSV file into a pandas DataFrame.
- Defines a function `clean_text` that:
  - Converts accented characters to their unaccented ASCII equivalents (e.g., “é” → “e”) using the `unidecode` library.
  - Removes emojis and any non-printable or special characters that could cause encoding issues.
- Applies this cleaning function to all text columns in the DataFrame.
- Saves the cleaned DataFrame to a new CSV file `ethical_review_results_clean.csv`.
- Automatically downloads the cleaned CSV for further use.

This step helps avoid encoding errors when generating reports with limited font support or other tools that do not handle special characters well.


In [18]:
# Load your original CSV
df = pd.read_csv('ethical_review_results.csv')

# Define a function to normalize text to ASCII (remove accents etc.)
def clean_text(text):
    if pd.isna(text):
        return ""
    # Convert to string and remove accents, emojis, and special chars
    text = str(text)
    text = unidecode.unidecode(text)  # Converts accented letters to plain letters
    # Optionally, remove any remaining non-printable characters:
    text = ''.join(c for c in text if 32 <= ord(c) <= 126 or c in '\n\r\t')
    return text

# Clean all text columns
for col in df.columns:
    if df[col].dtype == 'object':
        df[col] = df[col].apply(clean_text)

# Save cleaned CSV
df.to_csv('ethical_review_results_clean.csv', index=False)
print("Clean CSV saved as 'ethical_review_results_clean.csv'")

# Download CSV for local use
files.download("ethical_review_results_clean.csv")

Clean CSV saved as 'ethical_review_results_clean.csv'


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

## Generating a PDF Report from Cleaned Ethical Review Results

This section guides you through uploading the cleaned CSV file (`ethical_review_results_clean.csv`) and generating a well-formatted PDF report summarizing the ethical review findings for each research project.

**Steps in the code:**

1. **File Upload**  
   Prompts you to upload the cleaned CSV file containing the ethical review results.

2. **PDF Class Definition**  
   Defines a custom `PDF` class based on `FPDF` that formats the report with:  
   - A header showing the report title.  
   - Section titles (chapters) with background fill color.  
   - Body text with readable font and spacing.

3. **PDF Report Generation**  
   Iterates through each project in the DataFrame:  
   - Adds a new page per project.  
   - Prints the project title using `multi_cell` to handle long titles.  
   - Includes key ethical evaluation parameters and status.  
   - Adds sections for “Key Ethical Weaknesses” and “Recommendations to Reach Full Ethical Compliance”.

4. **Saving and Downloading the PDF**  
   Saves the completed PDF report to a file and triggers an automatic download for local access.

This approach ensures that each project’s review results are clearly and cleanly presented on its own page for easy reading and sharing.


In [20]:
# Step 1: Upload your original CSV file
print("Please upload your original 'ethical_review_results_clean.csv' file:")
uploaded = files.upload()
csv_file_path = list(uploaded.keys())[0]


# Step 2: Define PDF class
class PDF(FPDF):
    def header(self):
        self.set_font("Helvetica", "B", 14)
        self.cell(0, 10, "Ethical Review Report (CNS Resolution 466/2012)", ln=1, align="C")
        self.ln(5)

    def chapter_title(self, title):
        self.set_font("Helvetica", "B", 12)
        self.set_fill_color(220, 235, 255)
        self.cell(0, 10, title, ln=1, fill=True)
        self.ln(2)

    def chapter_body(self, text):
        self.set_font("Helvetica", "", 11)
        self.multi_cell(0, 8, text)
        self.ln()

# Step 3: Generate PDF report without emojis in headings
pdf = PDF()
pdf.set_auto_page_break(auto=True, margin=15)

for idx, row in df.iterrows():
    pdf.add_page()
    project_title = row.get('project_title', 'Unknown Project')

    # Wrap project title with multi_cell for long titles
    pdf.set_font("Helvetica", "B", 12)
    pdf.set_fill_color(220, 235, 255)
    pdf.multi_cell(0, 10, f"Project: {project_title}", fill=True)
    pdf.ln(2)

    pdf.chapter_body(f"""
Ethical Review Outcome: {row.get('review_outcome', 'N/A')}
Ethical Clarity: {row.get('ethical_clarity', 'N/A')}
Informed Consent: {row.get('informed_consent', 'N/A')}
Protection of Vulnerable Groups: {row.get('protection_of_vulnerable_groups', 'N/A')}
Risk-Benefit Ratio: {row.get('risk_benefit_ratio', 'N/A')}
Data Protection and Privacy: {row.get('data_protection_and_privacy', 'N/A')}
Feasibility: {row.get('feasibility', 'N/A')}
""")

    pdf.chapter_title("Key Ethical Weaknesses")
    weaknesses = row.get("ethical_weaknesses", "Not provided.")
    pdf.chapter_body(weaknesses)

    pdf.chapter_title("Recommendations to Reach Full Ethical Compliance")
    recommendations = row.get("recommendations", "Not provided.")
    pdf.chapter_body(recommendations)

# Step 4: Save and download PDF
pdf_path = "Ethical_Review_Report.pdf"
pdf.output(pdf_path)
print(f"PDF report generated: {pdf_path}")

files.download(pdf_path)


Please upload your original 'ethical_review_results.csv' file:


Saving ethical_review_results_clean.csv to ethical_review_results_clean (6).csv
PDF report generated: Ethical_Review_Report.pdf


  self.cell(0, 10, "Ethical Review Report (CNS Resolution 466/2012)", ln=1, align="C")
  self.cell(0, 10, title, ln=1, fill=True)


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>