# Side Effect Analyzer Example

This notebook demonstrates how to use the `SideEffectAnalyzer` to process drug reviews, analyze side effects, and generate relevant outputs.

## Step 1: Import Libraries and Initialize Analyzer
We will import the necessary libraries and initialize the `SideEffectAnalyzer` with the following:
- Initial keywords: `["nausea", "dizziness", "headache", "stomach pain"]`.
- Official side effects: `["fatigue", "dry mouth", "anxiety"]`.

In [None]:
# Import required modules
from src.side_effect.apply import SideEffectAnalyzer
import pandas as pd

# Define initial keywords and official side effects
initial_keywords = ["nausea", "dizziness", "headache", "stomach pain"]
side_effects_official = ["fatigue", "dry mouth", "anxiety"]

# Initialize the SideEffectAnalyzer
analyzer = SideEffectAnalyzer(initial_keywords, side_effects_official)

## Step 2: Load and Process Dataset
We will now load the dataset (`simulants_reviews.csv`) and process it using the `SideEffectAnalyzer`.

The output will include:
- Processed comment dictionary.
- Side effect scores for each drug.
- Top K comments for each side effect.

In [None]:
# Load the dataset
file_path = "simulants_reviews.csv"

# Process the file using the analyzer
results = analyzer.process_file(file_path)

# Extract results
comment_dict, side_effect_scores, top_k_comments = results

# Display side effect scores for drugs
side_effect_scores

## Step 3: Display Expanded Keywords
The analyzer will dynamically expand the initial keywords using both WordNet and the official side effects list.

Here are the expanded keywords for each initial keyword.

In [None]:
# Expand keywords using the analyzer's keyword expander
expanded_keywords = analyzer.keyword_expander.expand_keywords(initial_keywords)

# Display the expanded keywords
for initial_kw, expansions in expanded_keywords.items():
    print(f"Initial Keyword: {initial_kw}")
    for exp in expansions:
        print(f"  - {list(exp.keys())[0]}: {list(exp.values())[0]}")
    print("\n")

## Step 4: Display Top K Comments
We will now display the top K comments related to each side effect, sorted by their relevance scores.

In [None]:
# Display top K comments related to each side effect
print("Top K Comments")
for comment in top_k_comments:
    print(f"Drug: {comment['drug']}, Side Effect: {comment['side_effect']}, Score: {comment['score']}")
    print(f"Comment: {comment['comment']}\n")

## Step 5: Save Results to CSV
Finally, we will save the analysis results to CSV files:
- `updated_comments.csv`: Contains the processed comments with side effects.
- `side_effect_scores.csv`: Contains relevance scores for each drug and side effect.
- `top_k_comments.csv`: Contains the top K comments for each side effect.

In [None]:
# Save the results to CSV files
pd.DataFrame(comment_dict).to_csv("updated_comments.csv", index=False)
pd.DataFrame([
    {"Drug Name": drug, **scores} for drug, scores in side_effect_scores.items()
]).to_csv("side_effect_scores.csv", index=False)
pd.DataFrame(top_k_comments).to_csv("top_k_comments.csv", index=False)

print("Results saved to CSV files:")
print("- updated_comments.csv")
print("- side_effect_scores.csv")
print("- top_k_comments.csv")