# üéì Academic Research Assistant Demo

**For Non-Technical Academic Researchers**

This notebook demonstrates how AI-Notebooks can help academic researchers analyze data, generate insights, and create research reports without requiring programming knowledge.

## üéØ What You'll Learn
- Load and explore research data with AI assistance
- Get automated data quality reports (like SweetViz)
- Receive AI-guided data cleaning recommendations
- Perform statistical analysis with plain English explanations
- Generate academic research reports automatically

## üë• Perfect For
- Graduate students conducting thesis research
- Social scientists analyzing survey data
- Medical researchers working with clinical data
- Education researchers studying learning outcomes
- Any researcher who wants AI assistance with data analysis

---

## üöÄ Setup

First, let's set up the academic research assistant:

In [None]:
# Setup Academic Research Assistant
import sys
import os
from pathlib import Path

# Add the project root to Python path
project_root = Path.cwd().parent if Path.cwd().name == 'notebooks' else Path.cwd()
sys.path.insert(0, str(project_root))

# Load environment variables for AI models
from dotenv import load_dotenv
load_dotenv(project_root / '.env')

# Load the academic research assistant
%load_ext ai_assistant.research.academic_magic

print("üéì Academic Research Assistant is ready!")
print("Use %research_help to see all available commands.")

## üìö Available Research Commands

Let's see what research commands are available:

In [None]:
%research_help

## üìä Example 1: Survey Research Analysis

Let's create some sample survey data to demonstrate the research workflow:

In [None]:
# Create sample survey data (simulating your CSV file)
import pandas as pd
import numpy as np

# Set random seed for reproducibility
np.random.seed(42)

# Generate sample survey data
n_participants = 200

# Demographics
ages = np.random.normal(25, 5, n_participants).astype(int)
ages = np.clip(ages, 18, 65)  # Realistic age range

genders = np.random.choice(['Male', 'Female', 'Other'], n_participants, p=[0.45, 0.50, 0.05])
education = np.random.choice(['High School', 'Bachelor', 'Master', 'PhD'], n_participants, p=[0.2, 0.4, 0.3, 0.1])

# Research variables (Likert scales 1-7)
# Let's simulate a study on student satisfaction
teaching_quality = np.random.normal(5.2, 1.2, n_participants)
teaching_quality = np.clip(teaching_quality, 1, 7)

course_content = np.random.normal(5.0, 1.1, n_participants)
course_content = np.clip(course_content, 1, 7)

support_services = np.random.normal(4.8, 1.3, n_participants)
support_services = np.clip(support_services, 1, 7)

# Overall satisfaction (influenced by other factors + some noise)
overall_satisfaction = (0.4 * teaching_quality + 0.3 * course_content + 0.2 * support_services + 
                       np.random.normal(0, 0.5, n_participants))
overall_satisfaction = np.clip(overall_satisfaction, 1, 7)

# Add some missing values to make it realistic
missing_indices = np.random.choice(n_participants, size=int(0.05 * n_participants), replace=False)
support_services[missing_indices] = np.nan

# Create DataFrame
survey_data = pd.DataFrame({
    'participant_id': range(1, n_participants + 1),
    'age': ages,
    'gender': genders,
    'education_level': education,
    'teaching_quality': teaching_quality,
    'course_content': course_content,
    'support_services': support_services,
    'overall_satisfaction': overall_satisfaction
})

# Save to CSV (simulating your uploaded file)
survey_data.to_csv('sample_survey_data.csv', index=False)

print("üìÑ Sample survey data created: sample_survey_data.csv")
print(f"üìä Dataset: {survey_data.shape[0]} participants, {survey_data.shape[1]} variables")
survey_data.head()

## üîç Step 1: Load Research Data

Now let's load our research data with AI assistance. The AI will automatically analyze the data quality and provide initial insights:

In [None]:
# Load research data with AI analysis
%research_load sample_survey_data.csv --research_question "What factors most strongly predict student satisfaction in online courses?"

## üßπ Step 2: Data Cleaning with AI Guidance

Let's get AI recommendations for cleaning our data:

In [None]:
# Get AI-guided data cleaning recommendations
%research_clean --interactive

## üìä Step 3: Exploratory Data Analysis

Now let's perform comprehensive exploratory data analysis with automated reports:

In [None]:
# Comprehensive EDA with AI insights
%research_eda

## üìà Step 4: Statistical Analysis

Let's get AI recommendations for appropriate statistical tests:

In [None]:
# AI-guided statistical analysis
%research_stats --variables "teaching_quality,course_content,support_services,overall_satisfaction"

## üìù Step 5: Generate Research Report

Finally, let's generate an academic research report section:

In [None]:
# Generate academic research report
%research_report

## üî¨ Example 2: Experimental Research Analysis

Let's create another example with experimental data:

In [None]:
# Create sample experimental data
np.random.seed(123)

# Simulate a treatment vs control study
n_per_group = 50

# Control group
control_scores = np.random.normal(75, 12, n_per_group)
control_group = pd.DataFrame({
    'participant_id': range(1, n_per_group + 1),
    'group': 'Control',
    'pre_test_score': np.random.normal(70, 10, n_per_group),
    'post_test_score': control_scores,
    'engagement_score': np.random.normal(6.2, 1.5, n_per_group),
    'satisfaction': np.random.normal(5.8, 1.2, n_per_group)
})

# Treatment group (with effect)
treatment_scores = np.random.normal(82, 11, n_per_group)  # Higher mean
treatment_group = pd.DataFrame({
    'participant_id': range(n_per_group + 1, 2 * n_per_group + 1),
    'group': 'Treatment',
    'pre_test_score': np.random.normal(71, 9, n_per_group),
    'post_test_score': treatment_scores,
    'engagement_score': np.random.normal(7.1, 1.3, n_per_group),  # Higher engagement
    'satisfaction': np.random.normal(6.5, 1.1, n_per_group)  # Higher satisfaction
})

# Combine groups
experimental_data = pd.concat([control_group, treatment_group], ignore_index=True)

# Add some demographic variables
experimental_data['age'] = np.random.normal(22, 3, len(experimental_data)).astype(int)
experimental_data['gender'] = np.random.choice(['Male', 'Female'], len(experimental_data))

# Save experimental data
experimental_data.to_csv('experimental_study_data.csv', index=False)

print("üß™ Experimental study data created: experimental_study_data.csv")
experimental_data.head()

In [None]:
# Analyze experimental data
%research_load experimental_study_data.csv --research_question "Does the new teaching method improve student learning outcomes compared to traditional methods?"

In [None]:
# EDA for experimental data
%research_eda

In [None]:
# Statistical analysis for experimental design
%research_stats --test_type "t-test" --variables "group,post_test_score,engagement_score"

## üéØ Key Benefits for Academic Researchers

### ‚úÖ What This System Provides:

1. **No Coding Required** - Just use simple magic commands
2. **AI Guidance** - Get expert recommendations at every step
3. **Automated Reports** - Generate publication-ready analysis
4. **Statistical Expertise** - AI explains complex concepts in plain English
5. **Quality Assurance** - Built-in checks for research best practices

### üî¨ Research Workflow Supported:

1. **Data Upload** ‚Üí Load CSV files with automatic quality assessment
2. **Data Cleaning** ‚Üí AI-guided cleaning with research best practices
3. **Exploration** ‚Üí Automated EDA with SweetViz integration
4. **Analysis** ‚Üí Statistical test recommendations and execution
5. **Reporting** ‚Üí Academic-style results sections

### üìä Supported Research Types:

- **Survey Research** - Likert scales, demographics, correlations
- **Experimental Studies** - Treatment vs control, pre/post designs
- **Observational Studies** - Cross-sectional and longitudinal data
- **Mixed Methods** - Quantitative analysis with qualitative insights

---

## üöÄ Next Steps

1. **Upload Your Own Data** - Replace the sample CSV with your research data
2. **Customize Research Questions** - Modify the research questions to match your study
3. **Explore Advanced Features** - Try different statistical tests and visualizations
4. **Generate Reports** - Create publication-ready results sections
5. **Collaborate** - Share notebooks with supervisors and colleagues

## üí° Tips for Academic Researchers

- **Start with Clear Research Questions** - The AI provides better guidance with specific questions
- **Review AI Recommendations** - Always validate AI suggestions with domain expertise
- **Document Your Process** - Keep notes on decisions made during analysis
- **Check Assumptions** - Verify statistical test assumptions before interpreting results
- **Consider Effect Sizes** - Look beyond p-values to practical significance

---

**üéì Ready to revolutionize your research workflow with AI assistance!**