# Data Analysis Template
**Author:** [Your Name]  
**Date:** [Date]  
**Purpose:** [Analysis Purpose]  
**Team Role:** [analyst/developer/researcher]  

## Analysis Overview
Brief description of the analysis objectives and methodology.

## Data Sources
- Database tables used
- API endpoints accessed
- File sources

## Key Findings
Summary of main insights and conclusions.

## 1. Environment Setup and Imports

In [None]:
# Standard imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

# Database connection
import psycopg2
from sqlalchemy import create_engine
import os

# Project-specific imports
import sys
sys.path.append('/app/src')
from soccer_intelligence.utils.config import Config
from soccer_intelligence.utils.database import DatabaseManager

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 100)
plt.style.use('seaborn-v0_8')
sns.set_palette('husl')

print("✅ Environment setup complete")

## 2. Database Connection and Data Loading

In [None]:
# Database connection based on role
# Note: Use appropriate credentials for your role

# For analysts and researchers (read-only)
# engine = create_engine('postgresql://analyst_user:analyst_secure_pass@postgres:5432/soccer_intelligence')

# For developers (full access)
# engine = create_engine('postgresql://soccerapp:soccerpass123@postgres:5432/soccer_intelligence')

# Test connection
try:
    # Replace with your role-appropriate connection
    engine = create_engine('postgresql://soccerapp:soccerpass123@postgres:5432/soccer_intelligence')
    
    # Test query
    test_query = "SELECT COUNT(*) as total_teams FROM teams"
    result = pd.read_sql(test_query, engine)
    print(f"✅ Database connected successfully. Total teams: {result['total_teams'].iloc[0]}")
    
except Exception as e:
    print(f"❌ Database connection failed: {e}")
    print("Please check your database credentials and connection.")

## 3. Data Exploration

In [None]:
# Load and explore your data here
# Example: Load player statistics

query = """
SELECT 
    p.player_name,
    t.team_name,
    ps.season_year,
    ps.goals,
    ps.assists,
    ps.minutes_played
FROM player_statistics ps
JOIN players p ON ps.player_id = p.player_id
JOIN teams t ON ps.team_id = t.team_id
WHERE ps.minutes_played > 90
LIMIT 10
"""

df = pd.read_sql(query, engine)
print("Sample data:")
display(df.head())
print(f"\nDataset shape: {df.shape}")
print(f"Data types:\n{df.dtypes}")

## 4. Analysis and Visualization

In [None]:
# Your analysis code here
# Example: Basic visualization

if not df.empty:
    plt.figure(figsize=(12, 6))
    
    plt.subplot(1, 2, 1)
    plt.scatter(df['goals'], df['assists'], alpha=0.7)
    plt.xlabel('Goals')
    plt.ylabel('Assists')
    plt.title('Goals vs Assists')
    
    plt.subplot(1, 2, 2)
    df['goals'].hist(bins=20, alpha=0.7)
    plt.xlabel('Goals')
    plt.ylabel('Frequency')
    plt.title('Distribution of Goals')
    
    plt.tight_layout()
    plt.show()
else:
    print("No data available for visualization")

## 5. Results and Conclusions

### Key Findings
- [Finding 1]
- [Finding 2]
- [Finding 3]

### Recommendations
- [Recommendation 1]
- [Recommendation 2]

### Next Steps
- [Next step 1]
- [Next step 2]

### Notes for Team
- [Important notes for team members]
- [Data quality observations]
- [Methodology considerations]