# One Piece Card Game Analysis

This notebook demonstrates how to analyze the One Piece card data that has been scraped and stored in the database.

In [None]:
import os
import sys
import sqlite3
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Add the parent directory to the path so we can import our modules
sys.path.append('..')
from database.db_manager import DatabaseManager

## Load the Card Data

First, let's load the card data from the database.

In [None]:
# Initialize the database manager
db_path = '../data/one_piece_cards.db'
db_manager = DatabaseManager(db_path)

# Get all cards
cards = db_manager.get_all_cards()

# Convert to DataFrame
df = pd.DataFrame(cards)

# Display basic information
print(f"Total cards: {len(df)}")
df.head()

## Basic Statistics

Let's look at some basic statistics about the cards.

In [None]:
# Count cards by color
color_counts = df['color'].value_counts()
print("Cards by Color:")
print(color_counts)

# Count cards by type
type_counts = df['card_type'].value_counts()
print("\nCards by Type:")
print(type_counts)

# Count cards by set
set_counts = df['card_set'].value_counts()
print("\nCards by Set (top 10):")
print(set_counts.head(10))

## Visualizations

Let's create some visualizations to better understand the card distribution.

In [None]:
# Set the style
sns.set(style="whitegrid")
plt.figure(figsize=(12, 6))

# Plot cards by color
ax = sns.barplot(x=color_counts.index, y=color_counts.values)
plt.title('Number of Cards by Color', fontsize=16)
plt.xlabel('Color', fontsize=12)
plt.ylabel('Number of Cards', fontsize=12)
plt.xticks(rotation=45)

# Add count labels on top of bars
for i, v in enumerate(color_counts.values):
    ax.text(i, v + 5, str(v), ha='center')

plt.tight_layout()
plt.show()

In [None]:
# Plot cards by type
plt.figure(figsize=(10, 6))
ax = sns.barplot(x=type_counts.index, y=type_counts.values)
plt.title('Number of Cards by Type', fontsize=16)
plt.xlabel('Card Type', fontsize=12)
plt.ylabel('Number of Cards', fontsize=12)

# Add count labels on top of bars
for i, v in enumerate(type_counts.values):
    ax.text(i, v + 5, str(v), ha='center')

plt.tight_layout()
plt.show()

## Cost Distribution

Let's analyze the cost distribution of cards.

In [None]:
# Filter out cards with no cost (e.g., some Event cards)
df_with_cost = df[df['cost'].notna()]

# Convert cost to numeric
df_with_cost['cost'] = pd.to_numeric(df_with_cost['cost'])

# Plot cost distribution
plt.figure(figsize=(12, 6))
sns.histplot(data=df_with_cost, x='cost', hue='color', multiple='stack', discrete=True)
plt.title('Card Cost Distribution by Color', fontsize=16)
plt.xlabel('Cost', fontsize=12)
plt.ylabel('Number of Cards', fontsize=12)
plt.tight_layout()
plt.show()

## Power Analysis

Let's analyze the power distribution of Character cards.

In [None]:
# Filter Character cards with power
df_characters = df[(df['card_type'] == 'Character') | (df['card_type'] == 'Leader')]
df_characters = df_characters[df_characters['power'].notna() & (df_characters['power'] != '-')]

# Convert power to numeric (remove any non-numeric characters)
df_characters['power_numeric'] = df_characters['power'].str.replace(r'\D', '', regex=True).astype(int)

# Plot power vs cost
plt.figure(figsize=(12, 8))
sns.scatterplot(data=df_characters, x='cost', y='power_numeric', hue='color', size='power_numeric', sizes=(20, 200), alpha=0.7)
plt.title('Character Power vs Cost by Color', fontsize=16)
plt.xlabel('Cost', fontsize=12)
plt.ylabel('Power', fontsize=12)
plt.grid(True, linestyle='--', alpha=0.7)
plt.tight_layout()
plt.show()

## Character Type Analysis

Let's analyze the distribution of character types.

In [None]:
# Extract character types (some cards have multiple types separated by '/')
all_types = []
for types in df['character_type'].dropna():
    if isinstance(types, str):
        all_types.extend([t.strip() for t in types.split('/')])

# Count occurrences of each type
type_counts = pd.Series(all_types).value_counts()

# Plot top 15 character types
plt.figure(figsize=(14, 8))
sns.barplot(x=type_counts.head(15).index, y=type_counts.head(15).values)
plt.title('Top 15 Character Types', fontsize=16)
plt.xlabel('Character Type', fontsize=12)
plt.ylabel('Number of Cards', fontsize=12)
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

## Card Effect Analysis

Let's analyze the card effects to find common keywords.

In [None]:
# Extract effects
effects = df['effect'].dropna().tolist()

# Common keywords to look for
keywords = [
    'Blocker', 'Rush', 'Double Attack', 'Banish', 'Counter', 'Draw', 
    'DON!!', 'Trigger', 'On Play', 'When Attacking', 'K.O.'
]

# Count occurrences of each keyword
keyword_counts = {}
for keyword in keywords:
    count = sum(1 for effect in effects if keyword in effect)
    keyword_counts[keyword] = count

# Convert to Series and sort
keyword_series = pd.Series(keyword_counts).sort_values(ascending=False)

# Plot keyword occurrences
plt.figure(figsize=(12, 6))
sns.barplot(x=keyword_series.index, y=keyword_series.values)
plt.title('Common Keywords in Card Effects', fontsize=16)
plt.xlabel('Keyword', fontsize=12)
plt.ylabel('Number of Occurrences', fontsize=12)
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

## Conclusion

This notebook has demonstrated some basic analysis of the One Piece card data. There are many more analyses that could be performed, such as:

- Analyzing the relationship between card attributes and effects
- Identifying powerful card combinations
- Tracking card trends across different sets
- Building deck recommendations based on card synergies

Feel free to extend this analysis with your own ideas!