# EDA Integration - Part 1: NumPy and Pandas Combined

This notebook covers using NumPy and Pandas together for data analysis.

**Topics covered:**
- Converting between NumPy and Pandas
- Using NumPy functions on DataFrames
- Vectorized operations
- Custom aggregations

**Problems:** 20 (Easy: 1-7, Medium: 8-14, Hard: 15-20)

In [None]:
# ============================================
# SETUP - Run this cell first!
# ============================================
import sys
sys.path.insert(0, '..')
from utils.checks import eda_01_numpy_pandas_combo as verify

print("Verification module loaded! Now import the libraries you need.")

---
## Problem 0: Import Required Libraries
**Difficulty:** Easy

### Concept
For combining NumPy and Pandas operations, you need both libraries imported.

### Syntax
```python
import numpy as np
import pandas as pd
```

### Task
Import NumPy as `np` and Pandas as `pd`.

### Expected Properties
- `np` should be the numpy module
- `pd` should be the pandas module

In [None]:
# Your solution:


In [None]:
# Verification
verify.p0(globals())

---
## Problem 1: DataFrame to NumPy Array
**Difficulty:** Easy

### Concept
DataFrames and NumPy arrays can be easily converted between each other. Converting to NumPy arrays is useful when you need to use NumPy's extensive mathematical functions or when interfacing with libraries that expect NumPy arrays.

### Syntax
```python
# Convert DataFrame to NumPy array
array = df.to_numpy()     # Preferred method
# OR
array = df.values         # Older method, still works
```

### Example
```python
>>> df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
>>> arr = df.to_numpy()
>>> arr
array([[1, 3],
       [2, 4]])
>>> type(arr)
<class 'numpy.ndarray'>
```

### Task
Convert the provided DataFrame to a NumPy array using `.to_numpy()` or `.values`. Store the result in `arr`.

### Expected Properties
- `arr` should be a NumPy ndarray
- Shape should be (3, 2) - 3 rows, 2 columns

In [None]:
# Your solution:
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
arr = None

In [None]:
# Verification
verify.p1(arr)

---
## Problem 2: NumPy Array to DataFrame
**Difficulty:** Easy

### Concept
Converting NumPy arrays to DataFrames is useful when you want to add labels (column names) to your data or use Pandas' powerful data manipulation features.

### Syntax
```python
# Create DataFrame from array with column names
df = pd.DataFrame(array, columns=['col1', 'col2'])
```

### Example
```python
>>> arr = np.array([[1, 2], [3, 4]])
>>> df = pd.DataFrame(arr, columns=['A', 'B'])
>>> df
   A  B
0  1  2
1  3  4
```

### Task
Convert the provided NumPy array to a DataFrame with column names 'X' and 'Y'. Store the result in `df`.

### Expected Properties
- `df` should be a pandas DataFrame
- Should have columns named 'X' and 'Y'
- Shape should be (3, 2)

In [None]:
# Your solution:
arr = np.array([[1, 2], [3, 4], [5, 6]])
df = None

In [None]:
# Verification
verify.p2(df)

---
## Problem 3: Apply NumPy Function to Column
**Difficulty:** Easy

### Concept
NumPy functions can be directly applied to pandas Series and DataFrame columns. This is efficient because pandas is built on top of NumPy and uses the same optimized operations.

### Syntax
```python
# Apply NumPy function to Series/column
result = np.function_name(df['column'])
```

### Example
```python
>>> df = pd.DataFrame({'values': [1, 4, 9]})
>>> np.sqrt(df['values'])
0    1.0
1    2.0
2    3.0
Name: values, dtype: float64
```

### Task
Apply `np.sqrt()` to the 'values' column of the DataFrame. Store the result in `sqrt_vals`.

### Expected Properties
- `sqrt_vals` should be a pandas Series
- Should have 5 elements
- First element should be 1.0, last should be 5.0

In [None]:
# Your solution:
df = pd.DataFrame({'values': [1, 4, 9, 16, 25]})
sqrt_vals = None

In [None]:
# Verification
verify.p3(sqrt_vals)

---
## Problem 4: Use np.where with DataFrame
**Difficulty:** Easy

### Concept
`np.where()` is a vectorized conditional operator that's much faster than loops or apply with lambda functions. It's the NumPy equivalent of "if-else" for arrays.

### Syntax
```python
# np.where(condition, value_if_true, value_if_false)
df['new_col'] = np.where(df['col'] > threshold, 'High', 'Low')
```

### Example
```python
>>> df = pd.DataFrame({'score': [45, 75, 60]})
>>> df['pass'] = np.where(df['score'] >= 60, 'Pass', 'Fail')
>>> df
   score  pass
0     45  Fail
1     75  Pass
2     60  Pass
```

### Task
Create a new column 'label' in the DataFrame. If 'value' > 50, label as 'High', else 'Low'. Use `np.where()`.

### Expected Properties
- DataFrame should have a 'label' column
- 'label' column should contain 'High' and 'Low' strings
- First element should be 'Low', second should be 'High'

In [None]:
# Your solution:
df = pd.DataFrame({'value': [30, 60, 40, 80, 20]})
# Add 'label' column using np.where


In [None]:
# Verification
verify.p4(df)

---
## Problem 5: Series to NumPy Array
**Difficulty:** Easy

### Concept
A pandas Series can be converted to a NumPy array, similar to DataFrames. This is useful when you need to pass data to functions that expect arrays or perform NumPy-specific operations.

### Syntax
```python
# Convert Series to array
array = series.to_numpy()   # Preferred
# OR
array = series.values       # Older method
```

### Example
```python
>>> s = pd.Series([10, 20, 30])
>>> arr = s.to_numpy()
>>> arr
array([10, 20, 30])
```

### Task
Convert the provided pandas Series to a NumPy array. Store the result in `arr`.

### Expected Properties
- `arr` should be a NumPy ndarray
- Should have 5 elements
- First element should be 10, last should be 50

In [None]:
# Your solution:
s = pd.Series([10, 20, 30, 40, 50])
arr = None

In [None]:
# Verification
verify.p5(arr)

---
## Problem 6: Apply NumPy Broadcasting
**Difficulty:** Easy

### Concept
Broadcasting is a powerful NumPy feature that allows operations between arrays of different shapes. When you add a scalar to a DataFrame, it's broadcast to all elements automatically.

### Syntax
```python
# Add scalar to all elements
result = df + scalar

# Multiply all elements
result = df * scalar
```

### Example
```python
>>> df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
>>> df + 10
    A   B
0  11  13
1  12  14
```

### Task
Add 10 to all values in the DataFrame. Store the result in `df_added`.

### Expected Properties
- `df_added` should be a DataFrame
- Should have same shape as original (3, 2)
- Sum of column 'A' should be 36 (11+12+13)
- Sum of column 'B' should be 45 (14+15+16)

In [None]:
# Your solution:
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df_added = None

In [None]:
# Verification
verify.p6(df_added)

---
## Problem 7: Use np.clip on DataFrame
**Difficulty:** Easy

### Concept
`np.clip()` limits values in an array to a specified range. Values below the minimum are set to the minimum, and values above the maximum are set to the maximum. This is useful for outlier handling and data normalization.

### Syntax
```python
# Clip values to range [min_val, max_val]
result = np.clip(array, min_val, max_val)
```

### Example
```python
>>> arr = np.array([5, 15, 25, 35])
>>> np.clip(arr, 10, 30)
array([10, 15, 25, 30])  # 5->10, 35->30
```

### Task
Clip the 'values' column to be between 20 and 80 (inclusive). Store the result in `df_clipped` as a DataFrame.

### Expected Properties
- `df_clipped` should be a DataFrame
- Minimum value in 'values' column should be 20
- Maximum value in 'values' column should be 80

In [None]:
# Your solution:
df = pd.DataFrame({'values': [10, 30, 50, 70, 90, 100]})
df_clipped = None

In [None]:
# Verification
verify.p7(df_clipped)

---
## Problem 8: Calculate Z-Scores Using NumPy
**Difficulty:** Medium

### Concept
Z-scores (standard scores) indicate how many standard deviations a value is from the mean. They're used to identify outliers and normalize data. A z-score of 0 means the value equals the mean.

### Syntax
```python
# Z-score formula: (x - mean) / std
z_scores = (data - np.mean(data)) / np.std(data)
```

### Example
```python
>>> values = np.array([10, 20, 30, 40, 50])
>>> z = (values - np.mean(values)) / np.std(values)
>>> z
array([-1.41, -0.71,  0.  ,  0.71,  1.41])
```

### Task
Calculate z-scores for the 'values' column using the formula (x - mean) / std. Store the result as a pandas Series in `z_scores`.

### Expected Properties
- `z_scores` should be a pandas Series
- Should have 5 elements
- Middle value (index 2) should have z-score close to 0

In [None]:
# Your solution:
df = pd.DataFrame({'values': [10, 20, 30, 40, 50]})
z_scores = None

In [None]:
# Verification
verify.p8(z_scores)

---
## Problem 9: Apply Custom NumPy Function
**Difficulty:** Medium

### Concept
Min-Max normalization scales values to a 0-1 range. This is useful for machine learning algorithms that work better with normalized data. The `apply()` method can apply functions to DataFrame columns.

### Syntax
```python
# Min-Max normalization: (x - min) / (max - min)
def normalize(x):
    return (x - np.min(x)) / (np.max(x) - np.min(x))

df_normalized = df.apply(normalize)
```

### Example
```python
>>> df = pd.DataFrame({'A': [1, 2, 3, 4, 5]})
>>> normalized = (df['A'] - df['A'].min()) / (df['A'].max() - df['A'].min())
>>> normalized
0    0.00
1    0.25
2    0.50
3    0.75
4    1.00
```

### Task
Apply the provided `normalize_col` function to the DataFrame using `.apply()`. Store the result in `df_result`.

### Expected Properties
- `df_result` should be a DataFrame
- Each column's minimum should be 0.0
- Each column's maximum should be 1.0

In [None]:
# Your solution:
df = pd.DataFrame({'A': [1, 2, 3, 4, 5], 'B': [10, 20, 30, 40, 50]})

def normalize_col(x):
    return (x - np.min(x)) / (np.max(x) - np.min(x))

df_result = None

In [None]:
# Verification
verify.p9(df_result)

---
## Problem 10: NumPy Random Data in DataFrame
**Difficulty:** Medium

### Concept
NumPy's random module can generate data from various probability distributions. This is useful for creating test data, simulations, and understanding statistical properties.

### Syntax
```python
# Normal distribution (mean=0, std=1)
np.random.normal(loc=0, scale=1, size=100)

# Uniform distribution [0, 1)
np.random.uniform(low=0, high=1, size=100)

# Create DataFrame from multiple arrays
df = pd.DataFrame({'col1': array1, 'col2': array2})
```

### Example
```python
>>> np.random.seed(42)
>>> df = pd.DataFrame({
...     'normal': np.random.normal(0, 1, 10),
...     'uniform': np.random.uniform(0, 1, 10)
... })
```

### Task
Create a DataFrame with two columns:
- 'normal': 100 samples from standard normal distribution
- 'uniform': 100 samples from uniform distribution [0, 1)

Store the result in `df_random`.

### Expected Properties
- `df_random` should be a DataFrame with shape (100, 2)
- Should have columns 'normal' and 'uniform'
- Uniform column values should be between 0 and 1

In [None]:
# Your solution:
np.random.seed(42)
df_random = None

In [None]:
# Verification
verify.p10(df_random)

---
## Problem 11: Correlation Matrix
**Difficulty:** Medium

### Concept
A correlation matrix shows the correlation coefficients between all pairs of variables. Values range from -1 (perfect negative correlation) to +1 (perfect positive correlation). 0 means no linear relationship.

### Syntax
```python
# Calculate correlation matrix
corr_matrix = df.corr()
```

### Example
```python
>>> df = pd.DataFrame({
...     'A': [1, 2, 3, 4],
...     'B': [2, 4, 6, 8]  # Perfectly correlated with A
... })
>>> df.corr()
     A    B
A  1.0  1.0
B  1.0  1.0
```

### Task
Calculate the correlation matrix for the provided DataFrame. Store the result in `corr_matrix`.

### Expected Properties
- `corr_matrix` should be a DataFrame
- Correlation between A and B should be 1.0 (perfect positive)
- Correlation between A and C should be -1.0 (perfect negative)

In [None]:
# Your solution:
df = pd.DataFrame({
    'A': [1, 2, 3, 4, 5],
    'B': [2, 4, 6, 8, 10],  # Perfectly correlated with A
    'C': [5, 4, 3, 2, 1]    # Negatively correlated with A
})

corr_matrix = None

In [None]:
# Verification
verify.p11(corr_matrix)

---
## Problem 12: Vectorized String Operations
**Difficulty:** Medium

### Concept
Pandas provides vectorized string operations through the `.str` accessor. These operations are much faster than Python loops and make string processing on large datasets efficient.

### Syntax
```python
# Common string operations
df['col'].str.lower()      # Convert to lowercase
df['col'].str.upper()      # Convert to uppercase
df['col'].str.strip()      # Remove whitespace
df['col'].str.title()      # Title case

# Chain operations
df['clean'] = df['col'].str.strip().str.title()
```

### Example
```python
>>> df = pd.DataFrame({'name': ['  alice  ', 'BOB']})
>>> df['clean'] = df['name'].str.strip().str.title()
>>> df
        name  clean
0   alice    Alice
1        BOB    Bob
```

### Task
Clean the 'name' column by:
1. Stripping whitespace
2. Converting to title case

Store the result in a new column `df['name_clean']`.

### Expected Properties
- DataFrame should have a 'name_clean' column
- All names should be in title case (first letter uppercase)
- No leading or trailing whitespace

In [None]:
# Your solution:
df = pd.DataFrame({'name': ['  Alice  ', 'BOB', '  charlie', 'DAVID  ']})
# Add 'name_clean' column


In [None]:
# Verification
verify.p12(df)

---
## Problem 13: Use np.select for Multiple Conditions
**Difficulty:** Medium

### Concept
`np.select()` allows you to apply multiple conditions and assign corresponding values. It's like a vectorized "if-elif-else" statement, more flexible than `np.where()` which only handles one condition.

### Syntax
```python
conditions = [
    df['col'] < 10,
    df['col'] < 20,
    df['col'] >= 20
]
choices = ['Low', 'Medium', 'High']
df['category'] = np.select(conditions, choices)
```

### Example
```python
>>> df = pd.DataFrame({'temp': [5, 15, 25]})
>>> conditions = [df['temp'] < 10, df['temp'] < 20, df['temp'] >= 20]
>>> choices = ['Cold', 'Cool', 'Warm']
>>> df['category'] = np.select(conditions, choices)
>>> df
   temp category
0     5     Cold
1    15     Cool
2    25     Warm
```

### Task
Create a 'grade' column based on 'score' using `np.select()`:
- <60: 'F'
- 60-69: 'D'
- 70-79: 'C'
- 80-89: 'B'
- >=90: 'A'

The conditions and choices are provided.

### Expected Properties
- DataFrame should have a 'grade' column
- First score (45) should get 'F'
- Last score (95) should get 'A'

In [None]:
# Your solution:
df = pd.DataFrame({'score': [45, 65, 75, 85, 95]})

conditions = [
    df['score'] < 60,
    df['score'] < 70,
    df['score'] < 80,
    df['score'] < 90,
    df['score'] >= 90
]
choices = ['F', 'D', 'C', 'B', 'A']

df['grade'] = None  # Use np.select()

In [None]:
# Verification
verify.p13(df)

---
## Problem 14: Rolling Window Calculations
**Difficulty:** Medium

### Concept
Rolling windows (moving averages) smooth out short-term fluctuations and highlight longer-term trends. This is essential for time series analysis and signal processing.

### Syntax
```python
# Calculate rolling mean with window size n
rolling_mean = df['col'].rolling(window=n).mean()
```

### Example
```python
>>> df = pd.DataFrame({'values': [1, 2, 3, 4, 5]})
>>> df['rolling_mean'] = df['values'].rolling(window=3).mean()
>>> df
   values  rolling_mean
0       1           NaN
1       2           NaN
2       3           2.0  # mean(1,2,3)
3       4           3.0  # mean(2,3,4)
4       5           4.0  # mean(3,4,5)
```

### Task
Calculate a 3-period rolling mean for the 'values' column. Store the result in `rolling_mean`.

### Expected Properties
- `rolling_mean` should be a pandas Series
- Should have 10 elements
- Third element (index 2) should be 2.0 (mean of 1, 2, 3)

In [None]:
# Your solution:
df = pd.DataFrame({'values': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]})
rolling_mean = None

In [None]:
# Verification
verify.p14(rolling_mean)

---
## Problem 15: Find Outliers with IQR Method
**Difficulty:** Hard

### Concept
The Interquartile Range (IQR) method identifies outliers as values that fall below Q1 - 1.5×IQR or above Q3 + 1.5×IQR, where Q1 is the 25th percentile, Q3 is the 75th percentile, and IQR = Q3 - Q1.

### Syntax
```python
Q1 = df['col'].quantile(0.25)
Q3 = df['col'].quantile(0.75)
IQR = Q3 - Q1

# Outliers are outside [Q1 - 1.5*IQR, Q3 + 1.5*IQR]
outliers = (df['col'] < Q1 - 1.5*IQR) | (df['col'] > Q3 + 1.5*IQR)
```

### Example
```python
>>> values = [1, 2, 3, 4, 5, 100]  # 100 is an outlier
>>> Q1, Q3 = np.percentile(values, [25, 75])
>>> IQR = Q3 - Q1
>>> outliers = (values < Q1 - 1.5*IQR) | (values > Q3 + 1.5*IQR)
```

### Task
Find outliers in the 'values' column using the IQR method. Store a boolean mask in `is_outlier` (True for outliers).

### Expected Properties
- `is_outlier` should be a pandas Series of booleans
- Should identify exactly 1 outlier
- The value 100 should be identified as an outlier

In [None]:
# Your solution:
df = pd.DataFrame({'values': [1, 2, 3, 4, 5, 6, 7, 8, 9, 100]})

Q1 = df['values'].quantile(0.25)
Q3 = df['values'].quantile(0.75)
IQR = Q3 - Q1

is_outlier = None  # Create boolean mask

In [None]:
# Verification
verify.p15(is_outlier)

---
## Problem 16: Bin Data with pd.cut
**Difficulty:** Hard

### Concept
Binning (discretization) converts continuous data into categorical bins. This is useful for creating age groups, price ranges, or any categorical groupings from numerical data.

### Syntax
```python
# pd.cut creates bins from continuous data
df['category'] = pd.cut(df['col'], 
                        bins=[0, 18, 65, 100], 
                        labels=['Young', 'Adult', 'Senior'])
```

### Example
```python
>>> df = pd.DataFrame({'age': [5, 25, 70]})
>>> df['group'] = pd.cut(df['age'], 
...                      bins=[0, 18, 65, 100], 
...                      labels=['Young', 'Adult', 'Senior'])
>>> df
   age   group
0    5   Young
1   25   Adult
2   70  Senior
```

### Task
Create age groups using `pd.cut()` with the provided bins and labels. Store the result in `df['age_group']`.

Bins: 0-18 (Child), 18-35 (Young Adult), 35-55 (Adult), 55+ (Senior)

### Expected Properties
- DataFrame should have an 'age_group' column
- First age (5) should be in 'Child' category
- Last age (75) should be in 'Senior' category

In [None]:
# Your solution:
df = pd.DataFrame({'age': [5, 15, 25, 35, 45, 55, 65, 75]})
bins = [0, 18, 35, 55, 100]
labels = ['Child', 'Young Adult', 'Adult', 'Senior']

df['age_group'] = None  # Use pd.cut()

In [None]:
# Verification
verify.p16(df)

---
## Problem 17: Cumulative Operations
**Difficulty:** Hard

### Concept
Cumulative operations calculate running totals or running maximums. These are useful for tracking cumulative progress, running balances, or maximum values seen so far.

### Syntax
```python
# Cumulative sum - running total
df['cumsum'] = df['col'].cumsum()

# Cumulative max - highest value so far
df['cummax'] = df['col'].cummax()

# Also available: cummin(), cumprod()
```

### Example
```python
>>> df = pd.DataFrame({'values': [1, 3, 2]})
>>> df['cumsum'] = df['values'].cumsum()
>>> df['cummax'] = df['values'].cummax()
>>> df
   values  cumsum  cummax
0       1       1       1
1       3       4       3
2       2       6       3
```

### Task
Calculate cumulative sum and cumulative maximum for the 'values' column. Store them in `df['cumsum']` and `df['cummax']`.

### Expected Properties
- DataFrame should have 'cumsum' and 'cummax' columns
- Last cumsum value should be 28 (sum of all)
- Last cummax value should be 10 (maximum value)

In [None]:
# Your solution:
df = pd.DataFrame({'values': [5, 3, 8, 2, 10]})
df['cumsum'] = None
df['cummax'] = None

In [None]:
# Verification
verify.p17(df)

---
## Problem 18: Rank Data
**Difficulty:** Hard

### Concept
Ranking assigns a rank to each value in a dataset, useful for leaderboards, performance comparisons, and statistical analysis. Lower ranks can represent higher values or vice versa.

### Syntax
```python
# Rank values (default: lowest value gets rank 1)
df['rank'] = df['col'].rank()

# Rank with highest value getting rank 1
df['rank'] = df['col'].rank(ascending=False)
```

### Example
```python
>>> df = pd.DataFrame({'score': [85, 90, 78]})
>>> df['rank'] = df['score'].rank(ascending=False)
>>> df
   score  rank
0     85   2.0
1     90   1.0  # Highest score gets rank 1
2     78   3.0
```

### Task
Rank the scores from highest to lowest (highest score gets rank 1). Store the result in `df['rank']`.

### Expected Properties
- DataFrame should have a 'rank' column
- The score 92 should have rank 1.0
- All ranks should be positive numbers

In [None]:
# Your solution:
df = pd.DataFrame({'score': [85, 90, 78, 92, 88]})
df['rank'] = None  # Use rank() with ascending=False

In [None]:
# Verification
verify.p18(df)

---
## Problem 19: Percentage Change
**Difficulty:** Hard

### Concept
Percentage change calculates the relative change between consecutive values. This is essential for analyzing growth rates, returns, and trends in time series data.

### Syntax
```python
# Calculate percentage change between consecutive rows
df['pct_change'] = df['col'].pct_change()
```

### Example
```python
>>> df = pd.DataFrame({'price': [100, 110, 105]})
>>> df['pct_change'] = df['price'].pct_change()
>>> df
   price  pct_change
0    100         NaN  # No previous value
1    110        0.10  # 10% increase
2    105       -0.05  # 5% decrease
```

### Task
Calculate percentage change for the 'price' column. Store the result in `df['pct_change']`.

### Expected Properties
- DataFrame should have a 'pct_change' column
- Second value (index 1) should be approximately 0.10 (10% increase)
- First value will be NaN (no previous value to compare)

In [None]:
# Your solution:
df = pd.DataFrame({'price': [100, 110, 105, 120, 115]})
df['pct_change'] = None

In [None]:
# Verification
verify.p19(df)

---
## Problem 20: Complex Data Transformation
**Difficulty:** Hard

### Concept
Group-wise calculations allow you to compute statistics within groups and then broadcast those results back to the original DataFrame. This is useful for calculating percentages within categories, standardizing within groups, etc.

### Syntax
```python
# Calculate percentage of group total
df['pct_of_group'] = df['value'] / df.groupby('category')['value'].transform('sum')
```

### Example
```python
>>> df = pd.DataFrame({
...     'category': ['A', 'A', 'B'],
...     'sales': [100, 200, 150]
... })
>>> df['pct'] = df['sales'] / df.groupby('category')['sales'].transform('sum')
>>> df
  category  sales   pct
0        A    100  0.33  # 100/(100+200)
1        A    200  0.67  # 200/(100+200)
2        B    150  1.00  # 150/150
```

### Task
For each row, calculate the percentage of total sales for that product. For example, if product A has sales [100, 150, 180], the percentages would be 100/430, 150/430, 180/430.

Store the result in `df['pct_of_product_total']`.

### Expected Properties
- DataFrame should have a 'pct_of_product_total' column
- For each product, the sum of percentages should equal 1.0
- All values should be between 0 and 1

In [None]:
# Your solution:
df = pd.DataFrame({
    'product': ['A', 'B', 'A', 'B', 'A'],
    'sales': [100, 200, 150, 250, 180]
})

df['pct_of_product_total'] = None

In [None]:
# Verification
verify.p20(df)

---
## Summary

Run this cell to see your overall progress on this notebook.

In [None]:
check.summary()