# Summarize_numeric

This tutorial will guide you through the `summarease.summarize_numeric` module, which provides tools to generate summary statistics or visualizations for numeric variables in a dataset, helping you understand key patterns and relationships between your numeric data.

## Getting started


The `summarease.summarize_numeric` module offers the following core functionalities:


1. **Summarizing Numeric Variables:**
- Outputs summary statistics for each numeric column (mean, standard deviation, min, max, etc.).
- Useful for quickly understanding the distribution and spread of numeric features.

2. **Visualizing Numeric Relationships:**
- Generates density plots for each numeric column to visualize their distributions.
- Creates a correlation heatmap to explore the relationships between multiple numeric variables.

## Necessary libraries


To use the `summarease.summarize_numeric` module, ensure the following libraries are installed:

In [1]:
import pandas as pd
import altair as alt
from summarease.summarize_numeric import summarize_numeric, plot_numeric_density, plot_correlation_heatmap

## Example dataset


We'll use the following dataset to demonstrate the module's functionality:

### Dataset: Numeric Data

In [3]:
numeric_data = pd.DataFrame({
    'feature1': [1.2, 2.3, 3.1, 4.8, 5.5, 6.7, 8.9, 10.1],
    'feature2': [3.2, 4.5, 5.1, 6.0, 7.8, 8.3, 9.1, 10.7],
    'feature3': [1.1, 2.2, 3.1, 4.0, 5.4, 6.6, 7.7, 8.5]
})

## Example usage

### `summarize_numeric`

This function calculates summary statistics or generates visualizations for numeric columns. It takes two parameters:


- `dataset`: The input dataset containing numeric variables. It must be a DataFrame.
- `summarize_by`: Specifies the format for summarizing the numeric variables. It can be either `"table"` (default) to generate a summary table, or `"plot"` to generate visualizations like density plots and a correlation heatmap.

#### Example 1: Summarizing with a Table

In [4]:
# Summarize numeric variables in the dataset by generating summary statistics
summary_table = summarize_numeric(dataset=numeric_data, summarize_by="table")
summary_table["numeric_describe"]

Unnamed: 0,feature1,feature2,feature3
count,8.0,8.0,8.0
mean,5.325,6.8375,4.825
std,3.137219,2.550035,2.663912
min,1.2,3.2,1.1
25%,2.9,4.95,2.875
50%,5.15,6.9,4.7
75%,7.25,8.5,6.875
max,10.1,10.7,8.5


#### Example 2: Visualizing Numeric Relationships

In [5]:
# Visualize the distribution of numeric variables and their correlations
numeric_plots = summarize_numeric(dataset=numeric_data, summarize_by="plot")
numeric_plots["numeric_plot"].show()  # Display density plots
numeric_plots["corr_plot"].show()    # Display correlation heatmap

In conclusion, the `summarease.summarize_numeric` module provides an easy way to explore and understand your dataset's numeric features. Whether you're looking for summary statistics or visualizations to analyze distributions and correlations, this tool will help you gain deeper insights into your data.

## Final notes

If you get an error or something went wrong during the usage of the function, you can always submit an issue in the github repo which will be addressed as soon as possible. Thanks for your time! 