## 1. What is a Chart

A **chart** is a structured visual depiction of numerical or categorical information that turns rows and columns into shapes, lines, and areas, enabling rapid recognition of patterns, contrasts, and trends that are otherwise hidden in raw numbers. Charts use elements such as bars, points, lines, boxes, and color to translate data into visual signals.

**Example:** A bar chart showing total sales for each product category.

**Real-world use case:** An e-commerce team uses a bar chart to compare category sales (Electronics, Fashion, Home Appliances) to decide where to increase advertising spend.

## 2. What is Data Visualization

**Data visualization** is the practice of converting datasets into visual representations—charts, graphs, maps, and dashboards—to communicate insights more effectively. It bridges analysis and action by making statistical results and trends accessible to stakeholders who may not work with raw data directly.

**Example:** Converting a CSV of monthly revenue into a line chart to reveal seasonality.

**Real-world use case:** A marketing analyst builds Power BI dashboards to monitor website traffic and conversions across campaigns.

## 3. Role of Charts in Data Visualization

Charts are the essential elements of data visualization. They:

- Reduce complexity by summarizing large datasets into clear visuals.
- Emphasize patterns, relationships, and important deviations.
- Provide evidence for data-driven decisions.
- Help non-technical audiences grasp quantitative messages quickly.

**Example:** A project manager prefers a Gantt chart to visualize timelines instead of reading dense text schedules.

## 4. Data Types Supported by Charts

Charts work with several kinds of data. The chosen chart type should match the data type to preserve meaning and avoid misinterpretation.

| Data Type | Definition | Example | Typical Charts |
|---|---:|---|---|
| Numerical (Quantitative) | Measurable numeric values | Age, Salary, Sales | Line, Histogram, Scatter, Box Plot |
| Categorical (Nominal) | Unordered labels | Country, Gender, Product Type | Bar, Pie, Treemap |
| Ordinal | Categories with order | Ratings (Low, Medium, High) | Bar, Column, Heatmap |
| Time-series (Temporal) | Values indexed by time | Month, Date, Year | Line, Area, Timeline |
| Boolean | True/False flags | Exceeded Limit (Yes/No) | Bar, Pie |

Understanding the data type guides the chart selection and layout decisions.

## 5. Types of Charts — Detailed Description, Examples, and Use Cases

Below is an extended catalogue of common chart types, with deep notes on when they are appropriate, advantages, disadvantages, and example situations.

---

### Bar Chart
**What it shows:** Comparison across categories using horizontal bars.

**When to use:** Comparing discrete categories with long names or many categories that fit better horizontally.

**Advantages:** Readable, straightforward to compare values.

**Disadvantages:** Can become cluttered with too many categories.

**Example use:** Comparing monthly revenue by product category.

---

### Column Chart
**What it shows:** Vertical bars for category comparison (often for time periods).

**When to use:** Comparing values across a small number of time buckets (months, quarters).

**Advantages:** Natural for time-ordered categories.

**Disadvantages:** Overplotting if many columns.

**Example use:** Month-over-month sales figures.

---

### Grouped and Stacked Bars
**What they show:** Grouped bars place multiple series side-by-side; stacked bars show part-to-whole within each category.

**When to use:** Grouped: compare multiple series directly. Stacked: highlight composition and totals.

**Advantages:** Compact multi-series display.

**Disadvantages:** Stacked charts make it hard to compare individual series across categories.

---

### Line Chart
**What it shows:** Trends and continuous changes over an ordered axis (usually time).

**When to use:** Time-series trends, moving averages, and forecasts.

**Advantages:** Shows direction and slope effectively.

**Disadvantages:** Too many series make it hard to read.

**Example use:** Daily active users over a year.

---

### Area Chart
**What it shows:** Like a line chart but with the area under the line filled; good for magnitude and cumulative visuals.

**When to use:** Emphasizing volume over time or stacked contributions.

**Advantages:** Visual weight helps show totals.

**Disadvantages:** Overlap hides details in multi-series scenarios.

---

### Pie Chart and Donut Chart
**What they show:** Relative proportions of parts to a whole.

**When to use:** When categories are few (2–6) and relative share is primary.

**Advantages:** Immediately signals part-to-whole relationships.

**Disadvantages:** Poor at showing precise comparisons, especially with many slices.

---

### Histogram
**What it shows:** Distribution of a numeric variable across contiguous bins.

**When to use:** Understanding skewness, modality, and range of continuous variables.

**Advantages:** Summarizes distribution compactly.

**Disadvantages:** Sensitive to bin size; different bins can suggest different interpretations.

---

### Box Plot (Box-and-Whisker)
**What it shows:** Five-number summary — min, Q1, median, Q3, max — and outliers.

**When to use:** Comparing central tendency and spread across groups.

**Advantages:** Compactly shows variability and extremes.

**Disadvantages:** Not intuitive for all audiences without explanation.

---

### Scatter Plot
**What it shows:** Relationship between two continuous variables; each point is an observation.

**When to use:** Detecting correlation, clusters, and outliers.

**Advantages:** Reveals relationships and heteroscedasticity.

**Disadvantages:** Hard to read when many overlapping points.

---

### Bubble Chart
**What it shows:** Scatter plot extended with point size encoding a third variable (and color for a fourth).

**When to use:** Multivariate comparisons where circle area meaningfully encodes a quantity.

**Advantages:** Packs more information than a scatter plot.

**Disadvantages:** Perception of area is non-linear and can mislead.

---

### Heatmap
**What it shows:** Intensity values in a matrix using a color scale.

**When to use:** Correlation matrices, activity by hour and day, or any tabular magnitude visualization.

**Advantages:** Compact representation of dense information.

**Disadvantages:** Color choices and perceptual issues (colorblindness) matter greatly.

---

### Treemap
**What it shows:** Hierarchical, rectangular partitioning sized by value.

**When to use:** Visualizing composition across nested categories (e.g., revenue by region and product).

**Advantages:** Space-efficient hierarchy display.

**Disadvantages:** Small rectangles are hard to compare or label.

---

### Violin Plot
**What it shows:** Distribution and density of a numeric variable, combining a box plot and density estimate.

**When to use:** To compare distributions with shape detail beyond quartiles.

**Advantages:** Shows multimodality and distribution shape.

**Disadvantages:** Requires explanation to non-technical audiences.

---

### Waterfall Chart
**What it shows:** Additive/subtractive sequence leading to a final total, often used in financial breakdowns.

**When to use:** Profit and loss breakdowns, explaining step-by-step changes.

**Advantages:** Makes cumulative impacts clear.

**Disadvantages:** Can be verbose with many steps.

---

### Funnel Chart
**What it shows:** Sequential stages with drop-offs encoded by width.

**When to use:** Conversion rates across pipeline stages.

**Advantages:** Intuitive for processes.

**Disadvantages:** Not suitable for precise numeric comparison.

---

### Radar (Spider) Chart
**What it shows:** Multiple quantitative variables on radial axes for profile comparison.

**When to use:** Skill or feature comparisons across multiple dimensions.

**Advantages:** Good at showing shape differences across entities.

**Disadvantages:** Overlapping series and scale issues can confuse interpretation.

---

### Gantt Chart
**What it shows:** Task schedules and durations on a timeline.

**When to use:** Project planning and tracking.

**Advantages:** Clear schedule visualization.

**Disadvantages:** Difficult to manage for very large projects in simple charting tools.

---

### Gauge / KPI Dial
**What it shows:** A single metric against a target, often with colored ranges (bad/ok/good).

**When to use:** Dashboards focusing on single KPI status.

**Advantages:** Immediate status indication.

**Disadvantages:** Limited analytical depth.

---

(End of chart catalogue)

## 6. Summary

- Charts are the primary instruments in data visualization; they transform numbers into comprehensible visual formats.
- The right chart aligns with the data type and the analytical question being asked.
- Design choices (labels, axes, color, and scale) influence how accurately a chart communicates.
- Interactive tools allow exploration, but static charts must be crisp, labelled, and purposeful.

---

## 7. Dataset (Placeholder) and Column Summary

Below is a placeholder description of the dataset columns you provided earlier. Insert your actual dataset path and use the code cells to load and inspect the data.

**Columns (example):**
- Age (numeric)
- Age_Group (categorical)
- Gender (categorical)
- Avg_Daily_Screen_Time_hr (numeric)
- awareness (categorical)
- Primary_Device (categorical)
- Device_Category (categorical)
- Screen_Size (categorical)
- Exceeded_Recommended_Limit (boolean)
- Educational_to_Recreational_Ratio (numeric)
- Health_Impacts (categorical)
- Health_Impact_Category (categorical)
- Urban_or_Rural (categorical)

Use the code cells below to load and examine the dataset. Plots are intentionally omitted; only placeholders are provided so you can fill them when ready.

In [None]:

import pandas as pd

file_path = '/content/Indian_Kids_Screen_Time.csv'
df = pd.read_csv(file_path)
df.head(10)

In [None]:

print(df.info())
print(df.describe(include='all'))
print(df.columns)


In [None]:
# Univariate analysis 
# Examples 
df['Age'].hist()
df['Avg_Daily_Screen_Time_hr'].describe()
df['Age_Group'].value_counts()

# compute basic stats
age_stats = df['Age'].describe()
screen_time_stats = df['Avg_Daily_Screen_Time_hr'].describe()
print(age_stats)
print(screen_time_stats)

In [None]:
# Bivariate analysis 
# Examples 
df.plot.scatter(x='Age', y='Avg_Daily_Screen_Time_hr')
df.groupby('Age_Group')['Avg_Daily_Screen_Time_hr'].median()

# compute correlations
corr = df[['Age','Avg_Daily_Screen_Time_hr','Educational_to_Recreational_Ratio']].corr()
print(corr)

In [None]:
# Multivariate analysis 
# Examples
# Use pandas pivot_table or groupby for summary tables
pivot = pd.pivot_table(df, values='Avg_Daily_Screen_Time_hr', index=['Age_Group','Gender'], aggfunc='median')
print(pivot)

# clustering or faceting preparations
features = df[['Avg_Daily_Screen_Time_hr','Educational_to_Recreational_Ratio']].dropna()
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaled = scaler.fit_transform(features)
print(scaled[:5])

In [None]:
# Exporting and saving results
# Example: save a cleaned CSV
cleaned_path = '/path/to/cleaned_dataset.csv'
df.to_csv(cleaned_path, index=False)

# Example: save a summary to excel
summary = df.describe()
summary.to_excel('/path/to/summary.xlsx')