##### _Data Visualization with Python_
---

# Treemap

A treemap displays hierarchical (tree-structured) data as a set of nested rectangles. Each branch of the tree is given a rectangle, which is then tiled with smaller rectangles representing sub-branches. The leaf nodes' rectangles have an area proportional to a specified dimension of the data (typically a numerical value). Color can also be used to represent another dimension of the data, either categorical or numerical.

### Suitable Variable Types

*   **Hierarchical Data:** Treemaps are fundamentally designed for data that has a hierarchical or nested structure.
*   **Numerical (Ratio):**  The *size* of each rectangle is determined by a numerical variable (ratio scale).
*   **Categorical (Optional):**  Categories are used to define the hierarchy (the levels of nesting). Color can also represent a categorical variable.
*  **Numerical for Color (Optional):** Color can be mapped to a continuous numerical variable.

### Use Cases

1.  **Visualizing Hierarchical Data:** The primary use is to show the structure and proportions of hierarchical data. Examples include:
    *   **File System:**  Showing the sizes of files and directories within a computer's file system.
    *   **Organizational Structure:**  Visualizing the hierarchy of departments and employees within a company, with rectangle size representing, for example, budget allocation or number of employees.
    *   **Budget Allocation:**  Showing how a budget is divided among different categories and subcategories.
    *   **Stock Portfolios:**  Representing the allocation of investments across different asset classes and individual stocks.
2.  **Showing Part-to-Whole Relationships:** Like pie charts, treemaps show how a whole is divided into its parts. However, treemaps can handle *multiple levels* of this part-to-whole relationship.
3.  **Comparing Proportions:** The relative sizes of the rectangles make it easy to compare the magnitudes of different categories and subcategories.
4. **Space Efficient:** It is very efficient using the given space.

### Potential Pitfalls

1.  **Difficulty with Small Values:** Very small values can be difficult to see, especially if they are nested deep within the hierarchy.
2.  **Aspect Ratio Issues:** The algorithm used to create the rectangles (the "squarified" treemap algorithm is common) tries to keep the rectangles as close to squares as possible. However, with many levels of nesting, some rectangles can become very thin and elongated, making them hard to compare.
3.  **Ordering of Rectangles:**  Within a level of the hierarchy, the order of the rectangles is often not meaningful (unless specifically sorted). This can make it difficult to find specific items.
4.  **Color Perception:**  If color is used to represent a numerical variable, be mindful of color scale choice (as with heatmaps). If used for categorical variable, use distinct colors.
5.  **Overplotting (Too Many Levels/Categories):**  If the hierarchy is too deep or has too many categories, the treemap can become cluttered and difficult to interpret.  Consider aggregating categories or limiting the depth of the displayed hierarchy.
6.  **Not Ideal for Showing Trends Over Time:**  Treemaps are static representations of data at a single point in time.  They are not suitable for showing changes over time.
7. **Difficult to Read:** Can be difficult to interpret.

### How to Create Treemaps?

#### Creating a Dummy Database:

In [1]:
import plotly.express as px
import pandas as pd
# Sales data
data = {
    'Category': ['Electronics', 'Electronics', 'Electronics', 
                 'Furniture', 'Furniture', 'Furniture', 
                 'Clothing', 'Clothing', 'Clothing'],
    'Subcategory': ['Laptops', 'Smartphones', 'Tablets', 
                    'Chairs', 'Tables', 'Sofas', 
                    'Men', 'Women', 'Kids'],
    'Sales': [120000, 80000, 30000, 
              50000, 40000, 20000, 
              70000, 90000, 40000]
}
df = pd.DataFrame(data)

df.head()

Unnamed: 0,Category,Subcategory,Sales
0,Electronics,Laptops,120000
1,Electronics,Smartphones,80000
2,Electronics,Tablets,30000
3,Furniture,Chairs,50000
4,Furniture,Tables,40000


#### Treemaps with Plotly:

In [2]:
# Creating the treemap
fig = px.treemap(
    df,
    path=['Category', 'Subcategory'],
    values='Sales',
    title='Sales Data Treemap'
)
fig.show()