# Task
Generate Python code to create treemap and sunburst visualizations for the dataset provided in the file '/content/academic Stress level - maintainance 1.csv'.

## Load the data

### Subtask:
Load the dataset from '/content/academic Stress level - maintainance 1.csv' into a pandas DataFrame.


**Reasoning**:
Load the data into a pandas DataFrame as instructed.



In [None]:
import pandas as pd

df = pd.read_csv('/content/academic Stress level - maintainance 1.csv')

## Inspect the data

### Subtask:
Display the first few rows and the columns of the DataFrame to understand its structure and content.


Here is the algorithm for the code you provided:

1.  **Load the dataset:** Read the data from the CSV file located at `/content/academic Stress level - maintainance 1.csv` into a pandas DataFrame.
2.  **Inspect the data:**
    *   Display the first few rows of the DataFrame to get a glimpse of the data structure and content.
    *   Display information about the DataFrame, including column names, non-null counts, and data types, to identify missing values and understand data types.
3.  **Handle missing values:**
    *   Fill any missing values in the 'Study Environment' column with the string 'Unknown'.
    *   Display the DataFrame information again to confirm that the missing values in 'Study Environment' have been handled.
4.  **Generate Treemap Visualization:**
    *   Import the `plotly.express` library.
    *   Create a treemap visualization using the DataFrame.
    *   Define the hierarchical path for the treemap using the columns 'Your Academic Stage', 'Study Environment', and 'What coping strategy you use as a student?'.
    *   Specify 'Rate your academic stress index ' as the values to determine the size of the rectangles in the treemap.
    *   Set a title for the treemap.
    *   Update the layout margins for better visualization.
    *   Display the treemap.
5.  **Generate Sunburst Visualization:**
    *   Import the `plotly.express` library (if not already imported).
    *   Create a sunburst visualization using the DataFrame.
    *   Define the hierarchical path for the sunburst chart using the columns 'Your Academic Stage', 'Study Environment', and 'What coping strategy you use as a student?'.
    *   Specify 'Rate your academic stress index ' as the values to determine the size of the segments in the sunburst chart.
    *   Set a title for the sunburst chart.
    *   Display the sunburst chart.
6.  **Summarize Findings:**
    *   Provide a summary of the key findings from the data loading, inspection, handling missing values, and visualization steps.
    *   Offer insights or suggest potential next steps for further analysis based on the visualizations and data characteristics.

**Reasoning**:
Display the first few rows and the column information of the DataFrame to understand its structure and content.



In [None]:
display(df.head())
display(df.info())

Unnamed: 0,Timestamp,Your Academic Stage,Peer pressure,Academic pressure from your home,Study Environment,What coping strategy you use as a student?,"Do you have any bad habits like smoking, drinking on a daily basis?",What would you rate the academic competition in your student life,Rate your academic stress index
0,24/07/2025 22:05:39,undergraduate,4,5,Noisy,Analyze the situation and handle it with intel...,No,3,5
1,24/07/2025 22:05:52,undergraduate,3,4,Peaceful,Analyze the situation and handle it with intel...,No,3,3
2,24/07/2025 22:06:39,undergraduate,1,1,Peaceful,"Social support (friends, family)",No,2,4
3,24/07/2025 22:06:45,undergraduate,3,2,Peaceful,Analyze the situation and handle it with intel...,No,4,3
4,24/07/2025 22:08:06,undergraduate,3,3,Peaceful,Analyze the situation and handle it with intel...,No,4,5


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 140 entries, 0 to 139
Data columns (total 9 columns):
 #   Column                                                               Non-Null Count  Dtype 
---  ------                                                               --------------  ----- 
 0   Timestamp                                                            140 non-null    object
 1   Your Academic Stage                                                  140 non-null    object
 2   Peer pressure                                                        140 non-null    int64 
 3   Academic pressure from your home                                     140 non-null    int64 
 4   Study Environment                                                    139 non-null    object
 5   What coping strategy you use as a student?                           140 non-null    object
 6   Do you have any bad habits like smoking, drinking on a daily basis?  140 non-null    object
 7   What would you rat

None

Based on the output of the df.info() in the cell a4804352, we can infer that the missing value in the 'Study Environment' column has been successfully filled. Before this step, df.info() showed 139 non-null entries for this column (as seen in cell 0f35df7f), and now it shows 140 non-null entries. This confirms that the fillna('Unknown', inplace=True) operation was successful in addressing the missing data in that specific column.



## Generate treemap visualization

### Subtask:
Create a treemap visualization to represent hierarchical data in the dataset.


**Reasoning**:
Import the necessary library and create the treemap visualization using the specified columns.



In [None]:
import plotly.express as px

# Fill missing values in 'Study Environment' before creating the treemap
df['Study Environment'].fillna('Unknown', inplace=True)

fig = px.treemap(df, path=['Your Academic Stage', 'Study Environment', 'What coping strategy you use as a student?'], values='Rate your academic stress index ',
                 title='Treemap of Academic Stress Index by Stage, Environment, and Coping Strategy')
fig.update_layout(margin = dict(t=50, l=25, r=25, b=25))
fig.show()

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  df['Study Environment'].fillna('Unknown', inplace=True)


## Generate sunburst visualization

### Subtask:
Create a sunburst visualization to show hierarchical data in a radial layout.

**Reasoning**:
Generate a sunburst chart using plotly.express with the specified hierarchical path and values.

In [None]:
import plotly.express as px

fig = px.sunburst(df, path=['Your Academic Stage', 'Study Environment', 'What coping strategy you use as a student?'], values='Rate your academic stress index ',
                  title='Sunburst Chart of Academic Stress Index by Stage, Environment, and Coping Strategy')
fig.show()

## Summary:

### Data Analysis Key Findings

* The dataset was successfully loaded into a pandas DataFrame.
* The DataFrame contains columns with `object` and `int64` data types.
* Missing values were identified in the 'Study Environment' column and successfully filled with 'Unknown' to enable the treemap and sunburst visualizations.
* The treemap visualization was successfully generated showing the hierarchical distribution of academic stress index across academic stage, study environment, and coping strategy.
* The sunburst visualization was successfully generated, providing a radial representation of the same hierarchical data.

### Insights or Next Steps

* The visualizations provide a clear hierarchical view of academic stress index across different academic stages, study environments, and coping strategies. Further analysis could involve calculating and visualizing the average stress index for each combination of these categories to identify specific areas of high or low stress.
* Explore other columns in the dataset, such as 'Peer pressure', 'Academic pressure from your home', and 'What would you rate the academic competition in your student life', to understand their correlation with the 'Rate your academic stress index'. This could lead to identifying additional factors influencing academic stress.
* Consider performing statistical tests to determine if the differences in stress index across different categories are statistically significant.
* Investigate the distribution of academic stress index within each leaf node of the treemap and sunburst charts to understand the variability in stress levels within specific groups.

Based on the treemap visualization, here are some inferences:

The treemap shows a hierarchical breakdown of the 'Rate your academic stress index' by 'Your Academic Stage', 'Study Environment', and 'What coping strategy you use as a student?'.
You can see the distribution of the academic stress index across different categories. For example, you can observe which academic stages, study environments, and coping strategies are associated with higher or lower stress index values.
The size of each rectangle in the treemap represents the sum of the 'Rate your academic stress index' for that particular category or combination of categories. Larger rectangles indicate a higher aggregate stress index.
By hovering over or clicking on the rectangles, you can get more detailed information about the specific categories and their corresponding stress index values.
To provide more specific inferences, I would need to analyze the visual output of the treemap and

**Reasoning**:
The error indicates that there are null values in the 'Study Environment' column which is part of the hierarchy path for the treemap. To fix this, I will fill the missing values in the 'Study Environment' column with a placeholder value.



In [None]:
df['Study Environment'].fillna('Unknown', inplace=True)
fig = px.treemap(df, path=['Your Academic Stage', 'Study Environment', 'What coping strategy you use as a student?'], values='Rate your academic stress index ')
fig.show()

## Generate sunburst visualization

### Subtask:
Create a sunburst visualization to show hierarchical data in a radial layout.


**Reasoning**:
Generate a sunburst chart using plotly.express with the specified hierarchical path and values.



In [None]:
import plotly.express as px

fig = px.sunburst(df, path=['Your Academic Stage', 'Study Environment', 'What coping strategy you use as a student?'], values='Rate your academic stress index ')
fig.show()

## Summary:

### Data Analysis Key Findings

*   The dataset was successfully loaded into a pandas DataFrame.
*   The DataFrame contains columns with `object` and `int64` data types.
*   Missing values were identified in the 'Study Environment' column.
*   The treemap visualization was successfully generated after filling missing values in 'Study Environment' with 'Unknown'.
*   The sunburst visualization was successfully generated using the specified hierarchical path and values.

### Insights or Next Steps

*   The visualizations provide a hierarchical view of academic stress index across different academic stages, study environments, and coping strategies. Further analysis could explore the average stress index for each combination of these categories.
*   Investigate other columns in the dataset to identify additional factors that might influence academic stress levels and incorporate them into future visualizations or analyses.


# Task
Load the dataset from "/content/academic Stress level - maintainance 1.csv", preprocess it, and then generate treemap and sunburst visualizations.

## Handle missing values

### Subtask:
Address any missing values in the dataset as identified during the inspection phase.


**Reasoning**:
I will fill the missing values in the 'Study Environment' column with 'Unknown' as identified in the previous inspection and instructed in the current subtask.



In [None]:
df['Study Environment'].fillna('Unknown', inplace=True)
display(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 140 entries, 0 to 139
Data columns (total 9 columns):
 #   Column                                                               Non-Null Count  Dtype 
---  ------                                                               --------------  ----- 
 0   Timestamp                                                            140 non-null    object
 1   Your Academic Stage                                                  140 non-null    object
 2   Peer pressure                                                        140 non-null    int64 
 3   Academic pressure from your home                                     140 non-null    int64 
 4   Study Environment                                                    140 non-null    object
 5   What coping strategy you use as a student?                           140 non-null    object
 6   Do you have any bad habits like smoking, drinking on a daily basis?  140 non-null    object
 7   What would you rat

None

## Summary:

### Data Analysis Key Findings

* Missing values in the 'Study Environment' column were successfully filled with 'Unknown'.

### Insights or Next Steps

* The next step is to proceed with the visualization phase now that the data has been preprocessed.


# Task
Organize the code into sections for data processing and visualization, add titles and headings to the plots, and explain and fix any errors in the code.

## Load and inspect data

### Subtask:
Load the dataset from '/content/academic Stress level - maintainance 1.csv' into a pandas DataFrame and display the first few rows along with the column information.


**Reasoning**:
Load the data into a pandas DataFrame and display the first few rows and column information as instructed.



In [None]:
import pandas as pd

df = pd.read_csv('/content/academic Stress level - maintainance 1.csv')
display(df.head())
display(df.info())

Unnamed: 0,Timestamp,Your Academic Stage,Peer pressure,Academic pressure from your home,Study Environment,What coping strategy you use as a student?,"Do you have any bad habits like smoking, drinking on a daily basis?",What would you rate the academic competition in your student life,Rate your academic stress index
0,24/07/2025 22:05:39,undergraduate,4,5,Noisy,Analyze the situation and handle it with intel...,No,3,5
1,24/07/2025 22:05:52,undergraduate,3,4,Peaceful,Analyze the situation and handle it with intel...,No,3,3
2,24/07/2025 22:06:39,undergraduate,1,1,Peaceful,"Social support (friends, family)",No,2,4
3,24/07/2025 22:06:45,undergraduate,3,2,Peaceful,Analyze the situation and handle it with intel...,No,4,3
4,24/07/2025 22:08:06,undergraduate,3,3,Peaceful,Analyze the situation and handle it with intel...,No,4,5


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 140 entries, 0 to 139
Data columns (total 9 columns):
 #   Column                                                               Non-Null Count  Dtype 
---  ------                                                               --------------  ----- 
 0   Timestamp                                                            140 non-null    object
 1   Your Academic Stage                                                  140 non-null    object
 2   Peer pressure                                                        140 non-null    int64 
 3   Academic pressure from your home                                     140 non-null    int64 
 4   Study Environment                                                    139 non-null    object
 5   What coping strategy you use as a student?                           140 non-null    object
 6   Do you have any bad habits like smoking, drinking on a daily basis?  140 non-null    object
 7   What would you rat

None

## Handle missing values

### Subtask:
Address any missing values in the dataset as identified during the inspection phase.


**Reasoning**:
Fill the missing values in the 'Study Environment' column with 'Unknown' and display the DataFrame info to confirm.



In [None]:
df['Study Environment'].fillna('Unknown', inplace=True)
display(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 140 entries, 0 to 139
Data columns (total 9 columns):
 #   Column                                                               Non-Null Count  Dtype 
---  ------                                                               --------------  ----- 
 0   Timestamp                                                            140 non-null    object
 1   Your Academic Stage                                                  140 non-null    object
 2   Peer pressure                                                        140 non-null    int64 
 3   Academic pressure from your home                                     140 non-null    int64 
 4   Study Environment                                                    140 non-null    object
 5   What coping strategy you use as a student?                           140 non-null    object
 6   Do you have any bad habits like smoking, drinking on a daily basis?  140 non-null    object
 7   What would you rat


A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.





None

## Generate treemap visualization

### Subtask:
Create a treemap visualization to represent hierarchical data in the dataset.


**Reasoning**:
Import the necessary library and create the treemap visualization using the specified columns and title.



In [None]:
import plotly.express as px

fig = px.treemap(df, path=['Your Academic Stage', 'Study Environment', 'What coping strategy you use as a student?'], values='Rate your academic stress index ',
                 title='Treemap of Academic Stress Index by Stage, Environment, and Coping Strategy')
fig.update_layout(margin = dict(t=40, l=25, r=25, b=25))
fig.show()

Based on the treemap visualization, here are some inferences:

The treemap shows a hierarchical breakdown of the 'Rate your academic stress index' by 'Your Academic Stage', 'Study Environment', and 'What coping strategy you use as a student?'.
You can see the distribution of the academic stress index across different categories. For example, you can observe which academic stages, study environments, and coping strategies are associated with higher or lower stress index values.
The size of each rectangle in the treemap represents the sum of the 'Rate your academic stress index' for that particular category or combination of categories. Larger rectangles indicate a higher aggregate stress index.
By hovering over or clicking on the rectangles, you can get more detailed information about the specific categories and their corresponding stress index values.
To provide more specific inferences, I would need to analyze the visual output of the treemap and

## Generate sunburst visualization

### Subtask:
Create a sunburst visualization to show hierarchical data in a radial layout.


**Reasoning**:
Generate a sunburst chart using plotly.express with the specified hierarchical path and values and set the title.



In [None]:
fig = px.sunburst(df, path=['Your Academic Stage', 'Study Environment', 'What coping strategy you use as a student?'], values='Rate your academic stress index ',
                  title='Sunburst Chart of Academic Stress Index by Stage, Environment, and Coping Strategy')
fig.show()

**Sunburst Visualization Inference:**

The sunburst chart visually breaks down the 'Rate your academic stress index' hierarchically by 'Your Academic Stage', 'Study Environment', and 'What coping strategy you use as a student?'. The size of each segment corresponds to the sum of the stress index for that specific group. This allows for identifying combinations of academic stage, study environment, and coping strategy that correlate with higher or lower aggregate stress levels. Examining segment proportions helps understand how stress is distributed across these factors, revealing key areas for further investigation into academic stress drivers.

Based on the sunburst visualization, here are some inferences:

The sunburst chart provides a radial view of the hierarchical data, starting from the center with 'Your Academic Stage', then branching out to 'Study Environment', and finally to 'What coping strategy you use as a student?'.

The size of each segment in the sunburst chart represents the sum of the 'Rate your academic stress index' for that particular category or combination of categories. Larger segments indicate a higher aggregate stress index.

By examining the different segments, you can observe how the academic stress index is distributed across various academic stages, study environments, and coping strategies. For instance, you might be able to identify which combinations of these factors are associated with higher or lower stress levels.

Hovering over or clicking on the segments in the interactive chart will provide more detailed information about the specific categories and their corresponding stress index values.

To provide more specific inferences, I would need to analyze the visual output of the sunburst chart, but generally, it allows you to:

Understand the proportion of the total academic stress index contributed by each academic stage.
See how the stress index is distributed across different study environments within each academic stage.
Observe the distribution of the stress index among different coping strategies within each study environment and academic stage.
This visualization helps in understanding the complex relationships between these factors and academic stress.

## Summarize findings

### Subtask:
Summarize the key findings from the data analysis and visualizations, and outline potential next steps for further investigation.


## Summary:

### Data Analysis Key Findings

* The dataset contains 140 entries and 9 columns, including 'Your Academic Stage', 'Study Environment', 'What coping strategy you use as a student?', and 'Rate your academic stress index '.
* There was one missing value in the 'Study Environment' column, which was successfully imputed with 'Unknown'.
* Treemap and sunburst visualizations were successfully generated, showing the distribution of the academic stress index across academic stage, study environment, and coping strategies.

### Insights or Next Steps

* The visualizations provide a clear hierarchical view of how academic stress index varies across different student demographics and behaviors.
* Future analysis could involve exploring correlations between the academic stress index and other columns in the dataset, if available, to understand potential factors contributing to stress levels.
