<a href="https://colab.research.google.com/github/swamysrinivasarao-ops/ssgn/blob/main/ssgn2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Task
Analyze the sensor data from "/content/sensor_data_practice.csv" to calculate average temperatures per sensor, identify and count anomalous readings (outside 18°C-28°C), determine the sensor location with the most anomalies, visualize the anomaly distribution, and provide a comprehensive summary report of the key findings.

## Load and Preview Data

### Subtask:
Load the sensor data from the 'sensor_data_practice.csv' file into a pandas DataFrame and display the first few rows, its information, and descriptive statistics to understand its structure and content.


**Reasoning**:
The subtask requires loading the data, displaying its first few rows, its information, and descriptive statistics. I will use a single code block to import pandas, read the CSV, and then apply the `.head()`, `.info()`, and `.describe()` methods to the DataFrame.



In [1]:
import pandas as pd

df = pd.read_csv('/content/sensor_data_practice.csv')

print("First 5 rows of the DataFrame:")
print(df.head())

print("\nDataFrame Information:")
df.info()

print("\nDescriptive Statistics:")
print(df.describe())

First 5 rows of the DataFrame:
  Sensor_ID  Reading_Number  Temperature_C Location            Timestamp
0        S1               1      22.671458    Lab-A  2025-01-01 00:00:00
1        S2               2      22.513979    Lab-B  2025-01-01 00:15:00
2        S3               3      21.807690    Lab-C  2025-01-01 00:30:00
3        S4               4      24.917800    Lab-D  2025-01-01 00:45:00
4        S5               5      23.678449    Lab-E  2025-01-01 01:00:00

DataFrame Information:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 5 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Sensor_ID       100 non-null    object 
 1   Reading_Number  100 non-null    int64  
 2   Temperature_C   100 non-null    float64
 3   Location        100 non-null    object 
 4   Timestamp       100 non-null    object 
dtypes: float64(1), int64(1), object(3)
memory usage: 4.0+ KB

Descriptive Statistics:
  

## Calculate Average Temperature per Sensor

### Subtask:
Calculate the average temperature for each unique sensor based on the 'Temperature_C' column.


**Reasoning**:
The subtask requires calculating the average temperature for each unique sensor. This can be achieved by grouping the DataFrame by 'Sensor_ID' and then computing the mean of the 'Temperature_C' column. The result will be stored in a new variable.



In [2]:
average_temperatures_per_sensor = df.groupby('Sensor_ID')['Temperature_C'].mean()
print("Average Temperature per Sensor:")
print(average_temperatures_per_sensor)

Average Temperature per Sensor:
Sensor_ID
S1    23.103596
S2    23.411689
S3    23.098647
S4    23.330205
S5    23.037844
Name: Temperature_C, dtype: float64


## Identify Anomalous Readings

### Subtask:
Identify all readings where the 'Temperature_C' is outside the safe range (less than 18°C or greater than 28°C).


**Reasoning**:
The subtask requires identifying anomalous temperature readings by filtering the 'df' DataFrame based on the given temperature range (less than 18°C or greater than 28°C) and storing them in a new DataFrame named 'anomalous_readings'. Then, I will print the head of the new DataFrame.



In [3]:
anomalous_readings = df[(df['Temperature_C'] < 18) | (df['Temperature_C'] > 28)]

print("Anomalous Readings:")
print(anomalous_readings.head())

Anomalous Readings:
Empty DataFrame
Columns: [Sensor_ID, Reading_Number, Temperature_C, Location, Timestamp]
Index: []


### Subtask:
Count the total number of anomalous readings and display the count.

**Reasoning**:
The subtask requires counting the total number of anomalous readings. This can be achieved by getting the number of rows in the `anomalous_readings` DataFrame using `len()` and then printing this count.



In [4]:
num_anomalies = len(anomalous_readings)
print(f"Total number of anomalous readings: {num_anomalies}")

Total number of anomalous readings: 0


### Determine Sensor Location with Most Anomalies

**Note**: As no anomalous readings were identified (total count: 0), it is not possible to determine a sensor location with the most anomalies. We will proceed to summarize the findings based on the data available.

### Determine Sensor Location with Most Anomalies

**Note**: As no anomalous readings were identified (total count: 0), it is not possible to determine a sensor location with the most anomalies. We will proceed to summarize the findings based on the data available.

### Determine Sensor Location with Most Anomalies

**Note**: As no anomalous readings were identified (total count: 0), it is not possible to determine a sensor location with the most anomalies. We will proceed to summarize the findings based on the data available.

## Provide a Comprehensive Summary Report of Key Findings

### Subtask:
Based on the analysis performed so far, provide a comprehensive summary report of the key findings, including average temperatures per sensor and the outcome of the anomaly detection.

### Summary Report of Key Findings

Based on the analysis of the sensor data:

1.  **Average Temperatures per Sensor:**
    The average temperatures for each sensor are as follows:
    - S1: 23.10°C
    - S2: 23.41°C
    - S3: 23.10°C
    - S4: 23.33°C
    - S5: 23.04°C

2.  **Anomaly Detection:**
    - No anomalous temperature readings were identified in the dataset. All recorded temperatures fell within the defined safe range of 18°C to 28°C.
    - The total count of anomalous readings is 0.
    - Consequently, it was not possible to determine a sensor location with the most anomalies as none were found.

## Count Anomalies per Sensor

### Subtask:
Count the number of anomalous readings for each sensor ID.


**Reasoning**:
The subtask requires counting anomalous readings per sensor ID. This can be achieved by grouping the 'anomalous_readings' DataFrame by 'Sensor_ID' and then using `.size()` or `.count()` to get the number of anomalies for each sensor. I will then store the result in 'anomalies_per_sensor' and print it.



In [5]:
anomalies_per_sensor = anomalous_readings.groupby('Sensor_ID').size()

print("Number of anomalous readings per sensor:")
print(anomalies_per_sensor)

Number of anomalous readings per sensor:
Series([], dtype: int64)


## Identify Location with Most Anomalies

### Subtask:
Determine which sensor location has the highest total number of anomalies.


### Determine Location with Most Anomalies

**Note**: As previously identified, no anomalous readings were found in the dataset (the `anomalous_readings` DataFrame is empty, and `anomalies_per_sensor` is an empty Series). Therefore, it is not possible to determine a sensor location with the highest number of anomalies, as there are no anomalies to count.

This finding will be incorporated into the final summary report.

## Visualize Anomaly Distribution

### Subtask:
Generate a bar chart showing the count of anomalies per sensor, with appropriate labels and a legend.


**Reasoning**:
Since no anomalous readings were found in the previous steps, a bar chart for anomaly distribution cannot be generated. I will add a code block to check the number of anomalies and print a message if none are found, as per the instructions.



In [6]:
import matplotlib.pyplot as plt

if num_anomalies == 0:
    print("No anomalous readings were identified, so a bar chart for anomaly distribution cannot be generated.")
else:
    plt.figure(figsize=(10, 6))
    anomalies_per_sensor.plot(kind='bar', color='skyblue')
    plt.title('Distribution of Anomalous Readings per Sensor')
    plt.xlabel('Sensor ID')
    plt.ylabel('Number of Anomalies')
    plt.xticks(rotation=45)
    plt.grid(axis='y', linestyle='--', alpha=0.7)
    plt.tight_layout()
    plt.show()


No anomalous readings were identified, so a bar chart for anomaly distribution cannot be generated.


## Generate Summary Report

### Subtask:
Create a summary report that includes the average temperature per sensor, the total count of anomalies, the list of sensors with anomalies, and the location with the most anomalies.


### Summary Report of Key Findings

Based on the analysis of the sensor data:

1.  **Average Temperatures per Sensor:**
    The average temperatures for each sensor are as follows:
    - S1: 23.10°C
    - S2: 23.41°C
    - S3: 23.10°C
    - S4: 23.33°C
    - S5: 23.04°C

2.  **Anomaly Detection:**
    - The total number of anomalous readings identified is **0**.
    - No anomalous temperature readings were detected for any sensor. All recorded temperatures fell within the defined safe range of 18°C to 28°C.
    - Consequently, it was not possible to determine a sensor location with the most anomalies as no anomalies were found.

### Summary Report of Key Findings

Based on the analysis of the sensor data:

1.  **Average Temperatures per Sensor:**
    The average temperatures for each sensor are as follows:
    - S1: 23.10°C
    - S2: 23.41°C
    - S3: 23.10°C
    - S4: 23.33°C
    - S5: 23.04°C

2.  **Anomaly Detection:**
    - The total number of anomalous readings identified is **0**.
    - No anomalous temperature readings were detected for any sensor. All recorded temperatures fell within the defined safe range of 18°C to 28°C.
    - Consequently, it was not possible to determine a sensor location with the most anomalies as no anomalies were found.

## Final Task

### Subtask:
Provide a comprehensive summary of the sensor data analysis, including key findings regarding temperature averages, anomalies, and problem locations.


## Summary:

### Data Analysis Key Findings

*   The dataset, `sensor_data_practice.csv`, contains 100 entries across 5 columns (`Sensor_ID`, `Reading_Number`, `Temperature_C`, `Location`, `Timestamp`) with no missing values.
*   The average temperatures calculated for each sensor are:
    *   S1: 23.10°C
    *   S2: 23.41°C
    *   S3: 23.10°C
    *   S4: 23.33°C
    *   S5: 23.04°C
*   **No anomalous temperature readings were identified** in the dataset. All 100 recorded temperatures fell within the defined safe range of 18°C to 28°C.
*   Consequently, the total count of anomalous readings is 0.
*   Due to the absence of anomalies, it was not possible to determine a sensor location with the most anomalies or to generate a visualization of anomaly distribution.

### Insights or Next Steps

*   The sensor system currently operates within the specified temperature range (18°C-28°C), indicating stable performance without deviations based on the analyzed data.
*   Continue monitoring sensor data for future anomalies and consider implementing real-time alerting for temperature breaches if the system's operational parameters change or if a narrower "normal" range is needed.
