<a href="https://colab.research.google.com/github/goodwillhunting9/AI-Driven-Food-Security-Platform/blob/main/ASSIGNMENT1AIHYPER_KETAN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Creating the Dataset:
The synthetic data is generated with random values for Hunger_Index, Food_Production_Index, and Undernourishment.
The data covers three fictional countries over five years (2018-2022).
Loading and Inspecting the Data:
The dataset is loaded directly from the DataFrame, and basic inspection (head, summary statistics) is performed.
Filtering and Visualization:
The data is filtered for Country A and visualized using a line plot to show the Hunger Index over the years.

In [None]:
import pandas as pd
import numpy as np

# Define the data
data = {
    'Country': ['Country A', 'Country B', 'Country C'] * 5,
    'Year': [2018, 2018, 2018, 2019, 2019, 2019, 2020, 2020, 2020, 2021, 2021, 2021, 2022, 2022, 2022],
    'Hunger_Index': np.random.uniform(20, 50, 15),  # Random hunger index values between 20 and 50
    'Food_Production_Index': np.random.uniform(80, 120, 15),  # Random food production index values
    'Undernourishment': np.random.uniform(10, 30, 15)  # Random percentage of undernourished population
}

# Convert to DataFrame
df = pd.DataFrame(data)

# Display the dataset
print(df)

# Save the dataset as a CSV file (optional, for later use)
df.to_csv('synthetic_food_security_data.csv', index=False)


In [None]:
# Display the first few rows of the dataset
print("Dataset successfully created and loaded:")
print(df.head())

# Example Analysis: Display summary statistics
print("\nSummary Statistics:")
print(df.describe())

# Check for missing values
print("\nMissing Values in Each Column:")
print(df.isnull().sum())


In [None]:
# Filter the data for 'Country A'
filtered_data = df[df['Country'] == 'Country A']

# Display the filtered data
print("\nFiltered Data for Country A:")
print(filtered_data)

# Visualize the Hunger Index over the years for 'Country A'
import matplotlib.pyplot as plt

plt.figure(figsize=(10, 6))
plt.plot(filtered_data['Year'], filtered_data['Hunger_Index'], marker='o')
plt.title('Hunger Index over Years for Country A')
plt.xlabel('Year')
plt.ylabel('Hunger Index')
plt.grid(True)
plt.show()


1. Comparing Hunger Index Across Countries

You might want to compare the Hunger Index for all countries over the years:

In [None]:
import matplotlib.pyplot as plt

# Plot Hunger Index for all countries over the years
plt.figure(figsize=(12, 8))

for country in df['Country'].unique():
    country_data = df[df['Country'] == country]
    plt.plot(country_data['Year'], country_data['Hunger_Index'], marker='o', label=country)

plt.title('Hunger Index Over the Years for All Countries')
plt.xlabel('Year')
plt.ylabel('Hunger Index')
plt.grid(True)
plt.legend()
plt.show()


In [None]:
# Plot Food Production Index for all countries over the years
plt.figure(figsize=(12, 8))

for country in df['Country'].unique():
    country_data = df[df['Country'] == country]
    plt.plot(country_data['Year'], country_data['Food_Production_Index'], marker='s', label=country)

plt.title('Food Production Index Over the Years for All Countries')
plt.xlabel('Year')
plt.ylabel('Food Production Index')
plt.grid(True)
plt.legend()
plt.show()


In [None]:
# Calculate the correlation matrix for numerical columns only
numerical_columns = df.select_dtypes(include=[np.number])  # This selects only numeric columns
correlation_matrix = numerical_columns.corr()

# Display the correlation matrix
print("Correlation Matrix:")
print(correlation_matrix)

# Visualize the correlation matrix using a heatmap
import seaborn as sns

plt.figure(figsize=(8, 6))
sns.heatmap(correlation_matrix, annot=True, cmap='coolwarm', vmin=-1, vmax=1)
plt.title('Correlation Matrix Heatmap')
plt.show()


*italicized text*# New Section

# New Section

Interpretation of the Correlation Matrix:
Year:
Correlation with Hunger_Index (0.097): The correlation between Year and Hunger_Index is very low, indicating that there isn't a strong linear relationship between the year and the hunger index in this synthetic dataset.
Correlation with Food_Production_Index (-0.227): There is a weak negative correlation, suggesting that as the years progress, there is a slight decrease in the food production index, but this relationship is not strong.
Correlation with Undernourishment (0.205): There is a weak positive correlation, meaning that as time goes on, undernourishment tends to increase slightly, but again, this is a weak relationship.
Hunger_Index:
Correlation with Food_Production_Index (0.234): There is a weak positive correlation, indicating that higher food production is somewhat associated with a higher hunger index in this dataset. This might be counterintuitive and could indicate issues such as unequal distribution or inefficiencies in how food is produced versus how it is consumed.
Correlation with Undernourishment (-0.339): This negative correlation suggests that as the hunger index increases, undernourishment decreases, which is not typical in real-world scenarios. This could be due to the randomness in the synthetic data, or it might suggest a complex relationship where regions with higher hunger indexes are actually better at mitigating severe undernourishment.
Food_Production_Index:
Correlation with Undernourishment (0.233): A weak positive correlation suggests that higher food production is slightly associated with higher levels of undernourishment. This could imply that food production increases do not necessarily lead to better nutritional outcomes, possibly due to distribution issues or other factors.
Undernourishment:
The correlations with other variables have already been discussed above.
Visualizing the Correlations:
The heatmap you generated provides a clear visual summary of these relationships. Typically, you'd expect strong positive correlations between variables like food production and decreased hunger, but since this is synthetic data, the relationships may not follow real-world patterns.