<a href="https://colab.research.google.com/github/EtyangRichard/Karamoja-Project/blob/main/Richard__Etyang_Karamoja_Project_ipynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Project Overview
This Project will involve creating an interactive visualization tool that provides insights into crop yield for sorghum and maize in Karamoja, Uganda, based on satellite imagery. This tool will serve as a prototype for a more comprehensive Food Security Monitoring system intended for use by NGOs operating in the region.

#Background
Karamoja, located in northeastern Uganda, is renowned as the most food-insecure region in the country. The primary contributors to this food insecurity are recurrent droughts and frequent pest and disease outbreaks, which significantly diminish crop productivity. Despite the efforts of various non-governmental organizations (NGOs) that offer technical support and farm inputs to improve yields, there is a substantial lack of comprehensive, real-time data on the overall agricultural conditions in the region. These NGOs often rely on localized information sources to direct their interventions.

This project aims to provide a foundational ```visualization``` tool that will aid in more informed decision-making and resource allocation for improving food security in Karamoja.

#Objectives

 To find out which district had the highest maize and Sorghum total production.

 To find out how the crop area in hectares affects the total production of maize and sorghum in each district.

To find out the relationship between maize production and sorghum production in each district.

To find out the correlation between the population and  total  maize and sorghum production of each district.











#Research Questions
Which district in Karamoja recorded the highest total production of maize and sorghum during the 2017 crop season?

How does the area of land dedicated to maize and sorghum (measured in hectares) influence the total production of these crops in each district?

What is the relationship between maize production and sorghum production across different districts in Karamoja?

What is the correlation between the population size of each district and the total production of maize and sorghum within those districts?



#The Data
I will the two data sets below and capitalize on the final cleaned data to generate visualization through Tableu using one set that is
 the ```Uganda_Karamoja_Subcounty_Crop_Yield_Population.csv``` clean data

Uganda_Karamoja_District_Crop_Yield_Population.csv

Uganda_Karamoja_Subcounty_Crop_Yield_Population.csv


In [2]:
#Importing Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns



In [3]:
#Loading Data sets
df=pd.read_csv('/content/Uganda_Karamoja_District_Crop_Yield_Population.csv')
df1=pd.read_csv('/content/Uganda_Karamoja_Subcounty_Crop_Yield_Population.csv')

#Exploring the Data sets

In [4]:
#Loading the first five rows of the two data sets
print(df.head())
print(df1.head())


#Loading the last five rows of the two data sets
print(df.tail())
print(df1.tail())

#Loading the shape of the two data sets
print(df.shape)
print(df1.shape)


   OBJECTID     NAME     POP        Area  S_Yield_Ha  M_Yield_Ha  \
0        92     ABIM   90385  2771977106         449        1040   
1        96   AMUDAT  101790  1643582836         205        1297   
2        20  KAABONG  627057  7373606003         279         945   
3        85   KOTIDO  243157  3641539808         331        1148   
4         5   MOROTO  127811  3570160948         128         355   

   Crop_Area_Ha     S_Area_Ha    M_Area_Ha  S_Prod_Tot  M_Prod_Tot  
0   5470.068394   3277.295971  1848.621855     1471506     1922567  
1   5765.443719   2973.423860  2733.661014      609552     3545558  
2  28121.672530  20544.194960  7394.416334     5731830     6987723  
3  53032.649450  50247.443900  1751.372284    16631904     2010575  
4   5954.814048   4741.748776  1190.050606      606944      422468  
   OBJECTID       SUBCOUNTY_NAME DISTRICT_NAME    POP        Area Karamoja  \
0       263              KACHERI        KOTIDO  17244  1067176155        Y   
1       264          

In [5]:
#summary statistics
print(df.describe())
print(df1.describe())

#summary of the dataframe
print(df.info())
print(df1.info())


        OBJECTID            POP          Area  S_Yield_Ha   M_Yield_Ha  \
count   7.000000       7.000000  7.000000e+00    7.000000     7.000000   
mean   61.714286  214943.571429  3.960853e+09  269.285714   986.142857   
std    36.481567  188604.280916  1.781860e+09  119.243049   321.566700   
min     5.000000   90385.000000  1.643583e+09  128.000000   355.000000   
25%    37.000000  114800.500000  3.171069e+09  171.000000   899.500000   
50%    80.000000  146780.000000  3.641540e+09  279.000000  1040.000000   
75%    88.500000  205391.000000  4.362553e+09  343.500000  1206.000000   
max    96.000000  627057.000000  7.373606e+09  449.000000  1297.000000   

       Crop_Area_Ha     S_Area_Ha    M_Area_Ha    S_Prod_Tot    M_Prod_Tot  
count      7.000000      7.000000     7.000000  7.000000e+00  7.000000e+00  
mean   21094.520379  16737.636651  3983.947082  4.873098e+06  4.085632e+06  
std    17363.854165  16625.963460  2678.911441  5.743724e+06  2.877188e+06  
min     5470.068394   297

In [6]:
print(df.columns)
print(df1.columns)


Index(['OBJECTID', 'NAME', 'POP', 'Area', 'S_Yield_Ha', 'M_Yield_Ha',
       'Crop_Area_Ha', 'S_Area_Ha', 'M_Area_Ha', 'S_Prod_Tot', 'M_Prod_Tot'],
      dtype='object')
Index(['OBJECTID', 'SUBCOUNTY_NAME', 'DISTRICT_NAME', 'POP', 'Area',
       'Karamoja', 'S_Yield_Ha', 'M_Yield_Ha', 'Crop_Area_Ha', 'S_Area_Ha',
       'M_Area_Ha', 'S_Prod_Tot', 'M_Prod_Tot'],
      dtype='object')


#DATA CLEANING


#Handing Missing Values

In [7]:
#Identifying Missing Values
print(df.isnull().sum())
print(df1.isnull().sum())
#Reolacing missing values
df.fillna(0, inplace=True)
df1.fillna(0, inplace=True)


OBJECTID        0
NAME            0
POP             0
Area            0
S_Yield_Ha      0
M_Yield_Ha      0
Crop_Area_Ha    0
S_Area_Ha       0
M_Area_Ha       0
S_Prod_Tot      0
M_Prod_Tot      0
dtype: int64
OBJECTID          0
SUBCOUNTY_NAME    0
DISTRICT_NAME     0
POP               0
Area              0
Karamoja          0
S_Yield_Ha        0
M_Yield_Ha        0
Crop_Area_Ha      0
S_Area_Ha         0
M_Area_Ha         0
S_Prod_Tot        0
M_Prod_Tot        0
dtype: int64


#Handling Duplicates.
Since there are no missing values, Check for duplicates and if present remove them.

In [8]:
#Identify and remove Duplicates.
print(df.duplicated().sum())
print (df1.duplicated().sum())
#Replacing duplicated values
df.drop_duplicates(inplace=True)
df1.drop_duplicates(inplace=True)



0
0


#Downloading the Dataset

In [60]:
# Save the cleaned DataFrame to a new CSV file
df.to_csv('cleaned_data.csv', index=False)
df1.to_csv('cleaned_data.csv', index=False)
print("\nCleaned data has been saved to 'cleaned_data.csv'.")




Cleaned data has been saved to 'cleaned_data.csv'.


In [61]:
#Saving dataset 29 df1) to a new CSV file
df1.to_csv('final_data.csv', index=False)
print("\nfinal data has been saved to 'final_data.csv'.")


final data has been saved to 'final_data.csv'.


#Conclusions
 Karamoja region is accreditated of being the most famine prone region in Uganda due to its low crop yield rates which are caused by frequent droughts, diseases and pests attacks. All these factors work hand in hand in reducing the level of food production and therefore increase food insecurity in the region.

 Although there are so many NGOs that could have offered the technical support and the farm inputs, there is however a major drawback regarding over-all visibility of the agricultural situation in the Karamoja region. Due to overeliance on localized information the development of strategies and programs to support the region are not well prioritized and appropriate. This highlights a need for a proper food security monitoring tool in order to get more generalized and better picture of the agriculture in the region.

 Potential for Future Development: This first mockup which  will set the foundation for the development of an  elaborated Food Security Monitoring tool. The feedback from this initial phase will be important in designing the tool that captures the full  requirements of an efficient food security management in Karamoja.

#Recommendations
The Input should be directly proportional to the Output: The volume of investment in producing sorghum and Maize should be assessed from the output generated from each district. The NGOs can focus on investing in Maize inputs in Kotido and Sorghum inputs in Kaabong and Nakipiripirit.

Invest in buying land: It is evident from the data that  the higher the crop area the higher the total production of Mazie and Sorghum.

Create Employment opportunities for large population in Karamoja in farms to Increase Production: It is evident that the districts with large population have increased production. However, Kotido District has the highest production even though they have a low population. This means that if their population would be large their production would be higher.