MACHINE LEARNING BASED ON LIVESTOCK HEALTH MONITORING AND MANAGEMENT SYSTEM

** This project is based on monitoring livestocks as well as managing them using datasets that contains lots of livestock diseases which is used to train our machine learning model **
This helps to make predictions and detection of early livestock diseases before they become difficult to deal with

Author: Ahmed Aisha

Dept: Computer Science

Matric: 20/47cs/01303

In [7]:
# importing all the necessary modules and packages that is used for building the model
from sklearn.model_selection import train_test_split  # used to split our datasets into training and test dataset
from sklearn.linear_model import LogisticRegression   # a binary classifier that is used to determine if the animal is healthy or not
from sklearn.naive_bayes import MultinomialNB         # is used to predict the type of disease that is affecting the livestock
# the below functions are used to convert all categorical data into numerical data
from sklearn.preprocessing import OneHotEncoder
from sklearn.pipeline import make_pipeline
from sklearn.compose import make_column_transformer

import pandas as pd
import math

In [8]:
# first we need to load and make some preprocessing activities on our dataset
# this involves converting all categorical data into their corresponding encoded format using the
# OneHotEncoder function
diseaseDataset = pd.read_csv('../../dataset/animal_disease_dataset.csv')
animalHealthConditionDataset = pd.read_csv('../../dataset/data.csv')
  
print(diseaseDataset.head(10).to_string())  # showing the output of the disease dataset
print()
print(animalHealthConditionDataset.tail(10).to_string())  # showing the output of the animal health dataset which is used to determine if the livestock is healthy or not
print()
print("animal health dataset size {}".format(animalHealthConditionDataset.shape))
print("animal disease dataset size {}".format(diseaseDataset.shape))

    Animal  Age  Temperature           Symptom 1           Symptom 2           Symptom 3         Disease
0      cow    3        103.1          depression      painless lumps    loss of appetite       pneumonia
1  buffalo   13        104.5      painless lumps    loss of appetite          depression     lumpy virus
2    sheep    1        100.5          depression      painless lumps    loss of appetite     lumpy virus
3      cow   14        100.3    loss of appetite    swelling in limb     crackling sound        blackleg
4    sheep    2        103.6      painless lumps    loss of appetite          depression       pneumonia
5     goat   10        101.2    loss of appetite    blisters on gums  difficulty walking  foot and mouth
6    sheep    6        103.3    loss of appetite          depression      painless lumps     lumpy virus
7     goat    6        101.7  difficulty walking  blisters on tongue    loss of appetite  foot and mouth
8  buffalo    9        102.5          depression      p

Preprocessing our dataset in order to make them both suitable for training our machine learning models

In [9]:
def make_clean(dataFrame: pd.DataFrame):
    frame = dataFrame
    # dropping any rows that  contains any null values
    return frame.dropna(axis='index')

# cleaning the two dataset using the above function
diseaseDataset = make_clean(diseaseDataset)
animalHealthConditionDataset = make_clean(animalHealthConditionDataset)

# splitting our datasets into training and testing features which is used to train our model
diseaseTrain, diseaseTest, diseaseTrainLabel, diseaseTestLabel = train_test_split(
    diseaseDataset.loc[:, ['Animal', 'Age', 'Temperature', 'Symptom 1', 'Symptom 2', 'Symptom 3', ]],
    diseaseDataset.loc[:, ['Disease']], train_size=.75, random_state=5
)

# Converting yes or no to zero or one before passing it train test split
healthLabel = animalHealthConditionDataset['Dangerous'].replace({"Yes": 1, "No": 0})
healthX = animalHealthConditionDataset.iloc[:, :6]

healthTrain, healthTest, healthLabelTrain, healthLabelTest = train_test_split(
    healthX, healthLabel, train_size=.75, random_state=5
)
# print(healthLabel)


After the dataset has been splitted, the next step is to convert all catergorical data that is within the dataset to numerical data
this allows the machine learning model to be able to make predictions and classification without any error

In [10]:
animalHealthColumnTransformer = make_column_transformer(
    (OneHotEncoder(), ['AnimalName', 'symptoms1', 'symptoms2', 'symptoms3', 'symptoms4', 'symptoms5']),
    remainder='passthrough'
)
animalDiseaseColumnTransformer = make_column_transformer(
    (OneHotEncoder(), ['Animal', 'Symptom 1', 'Symptom 2', 'Symptom 3']), remainder="passthrough"
)

Now that the dataset has been preprocessed, the Preprocessed dataset can now be used to train the machine learning model
The machine learning model that is used includes

NAIVE BAYES ALGORITHM

LINEAR LOGISTICS REGRESSION MODEL

In [11]:
# selecting the corresponding model that will be used for making classification and the one that will be used for
# making prediction of farm animals disease

logreg = LogisticRegression(solver="lbfgs")
naiveClassifier = MultinomialNB(alpha=0.1)
#
# # creating a pipeline of steps to be taken from the encoder to passing the data to the corresponding machine
# # learning algorithm
animalHealthPipeline = make_pipeline(animalHealthColumnTransformer, logreg)
animalDiseasePipeline = make_pipeline(animalDiseaseColumnTransformer, naiveClassifier)

# fitting the trainig dataset into our machine learning pipeline
animalDiseasePipeline.fit(diseaseTrain, diseaseTrainLabel)
animalHealthPipeline.fit(healthTrain, healthLabelTrain)


  y = column_or_1d(y, warn=True)


CHECKING THE MODEL ACCURACY LEVEL USING THE TRAINING DATASET AND THE TEST DATASET
CODE EVALUATION

In [12]:
print("Training Accuracy for Animal Disease model is {}%".format(round(animalDiseasePipeline.score(diseaseTrain, diseaseTrainLabel) * 100)))
print("Testing Accuracy for Animal Disease model is {}%".format(round(animalDiseasePipeline.score(diseaseTest, diseaseTestLabel) * 100)))

print()
print("Training Accuracy for Animal Health Condition model is {}%".format(round(animalHealthPipeline.score(healthTrain, healthLabelTrain) * 100)))
# print("Training Accuracy for Animal Health Condition model is {}%".format(round(animalHealthPipeline.score(healthTest, healthLabelTest) * 100)))

Training Accuracy for Animal Disease model is 84%
Testing Accuracy for Animal Disease model is 83%

Training Accuracy for Animal Health Condition model is 99%


TESTING AND MAKING PREDICTIONS USING THE TRAINED MODEL


In [13]:
import pickle