Skip to content

This project estimates the % of body fat in a humans body based on parameters such ass weight, hip size, chest size and more.

Notifications You must be signed in to change notification settings

lawalsegun2025/body_fat_estimator

Repository files navigation

Body Fat Estimator

Table of Content

Demo

body_fat_estimator.mp4

Overview

This project model estimates the percentage of body fat in a human's body based on parameters such as weight, height, hip size and many more.

Motivation

Your body fat percentage is a value that tells you how much of your body weight is made up of fat. In terms of your overall health, your body fat percentage can be one of the most useful numbers available to you, more than how much you weigh and even more than your Body Mass Index (BMI).

Problem Solving Steps

  1. Load the dataset into a pandas Data Frame
  2. Perform Exploratory Data Analysis on the data
  3. Select the appropriate Machine Learning algorithmand fine tune hyperparameters
  4. Save model in a pickle file
  5. Integrate model with the User Interface using flask web framework
  6. Deploy the webapp in a cloud platform

About Dataset

Dataset can be downloaded here.

Context

Lists estimates of the percentage of body fat determined by underwater weighing and various body circumference measurements for 252 men. Educational use of the dataset

This data set can be used to illustrate multiple regression techniques. Accurate measurement of body fat is inconvenient/costly and it is desirable to have easy methods of estimating body fat that are not inconvenient/costly. Content

The variables listed below, from left to right, are:

  1. Density determined from underwater weighing
  2. Percent body fat from Siri's (1956) equation
  3. Age (years)
  4. Weight (lbs)
  5. Height (inches)
  6. Neck circumference (cm)
  7. Chest circumference (cm)
  8. Abdomen 2 circumference (cm)
  9. Hip circumference (cm)
  10. Thigh circumference (cm)
  11. Knee circumference (cm)
  12. Ankle circumference (cm)
  13. Biceps (extended) circumference (cm)
  14. Forearm circumference (cm)
  15. Wrist circumference (cm)

(Measurement standards are apparently those listed in Benhke and Wilmore (1974), pp. 45-48 where, for instance, the abdomen 2 circumference is measured "laterally, at the level of the iliac crests, and anteriorly, at the umbilicus".)

These data are used to produce the predictive equations for lean body weight given in the abstract "Generalized body composition prediction equation for men using simple measurement techniques", K.W. Penrose, A.G. Nelson, A.G. Fisher, FACSM, Human Performance Research Center, Brigham Young University, Provo, Utah 84602 as listed in Medicine and Science in Sports and Exercise, vol. 17, no. 2, April 1985, p. 189. (The predictive equation were obtained from the first 143 of the 252 cases that are listed below).

More details

A variety of popular health books suggest that the readers assess their health, at least in part, by estimating their percentage of body fat. In Bailey (1994), for instance, the reader can estimate body fat from tables using their age and various skin-fold measurements obtained by using a caliper. Other texts give predictive equations for body fat using body circumference measurements (e.g. abdominal circumference) and/or skin-fold measurements. See, for instance, Behnke and Wilmore (1974), pp. 66-67; Wilmore (1976), p. 247; or Katch and McArdle (1977), pp. 120-132).

The percentage of body fat for an individual can be estimated once body density has been determined. Folks (e.g. Siri (1956)) assume that the body consists of two components - lean body tissue and fat tissue. Letting:

D = Body Density (gm/cm^3)
A = proportion of lean body tissue
B = proportion of fat tissue (A+B=1)
a = density of lean body tissue (gm/cm^3)
b = density of fat tissue (gm/cm^3)

we have:

D = 1/[(A/a) + (B/b)]

solving for B we find:

B = (1/D)*[ab/(a-b)] - [b/(a-b)].

Using the estimates a=1.10 gm/cm^3 and b=0.90 gm/cm^3 (see Katch and McArdle (1977), p. 111 or Wilmore (1976), p. 123) we come up with "Siri's equation":

Percentage of Body Fat (i.e. 100*B) = 495/D - 450.

Volume, and hence body density, can be accurately measured a variety of ways. The technique of underwater weighing "computes body volume as the difference between body weight measured in air and weight measured during water submersion. In other words, body volume is equal to the loss of weight in water with the appropriate temperature correction for the water's density" (Katch and McArdle (1977), p. 113). Using this technique,

Body Density = WA/[(WA-WW)/c.f. - LV]

where:

WA = Weight in air (kg)
WW = Weight in water (kg)
c.f. = Water correction factor (=1 at 39.2 deg F as one-gram of water occupies exactly one cm^3 at this temperature, =.997 at 76-78 deg F)
LV = Residual Lung Volume (liters)

(Katch and McArdle (1977), p. 115). Other methods of determining body volume are given in Behnke and Wilmore (1974), p. 22 ff. Source

The data were generously supplied by Dr. A. Garth Fisher who gave permission to freely distribute the data and use for non-commercial purposes.

Roger W. Johnson Department of Mathematics & Computer Science South Dakota School of Mines & Technology 501 East St. Joseph Street Rapid City, SD 57701

email address: rwjohnso@silver.sdsmt.edu web address: http://silver.sdsmt.edu/~rwjohnso References

Bailey, Covert (1994). Smart Exercise: Burning Fat, Getting Fit, Houghton-Mifflin Co., Boston, pp. 179-186.

Behnke, A.R. and Wilmore, J.H. (1974). Evaluation and Regulation of Body Build and Composition, Prentice-Hall, Englewood Cliffs, N.J.

Siri, W.E. (1956), "Gross composition of the body", in Advances in Biological and Medical Physics, vol. IV, edited by J.H. Lawrence and C.A. Tobias, Academic Press, Inc., New York.

Katch, Frank and McArdle, William (1977). Nutrition, Weight Control, and Exercise, Houghton Mifflin Co., Boston.

Wilmore, Jack (1976). Athletic Training and Physical Fitness: Physiological Principles of the Conditioning Process, Allyn and Bacon, Inc., Boston.

Data Cleaning Techniques

The dataset used was already cleaned.

Exploratory Data Analysis

The following two functions was used to visualize KDE Distribution Plots and Distribution analysis.

KDE Distribution Plots

# define a distribution function
def plot_displots(col):
    
    plt.figure(figsize=(10, 7))
    sns.kdeplot(df["BodyFat"], color="magenta", 
                label="Bodyfat")
    sns.kdeplot(df[col], color="red", 
                label=col)
    plt.legend();
    plt.show()
    
cols =list(df.columns)
for i in cols:
    print(f"Distribution plots for {i} feature is shown below")
    plot_displots(i);
    print("."*100);

distribution analysis

# function that plots the distribution
def draw_plots(df, col):
    
    plt.figure(figsize=(20, 7))
    plt.subplot(1, 3, 1)
    plt.hist(df[col], color="magenta")
    
    plt.subplot(1, 3, 2)
    stats.probplot(df[col], dist="norm", plot=plt)
    
    plt.subplot(1, 3, 3)
    sns.boxplot(df[col], color="magenta")
    
    plt.show()
    
cols = list(df.columns)
for i in range(len(cols)):
    
    print(f"Distribution plots for the feature {cols[i]} are shown below ")
    
    draw_plots(df, cols[i])
    
    print("."*100)

Feature Selection

Two feature selection techniques were used;

  • ExtraTree Regressor
  • Mutual Information Regressor

Variance Inflation Factor was also used to measure how much the behaviour (variance) of an independent variable is influenced or inflated by its interaction/correlation with the other independent variables.

After carrying out the feature selection techniques we selected 5 features out of the initial 15 features. The features selected includes;

  1. Density
  2. Abdomen
  3. Chest
  4. Weight and
  5. Hip

Model Building

Two models were built;

  • DecisionTree Regressor and
  • RandomForestRegressor

The decision tree was too large so we decided to prune the tree by explorint the cost_complexity_pruning_path. We were able to get an optimal ccp_alpha value of 1.

The graph below shows the accuracy vs alpha value of both the testing and training data.

Model Performance

DecisionTree Regressor model had an accurcy of 92.7% while the RandomForest Regressor Model had an accuracy of 96.2%.

After attempting to improve the model performance by Hyperparameter tunning using RandomizedSerachCV we obsered that there was a drop in the performance of the tuned model, so we decided to go with the initial model parameters.

About

This project estimates the % of body fat in a humans body based on parameters such ass weight, hip size, chest size and more.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages