We have chosen to explore the ObesityDataSet, which is a dataset that gathered information of individuals. Below is the list of question asks in the survey:

1. What is your gender? (Male/Female)
2. What is your age?
3. What is your height (in meters)?
4. What is your weight (in kilograms)?
5. Has a family member suffered or suffers from overweight? (Yes/No)
6. Do you eat high caloric food frequently? (Yes/No)
7. Do you usually eat vegetables in your meals? (Never/Sometimes/Always)
8. How many main meals do you have daily? (1-2, 3, 3+)
9. Do you eat any food between meals? (No/Sometimes/Frequently/Always)
10. Do you smoke? (Yes/No)
11. How much water do you drink daily (in liters)? (<1L, 1-2L, >2L)
12. How often do you have physical activity (days)? (0, 1-2, 2-4, 4-5)
13. How much time do you use technological devices such as cell phone, videogames, television, computer, and others (hours)? (0-2, 3-5, >5)
14. How often do you drink alcohol? (Do not drink/Sometimes/Frequently/Always)
15. Which transportation do you usually use? (Automobile/Motorbike/Bike/Public Transportation/Walking)

*From Palechor & Manotas (2019)*

Palechor, F. M., &amp; Manotas, A. de. (2019). Dataset for estimation of obesity levels based on eating habits and physical condition in individuals from Colombia, Peru and Mexico. Data in Brief, 25, 104344. https://doi.org/10.1016/j.dib.2019.104344 

In [1]:
library(tidyverse)
library(repr)
library(tidymodels)

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.2 ──
[32m✔[39m [34mggplot2[39m 3.4.2     [32m✔[39m [34mpurrr  [39m 1.0.1
[32m✔[39m [34mtibble [39m 3.2.1     [32m✔[39m [34mdplyr  [39m 1.1.1
[32m✔[39m [34mtidyr  [39m 1.3.0     [32m✔[39m [34mstringr[39m 1.5.0
[32m✔[39m [34mreadr  [39m 2.1.3     [32m✔[39m [34mforcats[39m 0.5.2
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
── [1mAttaching packages[22m ────────────────────────────────────── tidymodels 1.0.0 ──

[32m✔[39m [34mbroom       [39m 1.0.2     [32m✔[39m [34mrsample     [39m 1.1.1
[32m✔[39m [34mdials       [39m 1.1.0     [32m✔[39m [34mtune        [39m 1.0.1
[32m✔[39m [34minfer       [39m 1.0.4     [32m✔[39m [34mworkflows   [39m 1.1.2
[32m✔[39

In [2]:
obesity <- read_csv("data/obesity.csv")

[1mRows: [22m[34m2111[39m [1mColumns: [22m[34m17[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (9): Gender, family_history_with_overweight, FAVC, CAEC, SMOKE, SCC, CAL...
[32mdbl[39m (8): Age, Height, Weight, FCVC, NCP, CH2O, FAF, TUE

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


In [6]:
obesity <- obesity |>
    mutate(CAEC = as_factor(CAEC))

obesity <- obesity |>
    mutate(Gender = as_factor(Gender))

obesity_names <- obesity |>
    rename("obesity_level" = NObeyesdad, 
           "gender" = Gender, 
           "age" = Age, 
           "height" = Height, 
           "weight" = Weight,
           "high_caloric_freq" = FAVC, 
           "eat_veg_w_meal" = FCVC,
           "main_meals_daily" = NCP, 
           "food_btw_meals" = CAEC, 
           "smoker" = SMOKE, 
           "water" = CH2O, 
           "monitor_calories" = SCC, 
           "physical_freq" = FAF, 
           "screen_time" = TUE,
           "alcohol" = CALC, 
           "transportation_mode" = MTRANS)

obesity_names_record |>
    mutate(
    

gender,age,height,weight,family_history_with_overweight,high_caloric_freq,eat_veg_w_meal,main_meals_daily,food_btw_meals,smoker,water,monitor_calories,physical_freq,screen_time,alcohol,transportation_mode,obesity_level
<fct>,<dbl>,<dbl>,<dbl>,<chr>,<chr>,<dbl>,<dbl>,<fct>,<chr>,<dbl>,<chr>,<dbl>,<dbl>,<chr>,<chr>,<chr>
Female,21,1.62,64.0,yes,no,2,3,Sometimes,no,2,no,0,1,no,Public_Transportation,Normal_Weight
Female,21,1.52,56.0,yes,no,3,3,Sometimes,yes,3,yes,3,0,Sometimes,Public_Transportation,Normal_Weight
Male,23,1.80,77.0,yes,no,2,3,Sometimes,no,2,no,2,1,Frequently,Public_Transportation,Normal_Weight
Male,27,1.80,87.0,no,no,3,3,Sometimes,no,2,no,2,0,Frequently,Walking,Overweight_Level_I
Male,22,1.78,89.8,no,no,2,1,Sometimes,no,2,no,0,0,Sometimes,Public_Transportation,Overweight_Level_II
Male,29,1.62,53.0,no,yes,2,3,Sometimes,no,2,no,0,0,Sometimes,Automobile,Normal_Weight
Female,23,1.50,55.0,yes,yes,3,3,Sometimes,no,2,no,1,0,Sometimes,Motorbike,Normal_Weight
Male,22,1.64,53.0,no,no,2,3,Sometimes,no,2,no,3,0,Sometimes,Public_Transportation,Normal_Weight
Male,24,1.78,64.0,yes,yes,3,3,Sometimes,no,2,no,1,1,Frequently,Public_Transportation,Normal_Weight
Male,22,1.72,68.0,yes,yes,2,3,Sometimes,no,2,no,1,1,no,Public_Transportation,Normal_Weight
