# Diet classification of participants

Study participants have filled in RedCap survey forms that differ for children and their caretakers. On both forms there are multiple choice questions to more clearly specify the diet participants adhere to.

This code contains all interpretations of survey answers to group participants to diet categories.


## Form for Children
Same row of check boxes was presented for both home and daycare diets:


- Ei erityisruokavaliota tai välttämisruokavaliota
- Laktoositon tai vähälaktoosinen ruokavalio
- Gluteeniton ruokavalio (vältetään vehnää,ruista ja ohraa)
- Ruokavalio, joka ei sisällä punaista lihaa
- Vegaaninen ruokavalio (ei sisällä mitään eläinperäisiä tuotteita)
- Kasvisruokavalio, joka sisältää yhtä tai useampaa seuraavista eläinkunnan tuotteista: kalaa, kananmunaa ja/tai maitotuotteita
- Ruokarajoituksia uskonnollisista syistä
- Muu ruokavalio

The code below encodes selections as `1`s and unchecked boxes as `0`s. For each form, we get an eight-character string of `0`s and `1`s. All combinations found are mapped to a diet category according to the research group's interpretation.

In [None]:
diet_id = {
    '00001000': 0,
    '00000100': 1,
    '00001100': 1, #contradictory answer
    '10000100': 1, #contradictory answer
    '01000100': 2,
    '00010100': 3,
    '01000000': 4,
    '10000001': 5,
    '10000000': 6,
}
diet_name = {    
    0: 'vegan',
    1: 'vegetarian',
    2: 'vegetarian lactose-free',
    3: 'vegetarian no red meat', 
    4: 'mixed diet lactose-free',
    5: 'mixed diet other diet', 
    6: 'mixed diet',
}
diet_main = {    
    0: 'vegan',
    1: 'vegetarian',
    2: 'vegetarian',
    3: 'vegetarian', 
    4: 'mixed_diet',
    5: 'mixed_diet', 
    6: 'mixed_diet',
}

In [None]:
import pandas as pd

#collating redcap answers to diet category per person
df = pd.read_csv('../data/main/redcap_child_diet.csv')
df.drop(df[df.mira2_lapsen_taustatieto_ja_ruoankyttkysely_timestamp == '[not completed]'].index, inplace=True)
df.rename(columns={'id_child': 'id_person'}, inplace=True)

In [None]:
df['diet_dc_ticks'] = \
df.diet_dc___1.astype(str) + \
df.diet_dc___2.astype(str) + \
df.diet_dc___3.astype(str) + \
df.diet_dc___4.astype(str) + \
df.diet_dc___5.astype(str) + \
df.diet_dc___6.astype(str) + \
df.diet_dc___7.astype(str) + \
df.diet_dc___8.astype(str)

df['diet_home_ticks'] = \
df.diet_home___1.astype(str) + \
df.diet_home___2.astype(str) + \
df.diet_home___3.astype(str) + \
df.diet_home___4.astype(str) + \
df.diet_home___5.astype(str) + \
df.diet_home___6.astype(str) + \
df.diet_home___7.astype(str) + \
df.diet_home___8.astype(str)

df['diet_dc_id'] = df.diet_dc_ticks.map(diet_id)
df['diet_dc'] = df.diet_dc_id.map(diet_name)

df['diet_home_id'] = df.diet_home_ticks.map(diet_id)
df['diet_home'] = df.diet_home_id.map(diet_name)

df['diet_main'] = df[['diet_dc_id', 'diet_home_id']].max(axis=1).map(diet_main)

usecols = [
        'id_person',
        'diet_dc_ticks',
        'diet_dc_id',
        'diet_dc',
        'diet_home_ticks',
        'diet_home_id',
        'diet_home',
        'diet_main',
    ]

diet_children = df[usecols]

In [None]:
diet_children.to_csv('../data/main/diet_class.csv', index=False)