*NOTE: Sex-specific variables, women at the end of the dofile *Important: More sex-specific categories at the end of the do file *Arrays have been considered, but not the instances *****************RENAMING VARIABLES******************************* *******MRI variables rename n_22434_2_0 Abdominal_fat_ratio rename n_22436_2_0 Liver_PDFF_AMRA rename n_22435_2_0 Muscle_fat_infiltration rename n_22432_2_0 Total_abd_adi_tissue_ind rename n_22433_2_0 Weight_to_muscle_ratio rename n_22410_2_0 Total_trunk_fat_vol rename n_22415_2_0 Total_adipose_tissue_vol rename n_22416_2_0 Total_lean_tissue_vol rename n_22408_2_0 Abd_sub_adi_tis_vol_ASAT rename n_22407_2_0 Visceral_adipose_tis_vol_VAT rename n_22409_2_0 Total_thigh_muscle_vol rename n_22405_2_0 Ant_thigh_leanmuscle_vol_L rename n_22403_2_0 Ant_thigh_leanmuscle_vol_R rename n_22406_2_0 Post_thigh_leanmuscle_vol_L rename n_22404_2_0 Post_thigh_leanmuscle_vol_R rename n_22412_2_0 Thigh_error_indicator_left rename n_22413_2_0 Thigh_error_indicator_right rename n_22411_2_0 VAT_ASAT_error_indicator rename n_22414_2_0 Image_quality_indicator rename n_24352_2_0 FR_liver_PDFF_mean renvars s_20204_*, subst(s_20204_ Liver_Im_T1_) renvars s_20254_*, subst(s_20254_ Liver_im_IDEAL) renvars s_20203_*, subst(s_20203_ Liver_im_gradient_echo) renvars n_22402_*, subst(n_22402_ Proton_den_fat_fract) renvars n_22401_*, subst(n_22401_ Liver_inflamm_factor_LIF) renvars n_22417_*, subst(n_22417_ Liver_iron_corre_T1_ct1) renvars n_22400_*, subst(n_22400_ Liver_iron_Fe) *******DXA variables rename n_23244_2_0 and_bone_mass rename n_23245_2_0 and_fat_mass rename n_23246_2_0 and_lean_mass rename n_23247_2_0 and_tissue_fat_per rename n_23248_2_0 and_total_mass rename n_23249_2_0 Arm_fat_mass_left rename n_23253_2_0 Arm_fat_mass_right rename n_23250_2_0 Arm_lean_mass_left rename n_23254_2_0 Arm_lean_mass_right rename n_23251_2_0 Arm_tissue_fat_per_left rename n_23255_2_0 Arm_tissue_fat_per_right rename n_23252_2_0 Arm_total_mass_left rename n_23256_2_0 Arm_total_mass_right rename n_23257_2_0 Arms_fat_mass rename n_23258_2_0 Arms_lean_mass rename n_23259_2_0 Arms_tissue_fat_per rename n_23260_2_0 Arms_total_mass rename n_23261_2_0 gyn_bone_mass rename n_23262_2_0 gyn_fat_mass rename n_23263_2_0 gyn_lean_mass rename n_23264_2_0 gyn_tissue_fat_per rename n_23265_2_0 gyn_total_mass rename n_23266_2_0 Leg_fat_mass_left rename n_23270_2_0 Leg_fat_mass_right rename n_23267_2_0 Leg_lean_mass_left rename n_23271_2_0 Leg_lean_mass_right rename n_23268_2_0 Leg_tissue_fat_per_left rename n_23272_2_0 Leg_tissue_fat_per_right rename n_23269_2_0 Leg_total_mass_left rename n_23273_2_0 Leg_total_mass_right rename n_23274_2_0 Legs_fat_mass rename n_23275_2_0 Legs_lean_mass rename n_23276_2_0 Legs_tissue_fat_per rename n_23277_2_0 Legs_total_mass rename n_23278_2_0 Total_fat_mass rename n_23279_2_0 Total_fatfree_mass rename n_23280_2_0 Total_lean_mass rename n_23281_2_0 Total_tissue_fat_per rename n_23282_2_0 Total_tissue_mass rename n_23283_2_0 Total_mass rename n_23284_2_0 Trunk_fat_mass rename n_23285_2_0 Trunk_lean_mass rename n_23286_2_0 Trunk_tissue_fat_per rename n_23287_2_0 Trunk_total_mass rename n_23288_2_0 VAT_mass rename n_23289_2_0 VAT_vol ************************************* ** Men specific variables***** ************************************** rename n_2365_0 screen_psa renvars n_20002*, subst(n_20002 non_cancer_illness) */ ************************************* ** Women specific variables***** ************************************** *Renaming female factors (reproductive variables and hormone use) rename n_2714_0_0 age_menarche rename n_3581_0_0 age_menopause rename n_2734_0_0 live_births rename n_2784_0_0 oca_use rename n_3591_0_0 hysterectomy rename n_2814_0_0 hrt_use rename n_3546_0_0 hrt_agestopped rename n_2834_0_0 bi_oophorectomy rename n_2724_0_0 menopause rename n_2694_0 screen_cervix rename n_2674_0 screen_breast *Renaming basic demographic and recruitment variables rename n_eid eID rename n_31_0_0 sex //skal med rename n_21003_0_0 age_recruitment //skal med rename n_189_0_0 townsend_dep //? rename n_21000_0_0 ethnicity //? renvars n_6138*, subst(n_6138 qualifications) renvars n_6141*, subst(n_6141 marital) rename n_20118_0_0 urban //? rename n_54_0_0 demogr_asses_centre //? capture rename ts_53_0_0 recruit_date rename n_20116_0_0 smoking_status rename n_3456_0_0 current_smoking //måske relevant for rygning rename n_1558_0_0 alcohol //? rename n_34_0_0 birth_year rename n_52_0_0 birth_month renvars n_6142*, subst(n_6142 employment) //? *Renaming specific alcohol variables rename n_1588_0_0 beer_wk rename n_4429_0_0 beer_mo rename n_1568_0_0 redwine_wk rename n_4407_0_0 redwine_mo rename n_1578_0_0 whitewine_wk rename n_4418_0_0 whitewine_mo rename n_1608_0_0 port_wk rename n_4451_0_0 port_mo rename n_1598_0_0 spirits_wk rename n_4440_0_0 spirits_mo rename n_5364_0_0 otheralc_wk rename n_4462_0_0 otheralc_mo *Renaming anthropometric variables renvars n_21001_*, subst(n_21001_ bmi2_) rename bmi2_0_0 bmi_2 renvars n_23104_*, subst(n_23104_ bmi1_) rename bmi1_0_0 bmi_1 renvars n_48_*, subst(n_48_ waist_) rename waist_0_0 waist renvars n_49_*, subst(n_49_ hip_) rename hip_0_0 hip rename n_3148_0_0 heelbmd_auto rename n_78_0_0 heelbmd_tscore_auto *rename n_3084_0_0 heelbmd_manual rename n_77_0_0 heelbmd_tscore_manual renvars n_23099_*, subst(n_23099_ bodyfat_pc_) //har vi også med rename bodyfat_pc_0_0 bodyfat_pc rename n_23105_0_0 bmr rename n_23127_0_0 trunkfat_pc rename n_46_0_0 handgrip_left rename n_47_0_0 handgrip_right rename n_50_0_0 height rename n_21002_0_0 weight rename n_23100_0_0 bodyfatmass rename n_23101_0_0 bodyfatfreemass rename n_23102_0_0 bodywatermass rename n_23106_0_0 bodyimpedance rename n_23128_0_0 trunkfatmass rename n_23129_0_0 trunkfatfreemass rename n_23130_0_0 trunkpredmass *Renaming touch-screen physical activity variables ******************** skal jeg bruge dem her til at estimere MET? eller kommer den længere nede? rename n_806_0_0 standing_job rename n_816_0_0 physical_job rename n_991_0_0 sports_f rename n_1001_0_0 sports_d rename n_924_0_0 walking_pace rename n_943_0_0 stair_climb rename n_971_0_0 walk_pleasure_f rename n_981_0_0 walk_pleasure_d rename n_3637_0_0 other_exercise_f rename n_3647_0_0 other_exercise_d rename n_2624_0_0 heavy_DIY_f rename n_2634_0_0 heavy_DIY_d rename n_1011_0_0 light_DIY_f rename n_1021_0_0 light_DIY_d *Renaming complete blood count variables rename n_30000_0_0 wbcc rename n_30120_0_0 lymphocytes rename n_30130_0_0 monocytes rename n_30140_0_0 neutrophils rename n_30150_0_0 eosinophils rename n_30160_0_0 basophils rename n_30010_0_0 rbcc rename n_30020_0_0 haem rename n_30030_0_0 hct rename n_30040_0_0 mcv rename n_30050_0_0 mch rename n_30060_0_0 mchc rename n_30070_0_0 rcdw rename n_30080_0_0 plateletc rename n_30100_0_0 mplateletv rename n_30170_0_0 nuclrbc rename n_30250_0_0 reticuc rename n_30260_0_0 mreticuv *Renaming diet change and other health history questions rename n_1538_0_0 change_diet rename n_2188_0_0 long_illness rename n_2296_0_0 falls_lastyr rename n_2306_0_0 weightchange_1yr //har jeg med rename n_2443_0_0 diabetes_sr rename n_2453_0_0 cancer_sr rename n_2463_0_0 fracbone_5yr rename n_1548_0_0 diet_var rename n_10912_0_0 pilot_diet_var rename n_1239_0_0 current_smoker rename n_1249_0_0 former_smoker rename n_2897_0_0 quitsmoke_age rename n_1647_0_0 birthcountry renvars n_20107*, subst(n_20107 fh_father) renvars n_20110*, subst(n_20110 fh_mother) renvars n_20111*, subst(n_20111 fh_sibling) renvars n_6154*, subst(n_6154 medication) *Renaming touch-screen meat/fish variables rename n_1329_0_0 oilyfish_ts rename n_1339_0_0 nonoilyfish_ts rename n_1349_0_0 processedmeat_ts rename n_1359_0_0 poultry_ts rename n_1369_0_0 beef_ts rename n_1379_0_0 lamb_ts rename n_1389_0_0 pork_ts *Renaming touch-screen other dietary variables rename n_1418_0_0 milktype_ts rename n_1408_0_0 cheese_ts rename n_1428_0_0 spreadtype_ts rename n_2654_0_0 nonbutterspreadtype_ts rename n_1438_0_0 bread_ts rename n_1448_0_0 breadtype_ts rename n_1458_0_0 cereal_ts rename n_1468_0_0 cerealtype_ts rename n_1478_0_0 saltadded_ts rename n_1309_0_0 fruit_ts rename n_1319_0_0 driedfruit_ts rename n_1299_0_0 rawveg_ts rename n_1289_0_0 cookveg_ts rename n_1488_0_0 tea_ts rename n_1498_0_0 coffee_ts rename n_1508_0_0 coffeetype_ts rename n_1518_0_0 temphotdrinks_ts rename n_1528_0_0 water_ts *Renaming touchscreen dietary supplement variable renvars n_6155*, subst(n_6155 vitsupplements) renvars n_6179*, subst(n_6179 minsupplements) **# DIETARY VARIABLES****************************************************** *Renaming 24-hr WebQ variables renvars n_20081*, subst(n_20081 WebQ_hourcompleted) renvars n_100010*, subst(n_100010 WebQ_portionsize) renvars n_100020*, subst(n_100020 WebQ_typicaldiet) *renvars n_20085*, subst(n_20085 WebQ_reasonatypical) renvars n_20086*, subst(n_20086 WebQ_specialdiet) *Fruit renvars n_104410*, subst(n_104410 WebQ_stewedfruit) renvars n_104420*, subst(n_104420 WebQ_prunes) renvars n_104430*, subst(n_104430 WebQ_dried) renvars n_104440*, subst(n_104440 WebQ_mixedfruit) renvars n_104450*, subst(n_104450 WebQ_apple) renvars n_104460*, subst(n_104460 WebQ_banana) renvars n_104470*, subst(n_104470 WebQ_berry) renvars n_104480*, subst(n_104480 WebQ_cherry) renvars n_104490*, subst(n_104490 WebQ_grapefruit) renvars n_104500*, subst(n_104500 WebQ_grapes) renvars n_104510*, subst(n_104510 WebQ_mango) renvars n_104520*, subst(n_104520 WebQ_melon) renvars n_104530*, subst(n_104530 WebQ_orange) renvars n_104540*, subst(n_104540 WebQ_satsuma) renvars n_104550*, subst(n_104550 WebQ_peach) renvars n_104560*, subst(n_104560 WebQ_pear) renvars n_104570*, subst(n_104570 WebQ_pineapple) renvars n_104580*, subst(n_104580 WebQ_plum) renvars n_104590*, subst(n_104590 WebQ_otherfruit) *Vegetables renvars n_104000*, subst(n_104000 WebQ_bakedbeans) /// renvars n_104010*, subst(n_104010 WebQ_pulses) /// renvars n_104020*, subst(n_104020 WebQ_friedpotatoes) renvars n_104030*, subst(n_104030 WebQ_boiledpotatoes) renvars n_104050*, subst(n_104050 WebQ_mashedpotatoes) renvars n_104060*, subst(n_104060 WebQ_mixedveg) renvars n_104070*, subst(n_104070 WebQ_vegpieces) renvars n_104080*, subst(n_104080 WebQ_coleslaw) renvars n_104090*, subst(n_104090 WebQ_salad) renvars n_104100*, subst(n_104100 WebQ_avo) renvars n_104110*, subst(n_104110 WebQ_broadbeans) /// renvars n_104120*, subst(n_104120 WebQ_greenbeans) renvars n_104130*, subst(n_104130 WebQ_beetroot) renvars n_104140*, subst(n_104140 WebQ_broccoli) renvars n_104150*, subst(n_104150 WebQ_squash) renvars n_104160*, subst(n_104160 WebQ_cabbage) renvars n_104170*, subst(n_104170 WebQ_carrots) renvars n_104180*, subst(n_104180 WebQ_cauliflower) renvars n_104190*, subst(n_104190 WebQ_celery) renvars n_104200*, subst(n_104200 WebQ_courgette) renvars n_104210*, subst(n_104210 WebQ_cucumber) renvars n_104220*, subst(n_104220 WebQ_garlic) renvars n_104230*, subst(n_104230 WebQ_leeks) renvars n_104240*, subst(n_104240 WebQ_lettuce) renvars n_104250*, subst(n_104250 WebQ_mushrooms) renvars n_104260*, subst(n_104260 WebQ_onion) renvars n_104270*, subst(n_104270 WebQ_parsnip) renvars n_104280*, subst(n_104280 WebQ_peas) renvars n_104290*, subst(n_104290 WebQ_peppers) renvars n_104300*, subst(n_104300 WebQ_spinach) renvars n_104310*, subst(n_104310 WebQ_sprouts) renvars n_104320*, subst(n_104320 WebQ_corn) renvars n_104330*, subst(n_104330 WebQ_sweetpotato) renvars n_104340*, subst(n_104340 WebQ_tomatofresh) renvars n_104350*, subst(n_104350 WebQ_tomatotinned) renvars n_104360*, subst(n_104360 WebQ_turnip) renvars n_104370*, subst(n_104370 WebQ_watercress) renvars n_104380*, subst(n_104380 WebQ_otherveg) *Meat renvars n_103010*, subst(n_103010 WebQ_sausage) renvars n_103020*, subst(n_103020 WebQ_beef) renvars n_103030*, subst(n_103030 WebQ_pork) renvars n_103040*, subst(n_103040 WebQ_lamb) renvars n_103050*, subst(n_103050 WebQ_crumbedchicken) renvars n_103060*, subst(n_103060 WebQ_chicken) renvars n_103070*, subst(n_103070 WebQ_bacon) renvars n_103080*, subst(n_103080 WebQ_ham) renvars n_103090*, subst(n_103090 WebQ_liver) renvars n_103100*, subst(n_103100 WebQ_othermeat) *Fish renvars n_103150*, subst(n_103150 WebQ_tinnedtuna) renvars n_103160*, subst(n_103160 WebQ_oilyfish) renvars n_103170*, subst(n_103170 WebQ_breadedfish) renvars n_103180*, subst(n_103180 WebQ_batteredfish) renvars n_103190*, subst(n_103190 WebQ_whitefish) renvars n_103200*, subst(n_103200 WebQ_prawns) renvars n_103210*, subst(n_103210 WebQ_lobster) renvars n_103220*, subst(n_103220 WebQ_shellfish) renvars n_103230*, subst(n_103230 WebQ_otherfish) *Cheese renvars n_102810*, subst(n_102810 WebQ_cheese_hard_lof) renvars n_102820*, subst(n_102820 WebQ_cheese_hard) renvars n_102830*, subst(n_102830 WebQ_cheese_soft) renvars n_102840*, subst(n_102840 WebQ_cheese_blue) renvars n_102850*, subst(n_102850 WebQ_cheese_spread_lof) renvars n_102860*, subst(n_102860 WebQ_cheese_spread) renvars n_102870*, subst(n_102870 WebQ_cheese_cottage) renvars n_102880*, subst(n_102880 WebQ_cheese_feta) renvars n_102890*, subst(n_102890 WebQ_cheese_mozzarell) renvars n_102900*, subst(n_102900 WebQ_cheese_goat) renvars n_102910*, subst(n_102910 WebQ_cheese_other) renvars n_102000*, subst(n_102000 WebQ_pizza) renvars n_102220*, subst(n_102220 WebQ_cheesecake) *Milk renvars n_100520*, subst(n_100520 WebQ_milkglasses) renvars n_100550*, subst(n_100550 WebQ_hotchoc) renvars n_100540*, subst(n_100540 WebQ_lowcalhotchoc) renvars n_100460*, subst(n_100460 WebQ_tea_black_milk) renvars n_100480*, subst(n_100480 WebQ_tea_rooibos_milk) renvars n_100260*, subst(n_100260 WebQ_coffee_instant_milk) renvars n_100280*, subst(n_100280 WebQ_coffee_filter_milk) renvars n_100320*, subst(n_100320 WebQ_coffee_espresso_milk) renvars n_100350*, subst(n_100350 WebQ_coffee_other_milk) renvars n_20105*, subst(n_20105 WebQ_porridge_liquid) renvars n_100890*, subst(n_100890 WebQ_cereal_milk) renvars n_102140*, subst(n_102140 WebQ_milkpud) renvars n_102150*, subst(n_102150 WebQ_othermilkpud) *Yoghurt renvars n_102090*, subst(n_102090 WebQ_yogurt) renvars n_100530*, subst(n_100530 WebQ_yogurtdrink) renvars n_100230*, subst(n_100230 WebQ_yogurtsmoothie) *Ice-cream renvars n_102120*, subst(n_102120 WebQ_icecream) *Beverages renvars n_100150*, subst(n_100150 WebQ_water) renvars n_100390*, subst(n_100390 WebQ_tea_consumed) renvars n_100400*, subst(n_100400 WebQ_tea_black) renvars n_100410*, subst(n_100410 WebQ_tea_rooibos) renvars n_100420*, subst(n_100420 WebQ_tea_green) renvars n_100430*, subst(n_100430 WebQ_tea_herbal) renvars n_100440*, subst(n_100440 WebQ_tea_other) *Coffee renvars n_100240*, subst(n_100240 WebQ_coffee_consumed) renvars n_100360*, subst(n_100360 WebQ_decaf_coffee) renvars n_100250*, subst(n_100250 WebQ_coffee_instant) renvars n_100270*, subst(n_100270 WebQ_coffee_filter) renvars n_100290*, subst(n_100290 WebQ_coffee_cappuccino) renvars n_100300*, subst(n_100300 WebQ_coffee_latte) renvars n_100310*, subst(n_100310 WebQ_coffee_espresso) renvars n_100330*, subst(n_100330 WebQ_coffee_other) *Bread renvars n_100950*, subst(n_100950 WebQ_bread_sliced) *Pasta/rice renvars n_102720*, subst(n_102720 WebQ_wmeal_pasta) renvars n_102740*, subst(n_102740 WebQ_brown_rice) renvars n_102780*, subst(n_102780 WebQ_other_cookgrains) *Cereal renvars n_100760*, subst(n_100760 WebQ_cereal_consumed) renvars n_100770*, subst(n_100770 WebQ_cereal_porridge) renvars n_100800*, subst(n_100800 WebQ_cereal_muesli) renvars n_100810*, subst(n_100810 WebQ_cereal_oatcrunch) renvars n_100820*, subst(n_100820 WebQ_cereal_sweet) renvars n_100830*, subst(n_100830 WebQ_cereal_plain) renvars n_100840*, subst(n_100840 WebQ_cereal_bran) renvars n_100850*, subst(n_100850 WebQ_cereal_wwheat) renvars n_100860*, subst(n_100860 WebQ_cereal_other) renvars n_100880*, subst(n_100880 WebQ_driedfruit_cereal) *Soup renvars n_102540*, subst(n_102540 WebQ_soup_canned) renvars n_102620*, subst(n_102620 WebQ_soup_homemade) *Nutrients renvars n_100001*, subst(n_100001 WebQ_foodweight) renvars n_100002*, subst(n_100002 WebQ_energy) renvars n_100003*, subst(n_100003 WebQ_protein) renvars n_100004*, subst(n_100004 WebQ_fat) renvars n_100005*, subst(n_100005 WebQ_cho) renvars n_100006*, subst(n_100006 WebQ_safa) renvars n_100007*, subst(n_100007 WebQ_pufa) renvars n_100008*, subst(n_100008 WebQ_sugar) renvars n_100009*, subst(n_100009 WebQ_fibre) renvars n_100011*, subst(n_100011 WebQ_iron) renvars n_100012*, subst(n_100012 WebQ_vitB6) renvars n_100013*, subst(n_100013 WebQ_vitB12) renvars n_100014*, subst(n_100014 WebQ_folate) renvars n_100015*, subst(n_100015 WebQ_vitC) renvars n_100016*, subst(n_100016 WebQ_potassium) renvars n_100017*, subst(n_100017 WebQ_magnesium) renvars n_100018*, subst(n_100018 WebQ_retinol) renvars n_100019*, subst(n_100019 WebQ_carotene) renvars n_100021*, subst(n_100021 WebQ_vitD) renvars n_100022*, subst(n_100022 WebQ_alcohol) renvars n_100023*, subst(n_100023 WebQ_starch) renvars n_100024*, subst(n_100024 WebQ_calcium) renvars n_100025*, subst(n_100025 WebQ_vitE) *Renaming screening variables *rename n_2345_0 screen_bowel *Renaming cancer variables rename n_40009_0_0 ca_total_reported renvars ts_40005_* , subst(ts_40005_ ca_Dx_date_) renvars s_40013_* , subst(s_40013_ ca_icd9_) renvars s_40006_* , subst(s_40006_ ca_icd10_) renvars ts_40000_* , subst(ts_40000_ death_date_) *To rename the repeat variables *Renaming repeat assessment date rename ts_53_1_0 repeat_recruit_date *Renaming anthropometric variables *rename n_23099_1_0 repeat_bodyfat_pc rename n_23105_1_0 repeat_bmr rename n_23127_1_0 repeat_trunkfat_pc rename n_46_1_0 repeat_handgrip_left rename n_47_1_0 repeat_handgrip_right rename n_50_1_0 repeat_height rename n_21002_1_0 repeat_weight rename n_23100_1_0 repeat_bodyfatmass rename n_23101_1_0 repeat_bodyfatfreemass rename n_23102_1_0 repeat_bodywatermass rename n_23106_1_0 repeat_bodyimpedance rename n_23128_1_0 repeat_trunkfatmass rename n_23129_1_0 repeat_trunkfatfreemass rename n_23130_1_0 repeat_trunkpredmass *imaging visit *rename n_23099_2_0 imag_bodyfat_pc *rename n_23105_2_0 imag_bmr *rename n_23127_2_0 imag_trunkfat_pc rename n_46_2_0 imag_handgrip_left rename n_47_2_0 imag_handgrip_right rename n_50_2_0 imag_height rename n_21002_2_0 imag_weight *rename n_23100_2_0 imag_bodyfatmass *rename n_23101_2_0 imag_bodyfatfreemass *rename n_23102_2_0 imag_bodywatermass *rename n_23106_2_0 imag_bodyimpedance *rename n_23128_2_0 imag_trunkfatmass *rename n_23129_2_0 imag_trunkfatfreemass *rename n_23130_2_0 imag_trunkpredmass *Renaming touch-screen meat/fish variables rename n_1329_1_0 repeat_oilyfish_ts rename n_1339_1_0 repeat_nonoilyfish_ts rename n_1349_1_0 repeat_processedmeat_ts rename n_1359_1_0 repeat_poultry_ts rename n_1369_1_0 repeat_beef_ts rename n_1379_1_0 repeat_lamb_ts rename n_1389_1_0 repeat_pork_ts *Renaming touch-screen other dietary variables rename n_1418_1_0 repeat_milktype_ts rename n_1408_1_0 repeat_cheese_ts rename n_1428_1_0 repeat_spreadtype_ts rename n_2654_1_0 repeat_nonbutterspreadtype_ts rename n_1438_1_0 repeat_bread_ts rename n_1448_1_0 repeat_breadtype_ts rename n_1458_1_0 repeat_cereal_ts rename n_1468_1_0 repeat_cerealtype_ts rename n_1478_1_0 repeat_saltadded_ts rename n_1309_1_0 repeat_fruit_ts rename n_1319_1_0 repeat_driedfruit_ts rename n_1299_1_0 repeat_rawveg_ts rename n_1289_1_0 repeat_cookveg_ts rename n_1488_1_0 repeat_tea_ts rename n_1498_1_0 repeat_coffee_ts rename n_1508_1_0 repeat_coffeetype_ts rename n_1518_1_0 repeat_temphotdrinks_ts rename n_1528_1_0 repeat_water_ts rename n_1538_1_0 repeat_change_diet rename n_1548_1_0 repeat_diet_var *Renaming touchscreen physical activity repeat variables rename n_806_1_0 repeat_standing_job rename n_816_1_0 repeat_physical_job rename n_991_1_0 repeat_sports_f rename n_1001_1_0 repeat_sports_d rename n_924_1_0 repeat_walking_pace rename n_943_1_0 repeat_stair_climb rename n_971_1_0 repeat_walk_pleasure_f rename n_981_1_0 repeat_walk_pleasure_d rename n_3637_1_0 repeat_other_exercise_f rename n_3647_1_0 repeat_other_exercise_d rename n_2624_1_0 repeat_heavy_DIY_f rename n_2634_1_0 repeat_heavy_DIY_d rename n_1011_1_0 repeat_light_DIY_f rename n_1021_1_0 repeat_light_DIY_d *Renaming diet change and other health history questions rename n_2188_1_0 repeat_long_illness rename n_2296_1_0 repeat_falls_lastyr rename n_2306_1_0 repeat_weightchange_1yr rename n_2443_1_0 repeat_diabetes_sr rename n_2453_1_0 repeat_cancer_sr rename n_2463_1_0 repeat_fracbone_5yr rename n_1239_1_0 repeat_current_smoker rename n_1249_1_0 repeat_former_smoker rename n_2897_1_0 repeat_quitsmoke_age rename n_1647_1_0 repeat_birthcountry *Create a variable which assigns everyone the day of birth as the 15th. gen dob=mdy(birth_month,15,birth_year) format dob %d *Check list birth_month birth_year dob in 1/5 *log using "covariates.log", replace *************************************************************************** ***SOCIODEMIGRAPHIC CHARACTERISTICS************************************************************************************************ *************************************************************************** ***AGE *Generating an age category variable. Currently 5 year categories. * No array, 3 instances, UK Biobank code 21003 egen ageG = cut (age_recruitment), at (30, 45, 50, 55, 60, 65, 150) icodes la var ageG "6 cat of age" recode ageG 0=1 1=2 2=3 3=4 4=5 5=6 label define ageGL 1 "<45" 2 "45-" 3 "50-" 4 "55-" 5 "60-" 6 "65-" label values ageG ageGL tab ageG, nolab tabstat age_recruitment, by (ageG) s(min max) misstable sum ageG ***ETHNICITY, *Like in PSA paper: White, mixed background, Black, Asian, Other, unknown (None of the above, PNA or missing) * No array, 3 instances, UK Biobank code 21000 gen ethnicity5 = . la var ethnicity5 "ethnicity 5 cat" replace ethnicity5 = 1 if ethnicity==1 | ethnicity==1001 | ethnicity==1002 | ethnicity==1003 // British, Irish, Any other white background replace ethnicity5 = 2 if ethnicity==2 | ethnicity==2001 | ethnicity==2002 | ethnicity==2003 | ethnicity==2004 // White and Black Caribbean/African, White % Asian, Any other mixed background replace ethnicity5 = 3 if ethnicity==3 | ethnicity==5 | ethnicity==3001 | ethnicity==3002 | ethnicity==3003 | ethnicity==3004 // Asian or Asian British, Chinese, Indian, Pakistani, Bangladeshi, Any other Asian background replace ethnicity5 = 4 if ethnicity==4 | ethnicity==4001 | ethnicity==4002 | ethnicity==4003 // Black or Black British, Caribbean, African, Any other Black background replace ethnicity5 = 5 if ethnicity==6 // Other ethnic group replace ethnicity5 = 9 if ethnicity==-1 | ethnicity==-3 | ethnicity==. label define ethnicity5L 1 "White" 2 "Mixed Race" 3 "Asian or Asian British" 4 "Black or Black British" 5 "Other" 9 "Missing/Unknown/PNTS" label values ethnicity5 ethnicity5L misstable sum ethnicity5 tab ethnicity ethnicity5 ***TOWNSEND INDEX * No array, no instances, UK Biobank code 189 misstable sum townsend_dep xtile townsendG = townsend_dep, nq(5) tab townsendG replace townsendG=9 if townsend_dep==. label define townsendGL 1 "most affluent" 5 "Most deprived" 9 "Missing/Unknown/PNTS" label values townsendG townsendGL tabstat townsend_dep, by (townsendG) s(min max) *Dichotomize for het analysis, median sum townsend_dep, det return list gen townsendG2=(townsend_dep>=r(p50)) replace townsendG2=9 if townsendG==9 tabstat townsend_dep, by (townsendG2) s(min max n) recode townsendG2 0=1 1=2 tab townsendG2 ***EDUCATION - going to do categories of qualifications *no qualifications, CSE/O-Level/GCSE or equivalent, AS/A-Level or equivalent, Higher education or other professional qualification, or equivalent *O/CSE = 0; A levels = 1; Prof Q/NVQ = 2; Degree = 3; None of the above, PNA and unknown = 9 tab1 qualifications* *A total of 6 arrays, 3 instances, UK Biobank code 6138 gen qualifications3 = . la var qualifications3 "qualifications 3 cat" *Recoding PNA 5 and none of the above 4 forvalues i=0/5 { replace qualifications3 = 9 if qualifications_0_`i' == -3 | qualifications_0_`i' == -7 } *Start at the lowest ranking category and recode over the top *Recoding O/CSE levels 0 forvalues i=0/5 { replace qualifications3 = 1 if qualifications_0_`i'==3 | qualifications_0_`i'==4 } *Recoding A levels 1 forvalues i=0/5 { replace qualifications3 = 2 if qualifications_0_`i'==2 } *Recoding prof Q/NVQ 2 or university degree 3 forvalues i=0/5 { replace qualifications3 = 3 if qualifications_0_`i'==5 | qualifications_0_`i'==6 | qualifications_0_`i' == 1 } label define qualificationsL 1 "O level/GSE/CSE" 2 "A levels" 3 "Prof Q/NVQ/HND/HNC/Degree or other professional qualification" 9 "Missing/Unknown/PNTS" label values qualifications3 qualificationsL misstable sum qualifications3 recode qualifications3 (.=9) tab qualifications3 tab2 qualifications3 qualifications*, firstonly sum age_recruitment if qualifications_0_0==5 | qualifications_0_0==6 | qualifications_0_0== 1 | qualifications_0_1==5 | /// qualifications_0_1==6 | qualifications_0_1== 1 | qualifications_0_2==5 | qualifications_0_2==6 | qualifications_0_2== 1 | /// qualifications_0_3==5 | qualifications_0_3==6 | qualifications_0_3== 1 | qualifications_0_4==5 | qualifications_0_4==6 | /// qualifications_0_4== 1 | qualifications_0_5==5 | qualifications_0_5==6 | qualifications_0_5== 1 * We do have 133,576 *Dichotomize for het analysis, No AND A vs higher gen qualifications3_2=qualifications3 recode qualifications3_2 2=1 3=2 tab qualifications3 qualifications3_2 ***MARITAL STATUS *A total of 5 arrays, 3 instances, UK Biobank code 6141 tab1 marital* gen marital=. la var marital "marital status" forvalues i=0/4 { replace marital = 9 if marital_0_`i'==-3 } forvalues i=0/4 { replace marital = 1 if marital_0_`i'!=1 | marital_0_`i'!=-3 } forvalues i=0/4 { replace marital = 2 if marital_0_`i'==1 } label define maritalL 1 "no living with partner" 2 "living with partner" 9 "Missing/Unknown/PNTS" label values marital maritalL misstable sum marital tab marital tab2 marital marital*, firstonly ***EMPLOYMENT *7 arrays, 3 instances, UK Biobank code 6142 tab1 employment* gen unemployment=. la var unemployment "unemployment status" forvalues i=0/6 { replace unemployment = 9 if employment_0_`i'==-3 | employment_0_`i'==-7 } forvalues i=0/4 { replace unemployment = 2 if employment_0_`i'!=2 | employment_0_`i'!=3 | employment_0_`i'!=4 | employment_0_`i'!=5 | employment_0_`i'!=6 | employment_0_`i'!=7 } forvalues i=0/4 { replace unemployment = 1 if employment_0_`i'==1 } label define unemploymentL 1 "Paid/self-employment" 2 "Not in paid/self-employment" 9 "Missing/Unknown/PNTS" label values unemployment unemploymentL misstable sum unemployment tab unemployment, nolab tab2 unemployment employment_0_*, firstonly *************************************************************************** ***ANTHROPOMETRY*********************************************************************************************************************** *************************************************************************** *HEIGHT * Generating sex-specific height category variable. * No array, 3 instances, UK Biobank code 50 misstable sum height egen heightG = cut (height), at (60, 170, 175, 180, 185, 220) icodes la var heightG "5 cat of height in men" tab heightG recode heightG 0=1 1=2 2=3 3=4 4=5 tab heightG label define heightGL 1 "<170" 2 "170-" 3 "175-" 4 "180-" 5 "185-" 9 "Missing", replace label values heightG heightGL misstable sum heightG recode heightG .=9 tab heightG tabstat height, by (heightG) s(n min max) ***BMI *There are two BMI variables and two weight variables, one from the category body size measures (n~499,504) (bmi_2) *and one from the category impedance measures (n~492,539) (bmi_1). *For heel bone mineral density and bmd t-score there is the automated variables (n~279,000) and also manaul measurements on approx 42,000 participants. *We decided 08/11/2013 (prostate group) to use the bio impedance measured BMI preferentially, and if missing use the body size measure gen bmi = bmi_1 la var bmi "bmi impedance or body size measure" replace bmi = bmi_2 if bmi_1==. gen bmi_1_0=bmi1_1_0 replace bmi_1_0 = bmi2_1_0 if bmi1_1_0==. gen bmi_2_0=bmi1_2_0 replace bmi_2_0 = bmi2_2_0 if bmi1_2_0==. gen bmi_3_0=bmi1_3_0 replace bmi_3_0 = bmi2_3_0 if bmi1_3_0==. *Check sum bmi* sum bmi_2 if bmi_1==. *Generating a BMI category variable egen bmiG = cut (bmi), at (5, 25, 30, 35, 100) icodes la var bmiG "5 cat of bmi" replace bmiG=9 if bmi==. tab bmiG recode bmiG 0=1 1=2 2=3 3=4 label define bmiGL 1 "<25" 2 "25-" 3 "30-" 4 "35-" 9 "Missing", replace label values bmiG bmiGL tab bmiG tabstat bmi, by (bmiG) s(n min max) *Generating BMI categories using WHO cut-offs egen bmiwho = cut (bmi), at (5, 18.5, 25, 30, 35, 40, 100) icodes la var bmiwho "bmi cat by who" recode bmiwho 0=1 1=2 2=3 3=4 4=5 5=6 label define bmiwhoL 1 "<18.5" 2 "18.5-24.99" 3 "25.00-29.99" 4 "30.00-34.99" 5 "35.00-39.99" 6 ">=40.00" 9 "Missing" label values bmiwho bmiwhoL tab bmiwho recode bmiwho .=9 tabstat bmi, by (bmiwho) s(n min max) *Generating smaller groups of BMI egen bmiGm = cut (bmi), at (5, 20, 22.5, 25, 27.5, 30, 32.5, 35, 37.5, 40, 100) icodes la var bmiGm "bmi modified smaller groups" recode bmiGm 0=1 1=2 2=3 3=4 4=5 5=6 6=7 7=8 8=9 9=10 10=11 label define bmiGmL 1 "<20" 2 "20-22.5" 3 "22.5-25" 4 "25.0-27.5" 5 "27.5-30.00" 6 "30.00-32.5" 7 "32.5-35" 8 "35-37.5" 9 "37.5-40" 10 "40+" 11 "Missing", replace label values bmiGm bmiGmL tab bmiGm recode bmiGm .=9 tabstat bmi if sex==0, by (bmiGm) s(n min max) tabstat bmi if sex==1, by (bmiGm) s(n min max) *BODY FAT *Fifths of fat mass xtile bodyfat_pcG = bodyfat_pc, nq(5) tab bodyfat_pcG misstable sum bodyfat_pcG recode bodyfat_pcG .=9 label define bodyfat_pcGL 9 "Missing" label values bodyfat_pcG bodyfat_pcGL tabstat bodyfat_pc, by (bodyfat_pcG) s(min max n) xtile bodyfat_pc5M = bodyfat_pc if sex==1, nq(5) recode bodyfat_pc5M .=9 if sex==1 xtile bodyfat_pc5W = bodyfat_pc if sex==0, nq(5) recode bodyfat_pc5W .=9 if sex==0 gen bodyfat_g =. replace bodyfat_g =bodyfat_pc5M if sex==1 replace bodyfat_g =bodyfat_pc5W if sex==0 ***WAIST * No array, 3 instances, UK Biobank code 48 xtile waistG = waist, nq (5) tab waistG misstable sum waistG recode waistG .=9 label define waistGL 9 "Missing" label values waistG waistGL tabstat waist, by (waistG) s(min max n) *WHO cut-offs recode waist (0/93.9999999=1) (94/101.99999999=2) (102/max=3), gen (waistwho) misstable sum waistwho recode waistwho .=9 label define waistwhoL 9 "Missing" label values waistwho waistwhoL tabstat waist, by (waistwho) s(n min max) ***WHR gen whr=(waist/hip) sum whr misstable sum whr *imaging gen whr_2_0=(waist_2_0/hip_2_0) sum whr_2_0 misstable sum whr_2_0 gen whr_3_0=(waist_3_0/hip_3_0) sum whr_3_0 misstable sum whr_3_0 *************************************************************************** ***LIFESTYLE*********************************************************************************************************************** *************************************************************************** ***SMOKING *Generating smoking categories where current smoking frequency is included *Never, past, current <15, current 15 or more *What do I do for those that say PNA or DNK but they are current? At the moment I have them in current <10. *consider joining current <15 and current 15 ot more *No arrays, 3 instances, UK Biobank code 20116 for smoking_status and 3456 for current_smoking gen smokingG = smoking_status // 0 Never,1 Previous,2 Current la var smokingG "smoking cat 4 cat" tab1 smoking_status current_smoking, nolab replace smokingG = 3 if current_smoking>=1 & current_smoking !=. replace smokingG = 9 if smoking_status == -3 | current_smoking==-3 | current_smoking==-1 | smoking_status == . recode smokingG 0=1 1=2 2=3 label define smokingGL 1 "never" 2 "previous" 3 "current" 9 "Missing/Unknown/PNTS" label values smokingG smokingGL tab smokingG, nolab misstable sum smokingG tabstat current_smoking, by (smokingG) s(min max n) tab smoking_status smokingG *Dichotomize for het analysis, smokers vs no and formers gen smokingG2=smokingG recode smokingG2 2=1 3=2 tab smokingG smokingG2 ***ALCOHOL *Alcohol - code adapted from Georgina *Grams/day *Change values to missing for those who answers "do not know" or "prefer not to say" *No arrays for alcohol, beer...., 3 instances, UK Biobank code 1558 for alcohol and 1588 for beer for example foreach var of varlist beer_wk beer_mo redwine_wk redwine_mo whitewine_wk whitewine_mo port_wk port_mo spirits_wk spirits_mo otheralc_wk otheralc_mo { recode `var' (-3 -1 = .) } list beer_wk redwine_wk whitewine_wk port_wk spirits_wk in 1/30 *Individual Drink Intake (Per Week & Per Month) in grams *20 grams per pint of beer, 10 grams for each other drink per glass gen beer_wk_g=. replace beer_wk_g=20*beer_wk gen beer_mo_g=. replace beer_mo_g=20*beer_mo foreach var of varlist redwine_wk redwine_mo whitewine_wk whitewine_mo port_wk port_mo spirits_wk spirits_mo otheralc_wk otheralc_mo { gen `var'_g = 10*`var' } list beer_wk beer_wk_g redwine_wk redwine_wk_g in 1/30 *Total drink variable only missing if all missings *week to days egen totaldrink_wk=rowtotal(beer_wk_g redwine_wk_g whitewine_wk_g port_wk_g spirits_wk_g otheralc_wk_g), missing replace totaldrink_wk=totaldrink_wk/7 list totaldrink_wk if beer_wk_g==. & redwine_wk_g==. & whitewine_wk_g==. & port_wk_g==. & spirits_wk_g==. & otheralc_wk_g==. in 1/100 *month to days egen totaldrink_mo=rowtotal(beer_mo_g redwine_mo_g whitewine_mo_g port_mo_g spirits_mo_g otheralc_mo_g), missing replace totaldrink_mo=totaldrink_mo/30.4375 list totaldrink_mo if beer_mo_g==. & redwine_mo_g==. & whitewine_mo_g==. & port_mo_g==. & spirits_mo_g==. & otheralc_mo_g==. in 1/100 *tab totaldrink_mo, miss gen drinkoverall=totaldrink_wk la var drinkoverall "g/d var to create alcohol cat" replace drinkoverall=totaldrink_mo if (drinkoverall==. & totaldrink_mo!=.) misstable sum drinkoverall *Calculate medians for individuals with full information, in the top four most frequent alcohol intake groups * then give the median of that group if drinkoverall==. but they have answered the question about overall alcohol intake *alcohol from 1 to 4: Daily or almost daily, Three or four times a week, Once or twice a week, One to three times a month forvalues i=1/4 { summ drinkoverall if alcohol==`i', detail gen median_`i'=r(p50) tab median_`i' } forvalues i=1/4 { replace drinkoverall=median_`i' if (drinkoverall==. & alcohol==`i') } *Categorical Variables gen alcoholG=. replace alcoholG=1 if (drinkoverall!=0 & drinkoverall<1 | alcohol==5) // alcohol==5 special occasions // ref group <1g/d replace alcoholG=2 if (drinkoverall>=1 & drinkoverall<10) replace alcoholG=3 if (drinkoverall>=10 & drinkoverall<20) replace alcoholG=4 if (drinkoverall>=20 & drinkoverall!=.) replace alcoholG=5 if alcohol==6 // non drinkers recode alcoholG .=9 label define alcoholGL 1 "<1g/d" 2 "1-10g/d" 3 "10-20g/d" 4 "20+g/d" 5 "none drinkers" 9 "Missing/Unknown/PNTS" label values alcoholG alcoholGL tab alcoholG tab alcoholG alcohol, miss rename alcohol alcoholbaseline // I want to have alcohol (final with all included) and alcoholG for het test code rename drinkoverall alcohol tab alcoholG, miss ***PHYSICAL ACTIVITY *****adapted from Kathryn **No arrays, 3 instances, UK Biobank code 864, 874, 884, 894, 904, 914 * -1 represents "Do not know", -2 represents "Unable to walk" (only in 864), -3 represents "Prefer not to answer" gen walks_days = n_864_0_0 gen walks_duration = n_874_0_0 gen MPA_days = n_884_0_0 gen MPA_duration = n_894_0_0 gen VPA_days = n_904_0_0 gen VPA_duration = n_914_0_0 tab1 walks_days walks_duration MPA_days MPA_duration VPA_days VPA_duration list VPA_* MPA_* walks_* in 1/5 *NOTE: People who don't have an answer for both variables (days/week and duration) will be coded as missing. I.e. people who said they did vigorous activity 3 days a week but don't know for how long are missing for the variable VPA_perwk *missing if negative value (-1 or -3 in original variable) foreach var of varlist walks_days walks_duration MPA_days MPA_duration VPA_days VPA_duration { replace `var'=. if `var'==-1 | `var'==-3 } list VPA_* MPA_* walks_* in 1/5 *Generating PA minutes per week gen VPA_perwk = VPA_days*VPA_duration if VPA_days>=0 & VPA_duration>=0 replace VPA_perwk = 0 if VPA_days==0 gen MPA_perwk = MPA_days*MPA_duration if MPA_days>=0 & MPA_duration>=0 replace MPA_perwk = 0 if MPA_days==0 gen walks_perwk = walks_days*walks_duration if walks_days>=0 & walks_duration>=0 replace walks_perwk = 0 if walks_days==0 | walks_days==-2 *Check list VPA_* MPA_* walks_* in 1/5 *IPAQ recommends "Only values of 10 or more minutes of activity should be included in the calculation of... *summary scores". They assume no health benefits for less than 10 minutes."Responses of less than 10 minutes... *[and their associated days] should be re-coded to zero." sum VPA_days VPA_duration VPA_perwk if VPA_duration<10 & VPA_perwk>0 sum MPA_days MPA_duration MPA_perwk if MPA_duration<10 & MPA_perwk>0 sum walks_days walks_duration walks_perwk if walks_duration<10 & walks_perwk>0 *Recoding to zero recode VPA_duration (1/9=0), gen (VPA_duration_minrecode) recode MPA_duration (1/9=0), gen (MPA_duration_minrecode) recode walks_duration (1/9=0), gen (walks_duration_minrecode) gen VPA_days_minrecode = VPA_days replace VPA_days_minrecode = 0 if VPA_duration_minrecode==0 gen MPA_days_minrecode = MPA_days replace MPA_days_minrecode = 0 if MPA_duration_minrecode==0 gen walks_days_minrecode = walks_days replace walks_days_minrecode = 0 if walks_duration_minrecode==0 list VPA_days* VPA_duration* in 1/5 list MPA_days* MPA_duration* in 1/5 list walks_days* walks_duration* in 1/5 *IPAQ recommend truncating each of the three activities at 180 minutes (3 hours) per day. This means that participants can't do more than 21 hours (7*3) (1260) each week, of each activity. *Again it doesn't say exactly how to do this, but I think we need to take weekly totals of all three activites and see who is over 1260 mins for each activity (equal to 180/day) *To truncate each of the three activities at 180 minutes per week: gen VPA_perwk_clean = VPA_days_minrecode*VPA_duration_minrecode if VPA_days_minrecode>=0 & VPA_duration_minrecode>=0 replace VPA_perwk_clean = 0 if VPA_days_minrecode==0 gen MPA_perwk_clean = MPA_days_minrecode*MPA_duration_minrecode if MPA_days_minrecode>=0 & MPA_duration_minrecode>=0 replace MPA_perwk_clean = 0 if MPA_days_minrecode==0 gen walks_perwk_clean = walks_days_minrecode*walks_duration_minrecode if walks_days_minrecode>=0 & walks_duration_minrecode>=0 replace walks_perwk_clean = 0 if walks_days_minrecode==0 | walks_days==-2 *Check list VPA_* in 1/5 list MPA_* in 1/5 list walks_* in 1/5 *Truncating at 1260 mins per week for each activity recode VPA_perwk_clean (1260/max=1260) recode MPA_perwk_clean (1260/max=1260) recode walks_perwk_clean (1260/max=1260) *Check sum VPA_perwk VPA_perwk_clean sum MPA_perwk MPA_perwk_clean sum walks_perwk walks_perwk_clean *Generating excess METS for each component gen VPA_METS_perwk = VPA_perwk_clean*7.0 gen MPA_METS_perwk = MPA_perwk_clean*3.0 gen walks_METS_perwk = walks_perwk_clean*2.3 *Check list VPA_days_minrecode VPA_duration_minrecode VPA_perwk_clean VPA_METS_perwk in 1/5 list MPA_days_minrecode MPA_duration_minrecode MPA_perwk_clean MPA_METS_perwk in 1/5 list walks_days_minrecode walks_duration_minrecode walks_perwk_clean walks_METS_perwk in 1/5 *Generating excess MET-minutes per week, use egen to not create missings if not all missings egen METmins_perwk=rowtotal (VPA_METS_perwk MPA_METS_perwk walks_METS_perwk), missing *Check list METmins_perwk VPA_METS_perwk MPA_METS_perwk walks_METS_perwk in 1/5 *Generating excess MET hours per week gen METhrs_perwk = METmins_perwk/60 *Check list METmins_perwk METhrs_perwk in 1/5 *Categories of excess MET-hrs per week egen METhrs_perwk_cat = cut (METhrs_perwk), at (0, 5, 10, 15, 25, 35, 50, 75, 100, 350) icodes recode METhrs_perwk_cat .=9 label define METhrs_perwk_catL 0 "<5" 1 "5-9.9" 2 "10-14.9" 3 "15-24.9" 4 "25-34.9" 5 "35-49.9" 6 "50-74.9" 7 "75-99.9" 8 ">=100" 9 "Missing/Unknown/PNTS" label values METhrs_perwk_cat METhrs_perwk_catL tabstat METhrs_perwk, by (METhrs_perwk_cat) s(min max n) *Generate 'low', 'moderate' and 'high' levels of physical activity. This is based on IPAQ advice (but simplified). egen METhrs_perwkG = cut (METhrs_perwk), at (0, 10, 50, 350) icodes recode METhrs_perwkG .= 9 tab METhrs_perwkG recode METhrs_perwkG 0=1 1=2 2=3 label define METhrs_perwkGL 1 "low" 2 "moderate" 3 "high" 9 "Missing/Unknown/PNTS" label values METhrs_perwkG METhrs_perwkGL tabstat METhrs_perwk, by (METhrs_perwkG) s(min max n) tab METhrs_perwkG *************************************************************************** ***HEALTH HISTORY*********************************************************************************************************************** *************************************************************************** ***VASECTOMY * code 1218 vasectomy, 1589 reversal of vasectomy / vasectomy reversal * A total of 32, No instances, UK Biobank code 20004 gen vasectomy=1 forvalues i=0/31 { replace vasectomy =2 if n_20004_0_`i'==1218 | n_20004_0_`i'==1589 } tab vasectomy label define vasectomyL 1 "No/Missing/Unknown/PNTS" 2 "Yes" label values vasectomy vasectomyL tab vasectomy ***HYPERTENSION *Two measures were made for blood pressure. * Two arrays, 3 instances, codes: SBP 4080 and 93. DBP 4079 and 94 *Also there are manual blood pressure measures on approx 43,000 particpants. list n_4080_0_0 n_4080_0_1 n_93_0_0 n_93_0_1 in 1/50 list n_4079_0_0 n_4079_0_1 n_94_0_0 n_94_0_1 in 1/50 *People only have 2 sbp and 2 dbp measures (can be any combination of automated and manual measures) *Sarah Lewington (via Naomi Allen) said to use an average of blood pressure measurements (11/11/2013) *Generating average of two measures egen sbp_ave2 = rowmean (n_4080_0_0 n_4080_0_1 n_93_0_0 n_93_0_1) egen dbp_ave2 = rowmean (n_4079_0_0 n_4079_0_1 n_94_0_0 n_94_0_1) misstable sum sbp_ave2 misstable sum dbp_ave2 *Check list sbp_ave2 n_4080* n_93_0_* in 1/20 list dbp_ave2 n_4079* n_94_0_* in 1/20 gen hypertension=1 replace hypertension=2 if sbp_ave2>=140 | dbp_ave2>=90 replace hypertension=9 if sbp_ave2==. | dbp_ave2==. tabstat sbp_ave2 dbp_ave2, by (hypertension) s(min max n) tab hypertension label define hypertensionL 1 "No" 2 "Yes" 9 "Missing" label values hypertension hypertensionL tab hypertension ***Diabetes *No arrays, 3 instances, UK Biobank code 2443 tab diabetes_sr gen diabetes=diabetes_sr recode diabetes -3=9 -1=9 .=9 0=1 1=2 label variable diabetes "Diabetes diagnosed by doctor" label define diabetesL 1 "No History" 2 "Diabetes" 9 "Missing/Unknown/PNTS" label values diabetes diabetesL misstable sum diabetes tab diabetes *************************************************************************** ** Men (pca) specific variables***** **************************************************************************** *PSA * 3-leveled PSA test variable * No array, 3 instances, UK Biobank code 2365 tab screen_psa, nolab gen psa_3 = screen_psa replace psa_3 = 9 if screen_psa==-3 | screen_psa==-1 recode psa_3 .=9 0=1 1=2 la var psa_3 "Ever Had PSA Test, 3 levels, Self Report" label define psa_3L 1 "No" 2 "Yes" 9 "Missing/Unknown/PNTS" , replace label values psa_3 psa_3L tab psa_3 screen_psa *GENITOURINARY MORBIDITIES ** Related Genitourinary Morbidities such as prostatitis, benign ph, or enlarged prostate **NOTE: check how many non_cancer_illness* we have in the database **non_cancer_illness has 29 arrays, 3 instances, UK Biobank code 20002 gen prostatitis = 1 la var prostatitis "prostatitis" gen bph = 1 la var bph "benign prostatic hypertrophy" gen enl_pros = 1 la var enl_pros "enlarged prostate" forvalues i=0/28 { replace prostatitis = 2 if non_cancer_illness_0_`i'==1517 replace bph = 2 if non_cancer_illness_0_`i'==1516 replace enl_pros=2 if non_cancer_illness_0_`i'==1396 } tab1 prostatitis bph enl_pros label define prostateL 1 "No/Missing/Unknown/PNTS" 2 "Yes", replace foreach var of varlist prostatitis bph enl_pros { label values `var' prostateL } tab bph enl_pros *join bph and enl_pros, as it is the same thing replace enl_pros=2 if bph==2 tab enl_pros *prostate biopsy 41200, M622 Open biopsy of lesion of prostate * PCa Family history *** If ppt has answered 'YES' to group II illnesses (which includes PrCa), but has not checked PrCa => definite NO *** Answering YES to any group II illness individually will be taken to indicate that the ppt said YES to group II collectively *** Genuine missings will be defined only as those who were ineligible to answer (i.e. adoptees) *** PNTS / Do no know, and other missings where the question was asked but not answered will be coded as a single 'unknown' category * Father with Pca **fh_father has 10 arrays, 3 instances, UK Biobank code 20107 * NOTE: check number of fh_father* * Group II illnesses are Parkinson's (11), Severe Depression (12), Lung (3), Bowel (4) and Prostate (13) Cancers *group II illnesses gen father_g2 = 9 /* Default: PNTS, Missing or Unknown */ la var father_g2 "Father has any Group II Illness" forvalues i=0/9 { replace father_g2 = 1 /* NO to all of the above, Group II */ if fh_father_0_`i'==-27 } forvalues i=0/9 { replace father_g2 = 2 /* ANY group II illness recorded by name in the array */ if fh_father_0_`i'==13 | fh_father_0_`i'==12 | /// fh_father_0_`i'==11 | fh_father_0_`i'==4 | fh_father_0_`i'==3 } replace father_g2 = 9 if n_1767_0_0==1 // if adopted give PNTs, missing or unknown label define father_g2L 1 "No" 2 "Any Group II Illness Recorded" 9 "Missing/Unknown/PNTS", replace label values father_g2 father_g2L tab father_g2 tab2 father_g2 fh_father*, firstonly gen father_pca = 9 la var father_pca "Father's History of PrCa" forvalues i=0/9 { replace father_pca = 1 if (father_g2==1 & fh_father_0_`i'!=13) | fh_father_0_`i'==-27 } forvalues i=0/9 { replace father_pca = 2 if fh_father_0_`i'==13 } tab father_pca label define father_pcaL 9 "Missing/Unknown/PNTS" 1 "No" 2 "Yes", replace label values father_pca father_pcaL tab father_pca tab2 father_pca fh_father*, firstonly * Brother with PCa * fh_sibling has 12 arrays, 3 instances, UK Biobank code 20111 gen brother_g2 = 9 /* Default: PNTS, Missing or Unknown */ la var brother_g2 "Brother has any Group II Illness" forvalues i=0/11 { replace brother_g2 = 1 /* NO to all of the above, Group II */ if fh_sibling_0_`i'==-27 } forvalues i=0/11 { replace brother_g2 = 2 /* ANY group II illness recorded by name in the array */ if fh_sibling_0_`i'==13 | fh_sibling_0_`i'==12 | /// fh_sibling_0_`i'==11 | fh_sibling_0_`i'==4 | fh_sibling_0_`i'==3 } replace brother_g2 = 9 if n_1767_0_0==1 // if adopted give PNTs, missing or unknown label define brother_g2L 1 "No" 2 "Any Group II Illness Recorded" 9 "Missing/Unknown/PNTS", replace label values brother_g2 brother_g2L tab brother_g2 tab2 brother_g2 fh_sibling*, firstonly gen brother_pca = 9 la var brother_pca "Brother's History of PrCa" forvalues i=0/11 { replace brother_pca = 1 if (brother_g2==1 & fh_sibling_0_`i'!=13) | fh_sibling_0_`i'==-27 } forvalues i=0/11 { replace brother_pca = 2 if fh_sibling_0_`i'==13 } tab brother_pca label define brother_pcaL 1 "No" 2 "Yes" 9 "Missing/Unknown/PNTS", replace label values brother_pca brother_pcaL tab brother_pca, nolab tab2 brother_pca fh_sibling*, firstonly * Overall Summary Variable of PrCa in Genetic First Degree Relative gen famhist_pcad = 9 /* Eligible to answer, but no answer recorded */ replace famhist_pcad = 1 if father_pca==1 & brother_pca==1 // no pca in father AND brother replace famhist_pcad = 2 if father_pca==2 | brother_pca==2 // no pca in father OR brother replace famhist_pcad = 3 if father_pca==2 & brother_pca==2 // pca in father AND brother tab famhist_pcad label variable famhist_pcad "PrCa in Genetic First Degree Relatives detailed" label define famhist_pcadL 1 "No History" 2 "Father OR Brother" 3 "Father AND Brother" 9 "Missing/Unknown/PNTS", replace label values famhist_pcad famhist_pcadL tab famhist_pcad misstable sum famhist_pcad * Father OR Brother gen famhist_pca1 = famhist_pca recode famhist_pca1 3=2 label variable famhist_pca1 "PrCa in Genetic First Degree Relative" label define famhist_pca1L 1 "No History" 2 "PrCa in First Degree Relative" 9 "Missing/Unknown/PNTS", replace label values famhist_pca1 famhist_pca1L tab famhist_pca1 *added after BJC revision (01/08/17), no, brother, father, brother and father gen famhist_pcadet = famhist_pcad tab famhist_pcadet recode famhist_pcadet 3=4 replace famhist_pcadet=2 if brother_pca==2 replace famhist_pcadet=3 if father_pca==2 replace famhist_pcadet=4 if father_pca==2 & brother_pca==2 label variable famhist_pcadet "Det PrCa in Genetic First Degree Relative" label define famhist_pcadetL 1 "No History" 2 "Brother" 3 "Father" 4 "Father AND Brother" 9 "Missing/Unknown/PNTS", replace label values famhist_pcadet famhist_pcadetL tab famhist_pcadet *************************************************************************** ***SEXUAL HISTORY*********************************************************************************************************************** *************************************************************************** *Heterosexual ///Age first had sexual intercourse/// // Take care with the lower category; it has implausible values *No arrays, 3 instances, UK Biobank code 2139 rename n_2139_0_0 agesex gen agesexG = cond(agesex >= 0 & agesex <= 16, 1, cond(agesex > 16 & agesex <= 20, 2, /// cond(agesex > 20 & agesex <= 25, 3, cond(agesex > 25 & agesex !=., 4, cond(agesex == -2, 0, 9))))) label variable agesexG "Age at first sexual intercourse" tab agesexG recode agesexG 0=5 tab agesexG * Change ref to 16-20 y recode agesexG 1=2 2=1 label define agesexGL 1 "16-20 years" 2 "Less than 16 years" 3 "20-25 years" 4 ">= 25 years" 5 "Never had sex" 9 "Missing/Unknown/PNTS" label values agesexG agesexGL misstable sum agesexG tabstat agesex, by (agesexG) s(min max n) *Dichotomize for het analysis, non vs some gen agesexG2=agesexG recode agesexG2 2=1 3=1 4=1 5=2 tab agesexG agesexG2 *Number of Children Fathered *No arrays, 3 instances, UK Biobank code 2405 rename n_2405_0_0 nchildren gen nchildrenG=nchildren replace nchildrenG = 3 if nchildren >= 3 tab nchildrenG recode nchildrenG -1=9 -3=9 .=9 0=1 1=2 2=3 3=4 *change ref to 2 children recode nchildrenG 1=4 3=1 4=3 replace nchildrenG= 5 if agesexG==5 label define nchildrenGL 1 "2" 2 "1" 3 "3+" 4 "0" 5 "never had sex" 9 "Missing/Unknown/PNTS", replace label values nchildrenG nchildrenGL tab nchildrenG *Dichotomize for het analysis, non vs some gen nchildrenG2=nchildrenG recode nchildrenG2 2=1 3=1 4=2 5=2 tab nchildrenG nchildrenG2 ****Lifetime Number of Sexual Partners, hetero // Some values in top group for the continuous variable seem questionable (t.ex. 17.000) *No arrays, 3 instances, UK Biobank code 2149 rename n_2149_0_0 sexualpartners tab sexualpartners, nolab gen sexpartnersG = sexualpartners recode sexpartnersG 1=1 2/5=2 6/max=3 replace sexpartnersG=4 if agesexG==5 // never misstable sum sexpartnersG recode sexpartnersG .=9 -3=9 -1=9 label var sexpartnersG "Lifetime number of sexual partners" label define sexpartnersGL 1 "One sexual partner" 2 "Between 2 and 5 sexual partners" 3 "Greater than 6 sexual partners" 4 "never" 9 "Missing/Unknown/PNTS" label values sexpartnersG sexpartnersGL tabstat sexualpartners, by (sexpartnersG) s(min max n) tab sexpartnersG, nolab *Dichotomize for het analysis, median sum sexualpartners, det return list gen sexpartnersG2=(sexualpartners>=r(p50)) replace sexpartnersG2=9 if sexpartnersG==9 tabstat sexualpartners, by (sexpartnersG2) s(min max n) recode sexpartnersG2 0=1 1=2 tabstat sexualpartners, by (sexpartnersG2) s(min max n) *Homosexual ///Ever had same-sex intercourse (defined as oral/anal intercourse) *No arrays, 3 instances, UK Biobank code 2159 rename n_2159_0_0 samesex tab samesex gen samesexG=samesex recode samesexG -3=9 .=9 0=1 1=2 tab samesexG label define samesexGL 1 "No" 2 "Yes" 9 "Missing/Unknown/PNTS", replace label values samesexG samesexGL label variable samesexG "Ever had same-sex intercourse (defined as oral/anal intercourse)" tab samesexG ***Lifetime Number of Sexual Partners, homosex // Some values in top group for the continuous variable seem questionable (t.ex. 17.000) *No arrays, 3 instances, UK Biobank code 3669 rename n_3669_0_0 samesexpartners tab samesexpartners gen samesexpartnersG = samesexpartners recode samesexpartnersG 1=1 2/5=2 6/max=3 replace samesexpartnersG=0 if agesexG==1 misstable sum samesexpartnersG recode samesexpartnersG .=0 if samesexG==1 tab samesexpartnersG recode samesexpartnersG .=9 -3=9 -1=9 0=4 // I don't want values for never had sex tab samesexpartnersG label var samesexpartnersG "Lifetime number of sexual partners" label define samesexpartnersGL 1 "One sexual partner" 2 "2-5 sexual partners" 3 "+6 sexual partners" 4 "never" 9 "Missing/Unknown/PNTS", replace label values samesexpartnersG samesexpartnersGL tabstat samesexpartners, by (samesexpartnersG) s(min max n) *Dichotomize for het analysis, median sum samesexpartners, det return list gen samesexpartnersG2=(samesexpartners>=r(p50)) replace samesexpartnersG2=9 if samesexpartnersG==9 tabstat samesexpartners, by (samesexpartnersG2) s(min max n) recode samesexpartnersG2 0=1 1=2 tabstat samesexpartners, by (samesexpartnersG2) s(min max n) *************************************************************************** ***EARLY LIFE FACTORS*********************************************************************************************************************** *************************************************************************** *facialhair *No arrays, 3 instances, UK Biobank code 2375 gen facialhair= n_2375_0_0 tab facialhair tab facialhair, nolab misstable sum facialhair recode facialhair (-1 = 9) (-3 = 9) (.=9) recode facialhair (1=2) (2=1) // to make about average the reference tab facialhair label define facialhairL 1 "About average" 2 "Younger than average" 3 "Older than average" 9 "Missing/Unknown/PNTS" label values facialhair facialhairL tab facialhair *Dichotomize for het analysis gen facialhair2=facialhair recode facialhair2 2=1 3=2 tab facialhair facialhair2 *voicebroke *No arrays, 3 instances, UK Biobank code 2385 gen voicebroke=n_2385_0_0 tab voicebroke misstable sum voicebroke recode voicebroke (-1 = 9) (-3 = 9) (.=9) recode voicebroke (1=2) (2=1) // to make about average the reference tab voicebroke label define voicebrokeL 1 "About average" 2 "Younger than average" 3 "Older than average" 9 "Missing/Unknown/PNTS" label values voicebroke voicebrokeL tab voicebroke *Dichotomize for het analysis gen voicebroke2=voicebroke recode voicebroke2 2=1 3=2 tab voicebroke voicebroke2 *baldpattern *No arrays, 3 instances, UK Biobank code 2395 gen baldpattern=n_2395_0_0 tab baldpattern misstable sum baldpattern recode baldpattern (-1 = 9) (-3 = 9) (.=9) label define baldpatternL 1 "Pattern 1" 2 "Pattern 2" 3 "Pattern 3" 4 "Pattern 4" 9 "Missing/Unknown/PNTS" label values baldpattern baldpatternL tab baldpattern *Dichotomize for het analysis gen baldpattern2=baldpattern recode baldpattern2 2=1 3=2 4=2 tab baldpattern baldpattern2 *bodysize10 *No arrays, 3 instances, UK Biobank code 1687 gen bodysize10=n_1687_0_0 label variable bodysize10 "bodysize at 10y" tab bodysize10 misstable sum bodysize10 recode bodysize10 (-1 = 9) (-3 = 9) (.=9) recode bodysize10 (1=3) (3=1) // to make about average the reference tab bodysize10 label define bodysize10L 1 "About average" 2 "Plumper" 3 "Thinner" 9 "Missing/Unknown/PNTS" label values bodysize10 bodysize10L tab bodysize10 *Dichotomize for het analysis gen bodysize10_2=bodysize10 recode bodysize10_2 3=1 tab bodysize10 bodysize10_2 *height10 *No arrays, 3 instances, UK Biobank code 1697 gen height10= n_1697_0_0 label variable height10 "height at 10y" tab height10 tab height10, nolab misstable sum height10 recode height10 (-1 = 9) (-3 = 9) (.=9) recode height10 (1=3) (3=1) // to make about average the reference tab height10 label define height10L 1 "About average" 2 "Taller" 3 "Shorter" 9 "Missing/Unknown/PNTS", replace label values height10 height10L tab height10 *Dichotomize for het analysis gen height10_2=height10 recode height10_2 3=1 tab height10 height10_2 *Create new variable for hair colour *No arrays, 3 instances, UK Biobank code 1747 gen haircolour=n_1747_0_0 label variable haircolour "Natural hair colour before greying in whites" tab haircolour misstable sum haircolour recode haircolour (-1 = 9) (-3 = 9) (.=9) replace haircolour=6 if ethnicity5!=1 *change ref to light brown recode haircolour 1=3 3=1 label define haircolour 1 "Light Brown" 2 "Red" 3 "Blonde" 4 "Dark Brown" 5 "Black" 6 "Other" 9 "Missing/Unknown/PNTS" label values haircolour haircolour tab haircolour *Dichotomize for het analysis gen haircolour2=haircolour recode haircolour2 3=1 4=1 5=1 6=1 tab haircolour2 haircolour **REGION gen region = . label variable region "Region of Recruitment" replace region = 1 /* LONDON */ if demogr_asses_centre==11012 | demogr_asses_centre==11018 | demogr_asses_centre==11019 | demogr_asses_centre==11020 replace region = 2 /* WALES */ if demogr_asses_centre==11023 | demogr_asses_centre==11022 | demogr_asses_centre==11003 replace region = 3 /* NORTH-WEST */ if demogr_asses_centre==10003 | demogr_asses_centre==11001 | demogr_asses_centre==11016 | demogr_asses_centre==11008 replace region = 4 /* NORTH-EAST */ if demogr_asses_centre==11009 | demogr_asses_centre==11017 replace region = 5 /* YORKSHIRE & HUMBER */ if demogr_asses_centre==11010 | demogr_asses_centre==11014 replace region = 6 /* WEST MIDLANDS */ if demogr_asses_centre==11006 | demogr_asses_centre==11021 replace region = 7 /* EAST MIDLANDS */ if demogr_asses_centre==11013 | demogr_asses_centre==11015 replace region = 8 /* SOUTH-EAST */ if demogr_asses_centre==11002 | demogr_asses_centre==11007 replace region = 9 /* SOUTH-WEST */ if demogr_asses_centre==11011 replace region = 10 /* SCOTLAND */ if demogr_asses_centre==11004 | demogr_asses_centre==11005 label define region 1 "London" 2 "Wales" 3 "North-West England" 4 "North-Eastern England" 5 "Yorkshire & the Humber" /// 6 "West Midlands" 7 "East Midlands" 8 "South-East England" 9 "South-West England" 10 "Scotland", replace label values region region tab region *center tab region demogr_asses_centre **HRT use* recode hrt_use (-3/-1=9) recode hrt_use .=9 if sex==0 recode hrt_use 1=2 if hrt_agestopped==-11 recode hrt_use 1=2 if hrt_agestopped==age_recruitment tab hrt_agestopped age_recruitment label def hrt_useL 0"No" 1"Former" 2"Current" 9"Missing", replace label val hrt_use hrt_useL codebook hrt_use if sex==0 **Oral contraceptive use recode oca_use (-3/-1=9) *changing current users* recode oca_use 1=2 if n_2804_0_0==-11 label def oca_useL 0"No" 1"Former" 2"Current" 9"Missing", replace label val oca_use oca_useL recode oca_use .=9 if sex==0 codebook oca_use if sex==0 ** Ethnic group separating asians* gen ethnicity6 =ethnicity5 recode ethnicity6 5=6 4=5 recode ethnicity6 3=4 if ethnicity==5 recode ethnicity6 3=4 if ethnicity==3 recode ethnicity6 3=4 if ethnicity==3004 label define ethnicity6L 1 "White" 2 "Mixed Race" 3 "Indian/Pakistani/Bangh" 4"Chinese, Asian or other Asian" 5 "Black or Black British" 6 "Other" 9 "Missing/Unknown/PNTS" label values ethnicity6 ethnicity6L tab ethnicity ethnicity6 *========= Menopause =====================* ** Post-menopausal women rename n_3720_0_0 menstrutoday rename n_3700_0_0 lastmenstru recode menopause 0=1 if age_recruitment>=55 recode menopause 2=1 if age_recruitment>=55 recode menopause 3=1 if age_recruitment>=55 recode menopause .=1 if age_recruitment>=55 & sex==0 recode menopause 0=1 if bi_oophorectomy==1 recode menopause 2=1 if bi_oophorectomy==1 recode menopause 3=1 if bi_oophorectomy==1 recode menopause .=1 if bi_oophorectomy==1 & sex==0 *Pre-menopausal recode menopause 3=0 if age_recruitment<50 & hrt_use==0 & bi_oophorectomy==0 recode menopause 2=0 if age_recruitment<50 & hrt_use==0 & bi_oophorectomy==0 recode menopause .=0 if age_recruitment<50 & hrt_use==0 & bi_oophorectomy==0 recode menopause 3=0 if menstrutoday==1 & age_recruitment<50 recode menopause 2=0 if menstrutoday==1 & age_recruitment<50 recode menopause .=0 if menstrutoday==1 & age_recruitment<50 & sex==0 *Unknown category recode menopause 3=2 recode menopause 0=2 if hrt_use !=0 | bi_oophorectomy!=0 | hysterectomy!=0 | age_recruitment>=50 recode lastmenstru (-3/-1=.) egen menstrucycle = cut (lastmenstru), at (0,6,11,15,19,25,370) icodes label define menstrucycleL 0 "0-5 Early Follicular" 1 "6-10 Late Follicular" 2 " 11-14 Mid Cycle" 3 "15-18 Early luteal" 4 " 19-24 Mid Luteal" 5 "25+ Late Luteal" 9"Missing", replace label val menstrucycle menstrucycleL recode menstrucycle 1=0 2=0 3=0 4=0 5=0 9=0 if menstrutoday==1 recode menstrucycle 0=. 1=. 2=. 3=. 4=. 5=. 9=. if menopause!=0