# Nice to Meat You : An Analysis on Meat Consumption

## Introduction

Meat consumption is a highly controversial topic in nowadays society in terms of [environment](https://www.theguardian.com/environment/2018/oct/10/huge-reduction-in-meat-eating-essential-to-avoid-climate-breakdown), [health](https://gamechangersmovie.com/), and [ethical](http://traslosmuros.com/en/slaughterhouse-documentary/) reasons. This situation introduces different ways of consuming meat and divides people according to their behaviors such as vegan, vegetarian, occasional meat consumer, and people consuming large amounts of meats.

In this project, we are particularly interested in the factors related to consuming meat and how the behavior of households varies according to their meat consumption. We are going to focus on demographic information such as economic status, family composition, age, and their relation with consuming meat. Moreover, we are going to study external effects that are related to meat consumption such as seasonal changes, campaigns, and discounts. Finally, we are going to categorize households according to their meat consumption and perform a behavior analysis per category.

To do so, we are going to use the Dunnhumby dataset. This dataset contains shopping information collected over two years from a group of 2,500 households as well as their information for each household. Moreover, we are planning to expand our analysis with an additional dataset about nutritional information.


## Factors that influence meat consumption

### Loading the data

In [1]:
%matplotlib inline
import pandas as pd
import numpy as np
import re
from matplotlib.ticker import MaxNLocator
import matplotlib.pyplot as plt
from requests import get
from bs4 import BeautifulSoup

The first step is to select the meat products in the products dataset. The products are classified at three different categories levels. The broadest category is 'DEPARTMENT'.

In [39]:
products = pd.read_csv('dunnhumby/product.csv', sep = ',')
products['DEPARTMENT'].unique()

array(['GROCERY', 'MISC. TRANS.', 'PASTRY', 'DRUG GM', 'MEAT-PCKGD',
       'SEAFOOD-PCKGD', 'PRODUCE', 'NUTRITION', 'DELI', 'COSMETICS',
       'MEAT', 'FLORAL', 'TRAVEL & LEISUR', 'SEAFOOD', 'MISC SALES TRAN',
       'SALAD BAR', 'KIOSK-GAS', 'ELECT &PLUMBING', 'GRO BAKERY',
       'GM MERCH EXP', 'FROZEN GROCERY', 'COUP/STR & MFG', 'SPIRITS',
       'GARDEN CENTER', 'TOYS', 'CHARITABLE CONT', 'RESTAURANT', 'RX',
       'PROD-WHS SALES', 'MEAT-WHSE', 'DAIRY DELI', 'CHEF SHOPPE', 'HBC',
       'DELI/SNACK BAR', 'PORK', 'AUTOMOTIVE', 'VIDEO RENTAL', ' ',
       'CNTRL/STORE SUP', 'HOUSEWARES', 'POSTAL CENTER', 'PHOTO', 'VIDEO',
       'PHARMACY SUPPLY'], dtype=object)

In [40]:
# Select the meat departments
meat_dept = ['MEAT-PCKGD','MEAT','MEAT-WHSE','PORK']
meat_products = products[products['DEPARTMENT'].isin(meat_dept)].copy()

The 'DELI' department also contains meat but not exclusively. It is therefore necessary to select the 'COMMODITY_DESC' amongst the 'DELI' department containing exclusively meat.

In [41]:
deli_products = products[products['DEPARTMENT'] == 'DELI']
deli_products['COMMODITY_DESC'].unique()

array(['SALADS/DIPS', 'SANDWICHES', 'DELI MEATS', 'CHEESES',
       'CHICKEN/POULTRY', 'SNACKS', 'PREPARED FOOD', 'COFFEE SHOP',
       'DELI SPECIALTIES (RETAIL PK)', 'PARTY TRAYS', 'SERVICE BEVERAGE',
       'SUSHI', 'DELI SUPPLIES'], dtype=object)

In [106]:
# Select the meat commodities amongst the 'DELI' department
meat_commodity = ['DELI MEATS','CHICKEN/POULTRY']
meat_products.append(products[products['COMMODITY_DESC'].isin(meat_commodity)])
meat_products['SUB_COMMODITY_DESC'].unique()

array(['FRESH', 'FRZN BREADED PREPARED CHICK', 'BREAST - BONELESS(IQF)',
       'BOLOGNA', 'LINKS - COOKED', 'LUNCH COMBO', 'BETTER FOR YOU',
       'LINKS - RAW', 'CORN DOGS', 'KOSHER/SPECIALTY', 'CHICKEN WINGS',
       'HAM', 'CHICKEN BREAST BONE IN', 'SELECT BEEF', 'LOAVE',
       'PEPPERONI/SALAMI', 'SAUERKRAUT', 'MISCELLANEOUS', 'ENTREES',
       'PRIMAL', 'POULTRY', 'SMOKED/COOKED',
       'SMOKED/COOKED - BETTER FOR YOU', 'PREMIUM - BEEF',
       'GROUND TURKEY', 'PICKLES', 'BREAST - BONE-IN (IQF)', 'PREMIUM',
       'STUFFED/MIXED BEEF', 'PATTIES - RAW', 'ROLLS - PORK',
       'VARIETY PACK', 'PREMIUM - MEAT', 'SALADS', 'CHICKEN-FULLY COOKED',
       'HAMS-DRY CURED/COUNTRY', 'FRZN BURGERS/BBQ/MEATBALL',
       'ROLLS - FLAVORED/OTHER', 'ECONOMY', 'PRE-COOKED',
       'PORK-FULLY COOKED', 'CHICKEN DRUMS', 'ECONOMY - MEAT',
       'CHOICE BEEF', 'EXTERNAL', 'DRY', 'BUTTS', 'NATURAL BEEF',
       'WHOLE HENS (UNDER 15LBS)', 'SOUP/STEW', 'LEAN', 'CUBED MEATS',
       'CHICKEN BREA

With these steps, we have selected all the products containing only meat. A second step allows to select all the products containing meat.

In [105]:
# products containing meat selected from their 'COMMODITY_DESC' and 'SUB_COMMODITY_DESC' sections
cont_meat_commodity = ['MEAT - SHELF STABLE','FROZEN CHICKEN','FRZN MEAT/MEAT DINNERS']
cont_meat_sub_commodity = ['DELI TRAY:MEAT AND CHEESE','MEAT ADDED']
cont_meat = products[products['COMMODITY_DESC'].isin(cont_meat_commodity)]
cont_meat.append(products[products['SUB_COMMODITY_DESC'].isin(cont_meat_sub_commodity)])
# remove vegetarian meat options 
cont_meat = cont_meat[cont_meat['SUB_COMMODITY_DESC'].isin(['FRZN MEAT ALTERNATIVES' ,'FROZEN MEAT (VEGETARIAN)']) == 0].copy()
cont_meat['SUB_COMMODITY_DESC'].unique()

array(['FRZN SS PREMIUM ENTREES/DNRS/T', 'FRZN MULTI SERVE ENTREES ALL',
       'MICROWAVABLE CUPS', 'CHILI: CANNED', 'BEEF STEW',
       'HOT DOG CHILI SAUCE', 'PASTA: CANNED',
       'FRZN BREADED PREPARED CHICK', 'FRZN SS PREMIUM ENTREES/DNRS/N',
       'SANDWICH SAUCE', 'VIENNA SAUSAGE',
       'SS ECONOMY ENTREES/DINNERS ALL', 'CHUNK MEATS - ALL',
       'LUNCHEON MEAT', 'FROZEN PASTA', 'CHICKEN-FULLY COOKED',
       'TAMALES (STOCKED N/CANNED MEAT', 'SNACKS/APPETIZERS', 'MICROWAVE',
       'HASH: CANNED', 'POULTRY - STEW W/DUMPLINGS/ A',
       'BEEF/PORK - DRIED SLICED W/GRA', 'POTTED MEATS AND SPREADS',
       'FROZEN ENTREES', 'CORN BEEF', 'MISC CND MEATS',
       'FRZN REGIONAL/OTHER', 'PIZZA/PREMIUM',
       'FRZN BURGERS/BBQ/MEATBALL', 'PORK-FULLY COOKED', 'KITES',
       'GRASS/SHRED'], dtype=object)

In [98]:
products.groupby('DEPARTMENT').count()['PRODUCT_ID'].sort_values(ascending = False)

DEPARTMENT
GROCERY            39021
DRUG GM            31529
PRODUCE             3118
COSMETICS           3011
NUTRITION           2914
MEAT                2544
MEAT-PCKGD          2427
DELI                2354
PASTRY              2149
FLORAL               938
SEAFOOD-PCKGD        563
MISC. TRANS.         490
SPIRITS              377
SEAFOOD              369
GARDEN CENTER        128
RESTAURANT           102
MISC SALES TRAN       88
SALAD BAR             48
COUP/STR & MFG        39
TRAVEL & LEISUR       28
FROZEN GROCERY        23
KIOSK-GAS             16
                      15
CHEF SHOPPE           14
RX                     9
CNTRL/STORE SUP        4
POSTAL CENTER          3
DAIRY DELI             3
TOYS                   3
VIDEO RENTAL           3
GM MERCH EXP           3
PHOTO                  2
DELI/SNACK BAR         2
PROD-WHS SALES         2
GRO BAKERY             2
CHARITABLE CONT        2
AUTOMOTIVE             2
VIDEO                  2
PORK                   1
ELECT &PLUMBIN

In [107]:
meat_products.to_csv('data/meat_products.csv')
cont_meat.to_csv('data/cont_meat.csv')