# kNN "Veganization"
Replace an ingredient with its k nearest _vegan_ neighbors in the USDA ingreident nutrition adjacency matrix. In theory, ingredients should be replaced by ones that have similar nutritional values. Furthermore, all nonvegan ingredients are removed from suggestions and only vegan replacements are suggested.

In [1]:
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [6]:
nutrition_adj_mat = np.load('../data/Adjacency_Matrix_5.npy')
combined_adj_mat = np.load('../data/Adjacency_Matrix_9.npy')
ingredient_list = np.load(('../data/Ingredient_List_USDA_Nutrition_Info.npy'))
vegan_ingredient_mask = np.load(('../data/Vegan_Ingredient_Mask_USDA_Nutrition_Info.npy'))

In [7]:
def suggestKBestVeganReplacements(nutrition_adj_mat,ingr_number,k, nonvegan_mask):
     """ Find the indices of the k nearest vegan neighbors of a given ingredient using the adjacency matrix weights"""
    adj_values_2 = nutrition_adj_mat[ingr_number,:]
    adj_values_2[nonvegan_mask] = 0
    maxvals = idx = (-adj_values_2).argsort()[:k]
    if (np.all(adj_values_2 == 0)):
        maxvals = []
    return maxvals

In [8]:
nonvegan_mask = [not i for i in vegan_ingredient_mask]

In [9]:
#25 is beef
ingr_number = 25
k = 5
maxvals2 = suggestKBestVeganReplacements(nutrition_adj_mat,ingr_number,k, nonvegan_mask)
print("Original Ingredient: {0}".format(ingredient_list[ingr_number]))
if (len(maxvals2)==0):
    print("No replacements found!")
else: 
    print("{0} best replacements: {1}".format(k,ingredient_list[maxvals2]))

Original Ingredient: beef, rib eye steak, boneless, lip off, separable lean and fat, trimmed to 0" fat, all grades, cooked, grilled
5 best replacements: ['edamame, frozen, prepared'
 'tofu, raw, regular, prepared with calcium sulfate'
 'beans, kidney, all types, mature seeds, raw'
 "leavening agents, yeast, baker's, active dry" 'mushrooms, white, raw']


Try with the combined recipe/nutrition matrix

In [31]:
ingr_number = 65
k = 5
maxvals2 = suggestKBestVeganReplacements(combined_adj_mat,ingr_number,k, nonvegan_mask)
print("Original Ingredient: {0}".format(ingredient_list[ingr_number]))
if (len(maxvals2)==0):
    print("No replacements found!")
else: 
    print("{0} best replacements: {1}".format(k,ingredient_list[maxvals2]))

Original Ingredient: cheese, goat, soft type
5 best replacements: ['seeds, sesame seeds, whole, dried'
 'seeds, sesame butter, tahini, from roasted and toasted kernels (most common type)'
 'nuts, almonds' 'seeds, hemp seed, hulled'
 'seeds, pumpkin and squash seeds, whole, roasted, without salt']
