This workbook implements Context Based Filtering for a Cats Recommendation System, working from the raw data all the way to model creation and initial results.

# Table of Contents

* [Load in Data and Segment Features from Context data](#segment)
* [Pre-process feature data](#pre-process)
    - [Recheck sweetviz for distinct values and types](#sweet)
    - [Make Pipeline](#pp_pipeline)
* [Run Modeling Pipeline](#run_pipeline)
* [Conclusion and Next Steps](#conclusion)

# Load in Data and Segment Features from Context data<a id='segment'></a>

First, we load all of our adoptable cats.

In [1]:
import pandas as pd
cats_DF = pd.read_csv("../data/raw/version0_5/Adoptable_cats_20221125.csv",header=0,index_col=0)
cats_DF.shape

  cats_DF = pd.read_csv("../data/raw/version0_5/Adoptable_cats_20221125.csv",header=0,index_col=0)


(49600, 50)

In [2]:
pd.set_option('display.max_columns', 500)
cats_DF.sample(3)

Unnamed: 0,id,organization_id,url,type,species,age,gender,size,coat,tags,name,description,organization_animal_id,photos,primary_photo_cropped,videos,status,status_changed_at,published_at,distance,breeds.primary,breeds.secondary,breeds.mixed,breeds.unknown,colors.primary,colors.secondary,colors.tertiary,attributes.spayed_neutered,attributes.house_trained,attributes.declawed,attributes.special_needs,attributes.shots_current,environment.children,environment.dogs,environment.cats,contact.email,contact.phone,contact.address.address1,contact.address.address2,contact.address.city,contact.address.state,contact.address.postcode,contact.address.country,animal_id,animal_type,organization_id.1,primary_photo_cropped.small,primary_photo_cropped.medium,primary_photo_cropped.large,primary_photo_cropped.full
6913,58957689,IL252,https://www.petfinder.com/cat/katara-58957689/...,Cat,Cat,Baby,Female,Small,Short,"['Friendly', 'Affectionate', 'Playful', 'Funny...",Katara,Katara is an adorable 4 month old tabby on the...,,[{'small': 'https://dl5zpyw5k3jeb.cloudfront.n...,,[],adoptable,2022-11-24T16:34:09+0000,2022-11-24T16:34:08+0000,,Domestic Short Hair,,True,False,Tabby (Brown / Chocolate),Tabby (Buff / Tan / Fawn),,True,True,False,False,True,True,,True,catnapvet@yahoo.com,,1101 Beach Avenue,,LaGrange Park,IL,60526,US,58957689,cat,il252,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...
46296,58694664,NV234,https://www.petfinder.com/cat/minina-58694664/...,Cat,Cat,Adult,Female,Medium,,[],Minina,,C2022145,[{'small': 'https://dl5zpyw5k3jeb.cloudfront.n...,,[],adoptable,2022-10-29T01:12:44+0000,2022-10-29T01:12:44+0000,,Domestic Short Hair,,False,False,,,,True,True,False,False,True,,,True,info@rtcatcafe.org,(702) 629-6351,4155 N Rancho Dr,#105,Las Vegas,NV,89130,US,58694664,cat,nv234,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...
606,58978934,GA788,https://www.petfinder.com/cat/frankie-58978934...,Cat,Cat,Baby,Male,Medium,,[],Frankie,Frankie is a soft little love bug. He was the ...,RF22-580,[{'small': 'https://dl5zpyw5k3jeb.cloudfront.n...,,[],adoptable,2022-11-27T22:10:37+0000,2022-11-27T22:10:35+0000,,Domestic Short Hair,,False,False,,,,False,True,False,False,False,True,,True,kittykonnection12@gmail.com,,,,Evans,GA,30809,US,58978934,cat,ga788,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...,https://dl5zpyw5k3jeb.cloudfront.net/photos/pe...


In [3]:
cats_DF.columns

Index(['id', 'organization_id', 'url', 'type', 'species', 'age', 'gender',
       'size', 'coat', 'tags', 'name', 'description', 'organization_animal_id',
       'photos', 'primary_photo_cropped', 'videos', 'status',
       'status_changed_at', 'published_at', 'distance', 'breeds.primary',
       'breeds.secondary', 'breeds.mixed', 'breeds.unknown', 'colors.primary',
       'colors.secondary', 'colors.tertiary', 'attributes.spayed_neutered',
       'attributes.house_trained', 'attributes.declawed',
       'attributes.special_needs', 'attributes.shots_current',
       'environment.children', 'environment.dogs', 'environment.cats',
       'contact.email', 'contact.phone', 'contact.address.address1',
       'contact.address.address2', 'contact.address.city',
       'contact.address.state', 'contact.address.postcode',
       'contact.address.country', 'animal_id', 'animal_type',
       'organization_id.1', 'primary_photo_cropped.small',
       'primary_photo_cropped.medium', 'primary_photo

Drop animals with no pictures since they are key to our 'tinder-like' app experience.

In [4]:
cats_DF = cats_DF.dropna(subset=['primary_photo_cropped.full'])# drop rows with 0 pictures
cats_DF.shape # matches na count via sweet viz for cats

(46805, 50)

Next we seperate the dataframe into features to model over and context data that can be shown to the user for any matches. 'ID' will be our shared key between the two tables.

Of note, the 'distance' field and 'primary_photo_cropped.full' field will be useful data for future model enhancements. For the model baseline, we will simply use textual data and assume a 0 distance for all pets.

In [5]:
contextCols = ['id','organization_id','url','type','tags','name','description','organization_animal_id',
              'photos','primary_photo_cropped','videos','status','status_changed_at','published_at',
              'distance','contact.email', 'contact.phone', 'contact.address.address1',
               'contact.address.address2', 'contact.address.city','contact.address.state', 
               'contact.address.postcode','contact.address.country', 'animal_id', 'animal_type',
               'organization_id.1', 'primary_photo_cropped.small','primary_photo_cropped.medium',
               'primary_photo_cropped.large','primary_photo_cropped.full']
featureCols = ['id','age','gender','size','coat','breeds.primary', 'breeds.secondary','breeds.mixed',
              'breeds.unknown','colors.primary','colors.secondary','colors.tertiary',
              'attributes.spayed_neutered','attributes.house_trained','attributes.declawed',
              'attributes.special_needs','attributes.shots_current','environment.children',
              'environment.dogs','environment.cats','type','contact.address.postcode']
cats_DF_features = cats_DF[featureCols]
cats_DF_context = cats_DF[contextCols]
cats_DF_features.shape

(46805, 22)

Let's sanity check our missing values now that we just have cats and remove any columns with too many missing values.

In [6]:
valueCounts = cats_DF_features.set_index('type').isna().groupby(level=0).sum()/cats_DF_features.shape[0] # level=0 refers to our index, which we made 'type'


In [7]:
pd.set_option('display.max_columns', 500)
valueCounts 

Unnamed: 0_level_0,id,age,gender,size,coat,breeds.primary,breeds.secondary,breeds.mixed,breeds.unknown,colors.primary,colors.secondary,colors.tertiary,attributes.spayed_neutered,attributes.house_trained,attributes.declawed,attributes.special_needs,attributes.shots_current,environment.children,environment.dogs,environment.cats,contact.address.postcode
type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
Cat,0.0,0.0,0.0,0.0,0.616857,0.0,0.899925,0.0,0.0,0.393548,0.746523,0.916035,0.0,0.0,0.0,0.0,0.0,0.737015,0.829954,0.588612,2.1e-05


In [8]:
valueCounts = cats_DF_context.set_index('type').isna().groupby(level=0).sum()/cats_DF_context.shape[0] # level=0 refers to our index, which we made 'type'


In [9]:
pd.set_option('display.max_columns', 500)
valueCounts 

Unnamed: 0_level_0,id,organization_id,url,tags,name,description,organization_animal_id,photos,primary_photo_cropped,videos,status,status_changed_at,published_at,distance,contact.email,contact.phone,contact.address.address1,contact.address.address2,contact.address.city,contact.address.state,contact.address.postcode,contact.address.country,animal_id,animal_type,organization_id.1,primary_photo_cropped.small,primary_photo_cropped.medium,primary_photo_cropped.large,primary_photo_cropped.full
type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1
Cat,0.0,0.0,0.0,0.0,2.1e-05,0.262365,0.313022,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.051405,0.193804,0.371499,0.923534,0.0,0.0,2.1e-05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


After a quick NA check, we will have to remove 'coat','breeds.secondary','colors.secondary','colors.tertiary','environment.children','environment.dogs' and 'environment.cats'. The column 'colors.primary' is also missing a lot of values but for sake of differing one cat from another it will be kept for now. Additionally, we will bring back in address postcode as an initial attempt to match nearby cats together.

In [10]:
featureCols = ['id','age','gender','size','breeds.primary','breeds.mixed','breeds.unknown',
               'colors.primary','attributes.spayed_neutered','attributes.house_trained',
               'attributes.declawed','attributes.special_needs','attributes.shots_current',
               'contact.address.postcode']
cats_DF_features = cats_DF[featureCols]
cats_DF_context = cats_DF[contextCols]
cats_DF_features.shape

(46805, 14)

In [11]:
cats_DF_features.dtypes

id                             int64
age                           object
gender                        object
size                          object
breeds.primary                object
breeds.mixed                    bool
breeds.unknown                  bool
colors.primary                object
attributes.spayed_neutered      bool
attributes.house_trained        bool
attributes.declawed             bool
attributes.special_needs        bool
attributes.shots_current        bool
contact.address.postcode      object
dtype: object

# Pre-process feature data<a id='pre-process'></a>

## Recheck sweetviz for distinct values and types<a id='sweet'></a>

First, let's re-examine our dataframe for distinct values.

In [12]:
cats_DF_features.head(3)

Unnamed: 0,id,age,gender,size,breeds.primary,breeds.mixed,breeds.unknown,colors.primary,attributes.spayed_neutered,attributes.house_trained,attributes.declawed,attributes.special_needs,attributes.shots_current,contact.address.postcode
1,58980784,Baby,Male,Medium,Tuxedo,False,False,Black & White / Tuxedo,True,True,False,False,True,37343
13,58980778,Baby,Male,Medium,Domestic Short Hair,False,False,Black,True,True,False,False,True,92057
14,58980506,Young,Female,Medium,Domestic Short Hair,False,False,Torbie,True,True,False,False,True,50126


In [13]:
cats_DF_features['colors.primary'].values[0]

'Black & White / Tuxedo'

In [14]:
# make special version without postocde so sweetviz can handle it, since postcode has both numbers and letters
featureCols = ['id','age','gender','size','breeds.primary','breeds.mixed','breeds.unknown',
               'colors.primary','attributes.spayed_neutered','attributes.house_trained',
               'attributes.declawed','attributes.special_needs','attributes.shots_current']
cats_DF_features_test = cats_DF[featureCols]
cats_DF_context_test = cats_DF[contextCols]
cats_DF_features_test.shape

(46805, 13)

In [15]:
import sweetviz as sv

cat_data_report = sv.analyze(cats_DF_features_test)
cat_data_report.show_html() #save to html document

  all_source_names = [cur_name for cur_name, cur_series in source_df.iteritems()]
  filtered_series_names_in_source = [cur_name for cur_name, cur_series in source_df.iteritems()


                                             |      | [  0%]   00:00 -> (? left)

  stats["mad"] = series.mad()
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():
  for item in category_counts.iteritems():


Report SWEETVIZ_REPORT.html was generated! NOTEBOOK/COLAB USERS: the web browser MAY not pop up, regardless, the report IS saved in your notebook/colab files.


## Make Pipeline <a id='pp_pipeline'></a>

In [16]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import FunctionTransformer
from sklearn.preprocessing import OneHotEncoder

In [17]:
def remove_columns_with_1_distinct(df):
    drop_col = [e for e in df.columns if df[e].nunique()==1]
    df_return = df.drop(drop_col,axis=1)
    return df_return


In [18]:
def drop_duplicates(df):
    df_return = df.drop_duplicates()
    return df_return


In [19]:
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.metrics.pairwise import linear_kernel 

def cosine_similarities(df_1,df_2):
    cs_simil = linear_kernel(df_1,df_1)
    results = {}
    ds = df_2 # needs id column
    for idx, row in ds.iterrows():
       similar_indices = cs_simil[idx].argsort()[:-100:-1] 
       similar_items = [(cs_simil[idx][i], ds['id'][i]) for i in similar_indices] 
       results[row['id']] = similar_items[1:]
    return results

#cosineSimilarity = FunctionTransformer(cosine_similarities)

In [20]:
def item(id,df):  
    ds = df
    colsGrab = ['id']
    return ds.loc[ds['id'] == id][colsGrab].values[0]# Just reads the results out of the dictionary.

def url(id,df):  
    ds = df
    colsGrab = ['url']
    return ds.loc[ds['id'] == id][colsGrab].values[0]# Just reads the results out of the dictionary.

def picture(id,df):  
    ds = df
    colsGrab = ['primary_photo_cropped.full']
    return ds.loc[ds['id'] == id][colsGrab].values[0]# Just reads the results out of the dictionary.

def recommend(item_id, num,df,reccs):
    print("Recommending " + str(num) + " cats similar to " + str(item(item_id,df)) + "... " 
          + picture(item_id,df) + " - " + url(item_id,df))   
    print("-------")    
    recs = reccs[item_id][:num]   
    for rec in recs: 
        print("Recommended: " + str(item(rec[1],df)) + " (score:" +      str(rec[0]) + ") " 
              + picture(rec[1],df) + " - " + url(rec[1],df))
    
def score(reccs, num):
    print("Finding average reccomendation score for top 5 reccomendations per example")
    results = []
    for key in reccs.keys():
        subRecs = reccs[key][:num]
        for r in subRecs:
            results.append(r[0])
    averageRecc = sum(results) / len(results)
    print("There are "+ str(len(results)) + 'results with a sum of' + str(sum(results)) + 'and and average of: ' 
          + str(averageRecc) )
    return averageRecc

In [21]:
categorical_features = ['age','gender','size','breeds.primary','breeds.mixed',
                        'colors.primary','attributes.spayed_neutered','attributes.house_trained',
                        'attributes.declawed','attributes.special_needs','attributes.shots_current',
                        'contact.address.postcode']

categorical_transformer = OneHotEncoder()

In [22]:
# Not used currently but kept for future when distance is more properly implemented
numerical_features = ['id']
numeric_transformer = lambda x:x #change nothing

In [23]:
from sklearn.compose import ColumnTransformer

preprocessor = ColumnTransformer(
    transformers = [
        #("num", numeric_transformer,numerical_features), 
        ("cat", categorical_transformer, categorical_features),
    ])

In [24]:
from sklearn.base import BaseEstimator

class ContentBasedRecommendor(BaseEstimator):
    def __init__(self):
        pass # constructor not needed for anything yet

    def fit(self,X,y=None):
        #print(X.shape)
        #self.X = X
        #self.y = y
        return cosine_similarities(X,y) 
    
    #def transform(self):
        #pass

    def predict(self,X,num,context_df,reccs):
        item_id = X['id'].values[0]
        return recommend(X, num,context_df,reccs)
    
    #def score(self: ContentBasedRecommendor, item_id,num,df_context,reccs):
        

In [25]:
model = Pipeline(
    steps=[("preprocessor", preprocessor),
           ("model", ContentBasedRecommendor())
          ])

# Run Modeling Pipeline <a id='run_pipeline'></a>

In [26]:
import numpy as np
#target = 'todo' # will be rankings once we have them
#X, y = cats_DF_features.drop(columns=target), cats_DF_features[target]
X = cats_DF_features
X = drop_duplicates(X)
X = remove_columns_with_1_distinct(X)
X = X.replace(np.nan,'Not Available')
X["contact.address.postcode"]= X["contact.address.postcode"].astype(str)
X.dtypes

id                             int64
age                           object
gender                        object
size                          object
breeds.primary                object
breeds.mixed                    bool
colors.primary                object
attributes.spayed_neutered      bool
attributes.house_trained        bool
attributes.declawed             bool
attributes.special_needs        bool
attributes.shots_current        bool
contact.address.postcode      object
dtype: object

In [27]:
from sklearn.model_selection import train_test_split
import numpy as np
# split data
x, x_test = train_test_split(X,test_size=0.1,train_size=0.9, random_state=13)
x_train, x_dev = train_test_split(x,test_size = 0.1,train_size =0.9, random_state=13)

# given the way the model works so far, the x_dev and x_test are not used. 
# If you aren't in the catalog you can't be scored so for now,  just using x_train for initial model results
# Once we get user rankings, we can move the model to something that can use these additional sets.

In [28]:
x_train = x_train.reset_index(drop=True) # index reset required so model fitting can match keys
x_train.shape

(37835, 13)

In [29]:
x_dev.shape

(4204, 13)

In [30]:
x_test.shape

(4671, 13)

In [31]:
x_train.sample(3)

Unnamed: 0,id,age,gender,size,breeds.primary,breeds.mixed,colors.primary,attributes.spayed_neutered,attributes.house_trained,attributes.declawed,attributes.special_needs,attributes.shots_current,contact.address.postcode
21051,58919333,Baby,Male,Small,Domestic Short Hair,False,White,True,True,False,False,True,33138
25536,58943883,Baby,Female,Medium,Domestic Short Hair,True,Black,True,True,False,False,True,85249
10268,58796660,Young,Female,Small,Domestic Short Hair,False,Not Available,True,False,False,False,False,77053


In [32]:

categorical_features_test = ['age','gender','size','breeds.primary','breeds.mixed',
                        'colors.primary','attributes.spayed_neutered','attributes.house_trained',
                        'attributes.declawed','attributes.special_needs','attributes.shots_current',
                        'contact.address.postcode']
#x_train = x_train.replace(np.nan,'Not Available')
x_train = x_train.reset_index(drop=True) # required so keys work properly
xtrain_cat = x_train[categorical_features_test]
#xtrain_cat = xtrain_cat.replace(np.nan,'Not Available') 
ohe = OneHotEncoder().fit(xtrain_cat) # One Hot Encoding WAAAY better
x_train_test = ohe.transform(xtrain_cat) # don't need to add id columns because same columns preserved
#type(x_train_test)
#x_train_test.shape
test =cosine_similarities(x_train_test,x_train)


In [33]:
xtrain_cat.shape

(37835, 12)

In [34]:
x_train_test.shape

(37835, 3892)

In [35]:
type(x_train_test)
x_train_test.shape
x_train_test.todense()[1]

matrix([[0., 0., 0., ..., 0., 0., 0.]])

In [36]:
pd.options.display.max_colwidth = 100
recommend(item_id=58806733, num=5,df=cats_DF_context,reccs=test)

['Recommending 5 cats similar to [58806733]... https://dl5zpyw5k3jeb.cloudfront.net/photos/pets/58806733/1/?bust=1668025870 - https://www.petfinder.com/cat/palomino-58806733/nm/las-cruces/cats-meow-adoption-center-nm198/?referrer_id=c2f7479c-c7e8-422b-bfb4-7c0b8aed0e55']
-------
['Recommended: [58700306] (score:10.0) https://dl5zpyw5k3jeb.cloudfront.net/photos/pets/58700306/1/?bust=1667073140 - https://www.petfinder.com/cat/eli-58700306/in/miller-beach/humane-society-northwest-indiana-in191/?referrer_id=c2f7479c-c7e8-422b-bfb4-7c0b8aed0e55']
['Recommended: [58929837] (score:10.0) https://dl5zpyw5k3jeb.cloudfront.net/photos/pets/58929837/1/?bust=1669064723 - https://www.petfinder.com/cat/ron-burgundy-58929837/nc/hickory/fur-babies-rescue-nc1180/?referrer_id=c2f7479c-c7e8-422b-bfb4-7c0b8aed0e55']
['Recommended: [58740201] (score:10.0) https://dl5zpyw5k3jeb.cloudfront.net/photos/pets/58740201/1/?bust=1669443071 - https://www.petfinder.com/cat/luke-58740201/tx/pleasanton/atascosa-animal-al

In [37]:
# Gather average score of top 5 recommendations for training set, with a max score of 12!
score(reccs=test, num=5)

Finding average reccomendation score for top 5 reccomendations per example
There are 189175results with a sum of2074297.0and and average of: 10.964963657988635


10.964963657988635

**Below is IP code for a pipeline. Still too buggy to use for a ML Baseline though.**

In [38]:
#categorical_features_test = ['age','gender','size','breeds.primary','breeds.mixed',
#                        'colors.primary','attributes.spayed_neutered','attributes.house_trained',
#                        'attributes.declawed','attributes.special_needs','attributes.shots_current',
#                        'contact.address.postcode']
#xtrain_cat = x_train[categorical_features_test]
#xtrain_cat.shape


In [39]:
#model = model.fit(X= x_train, y=x_train)
#savedMod = model.fit(X= xtrain_cat, y=x_train)

In [40]:
#item_id=58761493
#array_id = pd.DataFrame([item_id],columns=['id'])
#array_id

In [41]:
#type(savedMod)

In [42]:
#model.predict(X=array_id, num=5,context_df=cats_DF_context,reccs=savedMod)

# Conclusion and Next Steps <a id='conclusion'></a>

**Conclusion of ML Baseline as of 12/6/22**: 
- Average top 5 recommendation per cat in the training set is 10.96. The highest available score is a 12.  
- The result above uses a simple content-based filtering recommendation model without using user perferences, since they are currently not available. Instead it compares items against each other, aka you liked this ketchup so here are 10 other similar types of ketchup. 
- Due to the method used to create the simple content-based filtering model, dev and test set can not be used so to get an initial idea of the results the training set was used. 
- The cats data version 0.5 features need more ways to dileanate one cat from another but based on include visual scans and the average reccomendation score, the simple cat CBF model generally excels at giving you similar cats to what you stated you wanted.
- In instances where there is more ambiguity (aka a chosen cat with less defined details), it will still find cats very similar to it but sometimes it can also throw in very similar cats who are a different breed. This might not be a bad thing.

**Next Steps**:

- Incorporate distance more effectively
- Can we use description field for cats at all? 
- Colloborative Filtering once user preferences are collected