## The Perfect Fit
**A container problem solved.**

The containerstore.com has a large selection of containers, but does not provide a search option to specify the size of the container.  The Container class allows you to scrape the all the current storage data from the container store website storing the name, dimensions, image, and url for each piece.  There are functions to reorganize the data.  Then you can search the containers for the type, size, and/or price that you would like.


In [1]:
import scrapy
import json
from pprint import pprint
import subprocess
from fractions import Fraction
from collections import Counter 
import datetime
import sys
import os


In [2]:

class ScrapeContainers:
    def __init__(self):
        subprocess.run(["rm",'container.json'])
        subprocess.run(["scrapy", "crawl", 'container', "-o", 'container.json'])
        print('Container Store storage section data updated on',datetime.date.today())
    
#     def update_json(self, scrapy_name='container', json_file_name='container.json'):
#         subprocess.run(["rm",json_file_name])
#         subprocess.run(["scrapy", "crawl", scrapy_name, "-o", json_file_name])
#         print('updated on',datetime.date.today())




In [67]:
class Container:
    
    

    def __init__(self):
        """
        Loads container.json and stores as global(?) variable 'data'
        
        'data' is a list of dictionaries.  
            Each dictionary contains the 'title' (name) of the container, its 'url', 
            a list of the 'dimensions' available, a list of the 'price'(s) in the same order as dimension,
            and a link to the url of the 'image'.
                (Do not have the image working yet, price not working for some of the items 
                    - the ones that have only 2 dimensions(?))
        
        Raises:
            throws exception if the file is not there.(KeyError? IOEXception?)
        """    

        if os.path.exists('container.json'):
            with open('container.json', 'r') as f:
                self.data = json.load(f)
        else:
            print("No such file '{}'".format(container.json), file=sys.stderr)
            
    def create_category(self):
        """
        Stores the category ('Decorative Bins and Baskets', 'Plastic Bins and Baskets', etc)
            for each item (dictionary) in the list 'data'.
            'category' becomes a new key for each dictionary in 'data'
        Also stores a Counter() called 'c' that counts the number of items in 'data' that are in each category: 
            so 'c' is a dictionary with keys the categories, and value the number of titles that have this category.
        
        Raises:
            Should we raise something if the category isn't found?  
                Not a problem yet becuase of the way containerstore is organized.
            
        """         
        for d in self.data:
            category = d['url'].split('/')[5]
            d['category']=category
        self.c = Counter()
        for d in self.data:
            self.c[d['category']] += 1
    
    
    def create_new_dimensions(self):
        """
        Runs through the 'dimensions' of each dictionary in 'data'.  
            Stores a new key 'new dimensions' that records the dimensions as a list of floats (in inches)
        
        Raises:
            Should we raise something if the dimensions aren't found?  Right now leaves as empty list.
            
        """         

        
        # loop through the container links
        for j in range (0,len(self.data)):
            self.data[j]['new dimensions']=[]
            # many container links have containers available in several different sizes, so we roam through each one
            for k in range(0,len(self.data[j]['dimensions'])):
                dim_no_whitespace = self.data[j]['dimensions'][k].strip()
                if (dim_no_whitespace != ""):
                    mydim = self.data[j]['dimensions'][k].split('x')
                    newdim = []
                    # each dimension of the container needs to be converted to a float and added to 'new dimensions'
                    for i in range (0,len(mydim)):
                        mysplit = mydim[i].split('sq.')[0]
                        if '-' in mysplit:
                            this_split = mysplit.split('-')
                            num =0
                            for p in this_split:
                                num += float(Fraction(p.split('"')[0]))
                        else:
                            num = float(Fraction(mysplit.split('"')[0]))
                        # if the entry notes that it is square or a diameter, then the dimension needs to be inserted in two spots            
                        double=int(('sq.' in mydim[i]) or ('diam.' in mydim[i]))
                        for _ in range(0,double+1):
                            newdim.append(num)
                    self.data[j]['new dimensions'].append(newdim)

    def organize_new_dimensions(self):
        
        """
        Creates a new variable 'dimensions' that is a list of lists of dictionaries.  
            dictionaries with 3 dimensions found are all together in the third spot of the list. 
                (I was doing this for the 'perfect fit' function.  I wanted to look through the dictionaries 
                with 2 dimensions in a different way than those with 3 dimensions, and 1 or 2 dimensions only 
                in emergency.  I am concered that storing everything may be unnecessary and take up too much 
                storage if the data set is really big.)
            
        """         
        
        self.dimensions = [[],[],[],[]] #stores zero, one, two, three dimensions
        
        for j in range(0,len(self.data)):
            if((self.data[j]['new dimensions'])==[]):
                self.dimensions[0].append([self.data[j],k]) 
            num_of_containers  = len(self.data[j]['new dimensions'])
            for k in range(0,num_of_containers):
                num_dim = len(self.data[j]['new dimensions'][k])
                
                
                #save the container and the spot that has these particular dimensions
                self.dimensions[num_dim].append([self.data[j],k]) 

                

        

    #  The the_perfect_fit function will be given 6 values, 2 for each dimension (a lower and upper limit).  
    #  It returns the possible containers.
    
    
    def the_perfect_fit_ranges(self,ranges):
        
        """
        Searches for cantiners that fit in a given range of dimensions, using the 'dimensions' variable, 
            created in 'organize_new_dimensions' function (a list of lists of dictionaries containing 
            container information).
        Based on whether 0,1,2, or 3 dimensions are given, a different strategy is used.
        
        Args:
            ranges: a list of 6 floats, two for each dimension, providing the lower and upper bound.
                for example to find a container in the range 6-8.25" x 9-12" x 5-7", the variable would be
                ranges = [6.,8.25,9.,12.,5.,7.]
        Returns:
            possibles: a list of 4 lists of dictionaries of containers that might work.  
                The 4 lists correspond to containers with 0,1,2, or 3 dimensions found.
        """     
        
        if len(ranges)==6:
            
            possibles = [[],[],[],[]]

            possibles[0]=self.dimensions[0]
            
            for d in self.dimensions[1]:
                if ((ranges[0]<d[0]['new dimensions'][d[1]][0]<ranges[1]) 
                    or (ranges[2]<d[0]['new dimensions'][d[1]][0]<ranges[3]) 
                    or (ranges[4]<d[0]['new dimensions'][d[1]][0]<ranges[5])):

                    possibles[1].append(d)

            for d in self.dimensions[2]:
                if ( (ranges[0] < d[0]['new dimensions'][d[1]][0] < ranges[1])  and (ranges[2] < d[0]['new dimensions'][d[1]][1] < ranges[3]) ):

                    possibles[2].append(d)

            for d in self.dimensions[3]:
                if ((ranges[0]<d[0]['new dimensions'][d[1]][0]<ranges[1]) and (ranges[2]<d[0]['new dimensions'][d[1]][1]<ranges[3]) 
                    and (ranges[4]<d[0]['new dimensions'][d[1]][2]<ranges[5])):

                    possibles[3].append(d)


            return possibles
    
    
    def the_perfect_fit_values(self,values,diffmax=2.):
        
        """
        Searches for cantiners that are closest to the given 3 dimensions, using the 'dimensions' variable, 
            created in 'organize_new_dimensions' function (a list of lists of dictionaries containing 
            container information).
        Based on whether 0,1,2, or 3 dimensions are given, a different strategy is used.
        
        Args:
            values: a list of 3 floats, one for each dimension
            
        Returns:
            possibles: a list of 4 lists of dictionaries of containers that might work.  
                The 4 lists correspond to containers with 0,1,2, or 3 dimensions found.
        """     
        
        if len(values)==3:
            
            possibles = [[],[],[],[]]

            possibles[0]=self.dimensions[0]
            
            diff = .5
            
            while diff<=diffmax:
                for d in self.dimensions[3]:
                    if ((abs(values[0]-d[0]['new dimensions'][d[1]][0])<diff) and (abs(values[1]-d[0]['new dimensions'][d[1]][1])<diff) 
                        and (abs(values[2]-d[0]['new dimensions'][d[1]][2])<diff)):

                        possibles[3].append(d)
                diff+=.5


            return possibles
    
        
        

In [68]:
# ScrapeContainers()


In [69]:
my_c=Container()


my_c.create_category()

my_c.create_new_dimensions()
my_c.organize_new_dimensions()

print('We found',len(my_c.data), 'total container titles.',
      ' Each container title may be available in several different sizes.')    
print('Considering all the size options, we find that there are', 
      len(my_c.dimensions[0])+len(my_c.dimensions[1])+len(my_c.dimensions[2])+len(my_c.dimensions[3]), 
      'unique containers. ')#, len(my_c.dimensions[3]), 'of those with all three dimensions found.')
print('The container titles are divided up into',len(my_c.c.most_common()),'categories.'
      'The spread of the',len(my_c.data),'container titles into the',len(my_c.c.most_common()),
      'categories is as follows:')#, len(my_c.dimensions[3]), 'of those with all three dimensions found.')
print(my_c.c.most_common())




We found 660 total container titles.  Each container title may be available in several different sizes.
Considering all the size options, we find that there are 1242 unique containers. 
The container titles are divided up into 11 categories.The spread of the 660 container titles into the 11 categories is as follows:
[('decorative-bins-baskets', 210), ('stacking-storage', 112), ('storage-drawers', 72), ('storage-bags-totes', 55), ('plastic-bins-baskets', 53), ('modular-storage', 46), ('serving-trays', 41), ('garage-storage-boxes', 23), ('storage-benches-seats', 22), ('smart-store', 16), ('trunks', 10)]


In [70]:
my_c.data[0]

{'category': 'storage-benches-seats',
 'dimensions': ['33-3/4" x 17-1/2" x 20-5/8" h'],
 'image': ['https://images.containerstore.com//catalogimages/319676/SH_16_Mercer_Bench_V1_R062816_CMYK.jpg'],
 'new dimensions': [[33.75, 17.5, 20.625]],
 'price': ['249'],
 'title': 'Rustic Driftwood Mercer Entryway Storage Bench',
 'url': 'https://www.containerstore.com/s/storage/storage-benches-seats/rustic-driftwood-mercer-entryway-storage-bench/12d?productId=11003771'}

The container that I am looking for should be in the range of 6-8.25" x 9-12" x 5-7"

In [71]:
values=[7.,10.5,6.]
pf = my_c.the_perfect_fit_values(values,1.)
possibles = pf[3]
print('There are',len(possibles), 'containers that will fit your space:', values)
print('')
for c in possibles:
    print('')
    print(c[0]['title'], ", '"+str(c[0]['category'])+ "'")
    print(c[0]['url'])
    print(c[0]['dimensions'][c[1]])

    print('$' +str(c[0]['price'][c[1]]))

print('\n')
if (len(pf[2])>0):
    print('There are',len(pf[2]), 'containers that might fit your space, that have only 2 dimensions given:')
for c in pf[2]:
    print('')
    print(c[0]['title'], ", '"+str(c[0]['category'])+ "'")
    print(c[0]['url'])
    print(c[0]['dimensions'][c[1]])
    
    if (len(c[0]['price'])>c[1]):
        print('$' +str(c[0]['price'][c[1]]))


print('\n')
if(len(pf[0])>0):
    print('There are',len(pf[0]), 'containers that without any dimensions given.  Feel free to see if any of these may work for your needs:')
for c in pf[0]:
    print('')
    print(c[0]['title'], ", '"+str(c[0]['category'])+ "'")
    print(c[0]['url'])
#     print(c[0]['dimensions'][c[1]])
    
    if (len(c[0]['price'])>c[1]):
        print('$' +str(c[0]['price'][c[1]].strip()))

There are 7 containers that will fit your space: [7.0, 10.5, 6.0]


Mondrian Storage Boxes with Lids , 'decorative-bins-baskets'
https://www.containerstore.com/s/storage/decorative-bins-baskets/mondrian-storage-boxes-with-lids/12d?productId=10036979
7" x 10-5/8" x 5-3/8" h
$6.99

Rectangular Hogla Storage Bin with Lid , 'decorative-bins-baskets'
https://www.containerstore.com/s/storage/decorative-bins-baskets/rectangular-hogla-storage-bin-with-lid/12d?productId=11003202
7" x 11" x 5-1/4" h
$10.99

White Handled Storage Baskets , 'plastic-bins-baskets'
https://www.containerstore.com/s/storage/plastic-bins-baskets/white-handled-storage-baskets/12d?productId=10036528
6-1/4" x 11" x 5-1/8" h
$3.99

Clear Handled Storage Baskets , 'plastic-bins-baskets'
https://www.containerstore.com/s/storage/plastic-bins-baskets/clear-handled-storage-baskets/12d?productId=10022155
6-1/4" x 11" x 5-1/8" h
$3.99

Mondrian Storage Boxes with Lids , 'plastic-bins-baskets'
https://www.containerstore.com/s/stor

In [72]:
pf = my_c.the_perfect_fit_ranges([6.,8.25,9.,12.,5.,7.])
possibles = pf[3]
print('There are',len(possibles), 'containers that will fit your space:')
print('')
for c in possibles:
    print('')
    print(c[0]['title'], ", '"+str(c[0]['category'])+ "'")
    print(c[0]['url'])
    print(c[0]['dimensions'][c[1]])

    print('$' +str(c[0]['price'][c[1]]))

print('\n')
if (len(pf[2])>0):
    print('There are',len(pf[2]), 'containers that might fit your space, that have only 2 dimensions given:')
for c in pf[2]:
    print('')
    print(c[0]['title'], ", '"+str(c[0]['category'])+ "'")
    print(c[0]['url'])
    print(c[0]['dimensions'][c[1]])
    
    if (len(c[0]['price'])>c[1]):
        print('$' +str(c[0]['price'][c[1]]))


print('\n')
if(len(pf[0])>0):
    print('There are',len(pf[0]), 'containers that without any dimensions given.  Feel free to see if any of these may work for your needs:')
for c in pf[0]:
    print('')
    print(c[0]['title'], ", '"+str(c[0]['category'])+ "'")
    print(c[0]['url'])
#     print(c[0]['dimensions'][c[1]])
    
    if (len(c[0]['price'])>c[1]):
        print('$' +str(c[0]['price'][c[1]].strip()))


There are 8 containers that will fit your space:


Mondrian Storage Boxes with Lids , 'decorative-bins-baskets'
https://www.containerstore.com/s/storage/decorative-bins-baskets/mondrian-storage-boxes-with-lids/12d?productId=10036979
7" x 10-5/8" x 5-3/8" h
$6.99

Rectangular Hogla Storage Bin with Lid , 'decorative-bins-baskets'
https://www.containerstore.com/s/storage/decorative-bins-baskets/rectangular-hogla-storage-bin-with-lid/12d?productId=11003202
7" x 11" x 5-1/4" h
$10.99

White Handled Storage Baskets , 'plastic-bins-baskets'
https://www.containerstore.com/s/storage/plastic-bins-baskets/white-handled-storage-baskets/12d?productId=10036528
6-1/4" x 11" x 5-1/8" h
$3.99

Clear Handled Storage Baskets , 'plastic-bins-baskets'
https://www.containerstore.com/s/storage/plastic-bins-baskets/clear-handled-storage-baskets/12d?productId=10022155
6-1/4" x 11" x 5-1/8" h
$3.99

Linus Storage Binz , 'plastic-bins-baskets'
https://www.containerstore.com/s/storage/plastic-bins-baskets/linus-