# ETL Lab

### Introduction

In this lab, we ask you to use the techniques learned in this section to work with an API of your choosing.  As important to getting to the correct code is to develop the proper procedure for getting there.  Just like in the preceding lessons we will following our procedures such as: 

1. Red, green, refactor
2. Move mess into an object
3. Make small methods by: 
    A. Commenting code
    B. Translating comments into methods
    
Along the way, we will arrive at our pattern of a *Client*, *Adapter*, and *Target*.

### Step 1.  Just get the data

The first step is to go from red to green.  That is, the code starts off with nothing working and our task is simply to get it working.  In this case, this means the following: 

1. Call an API of your choosing
2. Return a list of dictionaries and store as a variable named `entities`

In [1]:
#importing necessary libraries
import requests
import json

#api-endpoint
api_key = '4ZV6k_jsCYOmslVsz-SDQgRCYsOngFjXx6vbdY-09nDR6UsSEDjVXu-wgDCJDnVSAVTjYBr1PZFsm00JOFfYRzaavn_ObxoflOlWJCO3SON5Z1q5lIgZXx_REEawXHYx'
headers = {'Authorization': 'Bearer %s' % api_key}

#recommended restaurants in New York City from yelp.com
url='https://api.yelp.com/v3/businesses/search'
params={'term':'restaurant', 'location':'New York City'}
response = requests.get(url, params = params, headers = headers)
entities = response.json()['businesses']
entities

[{'id': 'lMy1BYJ5HX8TccXTBwdRNg',
  'alias': 'covenhoven-brooklyn',
  'name': 'Covenhoven',
  'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/OZZ2U4MyoiBUrjQMZgqcRg/o.jpg',
  'is_closed': False,
  'url': 'https://www.yelp.com/biz/covenhoven-brooklyn?adjust_creative=PdG5NYNj9KqEVMaMf45xhA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=PdG5NYNj9KqEVMaMf45xhA',
  'review_count': 116,
  'categories': [{'alias': 'bars', 'title': 'Bars'},
   {'alias': 'salad', 'title': 'Salad'},
   {'alias': 'soup', 'title': 'Soup'}],
  'rating': 4.5,
  'coordinates': {'latitude': 40.675246, 'longitude': -73.960244},
  'transactions': [],
  'price': '$$',
  'location': {'address1': '730 Classon Ave',
   'address2': None,
   'address3': '',
   'city': 'Brooklyn',
   'zip_code': '11238',
   'country': 'US',
   'state': 'NY',
   'display_address': ['730 Classon Ave', 'Brooklyn, NY 11238']},
  'phone': '+17184839950',
  'display_phone': '(718) 483-9950',
  'distance': 4417.8365326838

In [2]:
type(entities)
#list

list

In [3]:
type(entities[0])
# dict

dict

In [4]:
len(entities)

20

In [5]:
entities[0]

{'id': 'lMy1BYJ5HX8TccXTBwdRNg',
 'alias': 'covenhoven-brooklyn',
 'name': 'Covenhoven',
 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/OZZ2U4MyoiBUrjQMZgqcRg/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/covenhoven-brooklyn?adjust_creative=PdG5NYNj9KqEVMaMf45xhA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=PdG5NYNj9KqEVMaMf45xhA',
 'review_count': 116,
 'categories': [{'alias': 'bars', 'title': 'Bars'},
  {'alias': 'salad', 'title': 'Salad'},
  {'alias': 'soup', 'title': 'Soup'}],
 'rating': 4.5,
 'coordinates': {'latitude': 40.675246, 'longitude': -73.960244},
 'transactions': [],
 'price': '$$',
 'location': {'address1': '730 Classon Ave',
  'address2': None,
  'address3': '',
  'city': 'Brooklyn',
  'zip_code': '11238',
  'country': 'US',
  'state': 'NY',
  'display_address': ['730 Classon Ave', 'Brooklyn, NY 11238']},
 'phone': '+17184839950',
 'display_phone': '(718) 483-9950',
 'distance': 4417.8365326838175}

In [6]:
entities[0].keys()

dict_keys(['id', 'alias', 'name', 'image_url', 'is_closed', 'url', 'review_count', 'categories', 'rating', 'coordinates', 'transactions', 'price', 'location', 'phone', 'display_phone', 'distance'])

### Step 2. Change the dictionaries into objects

The next step is to change dictionaries received back from the API into objects.  We can break this down into a couple of steps.

1. Create the *target class*.  This is the class the dictionaries will be transformed into.  To do this, choose no more than five attributes to store in each instance.

In [7]:
class RestaurantTarget:
    def __init__(self, name, display_phone, review_count, rating):
        self._name = name
        self._display_phone = display_phone
        self._review_count = review_count
        self._rating = rating           

In [8]:
target_instance = RestaurantTarget(entities[0]['name'], entities[0]['display_phone'], 
                           entities[0]['review_count'], entities[0]['rating'])

Check your work by assigning an instance to the variable `target_instance`.

In [9]:
3 < len( target_instance.__dict__.keys()) < 5
# True 

True

In [10]:
target_instance.__dict__.keys()

dict_keys(['_name', '_display_phone', '_review_count', '_rating'])

In [11]:
target_instance.__dict__

{'_name': 'Covenhoven',
 '_display_phone': '(718) 483-9950',
 '_review_count': 116,
 '_rating': 4.5}

1. Reject some of the data

We don't want to pass all of our data into our class.  So create a smaller dictionary of just the attributes we need.

In [12]:
selected_attributes = {'name':entities[0]['name'], 'display_phone':entities[0]['display_phone'],
                       'review_count':entities[0]['review_count'], 'rating':entities[0]['rating']}

In [13]:
type(selected_attributes)
# dict

dict

In [14]:
selected_attributes.keys()

dict_keys(['name', 'display_phone', 'review_count', 'rating'])

In [15]:
target_instance.__dict__.keys()

dict_keys(['_name', '_display_phone', '_review_count', '_rating'])

In [16]:
len(selected_attributes.keys()) == len(target_instance.__dict__.keys())
# True

True

2. Coerce dictionaries into objects

A. To start, coerce just one dictionary into an object.

In [17]:
first_object = RestaurantTarget(selected_attributes['name'], selected_attributes['display_phone'],
                     selected_attributes['review_count'], selected_attributes['rating'])
# change the above line to reference your target class

In [18]:
first_object.__dict__

{'_name': 'Covenhoven',
 '_display_phone': '(718) 483-9950',
 '_review_count': 116,
 '_rating': 4.5}

In [19]:
list(first_object.__dict__.values())

['Covenhoven', '(718) 483-9950', 116, 4.5]

In [20]:
list(first_object.__dict__.values()) == list(entities[0].values())
# True

False

- Above code returns "False", instead of "True" shown in the cell, so I checked out the entities[0].values as blow. fist_object contains only selected_attributes, but entities[0] still contains al the attributes, hence the list of values of them cannot be the same.

In [21]:
entities[0].values()

dict_values(['lMy1BYJ5HX8TccXTBwdRNg', 'covenhoven-brooklyn', 'Covenhoven', 'https://s3-media2.fl.yelpcdn.com/bphoto/OZZ2U4MyoiBUrjQMZgqcRg/o.jpg', False, 'https://www.yelp.com/biz/covenhoven-brooklyn?adjust_creative=PdG5NYNj9KqEVMaMf45xhA&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=PdG5NYNj9KqEVMaMf45xhA', 116, [{'alias': 'bars', 'title': 'Bars'}, {'alias': 'salad', 'title': 'Salad'}, {'alias': 'soup', 'title': 'Soup'}], 4.5, {'latitude': 40.675246, 'longitude': -73.960244}, [], '$$', {'address1': '730 Classon Ave', 'address2': None, 'address3': '', 'city': 'Brooklyn', 'zip_code': '11238', 'country': 'US', 'state': 'NY', 'display_address': ['730 Classon Ave', 'Brooklyn, NY 11238']}, '+17184839950', '(718) 483-9950', 4417.8365326838175])

B. Now that you have solved for one, solve for all.  Coerce all of the dictionaries into objects.  Assign the list of objects to a variable `targets`.

In [22]:
targets = [RestaurantTarget(rest_attr['name'], rest_attr['display_phone'], rest_attr['review_count'], rest_attr['rating']) 
                           for rest_attr in entities]

In [23]:
targets

[<__main__.RestaurantTarget at 0x1d97ca59e10>,
 <__main__.RestaurantTarget at 0x1d97ca59e80>,
 <__main__.RestaurantTarget at 0x1d97ca59eb8>,
 <__main__.RestaurantTarget at 0x1d97ca59ef0>,
 <__main__.RestaurantTarget at 0x1d97ca59f28>,
 <__main__.RestaurantTarget at 0x1d97ca59f60>,
 <__main__.RestaurantTarget at 0x1d97ca59f98>,
 <__main__.RestaurantTarget at 0x1d97ca59fd0>,
 <__main__.RestaurantTarget at 0x1d97ca5a048>,
 <__main__.RestaurantTarget at 0x1d97ca5a080>,
 <__main__.RestaurantTarget at 0x1d97ca5a0b8>,
 <__main__.RestaurantTarget at 0x1d97ca5a0f0>,
 <__main__.RestaurantTarget at 0x1d97ca5a128>,
 <__main__.RestaurantTarget at 0x1d97ca5a160>,
 <__main__.RestaurantTarget at 0x1d97ca5a198>,
 <__main__.RestaurantTarget at 0x1d97ca5a1d0>,
 <__main__.RestaurantTarget at 0x1d97ca5a208>,
 <__main__.RestaurantTarget at 0x1d97ca5a240>,
 <__main__.RestaurantTarget at 0x1d97ca5a278>,
 <__main__.RestaurantTarget at 0x1d97ca5a2b0>]

In [24]:
type(targets)

list

In [25]:
type(targets[0])

__main__.RestaurantTarget

In [26]:
targets[0].__dict__

{'_name': 'Covenhoven',
 '_display_phone': '(718) 483-9950',
 '_review_count': 116,
 '_rating': 4.5}

In [27]:
len(targets) == len(entities)
# True

True

### 3. Move the remaining code into an object 

At this point, we successfully have transformed a list of dictionaries from an API into a list of objects.  But we need to keep cleaning up our code.  To do this, look at the code outside of a class, and move it into a class, with a method name of run.

In [28]:
class RestaurantAdapter:
    def __init__(self, restaurant_dicts):
        self._restaurant_dicts = restaurant_dicts
        
    def run(self):
        self._targets = []
        for restaurant_dict in self._restaurant_dicts:
            selected_attr = {'name':restaurant_dict['name'], 
                                  'display_phone':restaurant_dict['display_phone'],
                                  'review_count':restaurant_dict['review_count'], 
                                  'rating':restaurant_dict['rating']}
            target = RestaurantTarget(selected_attr['name'], selected_attr['display_phone'],
                                selected_attr['review_count'],selected_attr['rating'])
            self._targets.append(target)
        return self._targets

Let's marke sure that this works.

In [29]:
adapter = RestaurantAdapter(entities)
# change the above line to reference your adapter
results = adapter.run()
results

[<__main__.RestaurantTarget at 0x1d97ca66ac8>,
 <__main__.RestaurantTarget at 0x1d97ca66b70>,
 <__main__.RestaurantTarget at 0x1d97ca66be0>,
 <__main__.RestaurantTarget at 0x1d97ca66c18>,
 <__main__.RestaurantTarget at 0x1d97ca66ba8>,
 <__main__.RestaurantTarget at 0x1d97ca66c50>,
 <__main__.RestaurantTarget at 0x1d97ca66cc0>,
 <__main__.RestaurantTarget at 0x1d97ca66cf8>,
 <__main__.RestaurantTarget at 0x1d97ca66c88>,
 <__main__.RestaurantTarget at 0x1d97ca66d30>,
 <__main__.RestaurantTarget at 0x1d97ca66da0>,
 <__main__.RestaurantTarget at 0x1d97ca66dd8>,
 <__main__.RestaurantTarget at 0x1d97ca66d68>,
 <__main__.RestaurantTarget at 0x1d97ca66a58>,
 <__main__.RestaurantTarget at 0x1d97ca669e8>,
 <__main__.RestaurantTarget at 0x1d97ca66e10>,
 <__main__.RestaurantTarget at 0x1d97ca66e48>,
 <__main__.RestaurantTarget at 0x1d97ca66f28>,
 <__main__.RestaurantTarget at 0x1d97ca66e80>,
 <__main__.RestaurantTarget at 0x1d97ca66eb8>]

In [30]:
results[0].__dict__

{'_name': 'Covenhoven',
 '_display_phone': '(718) 483-9950',
 '_review_count': 116,
 '_rating': 4.5}

In [31]:
len(results) == len(targets)
# True

True

### 4. Make the methods smaller

Next, separate out the run method in the adapter class smaller.  Do this by first writing comments in the code, and then moving the code into separate methods.  Please leave the comments in your code.  Your methods should be no longer than five lines long, and there can only be a total of one `if else` statement or `loop` per method.  Having both an `if else` and a `loop` in any method is also too complicated -- don't do it.

In [32]:
class RestaurantAdapter:
    def __init__(self, restaurant_dicts):
        self._restaurant_dicts = restaurant_dicts
        
    # select data with the attributes we need    
    def select_data(self, restaurant_dict):
        selected_attr = ('name', 'display_phone', 'review_count', 'rating')
        return dict((k, restaurant_dict[k]) for k in selected_attr)
    
    # create restaurant    
    def create_restaurant(self, restaurant_dict):
        restaurant = RestaurantTarget(restaurant_dict['name'], restaurant_dict['display_phone'],
                                     restaurant_dict['review_count'], restaurant_dict['rating'])   
        return restaurant   
    
    # turn restaurants data into objects        
    def restaurants_data_to_objects(self, restaurant_dicts):
        targets = []
        for restaurant_dict in restaurant_dicts:
            selected_data = self.select_data(restaurant_dict)
            restaurant = self.create_restaurant(selected_data)
            targets.append(restaurant)
        return targets

In [33]:
targets

[<__main__.RestaurantTarget at 0x1d97ca59e10>,
 <__main__.RestaurantTarget at 0x1d97ca59e80>,
 <__main__.RestaurantTarget at 0x1d97ca59eb8>,
 <__main__.RestaurantTarget at 0x1d97ca59ef0>,
 <__main__.RestaurantTarget at 0x1d97ca59f28>,
 <__main__.RestaurantTarget at 0x1d97ca59f60>,
 <__main__.RestaurantTarget at 0x1d97ca59f98>,
 <__main__.RestaurantTarget at 0x1d97ca59fd0>,
 <__main__.RestaurantTarget at 0x1d97ca5a048>,
 <__main__.RestaurantTarget at 0x1d97ca5a080>,
 <__main__.RestaurantTarget at 0x1d97ca5a0b8>,
 <__main__.RestaurantTarget at 0x1d97ca5a0f0>,
 <__main__.RestaurantTarget at 0x1d97ca5a128>,
 <__main__.RestaurantTarget at 0x1d97ca5a160>,
 <__main__.RestaurantTarget at 0x1d97ca5a198>,
 <__main__.RestaurantTarget at 0x1d97ca5a1d0>,
 <__main__.RestaurantTarget at 0x1d97ca5a208>,
 <__main__.RestaurantTarget at 0x1d97ca5a240>,
 <__main__.RestaurantTarget at 0x1d97ca5a278>,
 <__main__.RestaurantTarget at 0x1d97ca5a2b0>]

In [34]:
targets[0].__dict__

{'_name': 'Covenhoven',
 '_display_phone': '(718) 483-9950',
 '_review_count': 116,
 '_rating': 4.5}

### 5. Create the client class

Next move calls to the API into their own separate class.  This way we can call the API but later to decide to coerce the data into different types of objects than we did above.

In [35]:
import requests
class RestaurantClient:
    def run(self):
        api_key = '4ZV6k_jsCYOmslVsz-SDQgRCYsOngFjXx6vbdY-09nDR6UsSEDjVXu-wgDCJDnVSAVTjYBr1PZFsm00JOFfYRzaavn_ObxoflOlWJCO3SON5Z1q5lIgZXx_REEawXHYx'
        headers = {'Authorization': 'Bearer %s' % api_key}
        url='https://api.yelp.com/v3/businesses/search'
        params={'term':'restaurant', 'location':'New York City'}
        return requests.get(url, params = params, headers = headers).json()['businesses']

Place the updated Adapter class below.  Check that it still works as it did before.

In [36]:
class RestaurantAdapter:
    def run(self):
        self._request_api = RestaurantClient()
        self._restaurants_data = self._request_api.run()
        self._restaurants = self.restaurants_data_to_objects(self._restaurants_data)
        return self._restaurants
    
    def select_data(self, restaurant_dict):
        selected_attr = ('name', 'display_phone', 'review_count', 'rating')
        return dict((k, restaurant_dict[k]) for k in selected_attr)
        
    def create_restaurant(self, restaurant_dict):
        restaurant = RestaurantTarget(restaurant_dict['name'], restaurant_dict['display_phone'], restaurant_dict['review_count'], restaurant_dict['rating'])   
        return restaurant   
                          
    def restaurants_data_to_objects(self, restaurant_dicts):       
        targets = []
        for restaurant_dict in restaurant_dicts:
            selected_data = self.select_data(restaurant_dict)
            restaurant = self.create_restaurant(selected_data)
            targets.append(restaurant)
        return targets

In [37]:
refactored_adapter = RestaurantAdapter()

len(refactored_adapter.run()) == len(adapter.run())

True

### Summary

Great job!  Hopefully, you saw how by building our code and then slowly refactoring our code, we can eventually get to some clean code.