# DATA MAPPING: COMMON FAILURES

1. What's bad mapping code
1. Case Study: The evolution of a data mapper with Giphy image data
 1. Learning the source data
 1. Use case 1: Filter out unused data
 1. Use Case 2: Nested records
 1. Use Case 3: Flatten data, add and transform fields
1. Conclusion: What's wrong with DIY mapping?

# 1. What's _bad_ mapping code

It works, and you get the idea. But it has problems: 
 * You need to **manipulate data in your head**
 * No clear pattern to apply when adding or changing it
 * Too too slow to understand
 * Not explicit 

When you have to squint and stare, you're doing it wrong.

### Example: Bad mapping code

I wrote this mapping code. It's not maintainable.

In [1]:
def giphy_mapper(dct):
    new_dict = {
        "bitly_gif_url": dct.get("bitly_gif_url"), 
        "giphy_id": dct.get("id"), 
        "import_datetime": dct.get("import_datetime"), 
        "rating": dct.get("rating"), 
        "slug": dct.get("slug"), 
        "source": dct.get("source"), 
        "title": dct.get("title"), 
        "image_type": dct.get("type"),  
        "giphy_url": dct.get("url"), 
        "username": dct.get("username"),
    }
    
    images = 'downsized', 'original', 'preview'
    new_dict["images"] = {}
    for image in images:
        new_dict["images"][image] = dct["images"].get(image)
    return new_dict

def giphy_mapper_extras(mapped_data):
    new_images = {}
    for dct_label, image_dct in mapped_data['images'].iteritems():
        new_images[dct_label] = filter_image_fields(image_dct)
    flat_dict = {}
    for image_name, image_data in new_images.iteritems():
        for image_field, val in image_data.iteritems():
            flat_dict[image_name+"_"+image_field] = val  # 
    mapped_data.update(flat_dict)
    del mapped_data["images"]
    return mapped_data

def transform_data(data):
    data['original_size'] = "%s MB" % (int(data['original_size'])/1000 )
    data['downsized_size'] = "%s MB" % (int(data['downsized_size'])/1000)
    data['height_diff'] = int(data["original_height"]) - int(data["preview_height"])
    return data

##### What makes this so bad?

Stare at the code again. Why is it hard to read? Because you are **manipulaing data in your head.** When you are squinting and staring to follow the data transformations you're doing it wrong.

This _used_ to be simple, but it was modified quickly to meet additional requirements.

Let's examine how simple code devolves.

# 2. Case Study: The evolution of a data mapper.


The following scenario displays how a simple data mapping problem can become complex, and ultimately generate unmaintainable code. 

#### Scenario 
Our client wants to analyze giphy images (about cats).

Parts
Introduction
Scenario


1. Pre-work: Setting up data & utilities (ignore this)
2. Structure of the API response.

#### Ignore this, scroll down

_Setting up data & utilties (ignore this)_

In [2]:
# Demo data
import json
giphy_json_response_sample = '{"pagination": {"count": 25, "total_count": 194085, "offset": 0}, "meta": {"status": 200, "msg": "OK", "response_id": "5a8e4877344d53506749c515"}, "data": [{"username": "", "rating": "g", "embed_url": "https://giphy.com/embed/Ov5NiLVXT8JEc", "is_indexable": 1, "url": "https://giphy.com/gifs/cats-light-sabers-Ov5NiLVXT8JEc", "images": {"fixed_height_still": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "356", "height": "200"}, "fixed_width_small": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/100w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/100w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "56", "width": "100", "mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/100w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "39892", "mp4_size": "12567", "size": "88908"}, "fixed_width_small_still": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/100w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "100", "height": "56"}, "preview_webp": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy-preview.webp?cid=e1bb72ff5a8e4877344d53506749c515", "width": "215", "size": "49358", "height": "121"}, "fixed_height": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "356", "mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "265590", "mp4_size": "74643", "size": "863917"}, "fixed_height_small_still": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/100_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "178", "height": "100"}, "480w_still": {"url": "https://media2.giphy.com/media/Ov5NiLVXT8JEc/480w_s.jpg?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "height": "270"}, "downsized_medium": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "1802952", "height": "281"}, "preview": {"width": "230", "mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy-preview.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "37266", "height": "128"}, "preview_gif": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy-preview.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "123", "size": "48262", "height": "69"}, "fixed_height_small": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/100.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/100.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "100", "width": "178", "mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/100.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "89192", "mp4_size": "26323", "size": "240747"}, "fixed_width": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "112", "width": "200", "mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "100960", "mp4_size": "28771", "size": "279318"}, "fixed_width_downsampled": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200w_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "112", "width": "200", "webp": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200w_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "28020", "size": "80804"}, "original_still": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "height": "281"}, "fixed_height_downsampled": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "356", "webp": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "73220", "size": "242229"}, "downsized_small": {"width": "500", "mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy-downsized-small.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "149771", "height": "280"}, "original_mp4": {"width": "480", "mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "121585", "height": "268"}, "downsized_still": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "height": "281"}, "looping": {"mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy-loop.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "968433"}, "downsized_large": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "1802952", "height": "281"}, "fixed_width_still": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/200w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "200", "height": "112"}, "downsized": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "1802952", "height": "281"}, "original": {"url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "281", "width": "500", "mp4": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "538448", "mp4_size": "121585", "frames": "22", "size": "1802952"}}, "title": "star wars fighting GIF", "trending_datetime": "2015-10-19 21:26:46", "source_post_url": "https://hobolunchbox.tumblr.com/post/96197585095/the-force-is-strong-with-mr-pickles", "content_url": "", "slug": "cats-light-sabers-Ov5NiLVXT8JEc", "source": "https://hobolunchbox.tumblr.com/post/96197585095/the-force-is-strong-with-mr-pickles", "source_tld": "hobolunchbox.tumblr.com", "is_sticker": 0, "bitly_gif_url": "https://gph.is/1B5sZnz", "type": "gif", "id": "Ov5NiLVXT8JEc", "import_datetime": "2014-08-30 20:50:33", "bitly_url": "https://gph.is/1B5sZnz"}, {"username": "", "rating": "g", "embed_url": "https://giphy.com/embed/W3QKEujo8vztC", "is_indexable": 1, "url": "https://giphy.com/gifs/cats-blanket-W3QKEujo8vztC", "images": {"fixed_height_still": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/200_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "209", "height": "200"}, "fixed_width_small": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/100w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/W3QKEujo8vztC/100w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "96", "width": "100", "mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/100w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "57228", "mp4_size": "4252", "size": "61192"}, "fixed_width_small_still": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/100w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "100", "height": "96"}, "preview_webp": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy-preview.webp?cid=e1bb72ff5a8e4877344d53506749c515", "width": "182", "size": "48944", "height": "174"}, "fixed_height": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/200.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/W3QKEujo8vztC/200.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "209", "mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/200.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "172382", "mp4_size": "10066", "size": "226325"}, "fixed_height_small_still": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/100_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "104", "height": "100"}, "480w_still": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/480w_s.jpg?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "size": "28976", "height": "460"}, "downsized_medium": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "1527014", "height": "479"}, "preview": {"width": "500", "mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy-preview.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "43744", "height": "478"}, "preview_gif": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy-preview.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "151", "size": "46828", "height": "145"}, "fixed_height_small": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/100.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/W3QKEujo8vztC/100.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "100", "width": "104", "mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/100.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "59936", "mp4_size": "4432", "size": "62587"}, "fixed_width": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/200w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/W3QKEujo8vztC/200w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "192", "width": "200", "mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/200w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "159622", "mp4_size": "9367", "size": "210484"}, "fixed_width_downsampled": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/200w_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "192", "width": "200", "webp": "https://media2.giphy.com/media/W3QKEujo8vztC/200w_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "57268", "size": "105239"}, "original_still": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "height": "479"}, "fixed_height_downsampled": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/200_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "209", "webp": "https://media2.giphy.com/media/W3QKEujo8vztC/200_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "61854", "size": "112228"}, "downsized_small": {"width": "500", "mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy-downsized-small.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "43744", "height": "478"}, "original_mp4": {"width": "480", "mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "34105", "height": "458"}, "downsized_still": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "height": "479"}, "looping": {"mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy-loop.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "1089862"}, "downsized_large": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "1527014", "height": "479"}, "fixed_width_still": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/200w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "200", "height": "192"}, "downsized": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "1527014", "height": "479"}, "original": {"url": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "479", "width": "500", "mp4": "https://media2.giphy.com/media/W3QKEujo8vztC/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "727898", "mp4_size": "34105", "frames": "17", "size": "1527014"}}, "title": "cold cat GIF", "trending_datetime": "2017-11-26 21:15:02", "source_post_url": "https://www.reddit.com/r/gifs/comments/3v1amj/two_little_cats_under_blanket/", "content_url": "", "slug": "cats-blanket-W3QKEujo8vztC", "source": "https://www.reddit.com/r/gifs/comments/3v1amj/two_little_cats_under_blanket/", "source_tld": "www.reddit.com", "is_sticker": 0, "bitly_gif_url": "https://gph.is/1Pst7YA", "type": "gif", "id": "W3QKEujo8vztC", "import_datetime": "2015-12-01 19:25:51", "bitly_url": "https://gph.is/1Pst7YA"}, {"username": "", "rating": "g", "embed_url": "https://giphy.com/embed/aC45M5Q4D07Pq", "is_indexable": 0, "url": "https://giphy.com/gifs/cat-funny-animation-aC45M5Q4D07Pq", "images": {"fixed_height_still": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "267", "height": "200"}, "fixed_width_small": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/100w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/aC45M5Q4D07Pq/100w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "75", "width": "100", "mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/100w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "49416", "mp4_size": "28230", "size": "158984"}, "fixed_width_small_still": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/100w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "100", "height": "75"}, "preview_webp": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy-preview.webp?cid=e1bb72ff5a8e4877344d53506749c515", "width": "183", "size": "49910", "height": "137"}, "fixed_height": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "267", "mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "238328", "mp4_size": "14331", "size": "262492"}, "fixed_height_small_still": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/100_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "133", "height": "100"}, "480w_still": {"url": "https://media1.giphy.com/media/aC45M5Q4D07Pq/480w_s.jpg?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "height": "360"}, "downsized_medium": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "size": "1014141", "height": "360"}, "preview": {"width": "430", "mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy-preview.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "44084", "height": "322"}, "preview_gif": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy-preview.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "117", "size": "49856", "height": "88"}, "fixed_height_small": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/100.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/aC45M5Q4D07Pq/100.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "100", "width": "133", "mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/100.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "77792", "mp4_size": "31479", "size": "262492"}, "fixed_width": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "150", "width": "200", "mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "157304", "mp4_size": "18755", "size": "158984"}, "fixed_width_downsampled": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200w_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "150", "width": "200", "webp": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200w_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "37820", "size": "135919"}, "original_still": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "height": "360"}, "fixed_height_downsampled": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "267", "webp": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "57292", "size": "217595"}, "downsized_small": {"width": "480", "mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy-downsized-small.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "61768", "height": "360"}, "original_mp4": {"width": "480", "mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "62457", "height": "360"}, "downsized_still": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy-downsized_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "size": "66043", "height": "360"}, "looping": {"mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy-loop.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "3717877"}, "downsized_large": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "size": "1014141", "height": "360"}, "fixed_width_still": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/200w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "200", "height": "150"}, "downsized": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy-downsized.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "size": "1014141", "height": "360"}, "original": {"url": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "360", "width": "480", "mp4": "https://media2.giphy.com/media/aC45M5Q4D07Pq/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "519212", "mp4_size": "62457", "frames": "25", "size": "1014141"}}, "title": "cat massaging GIF", "trending_datetime": "1970-01-01 00:00:00", "source_post_url": "https://gifbinge.tumblr.com/post/56881793567/que-gustito", "content_url": "", "slug": "cat-funny-animation-aC45M5Q4D07Pq", "source": "https://gifbinge.tumblr.com/post/56881793567/que-gustito", "source_tld": "gifbinge.tumblr.com", "is_sticker": 0, "bitly_gif_url": "https://gph.is/11sgz9i", "type": "gif", "id": "aC45M5Q4D07Pq", "import_datetime": "2013-07-30 12:10:58", "bitly_url": "https://gph.is/11sgz9i"}, {"username": "producthunt", "rating": "g", "embed_url": "https://giphy.com/embed/3o72EX5QZ9N9d51dqo", "images": {"fixed_height_still": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "384", "height": "200"}, "fixed_width_small": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/100w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/100w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "52", "width": "100", "mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/100w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "53824", "mp4_size": "10996", "size": "78913"}, "fixed_width_small_still": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/100w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "100", "height": "52"}, "preview_webp": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy-preview.webp?cid=e1bb72ff5a8e4877344d53506749c515", "width": "146", "size": "48646", "height": "76"}, "fixed_height": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "384", "mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "328312", "mp4_size": "53724", "size": "848855"}, "fixed_height_small_still": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/100_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "192", "height": "100"}, "480w_still": {"url": "https://media4.giphy.com/media/3o72EX5QZ9N9d51dqo/480w_s.jpg?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "height": "250"}, "downsized_medium": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "680", "size": "2218338", "height": "354"}, "preview": {"width": "330", "mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy-preview.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "43887", "height": "170"}, "preview_gif": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy-preview.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "125", "size": "48211", "height": "65"}, "fixed_height_small": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/100.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/100.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "100", "width": "192", "mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/100.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "134630", "mp4_size": "25192", "size": "255091"}, "fixed_width": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "104", "width": "200", "mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "143644", "mp4_size": "26044", "size": "267677"}, "fixed_width_downsampled": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200w_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "104", "width": "200", "webp": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200w_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "45544", "size": "93282"}, "original_still": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "680", "height": "354"}, "fixed_height_downsampled": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "384", "webp": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "104026", "size": "284924"}, "downsized_small": {"width": "680", "mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy-downsized-small.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "134785", "height": "354"}, "original_mp4": {"width": "480", "mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "70326", "height": "248"}, "downsized_still": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy-tumblr_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "height": "260"}, "looping": {"mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy-loop.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "805212"}, "downsized_large": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "680", "size": "2218338", "height": "354"}, "fixed_width_still": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/200w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "200", "height": "104"}, "downsized": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy-tumblr.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "1325799", "height": "260"}, "original": {"url": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "354", "width": "680", "mp4": "https://media3.giphy.com/media/3o72EX5QZ9N9d51dqo/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "675100", "mp4_size": "70326", "frames": "19", "size": "2218338"}}, "is_indexable": 1, "url": "https://giphy.com/gifs/producthunt-cats-music-streaming-3o72EX5QZ9N9d51dqo", "user": {"username": "producthunt", "display_name": "Product Hunt", "banner_url": "https://media.giphy.com/headers/producthunt/upKxmTbSoKwU.jpg", "twitter": "@ProductHunt", "avatar_url": "https://media.giphy.com/avatars/producthunt/DgeHhyfC27o9.jpg", "is_verified": true, "profile_url": "https://giphy.com/producthunt/"}, "title": "cats dj GIF by Product Hunt", "trending_datetime": "2016-08-23 13:00:01", "source_post_url": "https://www.producthunt.com/topics/music-streaming", "content_url": "", "slug": "producthunt-cats-music-streaming-3o72EX5QZ9N9d51dqo", "source": "https://www.producthunt.com/topics/music-streaming", "source_tld": "www.producthunt.com", "is_sticker": 0, "bitly_gif_url": "https://gph.is/29jEXUA", "type": "gif", "id": "3o72EX5QZ9N9d51dqo", "import_datetime": "2016-06-30 23:58:11", "bitly_url": "https://gph.is/29jEXUA"}, {"username": "", "rating": "g", "embed_url": "https://giphy.com/embed/aEXP6scfSSwQo", "is_indexable": 1, "url": "https://giphy.com/gifs/aEXP6scfSSwQo", "images": {"fixed_height_still": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/200_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "267", "height": "200"}, "fixed_width_small": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/100w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media0.giphy.com/media/aEXP6scfSSwQo/100w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "75", "width": "100", "mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/100w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "25738", "mp4_size": "3734", "size": "58973"}, "fixed_width_small_still": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/100w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "100", "height": "75"}, "preview_webp": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy-preview.webp?cid=e1bb72ff5a8e4877344d53506749c515", "width": "281", "size": "42806", "height": "211"}, "fixed_height": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/200.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media0.giphy.com/media/aEXP6scfSSwQo/200.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "267", "mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/200.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "111042", "mp4_size": "10926", "size": "364674"}, "fixed_height_small_still": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/100_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "133", "height": "100"}, "480w_still": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/480w_s.jpg?cid=e1bb72ff5a8e4877344d53506749c515", "width": "480", "height": "360"}, "downsized_medium": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "809099", "height": "375"}, "preview": {"width": "500", "mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy-preview.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "36943", "height": "374"}, "preview_gif": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy-preview.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "167", "size": "48281", "height": "125"}, "fixed_height_small": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/100.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media0.giphy.com/media/aEXP6scfSSwQo/100.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "100", "width": "133", "mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/100.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "37194", "mp4_size": "4844", "size": "96352"}, "fixed_width": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/200w.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media0.giphy.com/media/aEXP6scfSSwQo/200w.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "150", "width": "200", "mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/200w.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "69996", "mp4_size": "7625", "size": "209546"}, "fixed_width_downsampled": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/200w_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "150", "width": "200", "webp": "https://media0.giphy.com/media/aEXP6scfSSwQo/200w_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "37888", "size": "116344"}, "original_still": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "height": "375"}, "fixed_height_downsampled": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/200_d.gif?cid=e1bb72ff5a8e4877344d53506749c515", "height": "200", "width": "267", "webp": "https://media0.giphy.com/media/aEXP6scfSSwQo/200_d.webp?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "60488", "size": "202283"}, "downsized_small": {"width": "500", "mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy-downsized-small.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "36943", "height": "374"}, "original_mp4": {"width": "480", "mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "27294", "height": "360"}, "downsized_still": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy-downsized_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "44484", "height": "375"}, "looping": {"mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy-loop.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "mp4_size": "1297124"}, "downsized_large": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "809099", "height": "375"}, "fixed_width_still": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/200w_s.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "200", "height": "150"}, "downsized": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy-downsized.gif?cid=e1bb72ff5a8e4877344d53506749c515", "width": "500", "size": "809099", "height": "375"}, "original": {"url": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", "webp": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy.webp?cid=e1bb72ff5a8e4877344d53506749c515", "height": "375", "width": "500", "mp4": "https://media0.giphy.com/media/aEXP6scfSSwQo/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515", "webp_size": "462834", "mp4_size": "27294", "frames": "11", "size": "809099"}}, "title": "best friends neck kiss GIF", "trending_datetime": "2017-02-17 02:30:06", "source_post_url": "https://imgur.com/gallery/4eJVw0b", "content_url": "", "slug": "aEXP6scfSSwQo", "source": "https://imgur.com/gallery/4eJVw0b", "source_tld": "imgur.com", "is_sticker": 0, "bitly_gif_url": "https://gph.is/299HKBX", "type": "gif", "id": "aEXP6scfSSwQo", "import_datetime": "2016-07-02 10:55:23", "bitly_url": "https://gph.is/299HKBX"}]}'
giphy_dataset_uri = 'http://api.giphy.com/v1/gifs/search?q=cats&api_key=dc6zaTOxFJmzC&limit=5'
def get_giphy_json():
    import requests
    response = requests.get(giphy_dataset_uri)
    if not response.ok:
        response.raise_for_status()
    return response.json()

def load_giphy_data(reset=False):
    if reset:
        return json.loads(get_giphy_json())
    else:
        return json.loads(giphy_json_response_sample)

Load the data

In [3]:
giphydata = load_giphy_data()

We'll focus on just one object for now.

In [4]:
sample_data = giphydata['data'][0]

## A. Learning the source data

###### Response data (setch)

<pre>
pagination
  ...
meta
  ...
data
  [
      record,
      record,
      ...,
  ]
</pre>

###### One of the records in the data collection 

<pre>
field
field
...
images
  kind1
      field,
      field,
      ...
  kind2
      field,
      field,
      ...
  kind3
      field,
      field,
      ...
</pre>

## B. Use case 1: Filter out unused data

We want objects in the collection to have fewer fields.

##### Create data like this:

<pre>
{
     "bitly_gif_url": "...", 
     "giphy_id": "...", 
     "import_datetime": "...", 
     "rating": "...", 
     "slug": "...", 
     "source": "...", 
     "title": "...", 
     "image_type": "...", 
     "giphy_url": "...", 
     "username": "..."
}
</pre>

#####  ... from source data like this

<pre>
{
  "bitly_gif_url": "https://gph.is/1B5sZnz", 
  "bitly_url": "https://gph.is/1B5sZnz", 
  "content_url": "", 
  "embed_url": "https://giphy.com/embed/Ov5NiLVXT8JEc", 
  "id": "Ov5NiLVXT8JEc", 
  "images": {
    "480w_still": {
      "height": "270", 
      "url": "https://media2.giphy.com/media/Ov5NiLVXT8JEc/480w_s.jpg?cid=e1bb7...
      "width": "480"
    }, 
    "downsized": {
      "height": "281", 
      "size": "1802952", 
      "url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72...
      "width": "500"
    }, 
    "downsized_large": {
      "height": "281", 
       ...
  }, 
  "import_datetime": "2014-08-30 20:50:33", 
  "is_indexable": 1, 
  "is_sticker": 0, 
  "rating": "g", 
  "slug": "cats-light-sabers-Ov5NiLVXT8JEc", 
  "source": "https://hobolunchbox.tumblr.com/post/96197585095/the-force-is-stro...
  "source_post_url": "https://hobolunchbox.tumblr.com/post/96197585095/the-forc...
  "source_tld": "hobolunchbox.tumblr.com", 
  "title": "star wars fighting GIF", 
  "trending_datetime": "2015-10-19 21:26:46", 
  "type": "gif", 
  "url": "https://giphy.com/gifs/cats-light-sabers-Ov5NiLVXT8JEc", 
  "username": ""
}
</pre>

#### Use Case 1, Solution 1: A Mapper Function

In [5]:
def giphy_mapper(dct):
    new_dict = {
        "bitly_gif_url": dct.get("bitly_gif_url"), 
        "giphy_id": dct.get("id"), # Target field changed because of python resereved word
        "import_datetime": dct.get("import_datetime"), 
        "rating": dct.get("rating"), 
        "slug": dct.get("slug"), 
        "source": dct.get("source"), 
        "title": dct.get("title"), 
        "image_type": dct.get("type"),  # Target field changed because of python resereved word
        "giphy_url": dct.get("url"), # We just want to be clear it's giphy's url, not just anybody's url
        "username": dct.get("username"),
    }
    return new_dict

In [6]:
giphy_mapper(sample_data)

{'bitly_gif_url': u'https://gph.is/1B5sZnz',
 'giphy_id': u'Ov5NiLVXT8JEc',
 'giphy_url': u'https://giphy.com/gifs/cats-light-sabers-Ov5NiLVXT8JEc',
 'image_type': u'gif',
 'import_datetime': u'2014-08-30 20:50:33',
 'rating': u'g',
 'slug': u'cats-light-sabers-Ov5NiLVXT8JEc',
 'source': u'https://hobolunchbox.tumblr.com/post/96197585095/the-force-is-strong-with-mr-pickles',
 'title': u'star wars fighting GIF',
 'username': u''}

#### Use Case 1 Solution 2: A Key Map

In [7]:
def key_mapper(dct, keymap):
    return {v:dct.get(k) for k,v in keymap.iteritems() if k in dct}

In [8]:
key_map_source_to_target_fields = {
    "bitly_gif_url": "bitly_gif_url", 
    "id": "giphy_id", # Target field changed because of python resereved word
    "import_datetime": "import_datetime", 
    "rating": "rating", 
    "slug": "slug", 
    "source": "source", 
    "title": "title", 
    "type": "image_type",  # Target field changed because of python resereved word
    "url": "giphy_url", # We just want to be clear it's giphy's url, not just anybody's url
    "username": "username",
}
 
key_mapper(sample_data, key_map_source_to_target_fields)

{'bitly_gif_url': u'https://gph.is/1B5sZnz',
 'giphy_id': u'Ov5NiLVXT8JEc',
 'giphy_url': u'https://giphy.com/gifs/cats-light-sabers-Ov5NiLVXT8JEc',
 'image_type': u'gif',
 'import_datetime': u'2014-08-30 20:50:33',
 'rating': u'g',
 'slug': u'cats-light-sabers-Ov5NiLVXT8JEc',
 'source': u'https://hobolunchbox.tumblr.com/post/96197585095/the-force-is-strong-with-mr-pickles',
 'title': u'star wars fighting GIF',
 'username': u''}

#### Use Case 1, Debrief: What's wrong with basic mapping?

Basic mapping is not a problem. But you might outgrow the home-grown mapper when your needs change.

## C. Use Case 2: Nested records

Returning to the object, we decide we really want the image data, currently nested in a collection.

There are _23 image records_: fixed_height_still, fixed_width_small, fixed_width_small_still, preview_webp, ... original, downsized, fixed_width_still

But want _only three_: *downsized, original, preview*

##### Create data like this:

<pre>
{
     "bitly_gif_url": "...", 
     "giphy_id": "...", 
     "import_datetime": "...", 
     "rating": "...", 
     "slug": "...", 
     "images": {
            "downsized": {
                "height": "...", 
                "size": "...", 
                "url": "...", 
                "width": "..."
            },
            "original": {
                "height": "...", 
                "size": "...", 
                "url": "...", 
                "width": "..."
            }, 
            "preview": {
                "height": "...", 
                "width": "..."
            },
     }
     "source": "...", 
     "title": "...", 
     "image_type": "...", 
     "giphy_url": "...", 
     "username": "..."
}
</pre>

#### Use Case 2 Solution 1: Just add some code

(This doesn't work out.)

In [9]:
def giphy_mapper(dct):
    new_dict = {
        "bitly_gif_url": dct.get("bitly_gif_url"), 
        "giphy_id": dct.get("id"), # Target field changed because of python resereved word
        "import_datetime": dct.get("import_datetime"), 
        "rating": dct.get("rating"), 
        "slug": dct.get("slug"), 
        "source": dct.get("source"), 
        "title": dct.get("title"), 
        "image_type": dct.get("type"),  # Target field changed because of python resereved word
        "giphy_url": dct.get("url"), # We just want to be clear it's giphy's url, not just anybody's url
        "username": dct.get("username"),
    }
    new_dict["images"] = {
        "downsized": dct["images"].get('downsized'),
        "original": dct["images"].get('original'),
        "preview": dct["images"].get('preview'),
    }
    
    return new_dict

In [10]:
giphy_mapper(sample_data)

{'bitly_gif_url': u'https://gph.is/1B5sZnz',
 'giphy_id': u'Ov5NiLVXT8JEc',
 'giphy_url': u'https://giphy.com/gifs/cats-light-sabers-Ov5NiLVXT8JEc',
 'image_type': u'gif',
 'images': {'downsized': {u'height': u'281',
   u'size': u'1802952',
   u'url': u'https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515',
   u'width': u'500'},
  'original': {u'frames': u'22',
   u'height': u'281',
   u'mp4': u'https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.mp4?cid=e1bb72ff5a8e4877344d53506749c515',
   u'mp4_size': u'121585',
   u'size': u'1802952',
   u'url': u'https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515',
   u'webp': u'https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.webp?cid=e1bb72ff5a8e4877344d53506749c515',
   u'webp_size': u'538448',
   u'width': u'500'},
  'preview': {u'height': u'128',
   u'mp4': u'https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy-preview.mp4?cid=e1bb72ff5a8e4877344d53506749c515',
   

Darn. That's too much. It has more data than I want. Forgot about that.

#### Use Case 2: Solution 2

(This technically works, but it's falling apart.)

In [11]:
#+++++++++++ 
# NEW 
#+++++++++++ 
# I'll just go add stuff and re-use the old mapper

def filter_image_fields(image_dct, keep=("height", "size", "url", "width",)):
    new_dict = {}
    for field in image_dct:
        if field in keep:
            new_dict[field] = image_dct[field]
    return new_dict

# And I'll call this "extras" cause it's the extra stuff I didn't do before. (lolz)
def giphy_mapper_extras(mapped_data):
    new_images = {}
    for dct_label, image_dct in mapped_data['images'].iteritems():
        new_images[dct_label] = filter_image_fields(image_dct)
    mapped_data['images'] = new_images
    return mapped_data

mapped_data = giphy_mapper_extras(giphy_mapper(sample_data))

# Take that. 
print json.dumps(mapped_data, indent=2, sort_keys=True)

{
  "bitly_gif_url": "https://gph.is/1B5sZnz", 
  "giphy_id": "Ov5NiLVXT8JEc", 
  "giphy_url": "https://giphy.com/gifs/cats-light-sabers-Ov5NiLVXT8JEc", 
  "image_type": "gif", 
  "images": {
    "downsized": {
      "height": "281", 
      "size": "1802952", 
      "url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", 
      "width": "500"
    }, 
    "original": {
      "height": "281", 
      "size": "1802952", 
      "url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", 
      "width": "500"
    }, 
    "preview": {
      "height": "128", 
      "width": "230"
    }
  }, 
  "import_datetime": "2014-08-30 20:50:33", 
  "rating": "g", 
  "slug": "cats-light-sabers-Ov5NiLVXT8JEc", 
  "source": "https://hobolunchbox.tumblr.com/post/96197585095/the-force-is-strong-with-mr-pickles", 
  "title": "star wars fighting GIF", 
  "username": ""
}


#### Use Case 2: Debrief
This is the fast way. Just stick stuff in your code to meet the need. But code has to be maintained. 

##### Problems
* you're bolting more and more onto the mapping code
* your mapping code turns into spaghetti

## D. Use Case 3: Flatten data, add and transform fields

Now your map folds

##### We're asked to make some changes:

* _"Can you also make the data not nested? It needs to show in a table (import to Excel, scroll on the screen, etc.)."_
* Also... _can you make the size in MB not bits?_
* _height and width like "640x480"_
* _calculate the difference in height between the original and the preview. We need that for analytics. ("No you don't." "Yes we do!")_

##### Create data like this:
<pre>
{
    "bitly_gif_url": "https://gph.is/1B5sZnz", 
    "giphy_id": "Ov5NiLVXT8JEc", 
    "giphy_url": "https://giphy.com/gifs/cats-light-sabers-Ov5NiLVXT8JEc", 
    "image_type": "gif", 
    "downsized": {
    "downsized_height": "281", 
    "downsized_size": "1.8 MB", 
    "downsized_url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", 
    "downsized_width": "500"
    "downsized_dimensions": "500 x 281"
    "original_dimensions": "500 x 281"
    "original_height": "281", 
    "original_size": "1.8 MB", 
    "original_url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", 
    "original_width": "500"
    "preview_height": "128", 
    "preview_width": "230"
    "preview_dimensions": "230 x 128"
    "height_diff": "281", 
    "preview_height": "153", 
    "import_datetime": "2014-08-30 20:50:33", 
    "rating": "g", 
    "slug": "cats-light-sabers-Ov5NiLVXT8JEc", 
    "source": "https://hobolunchbox.tumblr.com/post/96197585095/the-force-is-strong-with-mr-pickles", 
    "title": "star wars fighting GIF", 
    "username": ""
}
</pre>

#### Use Case 3: Solution

(Not good!)

In [12]:
# Now you adapt it again

def giphy_mapper(dct):
    #+++++++++++ 
    # OLD 
    #+++++++++++ 
    new_dict = {
        "bitly_gif_url": dct.get("bitly_gif_url"), 
        "giphy_id": dct.get("id"), # Target field changed because of python resereved word
        "import_datetime": dct.get("import_datetime"), 
        "rating": dct.get("rating"), 
        "slug": dct.get("slug"), 
        "source": dct.get("source"), 
        "title": dct.get("title"), 
        "image_type": dct.get("type"),  # Target field changed because of python resereved word
        "giphy_url": dct.get("url"), # We just want to be clear it's giphy's url, not just anybody's url
        "username": dct.get("username"),
    }
    
    #+++++++++++ 
    # NEW 
    #+++++++++++ 
    # DRY up a few things here. Being a smart programmer, you know.
    images = 'downsized', 'original', 'preview'
    new_dict["images"] = {}
    for image in images:
        new_dict["images"][image] = dct["images"].get(image)
    return new_dict

def giphy_mapper_extras(mapped_data):
    #+++++++++++ 
    # OLD 
    #+++++++++++ 
    new_images = {}
    for dct_label, image_dct in mapped_data['images'].iteritems():
        new_images[dct_label] = filter_image_fields(image_dct)
    #+++++++++++ 
    # NEW 
    #+++++++++++ 
    flat_dict = {}
    # I'll flatten the list here.
    for image_name, image_data in new_images.iteritems():
        for image_field, val in image_data.iteritems():
            flat_dict[image_name+"_"+image_field] = val  # 
    mapped_data.update(flat_dict)
    del mapped_data["images"]
    return mapped_data

#+++++++++++ 
# NEW 
#+++++++++++ 
# I'll transform with something like this
def make_dimensions(width,height, fmt='%s x %s'):
    return fmt % (width, height,)

def transform_data(data):
    data['original_size'] = "%s MB" % (int(data['original_size'])/1000 )
    data['downsized_size'] = "%s MB" % (int(data['downsized_size'])/1000)
    data['height_diff'] = int(data["original_height"]) - int(data["preview_height"])
    data['preview_dimensions'] = make_dimensions(data['preview_width'],data['preview_height'])
    data['original_dimensions'] = make_dimensions(data['original_width'],data['original_height'])
    data['downsized_dimensions'] = make_dimensions(data['downsized_width'],data['downsized_height'])
    return data

In [13]:
# Now we're done.
tranformed_data = transform_data(giphy_mapper_extras(giphy_mapper(sample_data)))
print json.dumps(tranformed_data, sort_keys=True, indent=3)

{
   "bitly_gif_url": "https://gph.is/1B5sZnz", 
   "downsized_dimensions": "500 x 281", 
   "downsized_height": "281", 
   "downsized_size": "1802 MB", 
   "downsized_url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", 
   "downsized_width": "500", 
   "giphy_id": "Ov5NiLVXT8JEc", 
   "giphy_url": "https://giphy.com/gifs/cats-light-sabers-Ov5NiLVXT8JEc", 
   "height_diff": 153, 
   "image_type": "gif", 
   "import_datetime": "2014-08-30 20:50:33", 
   "original_dimensions": "500 x 281", 
   "original_height": "281", 
   "original_size": "1802 MB", 
   "original_url": "https://media1.giphy.com/media/Ov5NiLVXT8JEc/giphy.gif?cid=e1bb72ff5a8e4877344d53506749c515", 
   "original_width": "500", 
   "preview_dimensions": "230 x 128", 
   "preview_height": "128", 
   "preview_width": "230", 
   "rating": "g", 
   "slug": "cats-light-sabers-Ov5NiLVXT8JEc", 
   "source": "https://hobolunchbox.tumblr.com/post/96197585095/the-force-is-strong-with-m

# Conclusion: What's wrong with DIY mapping?

* You need to **manipulate data in your head**
* No clear pattern to apply when adding or changing it
* Too too slow to understand
* Not explicit 

**When you are squinting and staring to follow the data transformations you're doing it wrong**

### There is a better way.

Solution in another notebook...