#### TITLE
List Comprehensions And Lamda Functions

#### OBJECTIVE
* Creating list comprehensions to replace loops with a single line of code.
* Creating single use functions called lambda functions.

#### DATASET
The data set from this mission — **hn_2014.json** — was downloaded from the Hacker News API. It's a different set of data from the CSV we've been using in the previous two missions, and it contains data about stories from Hacker News in 2014.

There are keys representing the title, URL, points, number of comments, and date, as well as some others that are less familiar to us. 

 | **Column**      | **Definition** |
| :---------- | :--------- |
| **author**  | The username of the person who submitted the story.|
| **createdAt**     | The date and time at which the story was created.|
| **createdAtI** |An integer value representing the date and time at which the story was created.|
| **numComments** | The number of comments that were made on the story.|
| **objectId**  |The unique identifier from Hacker News for the story.|
| **points**     |The number of points the story acquired, calculated as the total number of upvotes minus the total number of downvotes.|
| **storyText** |The text of the story (if the story contains text).|
| **tags**  | A list of tags associated with the story.|
| **title**     | The title of the story.|
| **url** |The URL that the stories links to( if the story has a URL)|


#### The JSON Format
The data set we'll use in this mission is in a format called JavaScript Object Notation (JSON). As the name indicates, JSON originated from the JavaScript language, but has now become a language-independent format.

From a Python perspective, JSON can be thought as a collection of Python objects nested inside each other.
![title](./img/Json_1.png)

The JSON above is a list, where each element in the list is a dictionary. Each of the dictionaries have the same keys, and one of the values of each dictionary is itself a list.

The Python json [module](https://docs.python.org/3.7/library/json.html#module-json) contains a number of functions to make working with JSON objects easier. We can use the json.loads() method to convert JSON data contained in a string to the equivalent set of Python objects:

#### String To JSON Obj

In [7]:
json_string = """
[
  {
    "name": "Sabine",
    "age": 36,
    "favorite_foods": ["Pumpkin", "Oatmeal"]
  },
  {
    "name": "Zoe",
    "age": 40,
    "favorite_foods": ["Chicken", "Pizza", "Chocolate"]
  },
  {
    "name": "Heidi",
    "age": 40,
    "favorite_foods": ["Caesar Salad"]
  }
]
"""

import json
json_obj = json.loads(json_string)
print(type(json_obj), '\n')
print(json_obj)

<class 'list'> 

[{'name': 'Sabine', 'age': 36, 'favorite_foods': ['Pumpkin', 'Oatmeal']}, {'name': 'Zoe', 'age': 40, 'favorite_foods': ['Chicken', 'Pizza', 'Chocolate']}, {'name': 'Heidi', 'age': 40, 'favorite_foods': ['Caesar Salad']}]


We can observe a few things:

* The formatting from our original string is gone. This is because printing Python lists and dictionaries has a simple formatting structure.
* The order of the keys in the dictionary have changed. This is because (prior to version 3.6) Python dictionaries don't have fixed order.

In [3]:
world_cup_str = """
[
    {
        "team_1": "France",
        "team_2": "Croatia",
        "game_type": "Final",
        "score" : [4, 2]
    },
    {
        "team_1": "Belgium",
        "team_2": "England",
        "game_type": "3rd/4th Playoff",
        "score" : [2, 0]
    }
]
"""
import json
world_cup_obj = json.loads(world_cup_str)
world_cup_obj

[{'team_1': 'France',
  'team_2': 'Croatia',
  'game_type': 'Final',
  'score': [4, 2]},
 {'team_1': 'Belgium',
  'team_2': 'England',
  'game_type': '3rd/4th Playoff',
  'score': [2, 0]}]

#### JSON Obj To String
Create a function which will print a JSON object with formatting to make it easier to read.
The function will use the **json.dumps()** function **("dump string")** which does the opposite of the **json.loads()** function — it takes a JSON object and returns a string version of it. The json.dumps() function accepts arguments that can specify formatting for the string, which we'll use to make things easier to read

In [6]:
def jprint(obj):
    # create a formatted string of the Python JSON object
    text = json.dumps(obj, sort_keys=True, indent=4)
    print(text)

first_story = hn[0]
jprint(first_story)

{
    "author": "dragongraphics",
    "createdAt": "2014-05-29T08:07:50Z",
    "createdAtI": 1401350870,
    "numComments": 0,
    "objectId": "7815238",
    "points": 2,
    "storyText": "",
    "tags": [
        "story",
        "author_dragongraphics",
        "story_7815238"
    ],
    "title": "Are we getting too Sassy? Weighing up micro-optimisation vs. maintainability",
    "url": "http://ashleynolan.co.uk/blog/are-we-getting-too-sassy"
}


#### Reading a JSON File
One of the places where the JSON format is commonly used is in the results returned by an Application programming interface (API). APIs are interfaces that can be used to send and transmit data between different computer systems. 
To read a file from JSON format, we use the json.load() function. Note that the function is json.load() without an "s" at the end. The json.loads() function is used for loading JSON data from a string ("loads" is short for "load string"), whereas the json.load() function is used to load from a file object.

In [5]:
import json
file = open("./datasets/hn_2014.json")
hn = json.load(file)

print(type(hn), '\n')
print(len(hn), '\n')
print(type(hn[0]), '\n')
print(hn[0].keys())

<class 'list'> 

35806 

<class 'dict'> 

dict_keys(['author', 'numComments', 'points', 'url', 'storyText', 'createdAt', 'tags', 'createdAtI', 'title', 'objectId'])


#### Deleting Dictionary Keys
You may notice that the createdAt and createdAtI keys both have the date and time data in two different formats. Because the format of createdAt is much easier to understand, let's do some data cleaning by deleting the createdAtI key from every dictionary.

To delete a key from a dictionary, we can use the [del](https://docs.python.org/3.7/reference/simple_stmts.html#del) statement.

In [11]:
def del_key(dict_, key):
    # create a copy so we don't
    # modify the original dict
    modified_dict = dict_.copy()
    del modified_dict[key]
    return modified_dict

first_story = hn[0]
jprint(first_story) 
print('Deleting [createdAtI]')

first_story = del_key(first_story, 'createdAtI')
jprint(first_story)

{
    "author": "dragongraphics",
    "createdAt": "2014-05-29T08:07:50Z",
    "createdAtI": 1401350870,
    "numComments": 0,
    "objectId": "7815238",
    "points": 2,
    "storyText": "",
    "tags": [
        "story",
        "author_dragongraphics",
        "story_7815238"
    ],
    "title": "Are we getting too Sassy? Weighing up micro-optimisation vs. maintainability",
    "url": "http://ashleynolan.co.uk/blog/are-we-getting-too-sassy"
}
Deleting [createdAtI]
{
    "author": "dragongraphics",
    "createdAt": "2014-05-29T08:07:50Z",
    "numComments": 0,
    "objectId": "7815238",
    "points": 2,
    "storyText": "",
    "tags": [
        "story",
        "author_dragongraphics",
        "story_7815238"
    ],
    "title": "Are we getting too Sassy? Weighing up micro-optimisation vs. maintainability",
    "url": "http://ashleynolan.co.uk/blog/are-we-getting-too-sassy"
}


Remove the createdAtI key from every story in our Hacker News data set

In [12]:
def del_key(dict_, key):
    # create a copy so we don't
    # modify the original dict
    modified_dict = dict_.copy()
    del modified_dict[key]
    return modified_dict

hn_clean = []
for h in hn:
    modified_story = del_key(h, 'createdAtI')
    hn_clean.append(modified_story)
    
hn_clean

[{'author': 'dragongraphics',
  'numComments': 0,
  'points': 2,
  'url': 'http://ashleynolan.co.uk/blog/are-we-getting-too-sassy',
  'storyText': '',
  'createdAt': '2014-05-29T08:07:50Z',
  'tags': ['story', 'author_dragongraphics', 'story_7815238'],
  'title': 'Are we getting too Sassy? Weighing up micro-optimisation vs. maintainability',
  'objectId': '7815238'},
 {'author': 'jcr',
  'numComments': 0,
  'points': 1,
  'url': 'http://spectrum.ieee.org/automaton/robotics/home-robots/telemba-telepresence-robot',
  'storyText': '',
  'createdAt': '2014-05-29T08:05:58Z',
  'tags': ['story', 'author_jcr', 'story_7815234'],
  'title': 'Telemba Turns Your Old Roomba and Tablet Into a Telepresence Robot',
  'objectId': '7815234'},
 {'author': 'callum85',
  'numComments': 0,
  'points': 1,
  'url': 'http://online.wsj.com/articles/apple-to-buy-beats-1401308971',
  'storyText': '',
  'createdAt': '2014-05-29T08:05:06Z',
  'tags': ['story', 'author_callum85', 'story_7815230'],
  'title': 'Apple

#### Writing List Comprehensions
The task we performed is an extremely common one.
* Iterated over values in a list.
* Performed a transformation on those values.
* Assigned the result to a new list.

Python includes a special syntax shortcut for tasks that meet these criteria: List Comprehensions. A list comprehension provides a concise way of creating lists in a single line of code.
![title](./img/ListTrns_1.png)

**Example 1: Add 1 to each item in a list of integers**

In [13]:
ints = [1, 2, 3, 4]

plus_one = []
for i in ints:
    plus_one.append(i + 1)

print(plus_one)

[2, 3, 4, 5]


To transform this structure into a list comprehension, we do the following within brackets:

* Start with the code that transforms each item.
* Continue with our for statement (without a colon).

We can then assign the list comprehension to a variable name. The image below shows how we convert the manual loop version to a list comprehension.
![title](./img/ListTrns_2.png)

**Example 2: Multiply each item in the list by 10**

In [14]:
times_ten = []
for i in ints:
    times_ten.append(i * 10)

print(times_ten)

[10, 20, 30, 40]


In [17]:
times_ten = [i * 10 for i in ints]
times_ten

[10, 20, 30, 40]

The **"transformation"** step of our list comprehension can be anything, including a function or method. In the example below, we are applying a function to a list of floats to round them to integers.

**Example 3: Applying a function to a list of floats**

In [18]:
floats = [2.1, 8.7, 4.2, 8.9]

rounded = []
for f in floats:
    rounded.append(round(f))

print(rounded)

[2, 9, 4, 9]


In [19]:
rounded = [round(f) for f in floats]
rounded

[2, 9, 4, 9]

**Example 4: Apply a method to each string in a list to capitalize it**

In [20]:
letters = ['a', 'b', 'c', 'd']

caps = []
for l in letters:
    caps.append(l.upper())
    
caps

['A', 'B', 'C', 'D']

In [22]:
caps = [l.upper() for l in letters]
caps

['A', 'B', 'C', 'D']

 A list comprehension can be used where we:
* Iterated over values in a list.
* Performed a transformation on those values.
* Assigned the result to a new list.

To transform a loop to a list comprehension, in brackets we:
* Start with the code that transforms each item.
* Continue with our for statement (without a colon).

![title](./img/ListTrns_5.png)

In [28]:
hn_clean = [del_key(d, 'createdAtI') for d in hn]
hn_clean[0:2]

[{'author': 'dragongraphics',
  'numComments': 0,
  'points': 2,
  'url': 'http://ashleynolan.co.uk/blog/are-we-getting-too-sassy',
  'storyText': '',
  'createdAt': '2014-05-29T08:07:50Z',
  'tags': ['story', 'author_dragongraphics', 'story_7815238'],
  'title': 'Are we getting too Sassy? Weighing up micro-optimisation vs. maintainability',
  'objectId': '7815238'},
 {'author': 'jcr',
  'numComments': 0,
  'points': 1,
  'url': 'http://spectrum.ieee.org/automaton/robotics/home-robots/telemba-telepresence-robot',
  'storyText': '',
  'createdAt': '2014-05-29T08:05:58Z',
  'tags': ['story', 'author_jcr', 'story_7815234'],
  'title': 'Telemba Turns Your Old Roomba and Tablet Into a Telepresence Robot',
  'objectId': '7815234'}]

#### Using List Comprehensions to Transform and Create Lists
List comprehensions can be used for many different things. Three common applications are:
1. Transforming a list
2. Creating a new list
3. Reducing a list

The first application, **transforming a list**, is the category that all the examples you've seen so far fit under. You are taking an existing list, applying a transformation to every value, and assigning it to a variable.

The second application, creating a new list, is useful for creating test data or data that is based on a set of numbers.

In [25]:
# Create an empty dataframe with labels
import pandas as pd

cols = ['col_{}'.format(i) for i in range(1, 5)]
data = np.zeros((4, 4))

df = pd.DataFrame(data, columns=cols)
print(df)

   col_1  col_2  col_3  col_4
0    0.0    0.0    0.0    0.0
1    0.0    0.0    0.0    0.0
2    0.0    0.0    0.0    0.0
3    0.0    0.0    0.0    0.0


In [27]:
urls = [d['url'] for d in hn_clean]
urls[0:5]

['http://ashleynolan.co.uk/blog/are-we-getting-too-sassy',
 'http://spectrum.ieee.org/automaton/robotics/home-robots/telemba-telepresence-robot',
 'http://online.wsj.com/articles/apple-to-buy-beats-1401308971',
 'http://alexsblog.org/2014/05/29/dont-wait-for-inspiration/',
 'http://techcrunch.com/2014/05/28/hackerone-get-9m-in-series-a-funding-to-build-bug-tracking-bounty-programs/']

#### Using List Comprehensions to Reduce a List
The last common application of list comprehensions is reducing a list. Let's say we had a list of integers and we wanted to remove any integers that were smaller than 50

In [None]:
big_ints = [i for i in ints if i>=50]
print(big_ints)