# 1. The JSON Format

In [1]:
import json

world_cup_str = """
[
    {
        "team_1": "France",
        "team_2": "Croatia",
        "game_type": "Final",
        "score" : [4, 2]
    },
    {
        "team_1": "Belgium",
        "team_2": "England",
        "game_type": "3rd/4th Playoff",
        "score" : [2, 0]
    }
]
"""

world_cup_obj=json.loads(world_cup_str)

# 2. Reading a JSON file

In [2]:
file=open("hn_2014.json")
hn=json.load(file)

# 3. Deleting Dictionary Keys

In [3]:
def del_key(dict_, key):
    # create a copy so we don't
    # modify the original dict
    modified_dict = dict_.copy()
    del modified_dict[key]
    return modified_dict

hn_clean=[]

print("Before Deleting Keys:",hn[:2], sep="\n")

for dicts in hn:
    new_dict=del_key(dicts, "createdAtI")
    hn_clean.append(new_dict)
    
print("After Deleting Keys:",hn_clean[:2], sep="\n")

Before Deleting Keys:
[{'author': 'dragongraphics', 'numComments': 0, 'points': 2, 'url': 'http://ashleynolan.co.uk/blog/are-we-getting-too-sassy', 'storyText': '', 'createdAt': '2014-05-29T08:07:50Z', 'tags': ['story', 'author_dragongraphics', 'story_7815238'], 'createdAtI': 1401350870, 'title': 'Are we getting too Sassy? Weighing up micro-optimisation vs. maintainability', 'objectId': '7815238'}, {'author': 'jcr', 'numComments': 0, 'points': 1, 'url': 'http://spectrum.ieee.org/automaton/robotics/home-robots/telemba-telepresence-robot', 'storyText': '', 'createdAt': '2014-05-29T08:05:58Z', 'tags': ['story', 'author_jcr', 'story_7815234'], 'createdAtI': 1401350758, 'title': 'Telemba Turns Your Old Roomba and Tablet Into a Telepresence Robot', 'objectId': '7815234'}]
After Deleting Keys:
[{'author': 'dragongraphics', 'numComments': 0, 'points': 2, 'url': 'http://ashleynolan.co.uk/blog/are-we-getting-too-sassy', 'storyText': '', 'createdAt': '2014-05-29T08:07:50Z', 'tags': ['story', 'aut

# 4. Writing List Comprehensions

![](https://s3.amazonaws.com/dq-content/355/loop_components_hn.svg)

In [4]:
# LOOP VERSION
#
# hn_clean = []
#
# for d in hn:
#     new_d = del_key(d, 'createdAtI')
#     hn_clean.append(new_d)

hn_clean=[del_key(dicts, "createdAtI") for dicts in hn]

# 5. Using List Comprehensions to Transform and Create Lists

In [5]:
urls=[data["url"] for data in hn_clean]

print(urls)



# 6. Using List Comprehensions to Reduce a List

In [6]:
thousand_points=[data for data in hn_clean if data["points"]>1000]

num_thousand_points=len(thousand_points)

print(num_thousand_points)

8


# 7. Passing Functions as Arguments

In [7]:
def keyfunction(dict):
    return dict["numComments"]

most_comments=max(hn_clean,key=keyfunction)

print(most_comments)

{'author': 'platz', 'numComments': 1208, 'points': 889, 'url': 'https://blog.mozilla.org/blog/2014/04/03/brendan-eich-steps-down-as-mozilla-ceo/', 'storyText': None, 'createdAt': '2014-04-03T19:02:53Z', 'tags': ['story', 'author_platz', 'story_7525198'], 'title': 'Brendan Eich Steps Down as Mozilla CEO', 'objectId': '7525198'}


# 8. Lambda Functions

![](https://s3.amazonaws.com/dq-content/355/lambda_1_components.svg)

![](https://s3.amazonaws.com/dq-content/355/lambda_3_comparison.svg)

In [8]:
# def multiply(a, b):
#    return a * b

multiply=lambda a,b: a*b

# 9. Using Lambda Functions to Analyze JSON data0

![](https://s3.amazonaws.com/dq-content/355/lambda_example_1.svg)

![](https://s3.amazonaws.com/dq-content/355/lambda_example_2.svg)

![](https://s3.amazonaws.com/dq-content/355/lambda_example_3.svg)

In [9]:
hn_sorted_points=lambda data: data.sorted(reverse=True)

hn_sorted_points=sorted(hn_clean, key=lambda data: data["points"], reverse=True)

top_5_titles=[data["title"] for data in hn_sorted_points[:5]]

print(top_5_titles)

['2048', 'Today is The Day We Fight Back', 'Wozniak: “Actually, the movie was largely a lie about me”', 'Microsoft Open Sources C# Compiler', 'Elon Musk: To the People of New Jersey']


# 10. Reading JSON files into pandas

In [10]:
import pandas as pd

hn_df=pd.DataFrame(hn_clean)

print(hn_df[:5])

           author  numComments  points  \
0  dragongraphics            0       2   
1             jcr            0       1   
2        callum85            0       1   
3          d3v3r0            0       1   
4      timmipetit            0       1   

                                                 url storyText  \
0  http://ashleynolan.co.uk/blog/are-we-getting-t...             
1  http://spectrum.ieee.org/automaton/robotics/ho...             
2  http://online.wsj.com/articles/apple-to-buy-be...             
3  http://alexsblog.org/2014/05/29/dont-wait-for-...             
4  http://techcrunch.com/2014/05/28/hackerone-get...             

              createdAt                                           tags  \
0  2014-05-29T08:07:50Z  [story, author_dragongraphics, story_7815238]   
1  2014-05-29T08:05:58Z             [story, author_jcr, story_7815234]   
2  2014-05-29T08:05:06Z        [story, author_callum85, story_7815230]   
3  2014-05-29T08:00:08Z          [story, author_d3v3r0

# 11. Exploring Tags Using the Apply Function

In [11]:
tags = hn_df['tags']

four_tags=tags[tags.apply(len)==4]

print(four_tags)

43       [story, author_alamgir_mand, story_7813869, sh...
86         [story, author_cweagans, story_7812404, ask_hn]
104      [story, author_nightstrike789, story_7812099, ...
107      [story, author_ISeemToBeAVerb, story_7812048, ...
109         [story, author_Swizec, story_7812018, show_hn]
                               ...                        
35747      [story, author_rpm4321, story_6994970, show_hn]
35759            [story, author_ct, story_6994828, ask_hn]
35778    [story, author_ChrisNorstrom, story_6994370, a...
35787    [story, author_benjamincburns, story_6994163, ...
35792      [story, author_randall, story_6993981, show_hn]
Name: tags, Length: 2347, dtype: object


# 12. Extracting Tags Using Apply with a Lambda Function

In [12]:
# def extract_tag(l):
#     return l[-1] if len(l) == 4 else None

cleaned_tags=tags.apply(lambda l: l[-1] if len(l)==4 else None)

hn_df["tags"]=cleaned_tags

print(hn_df["tags"].head(100))

0     None
1     None
2     None
3     None
4     None
      ... 
95    None
96    None
97    None
98    None
99    None
Name: tags, Length: 100, dtype: object


# 13. Next Steps

Congratulations, you've reached the end of the lesson! Let's quickly recap the techniques we learned:

* How to read and work with JSON data.

* How to use list comprehensions to extract specific values from JSON objects

* Some of the theory behind passing functions as arguments.

* How to create single-use lambda functions.

* How to use lambda functions in pandas to extract tags from Hacker News stories.

A lot of these techniques allow us to take code that was three to four lines long and write it in a single line of code. This is a really neat trick, and it can be tempting to start trying to write your code in as few lines as possible.

While this can be fun, it's useful to keep in mind you should always balance brevity with readability. When you write code, one of your highest priorities should be to make it readable. The importance of making your code accessible to others shouldn't be underrated; the person reading your code might be a colleague you're collaborating with, a potential employer looking at your portfolio, or yourself in six months when you have forgotten the details of why you wrote what.

In some cases, employing the techniques you will learn in this lesson will make your code more readable, but using them for more complex scenarios can have the opposite effect. Try to keep this in mind as you continue to work through lessons and when employing these techniques outside Dataquest.

In the final lesson of the course, we'll learn techniques to fill missing values in data.

