# Working with Nested Data

### Introduction

Believe it or not, we are now ready to work with data from the web.

Do you feel ready?  Good.

## Getting Our Data

When we get data in the real world, it often comes in the same format: a list of dictionaries.  Good thing for us, we know lists and we know dictionaries.

Let's see this by way of example.  Below is the web address of a website that allows programmers to find jobs.

"https://jobs.github.com/positions.json?description=python&location=new+york"

If we go there, we'll see something like the following:

<img src="https://storage.cloud.google.com/curriculum-assets/curriculum-assets.nosync/mom-files/github-jobs.png" width="50%">

> When we go to that URL above, and getting back information, we are accessing an API.  APIs are really cool because they allow programmers to access information gathered by other companies.  For example, above we are getting a list of job openings from Github's API.  If you'd like to see different kinds of data we can get from APIs, [click here](https://github.com/public-apis/public-apis).

Ok, so that mix of green and blue text above may look like a mess, but really it is just a list of dictionaries.

```python
[
    {'title': 'Senior Developer', 'company': 'Postmates'},
    {'title': 'Junior Developer', 'company': 'Bento Box'}
]
```

And while reading through all of that green and blue text is not sustainable, asking questions of a list of dictionaries is.  In fact, we do it all of the time in programming.

Let's learn how to ask questions of a list of dictionaries.  It's great training working with data from the Internet.

## Let's get a job

* First review

Remember with a dictionary, we use the squiggly brackets to create the dictionary `{}`.

In [1]:
developer_job = {'company': 'Postmates', 'title': 'senior developer'}

> Press shift + enter to assign the dictionary to `developer_job`.

And we use the square brackets to access data from a dictionary.

In [2]:
developer_job['company']

'Postmates'

As we said, we think of this as asking questions of our dictionary.  Hey `developer_job`, what is the `'title'` of the job? 

And our dictionary responds with `senior developer`.

* Now for nested data

Ok, now below we assign a a list of jobs to the variable `jobs`.

In [101]:
jobs = [
    {'company': 'Postmates', 'title': 'Senior Developer'},
    {'company': 'Bento Box', 'title': 'Python Programmer'}
]
jobs

[{'company': 'Postmates', 'title': 'Senior Developer'},
 {'company': 'Bento Box', 'title': 'Python Programmer'}]

Remember that we use the index to retreive a specific element from a list.  Ok, let's ask our list above for the first job.

> Press shift + return below.

In [135]:
jobs[0]

{'company': 'Postmates', 'title': 'Senior Developer'}

> It gives us the first element in the list, the dictionary representing the job from `Postmates`.

Now, on your own, access the second element from the list of jobs in the cell below.

> Think about asking the `jobs` list a question.  If you're correct, you'll see the return value below the cell match what we have commented out.

In [159]:


# {'company': 'Bento Box', 'title': 'Python Programmer'}

> Audio help.  Press shift + return to hear it below.

In [168]:
beginning = "https://storage.googleapis.com/curriculum-assets/curriculum-assets.nosync/mom-files/"

import IPython.display as ipd
ipd.Audio(beginning + "nested-data-select-first.wav")

### More complicated

Now let's try to dig further inside of this list.

In [104]:
jobs = [
    {'company': 'Postmates', 'title': 'Senior Developer'},
    {'company': 'Bento Box', 'title': 'Python Programmer'}
]
jobs

[{'company': 'Postmates', 'title': 'Senior Developer'},
 {'company': 'Bento Box', 'title': 'Python Programmer'}]

We'll start by trying to select the `title` of the **last element**, which is `Senior Developer`.  Notice that we can't do this directly.  All we have is a variable that points to the list of `jobs`.  

But in that list of `jobs`, as it's last element is the dictionary representing the second job, and inside that dictionary is the `title` of `Python Programmer`.

So to get there, we need to ask two questions:

1. Hey, `jobs`, what's your second element
2. Hey second element, what's your `title`

`jobs -> second job -> title`

So that's the path we need to go through to access our data.  Let's try it.

1. Start with the list of jobs and ask for the second element

In [105]:
jobs[1]

{'company': 'Bento Box', 'title': 'Python Programmer'}

2. Now that we have the second element, ask for the `title`

In [106]:
jobs[1]['title']

'Python Programmer'

> Notice that there are no spaces in the line above.

### A little analogy

So we accessed this information with the following logic:

`jobs -> second job -> title`

Now for an analogy.  Let's think of our data as a mailbox.

<img src="https://storage.googleapis.com/curriculum-assets/curriculum-assets.nosync/intro-to-coding/mailboxes.jpg" width = 30%>

In [107]:
jobs = [
    {'company': 'Postmates', 'title': 'Senior Developer'},
    {'company': 'Bento Box', 'title': 'Python Programmer'}
]
jobs

[{'company': 'Postmates', 'title': 'Senior Developer'},
 {'company': 'Bento Box', 'title': 'Python Programmer'}]

* The whole gray mailbox represents the list of **jobs**.  
* Each door contains a different element in our list.  
* Then inside the mailbox representing a specific job, there are different pieces of information about that job.

``jobs -> second job -> title``

In [108]:
jobs[1]['title']

'Python Programmer'

Ok, now it's your turn to practice.  

Once again, here is our list of jobs.

In [110]:
jobs = [{'company': 'Postmates', 'title': 'Senior Developer'},
 {'company': 'Bento Box', 'title': 'Python Programmer'}]

Select the company related to the first job.  You'll see it's correct if pressing `shift + return` returns the string `'Postmates'`. 

In [143]:


# 'Postmates'

'Postmates'

> Walkthrough below.  Press shift + enter to hear audio.

In [167]:
beginning = "https://storage.googleapis.com/curriculum-assets/curriculum-assets.nosync/mom-files/"

import IPython.display as ipd
ipd.Audio(beginning + "selecting-nested.wav")

If you were able to do this, congratulations :)  You are certainly on your way.  The reason why being able to work with nested data structures is so important is because we see them so often on the web.

In fact let's go get some live jobs data right now.

### Wrapping Up

In [149]:
import pandas as pd
url = "https://jobs.github.com/positions.json?description=python&location=new+york"
jobs_df = pd.read_json(url)
internet_jobs = jobs_df.to_dict('records')

> Press shift + return on the cell above.

Guess what, we just gathered a list of jobs from the Internet.  Now if we look at the entire list we'll see that it's too much information to handle.  In fact, let's just look at the first dictionary.

In [114]:
internet_jobs[0]

{'id': '62036836-5b6b-4e69-a9d5-805e3f4f4ff1',
 'type': 'Full Time',
 'url': 'https://jobs.github.com/positions/62036836-5b6b-4e69-a9d5-805e3f4f4ff1',
 'created_at': Timestamp('2019-12-18 17:05:11+0000', tz='UTC'),
 'company': 'Markacy',
 'company_url': 'http://www.markacy.com',
 'location': 'New York City, NY, USA',
 'title': 'Senior Shopify Developer',
 'description': '<p>We are looking for a Shopify Expert to join our rapidly growing team to implement and maintain functional web pages and third-party integrations for our clients.</p>\n<p>Senior Web Developer responsibilities include working with Markacy leadership and client teams to understand, document requirements and execute weekly growth tests on the website to increase key client metrics. To be successful in this role, you should have extensive experience building web pages from scratch and in-depth knowledge of Shopify, WordPress, and other major CMS E-commerce platforms. You should also have experience using HotJar, Google A

**But** it's just a dictionary, so we can use the same logic as before to select information from the first dictionary in the list -- `internet_jobs[0]['company']`.

In [116]:
internet_jobs[0]['company']

'Markacy'

> So you can see that by knowing how to navigate a list of dictionaries, we can browse data that comes from the Internet.

Try accessing the `company` from the second element in our `internet_jobs` list.

'Aon Cyber Solutions'

And in fact, we can use a loop to print out a list of all of the companies we should reach out to.  

In [153]:
for job in internet_jobs:
     print(job['how_to_apply'])

<p>Email your resume to <a href="mailto:tucker.matheson@markacy.com">tucker.matheson@markacy.com</a></p>

<p>APPLY HERE: <a href="https://us-strozfriedberg-aon.icims.com/jobs/24315/senior-developer/job">https://us-strozfriedberg-aon.icims.com/jobs/24315/senior-developer/job</a></p>

<p><a href="https://jobs.lever.co/sesamecare/26d259b5-5bcc-4ee9-a5d0-068d9c856381?lever-origin=applied&amp;lever-source%5B%5D=GitHub">https://jobs.lever.co/sesamecare/26d259b5-5bcc-4ee9-a5d0-068d9c856381?lever-origin=applied&amp;lever-source%5B%5D=GitHub</a></p>

<p>Apply Here: <a href="http://www.Click2apply.net/cyg9sy9m99zdm9v2">http://www.Click2apply.net/cyg9sy9m99zdm9v2</a></p>



> But we're getting ahead of ourselves.  We'll dedicate future lessons to looping through a list of dictionaries.

### Summary

In this lesson, we saw how to work with nested data structures.  We saw that we can represent a table of data as a list of dictionaries.

In [117]:
jobs = [{'company': 'Postmates', 'title': 'Senior Developer'},
 {'company': 'Bento Box', 'title': 'Python Programmer'}]

And we can select this data by working from the outside in.  So if we want to get the company of a specific job, we take the following path.

`jobs -> first job -> company`

In [118]:
jobs[0]['company']

'Postmates'

If you feel like you need some more practice with this, take some time and practice.  In the next, lesson, we'll learn how to loop through our data.

> For more practice use knowledge about nested data structures to select the audio file you want to hear.

In [154]:
beginning = "https://storage.googleapis.com/curriculum-assets/curriculum-assets.nosync/mom-files/"

scenes = [
    {'audio': 'preposition-short.wav'},
    {'audio': 'like-a-dinosaur.wav'}
]

In [162]:
selected_scene = None

In [166]:
import IPython.display as ipd
ipd.Audio(beginning + selected_scene)

<right> 
<a href="https://colab.research.google.com/github/jigsawlabs-student/code-intro/blob/master/7-make-it-easy.ipynb">
<img src="https://storage.cloud.google.com/curriculum-assets/curriculum-assets.nosync/mom-files/pngfuel.com.png" align="right" style="padding-right: 20px" width="10%">
    </a>
</right>

<center>
<a href="https://www.jigsawlabs.io/free" style="position: center"><img src="https://storage.cloud.google.com/curriculum-assets/curriculum-assets.nosync/mom-files/jigsaw-labs.png" width="15%" style="text-align: center"></a>
</center>

### Answers

In [164]:
jobs = [{'company': 'Postmates', 'title': 'Senior Developer'},
 {'company': 'Bento Box', 'title': 'Python Programmer'}]

In [163]:
jobs[1]

{'company': 'Bento Box', 'title': 'Python Programmer'}

In [165]:
jobs[0]['company']

'Postmates'

In [None]:
internet_jobs[1]['company']